Why Morningstar Moved to the Cloud: 97% Cost Reduction

Enterprises won't move to the cloud. If they do, it's tantamount to admitting your IT group sucks. That has been the common wisdom. Morningstar, an investment research provider, is moving to the cloud and they're about as enterprisey as it gets. And they don't strike me as incompetent, they just don't want to worry about all the low level IT stuff anymore.

Mitch Shue, Morningstar's CTO, gave a short talk at AWS Summit Series 2017 on their move to AWS. It's not full of nitty gritty technical details. That's not the interesting part. The talk is more about their motivations, the process they used to make the move, and some of the results they've experienced. While that's more interesting, we've heard a lot of it before.

What I found most interesting was the idea of Morningstar as a canary test. If Morningstar succeeds, the damn might bust and we'll see a lot more adoption of the cloud by stodgy mainstream enterprises. It's a copy cat world. That sort of precedent gives other CTOs the cover they need to make the same decision.

The most important idea in the whole talk: the cost savings of moving to the cloud are nice, but what they were more interested in is "creating a frictionless development experience to spur innovation and creativity."

Software is eating the world. Morningstar is no doubt looking at the future and sees the winners will be those who can develop the best software, the fastest. They need to get better at developing software. Owning your own infrastructure is a form of technical debt. Time to pay down the debt and get to the real work of innovating, not plumbing.

Here's my gloss of the talk:

Infrastructure

  • 8 datacenters across the world
  • 11,318 application servers
  • 7,894 database instances
  • 4PB storage

Time for a Change

  • Imagine the effort that goes into patching, upgrading, managing, and securing 8 global datacenters. It's very complex to run.
  • The problem is not their IT staff. They are happy with IT.
  • Want maximize talents, own less infrastructure, and focus on core business.
  • Want to simplify and reduce complexity.
  • Running datacenters is not a differentiator for them.

Changing the Culture

  • Transitioning to the cloud required aligning 1400 technologists from across the world. Not easy to do.
  • There are many semi-autonomous groups from around the world, each with their own individual road maps and team goals.
  • Established high level technology goals: maximize talents, own less infrastructure, reduce complexity, improve product completeness, product security, product recoverability, product reliability, better uptime, better monitoring, and faster incident response.
  • Goals were chosen to be easy to repeat, simple to understand, and non-expiring.
  • Don't be passive about your culture. Weave these goals into your culture through intentional repetition: lots of meetings, blog posts, presentations, and conversations.
  • Moving to AWS supports each of these goals.

Strategy: Move Fast then Optimize

  • The Data Collection team initially moved over to AWS using a lift and shift approach. That got them into the cloud quickly.
  • Once everything was working they began to reduce complexity.
  • Replaced SQL Server with RDS for PostgreSQL.
  • Added Kinesis for messaging, so they'd have better control over workload distribution.
  • Since their application was only active 2-4 hours a day they replaced all EC2 instances with Lambda functions.
  • Committed to a strategy of creating and destroying the entire computer environment on demand.
  • Moved their data lake to S3.
  • Continued to iterate to reduce complexity.

Result: 97% Cost Reduction

  • Of course this project was a classic Lambda use case. With so much idle time, full time EC2 instances didn't make a lot of sense.
  • But then again you could imagine many such potential optimizations across an enterprise.

The Future

  • By 2017 core data APIs will move terabytes of data and process millions of messages a day.
  • By 2018 they'll store more than 2PB of data and process over 2 billion messages a day in support of over $500 million of product revenue.
  • Goal of 70% reduction in own infrastructure by 2020.
  • Closing Sydney datacenter in 2018. Shrinking Shenzhen foot print by moving non-production environments out.
  • Created a cloud center of excellence to rethink areas like security, deployment, and operations.
  • Will launch retraining programs to invest in and modernize skill sets of in-house people (nice!).
  • Cost efficiencies are great, but they're also interested in creating a frictionless development experience to spur innovation and creativity. Seems like a big reason they made the move.

Why AWS?

  • This section of the talk sounds like an AWS press release, this was an AWS conference after all, but it's a good example of how an enterprise CTO would think about things.
  • Liked AWS for pace innovation, breath of services, and professional support.
  • Moving to AWS and owning less infrastructure means more capacity, more security, more reliability, more peace of mind, and more freedom to innovate.

On HackerNews