Scaling MySQL in the cloud with Vitess and Kubernetes

[Cross-posted from the Google Cloud Platform Blog

Your new website is growing exponentially. After a few rounds of high fives, you start scaling to meet this unexpected demand. While you can always add more front-end servers, eventually your database becomes a bottleneck, which leads you to . . .

  • Add more replicas for better read throughput and data durability
  • Introduce sharding to scale your write throughput and let your data set grow beyond a single machine
  • Create separate replica pools for batch jobs and backups, to isolate them from live traffic
  • Clone the whole deployment into multiple datacenters worldwide for disaster recovery and lower latency

At YouTube, we went on that journey as we scaled our MySQL deployment, which today handles the metadata for billions of daily video views and 300 hours of new video uploads per minute. To do this, we developed the Vitess platform, which addresses scaling challenges while hiding the associated complexity from the application layer.

Vitess is available as an open-source project and runs best in a containerized environment. With Kubernetes and Google Container Engine as your container cluster manager, it's now a lot easier to get started. We’ve created a single deployment configuration for Vitess that works on any platform that Kubernetes supports.

In addition to being easy to deploy in a container cluster, Vitess also takes full advantage of the benefits offered by a container cluster manager, in particular:

  • Horizontal scaling – add capacity by launching additional nodes rather than making one huge node
  • Dynamic placement – let the cluster manager schedule Vitess containers wherever it wants
  • Declarative specification – describe your desired end state, and let the cluster manager create it
  • Self-healing components – recover automatically from machine failures

In this environment, Vitess provides a MySQL storage layer with improved durability, scalability, and manageability.

We're just getting started with this integration, but you can already run Vitess on Kubernetes yourself. For more on Vitess, check out vitess.io, ask questions on our forum, or join us on GitHub. In particular, take a look at our overview to understand the trade-offs of Vitess versus NoSQL solutions and fully-managed MySQL solutions like Google Cloud SQL.

-Posted by Anthony Yeh, Software Engineer, YouTube

Comments