Featured

Scaling Apache Kafka Clusters on Confluent Cloud ft. Ajit Yagaty and Aashish Kohli



Published
https://cnfl.io/podcast-episode-214 | How much can Apache Kafka® scale horizontally, and how can you automatically balance, or rebalance data to ensure optimal performance?

You may require the flexibility to scale or shrink your Kafka clusters based on demand. With experience engineering cluster elasticity and capacity management features for cloud-native Kafka, Ajit Yagaty (Confluent Cloud Control Plane Engineering) and Aashish Kohli (Confluent Cloud Product Management) join Kris Jenkins in this episode to explain how the architecture of Confluent Cloud supports elasticity.

Kris suggests that optimal elasticity is like water from a faucet—you should be able to quickly obtain as many resources as you need, but at the same time you don't want the slightest amount to go wasted. But how do you specify the amount of capacity by which to adjust, and how do you know when it's necessary?

Aashish begins by explaining how elasticity on Confluent Cloud has come a long way since the early days of scaling via support tickets. It's now self-serve and can be accomplished by dialing up or down a desired number of CKUs, or Confluent Units of Kafka. A CKU corresponds to a specific amount of Kafka resources and has been made to be consistent across all three major clouds. You can specify the number of CKUs you need via API, CLI or Confluent Cloud UI.

Ajit explains in detail how, once your request has been made, cluster resizing is a two-step process. First, capacity is added, and then your data is rebalanced. Rebalancing data on the cluster is critical to ensuring that optimal performance is derived from the available capacity. The amount of time it takes to resize a Kafka cluster depends on the number of CKUs being added or removed, as well as the amount of data to be rebalanced.

Of course, to request more or fewer CKUs in the first place, you have to know when it's necessary for your Kafka cluster(s). This can be challenging as clusters emit a large variety of metrics. Fortunately, there is a single composite metric that you can monitor to help you decide, as Ajit imparts on the episode.

Other topics covered by the trio include an in-depth explanation of how Confluent Cloud achieves elasticity under the hood (separate control and data planes, along with some Kafka dogfooding), future plans for autoscaling elasticity, scenarios where elasticity is critical, and much more.

EPISODE LINKS
► Shrink a Dedicated Kafka Cluster in Confluent Cloud: https://cnfl.io/cluster-shrinking-episode-214
► Elastic Apache Kafka Clusters in Confluent Cloud: https://cnfl.io/project-metamorphosis-episode-214
► Kris Jenkins’ Twitter: https://twitter.com/krisajenkins
► Streaming Audio Playlist: https://www.youtube.com/playlist?list=PLa7VYi0yPIH1B0i7mhzVi78TIkKSd-0vE
► Join the Confluent Community: https://cnfl.io/join-community-episode-214
► Learn more with Kafka tutorials, resources, and guides: https://cnfl.io/confluent-developer-episode-214
► Live demo: Intro to Event-Driven Microservices with Confluent: https://cnfl.io/event-driven-microservices-demo-episode-214
► Use PODCAST100 to get $100 of free Confluent Cloud usage: https://cnfl.io/try-cloud-episode-214
► Promo code details: https://cnfl.io/podcast100-details-episode-214

TIMESTAMPS
00:00 - Intro
1:30 - What is elasticity?
4:04 - Elasticity in the Cloud
7:17 - Kafka cluster performance metrics
9:01 - Self-service ability
11:12 - What does it take to expand a cluster with Kafka?
14:16 - Confluent for Kubernetes
22:31 - Architecture Overview
26:43 - Self-balancing cluster
28:59 - Cluster data rebalancing
29:43 - Cluster expansion/shrink behaviors
36:58 - User experience
42:15 - What's next
47:14 - It's a wrap

CONNECT
Subscribe: https://youtube.com/c/confluent?sub_confirmation=1
Site: https://confluent.io
GitHub: https://github.com/confluentinc
Facebook: https://facebook.com/confluentinc
Twitter: https://twitter.com/confluentinc
LinkedIn: https://www.linkedin.com/company/confluent
Instagram: https://www.instagram.com/confluent_inc

ABOUT CONFLUENT
Confluent is pioneering a fundamentally new category of data infrastructure focused on data in motion. Confluent’s cloud-native offering is the foundational platform for data in motion – designed to be the intelligent connective tissue enabling real-time data, from multiple sources, to constantly stream across the organization. With Confluent, organizations can meet the new business imperative of delivering rich, digital front-end customer experiences and transitioning to sophisticated, real-time, software-driven backend operations. To learn more, please visit www.confluent.io.

#cloudnative #apachekafka #kafka #confluent
Category
Management
Be the first to comment