With WarpStream, Confluent Got a New Type of Kafka Platform

WarpStream customers ran Kafka at a tenth of the cost of other cloud providers, thanks to using AWS object storage to beat networking costs.

Oct 1st, 2024 7:11am by Joab Jackson

Featued image for: With WarpStream, Confluent Got a New Type of Kafka Platform

WarpStream’s Ryan Worl (left) and Confluent’s Shaun Clowes discuss the merger at Current 2024.

The Mountain View California-based company Confluent offers the Kafka streaming software on an enterprise platform. It also offers a version for the cloud, Confluent Cloud. And now it has a third way to deliver Kafka, thanks to the recently acquired WarpStream earlier last month.

Considering the buzz at the company’s annual Current user conference, held in Austin, this buy was anything but the usual routine acquisition of a potential competitor.

WarpStream developed its own real-time data streaming, “Kafka-compatible” distribution, based loosely on — Buzzword Alert! — the Bring-Your-Own-Cloud (BYOC) model, though the WarpStream’s founders insist its not standard BYOC.

Confluent, as always, is bullish on streaming data, and at this year’s conference, they reinforced the idea that most IT systems will eventually move from a batch-processing mindset to that of analyzing and processing stream data. The company’s latest data streaming report shows that 91% of IT leaders are planning on moving their organizations more towards data streaming.

WarpStream fits a particular niche; it is perfect for logging, observability, and feeding data lakes, all cases where a little bit of latency is an acceptable tradeoff for large cloud cost savings, Confluent execs noted.

“Our goal is to make data streaming the central nervous system of every company, and to do that we need to make it something that is a great fit for a vast array of use cases and companies,” noted Jay Kreps, CEO and co-founder of Confluent, in a company statement of the acquisition.

WarpStream Ahead

WarpStream’s approach provides a set of containers to the customers with a fully scalable Kafka deployment, along with supporting tools.

It is managed under what is called a “shared responsibility” model of cloud computing, where both the customer and vendor share in the responsibility of maintaining the application.

Unlike other BYOC approaches, however, it is up to the customer, not the service provider, to scale to the needed level of service. But that can be done automatically.

In a press conference at the conference, WarpStream co-creator Ryan Worl said he did not feel comfortable with a business model where the vendor could scale up its operations on the customer’s behalf. So that part is left to the customer.

“The thing we realized is that we could avoid a lot of the problems with the traditional BYOC model by re-architecting the underlying system in a different way,” explained WarpStream confounder Richard Artoul during the Current keynote.

In this setup, the control plane and data plane are split, with WarpStream managing the control plane from its own cloud account, which manage the transactions, consumer groups, metadata and consensus.

The customer manages the data plane, which is a set of containerized agents, each basically the equivalent of a stateless Kafka broker, handling batching, caching and the raw TCP protocol. Scaling up simply involves adding more containers, without all the messy work of partitioning. The entire operation can be managed with Kubernetes and a set of Helm Charts.

“We don’t need access to the customer’s environment because there are no stateful components to manage in the first place,” Artoul said. All the customer’s data remains within the customer’s system.

Confluent comparison chart.

Get Down on the Storage Bucket

But wait! This setup gets even more unusual!

Everything is run from the Amazon Web Services‘ S3 storage bucket.

“Object storage is such a solid API that you can depend on. We don’t need to have any access in the customer’s cloud account, no cross-account permissions or anything like that to manage stateful infrastructure in the old style BYOC,” Worl said. AWS makes sure the data is durable and always available.

“That’s why the the old style BYOC vendors need those permissions to punch into your cloud account, get root on the machines if they need to fix something when it goes wrong,” Worl added. In a modern setup, this can sometimes involve granting hundreds of different sets of permissions to the vendor, something any good CSO would bristle at.

WarpStream eliminates all of this through what the company called a “zero disk architecture.”

There are no local disks, no caching, no EBS volumes used in this setup.

The agents never touch disks. Instead, they operate entirely from the object storage.

Sure, it is a “bit slower” than compared to a standard Amazon Web Services’ computational units, Artoul explained. But it is a lot simpler and much, much more inexpensive.

For one, storing data on object storage rather than on EBS-based disks is about 25 times more expensive, once replication is factored in.

And then there are the savings of networking costs. One of the biggest costs of Kafka, especially for high-volume work (and is there any other kind of Kafka workload?), is replicating data across availability zones.

By using S3 not only as the storage layer, but also the networking layer, it eliminates all networking costs across availability zones.

Of course, the user could run everything in-house and eliminate all data transfer costs. But then they would need people to run Kafka, which itself can be a considerable expense. Just auto-scaling alone is a headache, with admins need to add brokers, manage partitions, and move data around.

“Since WarpStream agents are completely stateless and all the storage is offloaded to the object store, they are actually trivial auto-scale,” Artoul explained.

Basically, the system can just monitor the CPU usage. If it gets high, add more containers. When it gets low, take a few away. No partition balancing, no moving of the data.

“So WarpStream is always perfectly right-sized for the amount of throughput it is serving right now, in the current moment and not how much throughput you might need in the future,” Artoul said.

Disclosure: Confluent paid for the reporter’s travel and lodging to attend Current 24.

Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 25 years, including stints at IDG and Government Computer News. Before that, he...