WalkMe Optimizes Business-Critical Apache Kafka® Clusters and Reduces TCO by 40%

WalkMe leverages Aiven for Kafka, tiered storage and BYOC model to maximize infrastructure investment while maintaining high performance and stability

WalkMe, a leading Digital Adoption Platform (DAP) provider, faced a challenge: soaring data volumes were driving up the cost of its Apache Kafka infrastructure. Discover how the company achieved the following with Aiven:

  • Kafka optimization reduced TCO by 40% while handling a 30% increase in data traffic.
  • Managed over 100,000 data events per second with optimal performance.
  • Improved compliance with data residency requirements.

Serving 2,000 customers and 35 million users with rapid growth on the horizon

WalkMe, a fast-growing and leading Digital Adoption Platform (DAP) provider, empowers businesses to accelerate digital transformation and streamline processes. Its advanced algorithms provide real-time guidance to 35 million users across 2,000 global customers, driving faster adoption of new technologies.

WalkMe's impressive growth trajectory has been further enhanced by its recent acquisition by SAP. The company will expand its reach within the SAP ecosystem while continuing to grow its business as an independent, market-leading DAP.

Kafka costs surge as data volumes soar

Since 2011, WalkMe has relied heavily on Apache Kafka to power its Digital Adoption Platform. Kafka’s real-time data streaming capabilities are essential for analyzing user behavior and providing personalized guidance. However, rapid customer growth led to increasing data volumes and soaring Kafka infrastructure costs. For example, storage constraints within WalkMe's Kafka clusters necessitated continuous infrastructure scaling to acquire the necessary servers equipped with costly Solid State Drives (SSDs).

Anticipating further increases in data volume and velocity alongside company growth, WalkMe sought a more cost-effective and scalable solution. “Our challenge was clear,” says Yotam Spenser, Head of Data Engineering at WalkMe. “We needed to reduce the Kafka costs associated with a rapid increase in data, without affecting performance or the reliability of our service.”

BYOC and tiered storage promise substantial cost-saving opportunities

After evaluating various vendors, WalkMe selected Aiven for Apache Kafka. Aiven proposed a two-pronged approach to cost-effectively deal with rising Kafka costs:

  1. Bring Your Own Cloud (BYOC): Deploying Aiven for Kafka on WalkMe’s existing Google Cloud Platform infrastructure, thereby allowing the company to leverage existing commitments and discounts.
  2. Tiered Storage: Implementing tiered storage to enable WalkMe to move less critical data to the more cost-effective Google Cloud Storage, reducing reliance on expensive SSDs.

Kafka TCO reduced by 40% while data volumes grow at 30% a year

The migration went well and, importantly, was completed within the planned six-month timeframe, avoiding any financial implications of contract over-run with the previous vendor. “When upgrading infrastructure, stability is always a concern,” says Spenser. “But Aiven carried out a smooth migration with no loss of service.”

Since using the Aiven service, the company has optimized the size of its Kafka clusters and regained control of its Kafka costs. “Our proactive approach to optimization has paid off. Together, the tiered storage and BYOC model have reduced the TCO of our Kafka environment by 40% and we’ve maintained the high levels of performance and reliability that the business demands,” says Spenser.

In the same time period, data traffic has grown at 30% a year. But thanks to the various optimizations implemented by Aiven, like tiered storage, the size of the Kafka clusters have remained fairly constant, helping to keep costs down.

“Previously, the storage for each Kafka cluster was substantial,” says Spenser. “Now, thanks to tiered storage, we’ve reduced it by around 80% which has saved us a huge amount of money — and all without any latency issues.”

Seamlessly scaling Kafka to 1+ GB/second

WalkMe has continued to enjoy a high throughput of very volatile data volumes with Aiven for Kafka in place. For example, the clusters have an average data flow of around several hundred megabytes per second and this increases to more than a gigabyte of data per second at its peak.

“We manage over 100,000 events and sometimes more than a gigabyte of data per second,” says Spenser. “It’s an extraordinary scale, but our Aiven for Apache Kafka clusters can handle these kinds of high-velocity data streams.”

BYOC model supports both data compliance and convenience

The BYOC model allows WalkMe to further enhance its control of data by keeping it within its own secure Google cloud environment. This provides additional assurance in meeting stringent data location and residency requirements such as the GDPR mandate that user data be processed and stored within specific geographical boundaries.

This strategic approach enables WalkMe to achieve the best of both worlds. “We have a fully managed service from Aiven’s Apache Kafka experts while ensuring the highest standards of data sovereignty and regulatory compliance,” says Spenser.

Additional cost savings in the future

As to the immediate future, the focus is on the SAP integration. Beyond that, Spenser is exploring options like follower fetching from Aiven to bring network costs down still further. WalkMe is also looking forward to using the labeling feature in Kafka, which will provide valuable insights into spending and identify areas for further optimization and cost-cutting opportunities.

“Choosing to work with Aiven was a good decision from both a technical and business perspective,” says Spenser. “We do not want to manage Kafka on a day-to-day basis, so we need a vendor to deal with that for us. We are very happy with the service from Aiven: they have created and manage a really cost effective and robust set-up for our business-critical data infrastructure.”

Get your first cluster online now

Aiven makes setting up cloud databases so simple anyone can do it. Our set-it-and-forget-it solutions take the pain out of cloud data infrastructure.