Real Use Case: Handling third-party Rate Limits

Scaling with Kafka, Redis, and Intelligent Retry Logic


Facing strict rate limits from a critical third-party chat system, our application struggled with dropped connections and user dissatisfaction as we scaled.

To solve this, I designed a resilient, scalable solution leveraging Kafka for asynchronous processing and Redis for intelligent retry management.

This approach not only addressed immediate issues but also streamlined future scalability and reliability.

Share this post
rate limit solution components

The Problem

Our product utilized a third-party chat system that enforced strict rate limits for creating users and connecting them to chat channels.
As our user base grew, we frequently encountered these rate limits, causing users to be blocked by rate limits and lose connection to the event chat unless they manually restarted their app or browser and retried joining.
This led to significant client dissatisfaction and a degraded user experience.

Our backend (BE) chat service received REST requests from the front end (FE) and issued further REST requests to the third-party chat service for user creation and channel connection. Rate limits imposed by the third-party service resulted in request failures, dropped connections, and ultimately unhappy end-users.

system components before applying solution.
System components before applying solution.
As the senior backend developer on this project, I worked closely with a mid-level developer, providing mentorship and guidance throughout the implementation. Additionally, collaboration with our backend guild and backend architect ensured adherence to best practices, scalability, and alignment with the overall architectural strategy.

The Designed Solution

To address this critical issue, I designed a scalable and resilient solution employing asynchronous processing, message queuing, and intelligent retry logic.

Introducing Kafka for Asynchronous Processing

I integrated Kafka into our architecture to handle requests asynchronously.
Kafka is a distributed event streaming platform for building real-time data pipelines and streaming applications. Kafka uses "topics" to organize messages, allowing services to publish and subscribe efficiently.

Backend services now send messages, categorized by message type, to specific Kafka topics.
Front end facing endpoints were modified to store data in the MySQL db and publish to Kafka for a quick response to the user.
Decoupling the front end and Backend services processes from direct dependencies on the third-party service allowed quick acknowledgment of user requests, significantly reducing immediate REST interactions with the third-party service.

BE logic modifications - Splitting "Create User" Requests
Create user asynchronous sequence diagram.
Create user asynchronous sequence diagram.
  • Immediate Response:

    Upon receiving a REST request from the front end, the backend added the user details into our MySQL database with a status marked as 'ready_to_connect'.
    It then sends a message to Kafka containing relevant user details required later for connection.

    This step promptly returned a 202 status to the front end, indicating that the request was accepted for processing but not yet completed.

  • Asynchronous Processing:

    After quickly responding to the front end, Kafka consumers asynchronously processed these queued requests.

BE logic modifications - Optimizing BE services integration

For channel connection requests originating from other backend services:

  • Services that could adapt promptly were modified to directly publish messages to appropriate Kafka topics.

  • For services constrained by time or complexity, we modified their API interactions by storing necessary details in our MySQL database and subsequently pushing messages to Kafka.

system components after applying solution.
System components after applying solution.

Leveraging the third-party Bulk API capabilities

Kafka consumer logic was enhanced to batch-process 1-100 messages simultaneously.
Instead of sending multiple single-message requests, we utilized the third-party bulk API (up to 100 connections on a single message), dramatically reducing the total request volume and minimizing rate-limit triggers.

Intelligent Retry Logic with Redis

Redis is an open-source, in-memory data structure store commonly used for caching, session management, and as a fast database.

To manage retry attempts effectively and avoid overwhelming the third-party service during rate-limit periods:

  • On encountering a 429 HTTP status (rate limit exceeded), Kafka messages weren't marked as consumed, ensuring retry.

  • Redis caching was introduced to store endpoint-specific rate-limit expiration times.
    Before any request, the service checked Redis. If the endpoint had an active rate-limit item in Redis, the request was delayed, thus preventing unnecessary retries while blocked.

Estimated Performance Improvements

Estimated Performance Improvements
Estimated Performance Improvements.
Rough Summary Estimate

Overall, this solution achieved a 4×–8× improvement in throughput and efficiency, alongside drastically lower error rates and improved UX—especially under peak load or event-driven spikes.

Alternative Solutions Considered

Before settling on Kafka and Redis, we explored alternative approaches such as introducing exponential backoff strategies directly within our REST API layer, using our mysql and cron jobs or adopting alternative queue systems like RabbitMQ.
However, these solutions either lacked sufficient scalability or introduced unnecessary complexity in managing state and retries.
Ultimately, Kafka's robust event-driven architecture combined with Redis's quick in-memory caching capabilities and the existing company knowledge base, provided the optimal balance of reliability, scalability, and maintainability.

Summary

By thoughtfully modifying our backend processes, incorporating Kafka for asynchronous handling, intelligently using Redis for rate limit management, and leveraging bulk APIs, we successfully navigated and mitigated third-party rate limits.

This practical, scalable solution utilized existing company infrastructure (Kafka, Redis) to effectively handle peak usage and provide users a consistently reliable experience.

Share this post
Kafka
Redis
Rate-limit
System-Design

Contact. Collaborate.
Subscribe. Explore.

abstract light blue and grey wave background