Building Real-Time Data Pipelines at Scale
Alex Kim
Principal Engineer

The Scale Problem
Our client's IoT platform ingests telemetry data from 500,000 connected devices. Each device reports every 2 seconds. That's 2 million events per second hitting the pipeline — and every single one needs to be processed, enriched, and routed in under 200ms.
Batch processing wasn't an option. When a pressure valve reads abnormal levels, you can't wait for a nightly ETL job.
Architecture Overview
We built a three-layer streaming architecture:
- •Ingestion Layer: Apache Kafka with 120 partitions, geo-distributed across 3 regions
- •Processing Layer: Apache Flink for stateful stream processing with exactly-once semantics
- •Serving Layer: ClickHouse for real-time analytics, Redis for hot-path caching
"At scale, the system that doesn't exist is the one that breaks. Design for the failure modes you haven't imagined yet."
Custom Backpressure
The standard Flink backpressure mechanism wasn't sufficient for our burst patterns. We built a custom adaptive throttling system.
public class AdaptiveThrottle implements BackpressureHandler {
private final RingBuffer<Long> latencyWindow;
public double computeThrottleRatio() {
double p99 = latencyWindow.percentile(0.99);
return Math.min(1.0, TARGET_LATENCY_MS / p99);
}
}Results
- •2M events/sec sustained throughput
- •Sub-100ms p95 end-to-end latency
- •99.99% uptime over 14 months
Don't miss the next architectural breakdown.
Join thousands of engineers who receive our weekly deep-dives on system design, AI/ML, and product engineering.