Show HN: 500k+ events/sec transformations for ClickHouse ingestion

8 pointsposted 6 hours ago
by super_ar

2 Comments

MarkSfik

4 hours ago

As someone who has wrestled with Flink's JVM heap management and the complexity of TaskManagers/JobManagers, the 'scaling within a single pipeline' idea is compelling. Why should I choose this over Flink for a ClickHouse sink? Is the main draw the operational simplicity (no cluster management), or are there specific ClickHouse-native optimizations in your implementation that Flink’s JDBC/official connectors are missing?

super_ar

3 hours ago

Good question. I wouldn’t say this replaces Flink in general. If you already run Flink and are comfortable with it, it’s a very powerful system.

Where we saw friction with Flink was mainly: 1.) Operational overhead (jobs, state backends, checkpointing) 2.) Generic sinks not being optimized for ClickHouse (batching, small inserts, etc.)

We focused on making scaling a property of the pipeline itself (just add replicas) and optimizing specifically for ClickHouse ingestion patterns.

So Flink is more general, this is more opinionated and focused on this specific use case.