> ClickHouse was designed for single-machine installations This is incorrect. Cl...

justCHurious · on Oct 23, 2024

On a side note, can someone please comment on this part

> for example if you add additional nodes it is not easy to redistribute data.

This is precisely one of the issues I predict we'll face with our cluster as we're ramping up OTEL data and it's being sent to a small cluster, and I'm deathly afraid that it will continue sending to the every shard in equal measure without moving around existing data. I can not find any good method of redistributing the load other than "use the third party backup program and pray it doesn't shit the bed".

MBkkt · on Oct 24, 2024

It's like saying that postgres was designed for distributed setups, just because there are large postgres installations. We all understand that clickhouse (and postgres) are great databases. But it's strange to call them designed for distributed setups. How about insertion not through a single master? Scalable replication? And a bunch of other important features -- not just the ability to keep independent shards that can be queried in single query

zX41ZdbW · on Oct 24, 2024

ClickHouse does not have a master replica (every replica is equal), and every machine processes inserts in parallel. It allocates block numbers through the distributed consensus in Keeper. This allows for a very high insertion rate, with several hundred million rows per second in production. The cluster can scale both by the number of shards and by the number of replicas per shard.

Scaling by the number of replicas of a single shard is less efficient than scaling by the number of shards. For ReplicatedMergeTree tables, due to physical replication of data, it is typically less than 10 replicas per shard, where 3 replicas per shard are practical for servers with non-redundant disks (RAID-0 and JBOD), and 2 replicas per shard are practical for servers with more redundant disks. For SharedMergeTree (in ClickHouse Cloud), which uses shared storage and does not physically replicate data (but still has to replicate metadata), the practical number of replicas is up to 300, and inserts scale quite well on these setups.