3 Comments
User's avatar
Neural Foundry's avatar

Solid practical guidance on the liquid vs partitioned choice. The cardinality thresholds you outline (sub-5k for partition + z-order, above that for liquid) match what we've seen in production pretty well, but the streaming ingestion tradeoff is interessting. Had cases where eager clustering hurt latency enough that downstream query speed wasn't realy worth it, and predictive I/O didn't catch up for hours.

Yogesh Gowda's avatar

QQ: In your example under section’ Parallel Write Considerations’ though the Kafka writers are writing concurrently they are just append only operations and so there won’t be conflicts with concurrent writers even if we decide to use liquid clustering right?

Let’s say I am writing to a destination table with less than 1TB of data

Canadian Data Guy's avatar

Appends are blind inserts so they do not conflict with each other. Liquid or Zorder does not matter.