Discussion about this post

User's avatar
Jayasurya Pilli's avatar

Thank you for such a detailed explanation on solution options for handling data skew.

However, I have a question, probably a quick one...

Given that now we also have Liquid Clustering feature available in Databricks, should Liquid Clustering be considered as the first and the foremost recommended solution, even over the AQE please?

In other words, as a recommended approach, shouldn't Liquid Clustering be considered first, followed by AQE, then BroadcastHashJoin, then Salting.

Please correct me if I'm wrong in my understanding.

Andrii Fadieiev's avatar

Great stuff, thanks for sharing!

2 more comments...

No posts

Ready for more?