Archive - Canadian Data Guy Unfiltered

Inside Delta Lake’s Idempotency Magic: The Secret to Exactly-Once Spark

Learn how txnAppId and epochId work together to create a bulletproof distributed two-phase commit. Achieve true exactly-once semantics for your…

Jan 27 • Canadian Data Guy

6:18

How to Choose Between Liquid Clustering and Partitioning with Z-Order in Databricks

The views expressed in this blog are my own and do not represent official guidance from Databricks

Jan 15 • Canadian Data Guy and Geethu

Unlocking Sub-Second Latency with Databricks

Watch now | How Spark Real Time Mode Achieving Millisecond Latency with a Simple Trigger Switch

Jan 14 • Canadian Data Guy

17:55

I Knew the Answer. I Just Couldn’t Remember It.

How you can turn your notes into a personal Knowledge Agent — no code required

Jan 10 • Canadian Data Guy

13:41

December 2025

4 Surprising Truths That Will Change How You Think About Spark Streaming

Spark gives you Real-Time without the complexity and pain

Dec 15, 2025 • Canadian Data Guy

November 2025

Why I Materialize Delta History for Debugging

Just a Quick Tip

Nov 27, 2025 • Canadian Data Guy

Stop Waiting for Connectors: Stream ANYTHING into Spark (It's 4 Functions)

Listen now | How to ingest data from any source into Apache Spark — demystified with real-world example of BlockChain Ingestion

Nov 3, 2025 • Canadian Data Guy and Yogita Nesargi

26:05

October 2025

How to write your first Spark application with Stream-Stream Joins with working code

A Practical, Hands-On Guide to Joining Real-Time Data Streams in Spark Structured Streaming

Oct 15, 2025 • Canadian Data Guy

September 2025

Build an Ethereum ETL Pipeline for Free Using Databricks Free Edition

Build a zero-infrastructure streaming pipeline: Step-by-step Ethereum data ingestion, schema evolution, and Delta storage

Sep 23, 2025 • Yogita Nesargi

June 2025

How to ace and structure your Data Modelling Interview

Prescriptive guidance for conducting your Data Modelling Interview

Jun 18, 2025 • Canadian Data Guy

A Deep Dive into Skewed Joins, GroupBy Bottlenecks, and Smart Strategies to Keep Your Spark Jobs Flying

Unlock comprehensive, practical solutions to conquer data skew in Apache Spark—step-by-step from basics to advanced strategies for perfectly balanced…

Jun 6, 2025 • Canadian Data Guy

May 2025

Decode the Join: A Spark Data Engineer’s Visual Handbook

Understand when and why to use Broadcast, Shuffle, or Sort-Merge Joins in Spark— with clear visuals, real-world use cases, and strategy tips tailored…

May 9, 2025 • Canadian Data Guy and Harathi Pasam

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts