sparklearning
Adaptive Query Execution
Stories about how Spark rewrites query plans at runtime using real shuffle statistics.
Stories
AQE: How Spark Rewrites Plans After the Shuffle
— coalescing shuffle partitions, join conversion, skew join handling, and dynamic partition pruning
Related stories
How Spark Chooses a Join
— the join strategies AQE can switch between at runtime
What Spark Knows About Your Data: Statistics and the Cost-Based Optimizer
— how AQE complements static CBO with real runtime statistics
The Journey of a Shuffle Record
— the shuffle mechanism AQE observes to make its decisions
When One Partition Holds Up Everyone: The Data Skew Story
— the skew problem AQE’s skew join handler solves