📖 Published as GitHub Pages
A growing collection of story-style explanations of Apache Spark internals. Each story focuses on one concept or subsystem and explains it as a narrative—what problem it solves, how it works, and how the pieces fit together. Stories are written to be engaging and readable without diving into code, so the ideas stick.
Stories are grouped by topic (each has its own directory); related topics are grouped into themes in the index below.
New stories are added over time and linked from this README.
How jobs become stages and tasks, how data moves, and how memory and fault tolerance work.
| Topic | Description | Stories |
|---|---|---|
| Execution & scheduling | From actions to DAG, stages, tasks; driver and executors | Coming soon |
| Scheduler | DAG Scheduler, Task Scheduler; how stages and tasks are submitted and run | From One Action to Many Tasks |
| Locality and delay scheduling | Preferred locations, locality levels, delay scheduling; when Spark waits for a good executor | Locality and Delay Scheduling |
| Scheduling pools and fair sharing | Pools, minimum share, weight; how multiple jobs share resources in fair mode | Scheduling Pools and Fair Sharing |
| Shuffle | Shuffle write/read, sort shuffle, external shuffle service | The Journey of a Shuffle Record |
| Memory & storage | Unified memory, BlockManager, caching and eviction | Coming soon |
| Fault tolerance | Lineage, recomputation, checkpointing, speculation | Coming soon |
| Partitioning | Partitions, coalesce vs repartition, partition pruning | Coming soon |
| Broadcast & shared state | Broadcast variables, accumulators | Coming soon |
How DataFrame/SQL becomes a plan, how it’s optimized, and how joins and adaptive execution work.
| Topic | Description | Stories |
|---|---|---|
| Query planning (Catalyst) | Logical plan, optimization rules, physical plan, codegen | Coming soon |
| Adaptive & runtime | AQE, dynamic partition pruning | Coming soon |
| Join strategies | Sort-merge, broadcast, hash join; when each is chosen | Coming soon |
State, checkpointing, and the lifecycle of micro-batches.
| Topic | Description | Stories |
|---|---|---|
| Structured Streaming | State stores, checkpointing, micro-batches, exactly-once | RocksDB in Structured Streaming |
Reading and writing data, formats, and data source APIs.
| Topic | Description | Stories |
|---|---|---|
| Data sources | Reading/writing, V1 vs V2 API, file formats | Coming soon |
| Serialization | Tungsten binary format, Kryo, wire format | Coming soon |
How PySpark and UDFs integrate with the JVM.
| Topic | Description | Stories |
|---|---|---|
| Python (PySpark) | JVM ↔ Python, Arrow, Pandas UDFs | Coming soon |
| UDFs | Scala/Java UDFs, registration, execution path | Coming soon |
How Spark runs on clusters and how you observe it.
| Topic | Description | Stories |
|---|---|---|
| Cluster & deploy | Cluster managers, driver/executor lifecycle, resource negotiation | Coming soon |
| UI & metrics | Spark UI, event log, history server, where metrics come from | Coming soon |
| Configuration | SparkConf, important configs, how they flow through the app | Coming soon |
Deeper internals: Tungsten, catalog, and table metadata.
| Topic | Description | Stories |
|---|---|---|
| Tungsten | Binary rows, off-heap, cache-friendly layout | Coming soon |
| Catalog & tables | Spark catalog, table metadata, session catalog | Coming soon |
rocksdb_structured_streaming_story.md).