← Back to Home

About PySpark Koans

What is PySpark Koans?

PySpark Koans is a browser-based, test-driven learning environment for PySpark and Delta Lake. It runs entirely in your browser using Pyodide (Python in WebAssembly) with a pandas-backed PySpark shim that emulates Spark APIs without requiring a real Spark cluster.

How It Works

Each "koan" is a small exercise where you fill in the blanks to make tests pass. You'll learn PySpark concepts by fixing failing tests, guided by hints and immediate feedback.

  • No installation required - runs entirely in your browser
  • Progressive difficulty from beginner to advanced
  • Covers PySpark DataFrames, SQL operations, and Delta Lake
  • Earn achievement badges as you complete learning tracks

What's Covered

PySpark

DataFrame basics, column operations, string functions, aggregations, joins, window functions, null handling, and advanced operations.

Delta Lake

Delta Lake features including time travel, merge operations, optimization, and transaction history.

Connect

Other Projects