PySpark Koans

Learn by fixing tests

Progress0/39

© 2025-2026 Alex Cole. All Rights Reserved.

Spark Koans is an independent community tool.

BasicsKoan 5

Grouping and Aggregating

Learn how to group data and calculate aggregates. Replace ___ with the correct code.

How it works: Replace the ___ blanks in the code editor with the correct PySpark code, then hit Run Code. Stuck? Try the Hint button.
Setup (read-only)
data = [
    ("Sales", "Alice", 5000),
    ("Sales", "Bob", 4500),
    ("Engineering", "Charlie", 6000),
    ("Engineering", "Diana", 6500),
    ("Engineering", "Eve", 5500)
]
df = spark.createDataFrame(data, ["department", "name", "salary"])
Your CodeCtrl/Cmd+Enter to run
Output
Output will appear here...