PySpark Koans

Learn by fixing tests

Progress0/39

© 2025-2026 Alex Cole. All Rights Reserved.

Spark Koans is an independent community tool.

JoinsKoan 22

Join on Multiple Columns

Join on multiple columns. Replace ___ with the correct code.

How it works: Replace the ___ blanks in the code editor with the correct PySpark code, then hit Run Code. Stuck? Try the Hint button.
Setup (read-only)
orders = spark.createDataFrame([
    ("2024", "Q1", "Alice", 100),
    ("2024", "Q2", "Alice", 150),
    ("2024", "Q1", "Bob", 200)
], ["year", "quarter", "rep", "amount"])

targets = spark.createDataFrame([
    ("2024", "Q1", 120),
    ("2024", "Q2", 140)
], ["year", "quarter", "target"])
Your CodeCtrl/Cmd+Enter to run
Output
Output will appear here...