PySpark Koans

Learn by fixing tests

Progress0/39

© 2025-2026 Alex Cole. All Rights Reserved.

Spark Koans is an independent community tool.

JoinsKoan 20

Inner Join

Join two DataFrames to combine related data. Replace ___ with the correct code.

How it works: Replace the ___ blanks in the code editor with the correct PySpark code, then hit Run Code. Stuck? Try the Hint button.
Setup (read-only)
employees = spark.createDataFrame([
    (1, "Alice", 101),
    (2, "Bob", 102),
    (3, "Charlie", 101)
], ["emp_id", "name", "dept_id"])

departments = spark.createDataFrame([
    (101, "Engineering"),
    (102, "Sales"),
    (103, "Marketing")
], ["dept_id", "dept_name"])
Your CodeCtrl/Cmd+Enter to run
Output
Output will appear here...