PySpark Koans

Learn by fixing tests

Progress0/39

© 2025-2026 Alex Cole. All Rights Reserved.

Spark Koans is an independent community tool.

Delta LakeKoan 110

VACUUM Old Files

Clean up old files from a Delta table. Replace ___ with the correct code.

How it works: Replace the ___ blanks in the code editor with the correct PySpark code, then hit Run Code. Stuck? Try the Hint button.
Setup (read-only)
_reset_delta_tables()

# Create table and make several updates
for i in range(3):
    data = [(j, f"version_{i}") for j in range(10)]
    df = spark.createDataFrame(data, ["id", "data"])
    df.write.format("delta").mode("overwrite").save("/data/versions")
Your CodeCtrl/Cmd+Enter to run
Output
Output will appear here...