Apache Spark uses local disk on AWS Glue workers to spill data from memory that exceeds the heap space defined by the spark.executor.memory configuration parametre.
Wide transformations(groupByKey(), reduceByKey(), join()) can cause a shuffle.
Spark writes the intermediate data to a local disk before it can exchange that data between the different workers.
At This point, you might get a "No space left on device" or a MetadataFetchFailedException error.
Spark throws this error when there isn't enough disk space left on the executor and there's no recovery.
These types of errors commonly occur when the processing job observes a significant skew in the dataset.
You need to make sure the error does not occur.
3.1 Use dedicated serverless storage.
Key: --write-shuffle-files-to-S3
Value : TRUE
Key: --conf
Value: spark.shuffle.storage.path=s3://custom_shuffle_bucket
3.2 Scaling out
3.3 Reduce and filter input data
3.4 Broadcast small tables
3.5 Use AQE
https://repost.aws/knowledge-center/glue-no-spaces-on-device-error
https://docs.aws.amazon.com/glue/latest/dg/monitor-spark-shuffle-manager.html
https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html