- YARN provides Uptime and Port
- Uptime : how long is has been running.
- port : 8088 for incoming client requests and communication with NodeManagers.
HDFS(Hadoop Distributed File system)
- The output of MAP is written to HDFS in the specified directory configured in mapred.output.dir
SparkSQL query execution phase
- query parsing > logical optimization > physical planning > code generation > query execution > data serialization/deserialization > result materialization > cleaning up resources