diff --git a/docs/configuration.md b/docs/configuration.md index dc5553f3da770..1f4f89b14361d 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -353,16 +353,6 @@ Apart from these, the following properties are also available, and may be useful Port for the driver to listen on. -
| Property Name | Default | Meaning |
|---|---|---|
| spark.cleaner.ttl | +(infinite) | ++ Duration (seconds) of how long Spark will remember any metadata (stages generated, tasks generated, etc.). + Periodic cleanups will ensure that metadata older than this duration will be forgetten. This is + useful for running Spark for many hours / days (for example, running 24/7 in case of Spark Streaming + applications). Note that any RDD that persists in memory for more than this duration will be cleared as well. + | +
| spark.cleaner.ttl.MAP_OUTPUT_TRACKER | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Cleans up the map containing the information of the mapper (the input block manager Id and the output result size) corresponding to a shuffle Id. + | +
| spark.cleaner.ttl.SHUFFLE_MAP_TASK | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Clears up the cache used for shuffled tasks (tasks present in the earlier stages of the job) - a map that maps stageId to the serialised byte array of the shuffled task. + | +
| spark.cleaner.ttl.RESULT_TASK | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Clears up the cache used to store the final tasks (tasks present in the last stage of the job) - a map that maps stageId to the serialised byte array of the final task. + | +
| spark.cleaner.ttl.SPARK_CONTEXT | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Cleans up all the old persistent (cached) RDDs. + | +
| spark.cleaner.ttl.HTTP_BROADCAST | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Cleans up all broadcast files which are timestamped older than the assigned cleanup value. + | +
| spark.cleaner.ttl.DAG_SCHEDULER | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Clears up all the maps saved inside the DAG Scheduler such as - stageIdToStage, pendingTasks, stageIdToJobIds etc which are timestamped older than the assigned cleanup value. + | +
| spark.cleaner.ttl.BLOCK_MANAGER | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Clears the old non broadcast blocks from memory. + | +
| spark.cleaner.ttl.BROADCAST_VARS | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Clears the old broadcast blocks from memory. + | +
| spark.cleaner.ttl.SHUFFLE_BLOCK_MANAGER | +spark.cleaner.ttl, with a min. value of 10 secs | ++ Deletes the old physical files stored on the disk created as a result of shuffling transformations/actions such as a reduce job. + | +