From 497868145c537a198a162fe0b0649353b8fd14ed Mon Sep 17 00:00:00 2001 From: Holden Karau Date: Sun, 6 Sep 2015 22:31:08 -0700 Subject: [PATCH 1/2] Try and document the three options --- docs/configuration.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index 29a36bd67f28..46ce770e11a1 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -458,9 +458,12 @@ Apart from these, the following properties are also available, and may be useful spark.shuffle.manager sort - Implementation to use for shuffling data. There are two implementations available: - sort and hash. Sort-based shuffle is more memory-efficient and is - the default option starting in 1.2. + Implementation to use for shuffling data. There are three implementations available: + sort, hash and the new tungsten-sort. + Sort-based shuffle is more memory-efficient and is the default option starting in 1.2. + Tungsten-sort is similar to sort based shuffle, with a direct binary cache-friendly + implementation with a fall back to regular sort based shuffle if its requirements are not + met. From 0aada24ed1c8344d836100c65f6f213a4831cab2 Mon Sep 17 00:00:00 2001 From: Holden Karau Date: Mon, 7 Sep 2015 03:06:08 -0700 Subject: [PATCH 2/2] s/to sort based/to the sort based/ and tag as 1.5+ --- docs/configuration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index 46ce770e11a1..aeb42201c8b7 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -459,9 +459,9 @@ Apart from these, the following properties are also available, and may be useful sort Implementation to use for shuffling data. There are three implementations available: - sort, hash and the new tungsten-sort. + sort, hash and the new (1.5+) tungsten-sort. Sort-based shuffle is more memory-efficient and is the default option starting in 1.2. - Tungsten-sort is similar to sort based shuffle, with a direct binary cache-friendly + Tungsten-sort is similar to the sort based shuffle, with a direct binary cache-friendly implementation with a fall back to regular sort based shuffle if its requirements are not met.