[SPARK-24717][SS] Split out max retain version of state for memory in HDFSBackedStateStoreProvider #21700

HeartSaVioR · 2018-07-02T22:23:03Z

What changes were proposed in this pull request?

This patch proposes breaking down configuration of retaining batch size on state into two pieces: files and in memory (cache). While this patch reuses existing configuration for files, it introduces new configuration, "spark.sql.streaming.maxBatchesToRetainInMemory" to configure max count of batch to retain in memory.

How was this patch tested?

Apply this patch on top of SPARK-24441 (#21469), and manually tested in various workloads to ensure overall size of states in memory is around 2x or less of the size of latest version of state, while it was 10x ~ 80x before applying the patch.

HeartSaVioR · 2018-07-02T22:25:57Z

Pasting JIRA issue description to explain why this patch is needed:

As default version of "spark.sql.streaming.minBatchesToRetain" is set to high (100), which doesn't require strictly 100x of memory, but I'm seeing 10x ~ 80x of memory consumption for various workloads. In addition, in some cases, requiring 2x of memory is even unacceptable, so we should split out configuration for memory and let users adjust to trade-off between memory usage vs cache miss (building state from files).

In normal case, default value '2' would cover both cases: success and restoring failure with less than or around 2x of memory usage, and '1' would only cover success case but no longer require more than 1x of memory. In extreme case, user can set the value to '0' to completely disable the map cache to maximize executor memory usage (covers #21500).

SparkQA · 2018-07-02T22:41:34Z

Test build #92546 has finished for PR 21700 at commit 22f0e22.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
public class BoundedSortedMap<K, V> extends TreeMap<K, V>

HeartSaVioR · 2018-07-02T22:47:21Z

retest this, please

HeartSaVioR · 2018-07-02T22:47:58Z

cc. @tdas @zsxwing @jose-torres @jerryshao @arunmahadevan @HyukjinKwon

SparkQA · 2018-07-02T23:02:05Z

Test build #92547 has finished for PR 21700 at commit 45796d8.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
public class BoundedSortedMap<K, V> extends TreeMap<K, V>

…ackedStateStoreProvider * introduce BoundedSortedMap which implements bounded size of sorted map * only first N elements will be retained * replace loadedMaps to BoundedSortedMap to retain only N versions of states * no need to cleanup in maintenance phase * introduce new configuration: spark.sql.streaming.minBatchesToRetainInMemory

HeartSaVioR · 2018-07-02T23:16:58Z

Missing new line in EOF for two new Java files. Just addressed.
Jenkins, retest this please.

SparkQA · 2018-07-02T23:34:16Z

Test build #92548 has finished for PR 21700 at commit cab25df.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds the following public classes (experimental):
public class BoundedSortedMap<K, V> extends TreeMap<K, V>

SparkQA · 2018-07-02T23:44:20Z

Test build #92549 has finished for PR 21700 at commit 0819412.

This patch fails RAT tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-03T03:43:40Z

Test build #92550 has finished for PR 21700 at commit 345b33a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-04T01:57:17Z

Test build #92587 has finished for PR 21700 at commit c50da7b.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-04T05:44:18Z

Test build #92588 has finished for PR 21700 at commit d8b4bb8.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tedyu · 2018-07-05T06:17:17Z

sql/core/src/main/java/org/apache/spark/sql/streaming/state/BoundedSortedMap.java

+  }
+
+  @Override
+  public void putAll(Map<? extends K, ? extends V> map) {


Should the map parameter be of type SortedMap ?
With ordinary Map, the traversal order is not fixed. It may produce non-deterministic result if the map's size is bigger than this BoundedSortedMap's size

Unfortunately this is inherited from Map interface so we can't modify its signature.
And assuming that put is implemented correctly, this can guarantee the size of BoundedSortedMap, since it defers to put method to restrict map's size.

tedyu · 2018-07-05T06:22:15Z

sql/core/src/main/java/org/apache/spark/sql/streaming/state/BoundedSortedMap.java

+
+  @Override
+  public void putAll(Map<? extends K, ? extends V> map) {
+    for (Map.Entry<? extends K, ? extends V> entry : map.entrySet()) {


I can think of some optimization here:
If the map's size is bigger than or equal to this BoundedSortedMap's size, you can call clear on this sortedMap first if map.lastKey() is lower than this.firstKey - since all of this sortedMap's elements would be evicted.
On the other hand, if map.firstKey() is higher than this.lastKey and this sortedMap is at full capacity, there is no need to enter the loop - no element from map would be taken anyway.

Thanks for the great suggestion. While we can't assume that map's type is SortedMap, looks like we could check the type of map in runtime and apply your suggestion. Will apply it.

tedyu · 2018-07-05T06:58:12Z

...main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala


-  private lazy val loadedMaps = new mutable.HashMap[Long, MapType]
+  // taking default value first: this will be updated by init method with configuration
+  @volatile private var numberOfVersionsRetainInMemory: Int = 2


numberOfVersionsRetainInMemory -> numberOfVersionsToRetainInMemory

HeartSaVioR · 2018-07-05T12:36:15Z

@tedyu Thanks for the detailed review comments. Addressed.

SparkQA · 2018-07-05T12:51:35Z

Test build #92645 has finished for PR 21700 at commit 9c68fe1.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-05T13:19:34Z

Test build #92647 has finished for PR 21700 at commit ee8b117.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-05T17:11:41Z

Test build #92649 has finished for PR 21700 at commit 35892b5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-10T22:29:54Z

Test build #92829 has finished for PR 21700 at commit 6d5d4ba.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

…ider

SparkQA · 2018-07-10T22:35:07Z

Test build #92830 has finished for PR 21700 at commit be44d9c.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-11T03:18:13Z

Test build #92832 has finished for PR 21700 at commit 30fac38.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2018-07-11T04:34:52Z

...main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala

+  @volatile private var numberOfVersionsToRetainInMemory: Int = _

-  private lazy val loadedMaps = new mutable.HashMap[Long, MapType]
+  private lazy val loadedMaps = new util.TreeMap[Long, MapType](Ordering[Long].reverse)


Just FYI: Referring java.util.TreeMap is unavoidable cause Scala doesn't support mutable SortedMap unless Scala 2.12+, hence needs additional changes for interop.

Yeah, I was wondering about that. Makes sense.

jose-torres · 2018-07-11T18:15:37Z

...main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala

+  @volatile private var numberOfVersionsToRetainInMemory: Int = _

-  private lazy val loadedMaps = new mutable.HashMap[Long, MapType]
+  private lazy val loadedMaps = new util.TreeMap[Long, MapType](Ordering[Long].reverse)


Yeah, I was wondering about that. Makes sense.

jose-torres · 2018-07-11T18:19:01Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

    require(!StateStore.isMaintenanceRunning)
  }

+  test("retaining only latest configured size of versions in memory") {


Sorry I didn't catch this earlier. We should ideally have tests that directly validate the specific behaviors we're documenting in the conf:

'2' will read from cache in the direct failure case

'1' will read from cache in the happy path but not if there's a failure

'0' will never read from the cache, and more importantly will maximize memory by never populating it

It is fairly easy to check whether reading from cache or reading from file with c9aada5 in #21469 since it introduces metrics for cache hit and cache miss, but not easy to check in this PR itself.

So I just rely on checking cache to ensure the data is correctly evicted and not available in cache as expected. Hope this is OK.

Btw, I caught a silly bug while adding tests to cover your suggestion. Thanks!

…and fix a silly bug

HeartSaVioR · 2018-07-11T23:08:21Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+
+    var currentVersion = 0
+
+    def restoreOriginValues(map: provider.MapType): Map[String, Int] = {


I've just allowed redundant function definition cause there's no way to use provider.MapType in parameter type unless provider is defined. If we really want to get rid of redundant function definition, we may have to change it to ConcurrentMap directly.

+1 one using making it ConcurrentMap. Maybe even better, you can use scala implicit classses to add methods to HDFSBackedStateStoreProvider

implicit class ProviderHelper(provider: StateStoreProvider) { def toStringIntMap(): Map[String, Int] = { .... } }

This should avoid this problem. Either way, I hate having these duplicate methods, so we should fix it one way or the other.

On second thought, if you make the convenience method checkVersion i mentioned above, you may not have to do this at all.

HeartSaVioR · 2018-07-11T23:11:46Z

@jose-torres Addressed review comment. Please take a look again.

SparkQA · 2018-07-12T02:58:12Z

Test build #92904 has finished for PR 21700 at commit e95e45a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas

Overall looks good! Some nits to solve, mainly on the test code.

tdas · 2018-07-17T07:26:44Z

...main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala

+    if (size == numberOfVersionsToRetainInMemory) {
+      val versionIdForLastKey = loadedMaps.lastKey()
+      if (versionIdForLastKey > newVersion) {
+        // this is the only case which put doesn't need


Can you clarify when this case can happen?

Will update the comment to clarify a bit more. We just avoid the case when the element is being added to the last and required to be evicted right away.

tdas · 2018-07-17T07:30:13Z

...main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala

+    loadedMaps.clone().asInstanceOf[util.SortedMap[Long, MapType]]
+  }
+
+  private def putStateIntoStateCache(newVersion: Long, map: MapType): Unit = synchronized {


putStateIntoStateCache -> cacheMap, to keep consistent with loadedMaps etc.

tdas · 2018-07-17T07:33:03Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

  }

+  def updateVersionTo(provider: StateStoreProvider, currentVersion: => Int,
+                      targetVersion: Int): Int = {


this is incorrect indenting by spark style guide. should be

def updateVersionTo( provider: StateStoreProvider, currentVersion: => Int, targetVersion: Int): Int = {

Why are you using => Int for currentVersion instead of simply using Int?

Also, since you are frequently incrementing the version by 1 (i.e. targetVersion = currentVersion + 1, always), you can add another convenience method called incrementVersion(provider, currentVersion)

Thanks for correcting style guide. Will fix.
Regarding currentVersion: => Int is somehow I was trying to modify currentVersion itself, and stick with current approach but didn't roll back. Will fix.
And I agree it would be better to have incrementVersion to shorter the code. Will address.

tdas · 2018-07-17T07:36:09Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+      map.asScala.map(entry => rowToString(entry._1) -> rowToInt(entry._2)).toMap
+    }
+
+    var currentVersion = 0


Nit: please add comments on each section here to make it clear what are you testing

tdas · 2018-07-17T07:57:37Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+
+    var currentVersion = 0
+
+    def restoreOriginValues(map: provider.MapType): Map[String, Int] = {


+1 one using making it ConcurrentMap. Maybe even better, you can use scala implicit classses to add methods to HDFSBackedStateStoreProvider

implicit class ProviderHelper(provider: StateStoreProvider) { def toStringIntMap(): Map[String, Int] = { .... } }

This should avoid this problem. Either way, I hate having these duplicate methods, so we should fix it one way or the other.

tdas · 2018-07-17T08:00:17Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+    loadedMaps = provider.getClonedLoadedMaps()
+    assert(loadedMaps.size() === 2)
+    assert(loadedMaps.firstKey() === 2L)
+    assert(loadedMaps.lastKey() === 1L)


You can make this a convenient function def checkLoadedVersions(num: Int, earliest: Int, latest: Int)

tdas · 2018-07-17T08:03:06Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+    assert(loadedMaps.firstKey() === 3L)
+    assert(loadedMaps.lastKey() === 2L)
+    assert(restoreOriginValues(loadedMaps.get(3L)) === Map("a" -> 3))
+    assert(restoreOriginValues(loadedMaps.get(2L)) === Map("a" -> 2))


this can be boiled down to a convenience method as well to reduce the verbosity def checkVersion(version: Int, expectedData: Map[String, Int])

tdas · 2018-07-17T08:04:18Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+
+    var currentVersion = 0
+
+    def restoreOriginValues(map: provider.MapType): Map[String, Int] = {


On second thought, if you make the convenience method checkVersion i mentioned above, you may not have to do this at all.

HeartSaVioR · 2018-07-17T09:42:50Z

@tdas Thanks for the detailed review! Addressed review comments.

SparkQA · 2018-07-17T13:00:36Z

Test build #93163 has finished for PR 21700 at commit 02b4972.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas

Changes look great! Just a couple of more nits and we are good to go.

tdas · 2018-07-19T00:40:54Z

sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala

+    currentVersion = incrementVersion(provider, currentVersion)
+    assert(getData(provider) === Set("a" -> 1))
+    var loadedMaps = provider.getClonedLoadedMaps()
+    checkLoadedVersions(loadedMaps, 1, 1L, 1L)


can you make these checkLoadedVersions(loadedMaps, count = 1, min = 1L, max = 1L) so that its obvious while reading what those numbers are.

Also does it need the prefix L?? seems like they are everywhere and they really dont need to be.

Yeah I'd add 'L' everywhere if the type of literal number should be long so that we don't rely on autocasting and be sure about the type explicitly, but no strong opinion about this. I can follow existing Spark preferences.

tdas · 2018-07-19T00:44:11Z

...main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala

  }

+  /** This method is intended to be only used for unit test(s). DO NOT TOUCH ELEMENTS IN MAP! */
+  private[state] def getClonedLoadedMaps(): util.SortedMap[Long, MapType] = synchronized {


just getLoadedMaps() is fine. The fact that its cloned is just implementation detail.

Agreed. Will address.

HeartSaVioR · 2018-07-19T01:25:24Z

@tdas Addressed review comments. Please take a look again. Thanks in advance!

SparkQA · 2018-07-19T05:26:25Z

Test build #93256 has finished for PR 21700 at commit cf78a2a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2018-07-19T06:58:30Z

LGTM! I am merging it! Thank you for all the hard work. And my apologies for not being able to give it time earlier to review it.

HeartSaVioR · 2018-07-19T07:14:02Z

My pleasure. Thanks all for spending your time to review thoughtfully, and merge this!

… HDFSBackedStateStoreProvider This patch proposes breaking down configuration of retaining batch size on state into two pieces: files and in memory (cache). While this patch reuses existing configuration for files, it introduces new configuration, "spark.sql.streaming.maxBatchesToRetainInMemory" to configure max count of batch to retain in memory. Apply this patch on top of SPARK-24441 (apache#21469), and manually tested in various workloads to ensure overall size of states in memory is around 2x or less of the size of latest version of state, while it was 10x ~ 80x before applying the patch. Author: Jungtaek Lim <[email protected]> Closes apache#21700 from HeartSaVioR/SPARK-24717.

#183) [SPARK-24717][SS] Split out max retain version of state for memory in HDFSBackedStateStoreProvider This patch proposes breaking down configuration of retaining batch size on state into two pieces: files and in memory (cache). While this patch reuses existing configuration for files, it introduces new configuration, "spark.sql.streaming.maxBatchesToRetainInMemory" to configure max count of batch to retain in memory. Apply this patch on top of SPARK-24441 (apache#21469), and manually tested in various workloads to ensure overall size of states in memory is around 2x or less of the size of latest version of state, while it was 10x ~ 80x before applying the patch. Author: Jungtaek Lim <[email protected]> Closes apache#21700 from HeartSaVioR/SPARK-24717.

HeartSaVioR force-pushed the SPARK-24717 branch from 22f0e22 to 45796d8 Compare July 2, 2018 22:46

HeartSaVioR force-pushed the SPARK-24717 branch from 45796d8 to cab25df Compare July 2, 2018 23:14

Fix javadoc style check

345b33a

HeartSaVioR force-pushed the SPARK-24717 branch from 0819412 to 345b33a Compare July 2, 2018 23:47

HeartSaVioR changed the title ~~SPARK-24717 Split out min retain version of state for memory in HDFSBackedStateStoreProvider~~ [SPARK-24717][SS] Split out min retain version of state for memory in HDFSBackedStateStoreProvider Jul 3, 2018

HeartSaVioR changed the title ~~[SPARK-24717][SS] Split out min retain version of state for memory in HDFSBackedStateStoreProvider~~ [SPARK-24717][SS] Split out max retain version of state for memory in HDFSBackedStateStoreProvider Jul 3, 2018

Elaborate on values which would be expected to be used normally

c50da7b

Fix java checkstyle via removing whitespace

d8b4bb8

tedyu reviewed Jul 5, 2018

View reviewed changes

Apply some optimization into putAll, and add various tests on putAll

2fd7b7c

Rename field as suggested in review comment

ee8b117

HeartSaVioR force-pushed the SPARK-24717 branch from 9c68fe1 to ee8b117 Compare July 5, 2018 12:59

Fix javadoc style check

35892b5

Move the core logic of BoundedSortedMap into HDFSBackedStateStoreProv…

be44d9c

…ider

HeartSaVioR force-pushed the SPARK-24717 branch from 6d5d4ba to be44d9c Compare July 10, 2018 22:30

Fix scala style

30fac38

HeartSaVioR commented Jul 11, 2018

View reviewed changes

jose-torres reviewed Jul 11, 2018

View reviewed changes

Add more test cases on MAX_BATCHES_TO_RETAIN_IN_MEMORY set to 2/1/0, …

e95e45a

…and fix a silly bug

HeartSaVioR commented Jul 11, 2018

View reviewed changes

tdas suggested changes Jul 17, 2018

View reviewed changes

HeartSaVioR added 2 commits July 17, 2018 18:13

Address review comments from @tdas

f07ad04

Fix scala style (mostly import)

02b4972

tdas reviewed Jul 19, 2018

View reviewed changes

Address review comments from @tdas

cf78a2a

asfgit closed this in 8b7d4f8 Jul 19, 2018

HeartSaVioR deleted the SPARK-24717 branch July 19, 2018 08:04

HeartSaVioR mentioned this pull request Mar 15, 2019

Scalable Memory option for HDFSBackedStateStore #21500

Closed

vatsalmevada mentioned this pull request Nov 8, 2019

SNAP-3219 - Retaining only two versions of the state store in memory TIBCOSoftware/snappy-spark#182

Closed

vatsalmevada mentioned this pull request Nov 8, 2019

[SPARK-24717][SS] Split out max retain version of state for memory in… TIBCOSoftware/snappy-spark#183

Merged


		var currentVersion = 0

		def restoreOriginValues(map: provider.MapType): Map[String, Int] = {

[SPARK-24717][SS] Split out max retain version of state for memory in HDFSBackedStateStoreProvider #21700

[SPARK-24717][SS] Split out max retain version of state for memory in HDFSBackedStateStoreProvider #21700

Uh oh!

Conversation

HeartSaVioR commented Jul 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

HeartSaVioR commented Jul 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Jul 2, 2018

Uh oh!

HeartSaVioR commented Jul 2, 2018

Uh oh!

HeartSaVioR commented Jul 2, 2018

Uh oh!

SparkQA commented Jul 2, 2018

Uh oh!

HeartSaVioR commented Jul 2, 2018

Uh oh!

SparkQA commented Jul 2, 2018

Uh oh!

SparkQA commented Jul 2, 2018

Uh oh!

SparkQA commented Jul 3, 2018

Uh oh!

SparkQA commented Jul 4, 2018

Uh oh!

SparkQA commented Jul 4, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Jul 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR commented Jul 5, 2018

Uh oh!

SparkQA commented Jul 5, 2018

Uh oh!

SparkQA commented Jul 5, 2018

Uh oh!

SparkQA commented Jul 5, 2018

Uh oh!

SparkQA commented Jul 10, 2018

Uh oh!

SparkQA commented Jul 10, 2018

Uh oh!

SparkQA commented Jul 11, 2018

Uh oh!

HeartSaVioR Jul 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jose-torres Jul 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR commented Jul 2, 2018 •

edited

Loading

HeartSaVioR commented Jul 2, 2018 •

edited

Loading

HeartSaVioR Jul 5, 2018 •

edited

Loading

HeartSaVioR Jul 11, 2018 •

edited

Loading

jose-torres Jul 11, 2018 •

edited

Loading

HeartSaVioR Jul 17, 2018 •

edited

Loading