[MINOR][CORE] Cleanup dead code and duplication in Mem. Management #19254

original-brownbear · 2017-09-16T14:54:37Z

What changes were proposed in this pull request?

Removed the method org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter#alignToWords.
It became unused as a result of 85b0a15
(SPARK-15962) introducing word alignment for unsafe arrays.
Cleaned up duplicate code in memory management and unsafe sorters
- The change extracting the exception paths is more than just cosmetics since it def. reduces the size the affected methods compile to

How was this patch tested?

Build still passes after removing the method, grepping the codebase for alignToWords shows no reference to it anywhere either.
Dried up code is covered by existing tests.

SparkQA · 2017-09-16T19:43:16Z

Test build #3922 has finished for PR 19254 at commit 84094a0.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

original-brownbear · 2017-09-16T19:47:31Z

Test failure in PySpark appears unrelated, various tests OOMed like e.g.

FAILED (errors=2)
[Running <class 'pyspark.streaming.tests.StreamingListenerTests'>]
ERROR

======================================================================
ERROR: setUpClass (pyspark.streaming.tests.StreamingListenerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/NewSparkPullRequestBuilder/python/pyspark/streaming/tests.py", line 65, in setUpClass
    cls.sc = SparkContext(appName=class_name, conf=conf)
  File "/home/jenkins/workspace/NewSparkPullRequestBuilder/python/pyspark/context.py", line 118, in __init__
    conf, jsc, profiler_cls)
  File "/home/jenkins/workspace/NewSparkPullRequestBuilder/python/pyspark/context.py", line 180, in _do_init
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/home/jenkins/workspace/NewSparkPullRequestBuilder/python/pyspark/context.py", line 270, in _initialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/home/jenkins/workspace/NewSparkPullRequestBuilder/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 1428, in __call__
    answer, self._gateway_client, None, self._fqn)
  File "/home/jenkins/workspace/NewSparkPullRequestBuilder/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py", line 320, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.OutOfMemoryError: Java heap space

srowen

Looks OK. Minor cleanup but valid

srowen · 2017-09-17T08:25:03Z

core/src/main/java/org/apache/spark/memory/MemoryConsumer.java

While here, pull this check into the method and call it "checkAllocation" or something?

@srowen I didn't do this on purpose so that allocatePage and allocateArray compile smaller since the condition isn't true very often. Not worth the duplication here?

srowen · 2017-09-17T08:25:46Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

The two if conditions could be collapsed

Right, will collapse those :)

SparkQA · 2017-09-17T11:04:33Z

Test build #3923 has finished for PR 19254 at commit 84094a0.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-09-17T13:42:42Z

Test build #3924 has finished for PR 19254 at commit 84094a0.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

original-brownbear · 2017-09-17T13:46:28Z

Test failure looks unrelated (even though it happened twice in a row now):

[info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED *** (1 second, 114 milliseconds)
[info]   spark-submit returned with exit code 1.
[info]   Command line: './bin/spark-submit' '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 'spark.sql.warehouse.dir=/home/jenkins/workspace/NewSparkPullRequestBuilder/target/tmp/warehouse-10257aaf-1819-4ac2-a504-d4d325c474eb' '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' '-Dderby.system.home=/home/jenkins/workspace/NewSparkPullRequestBuilder/target/tmp/warehouse-10257aaf-1819-4ac2-a504-d4d325c474eb' '/home/jenkins/workspace/NewSparkPullRequestBuilder/target/tmp/test5614170310518985767.py'
[info]   
[info]   2017-09-17 06:11:01.557 - stderr> Error: Could not find or load main class org.apache.spark.launcher.Main (SparkSubmitTestUtils.scala:81)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:528)
[info]   at org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
[info]   at org.scalatest.Assertions$class.fail(Assertions.scala:1089)
[info]   at org.scalatest.FunSuite.fail(FunSuite.scala:1560)
[info]   at org.apache.spark.sql.hive.SparkSubmitTestUtils$class.runSparkSubmit(SparkSubmitTestUtils.scala:81)
[info]   at org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite.runSparkSubmit(HiveExternalCatalogVersionsSuite.scala:38)
[info]   at org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite$$anonfun$beforeAll$1.apply(HiveExternalCatalogVersionsSuite.scala:120)
[info]   at org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite$$anonfun$beforeAll$1.apply(HiveExternalCatalogVersionsSuite.scala:105)
[info]   at scala.collection.immutable.List.foreach(List.scala:381)
[info]   at org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite.beforeAll(HiveExternalCatalogVersionsSuite.scala:105)
[info]   at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)
[info]   at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
[info]   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
[info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314)
[info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:480)
[info]   at sbt.ForkMain$Run$2.call(ForkMain.java:296)
[info]   at sbt.ForkMain$Run$2.call(ForkMain.java:286)
[info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[info]   at java.lang.Thread.run(Thread.java:745)

The same test passes fine for me locally (and passed on Jenkins once before when only PySpark failed).

original-brownbear · 2017-09-17T13:57:17Z

Seems to be broken on Jenkins master too https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-sbt-hadoop-2.7/3490/

srowen · 2017-09-17T14:24:31Z

I think it's a problem with a recent change, yeah. Just wait until it can be resolved then we can retest

original-brownbear · 2017-09-18T07:35:54Z

Sure, added a JIRA for the failure here https://issues.apache.org/jira/browse/SPARK-22047 :)

* Removed the method `org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter#alignToWords`. It became unused as a result of 85b0a15 (SPARK-15962) introducing word alignment for unsafe arrays. * Cleaned up duplicate code in memory management and unsafe sorters * The change extracting the exception paths is more than just cosmetics since it def. reduces the size the affected methods compile to * Build still passes after removing the method, grepping the codebase for `alignToWords` shows no reference to it anywhere either. * Dried up code is covered by existing tests.

original-brownbear · 2017-09-18T09:13:59Z

@srowen rebased against master to get the test ignore 894a756 , should be good to retest now :)

SparkQA · 2017-09-18T18:24:37Z

Test build #3925 has finished for PR 19254 at commit 6b408a0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-09-19T09:06:38Z

Merged to master

srowen approved these changes Sep 17, 2017

View reviewed changes

original-brownbear added 2 commits September 18, 2017 11:11

Collapse nested if statements

6b408a0

original-brownbear force-pushed the cleanup-mem-consumer branch from e2d05d1 to 6b408a0 Compare September 18, 2017 09:12

asfgit closed this in 7c92351 Sep 19, 2017

[MINOR][CORE] Cleanup dead code and duplication in Mem. Management #19254

[MINOR][CORE] Cleanup dead code and duplication in Mem. Management #19254

Uh oh!

Conversation

original-brownbear commented Sep 16, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Sep 16, 2017

Uh oh!

original-brownbear commented Sep 16, 2017

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

srowen Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srowen Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 17, 2017

Uh oh!

SparkQA commented Sep 17, 2017

Uh oh!

original-brownbear commented Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

original-brownbear commented Sep 17, 2017

Uh oh!

srowen commented Sep 17, 2017

Uh oh!

original-brownbear commented Sep 18, 2017

Uh oh!

original-brownbear commented Sep 18, 2017

Uh oh!

SparkQA commented Sep 18, 2017

Uh oh!

srowen commented Sep 19, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

original-brownbear Sep 17, 2017 •

edited

Loading

original-brownbear commented Sep 17, 2017 •

edited

Loading