[SPARK-27323][CORE][SQL][STREAMING] Use Single-Abstract-Method support in Scala 2.12 to simplify code #24241

srowen · 2019-03-29T18:05:06Z

What changes were proposed in this pull request?

Use Single Abstract Method syntax where possible (and minor related cleanup). Comments below. No logic should change here.

How was this patch tested?

Existing tests.

…leanup)

srowen · 2019-03-29T18:05:39Z

core/src/main/scala/org/apache/spark/BarrierCoordinator.scala

-        override def apply(key: ContextBarrierId): ContextBarrierState =
-          new ContextBarrierState(key, numTasks)
-      })
+      states.computeIfAbsent(barrierId,


90% of the changes are like this, converting an anonymous inner class to a SAM expression.

srowen · 2019-03-29T18:06:11Z

core/src/main/scala/org/apache/spark/SparkConf.scala

  /** Get an optional value, applying variable substitution. */
  private[spark] def getWithSubstitution(key: String): Option[String] = {
-    getOption(key).map(reader.substitute(_))
+    getOption(key).map(reader.substitute)


In the files I did change, I cleaned up a few other things in nearby code, like this. Other examples are using .nonEmpty and removing redundant braces, etc

SparkQA · 2019-03-29T22:19:12Z

Test build #104088 has finished for PR 24241 at commit 6a95688.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-03-30T04:53:53Z

Test build #104100 has finished for PR 24241 at commit 49fabf2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

core/src/main/scala/org/apache/spark/executor/Executor.scala

dongjoon-hyun

All transformations look nice. Thanks! If we add the following two cases, it will be complete. Could you add these together in this PR, @srowen ?

KinesisCheckpointerSuite.scala

-    when(checkpointerMock.checkpoint(anyString)).thenAnswer(new Answer[Unit] {
-      override def answer(invocations: InvocationOnMock): Unit = {
-        clock.waitTillTime(clock.getTimeMillis() + checkpointInterval.milliseconds / 2)
-      }
-    })
+    when(checkpointerMock.checkpoint(anyString)).thenAnswer((_: InvocationOnMock) =>
+      clock.waitTillTime(clock.getTimeMillis() + checkpointInterval.milliseconds / 2))

SparkSQLCLIDriver.scala

-    HiveInterruptUtils.add(new HiveInterruptCallback {
-      override def interrupt() {
-        // Handle remote execution mode
-        if (SparkSQLEnv.sparkContext != null) {
-          SparkSQLEnv.sparkContext.cancelAllJobs()
-        } else {
-          if (transport != null) {
-            // Force closing of TCP connection upon session termination
-            transport.getSocket.close()
-          }
+    HiveInterruptUtils.add(() => {
+      // Handle remote execution mode
+      if (SparkSQLEnv.sparkContext != null) {
+        SparkSQLEnv.sparkContext.cancelAllJobs()
+      } else {
+        if (transport != null) {
+          // Force closing of TCP connection upon session termination
+          transport.getSocket.close()

SparkQA · 2019-03-31T06:03:13Z

Test build #104120 has finished for PR 24241 at commit 205ca98.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-03-31T18:27:01Z

Test build #104140 has finished for PR 24241 at commit a94508f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-03-31T23:50:59Z

Test build #4673 has finished for PR 24241 at commit a94508f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2019-04-01T01:23:55Z

Looks the test failures are not persistent.

HyukjinKwon · 2019-04-01T01:23:59Z

retest this please

SparkQA · 2019-04-01T05:54:02Z

Test build #104151 has finished for PR 24241 at commit a94508f.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

attilapiros · 2019-04-01T17:21:13Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

      // multiple distinct keys might be treated as equal by the ordering. To deal with this, we
      // need to read all keys considered equal by the ordering at once and compare them.
-      new Iterator[Iterator[Product2[K, C]]] {
+      val it = new Iterator[Iterator[Product2[K, C]]] {


Nit (more like a question): why a new val is introduced here?

I was trying to convert the flatMap(i => i) below to simply flatten. For reasons even I'm not clear about, I had to introduce an intermediate val here then return it.flatten to get it to work. Seems harmless enough as a change but I didn't get the difference in type inference

attilapiros · 2019-04-01T17:57:23Z

core/src/test/scala/org/apache/spark/deploy/worker/DriverRunnerTest.scala

-        runner.runCommandWithRetry(processBuilder, p => (), supervise = superviseRetry)
-      }
-    }).when(runner).prepareAndRunDriver()
+    doAnswer((_: InvocationOnMock) =>


Nit: In this case would not we prefer to have {, like:

doAnswer { (_: InvocationOnMock) => }

I can change it. I don't have a strong preference. I default to not using blocks where not necessary, but for non-trivial blocks it's probably clearer to use them anyway

attilapiros · 2019-04-01T18:02:53Z

core/src/test/scala/org/apache/spark/scheduler/BlacklistTrackerSuite.scala

-          throw new IllegalStateException("hostA should be on the blacklist")
-        }
+    when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
+      if (blacklist.nodeBlacklist.contains("hostA")) {


Is the comment lost on purpose?

Oops, no probably lost by automated refactoring

I am kinda glad. Because I started to question whether I would like to read this all... Now I see it has value so, I continue it :)

attilapiros · 2019-04-01T18:06:46Z

core/src/test/scala/org/apache/spark/scheduler/BlacklistTrackerSuite.scala

-          throw new IllegalStateException("hostA should be on the blacklist")
-        }
+    when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
+      if (blacklist.nodeBlacklist.contains("hostA")) {


Now I am quite sure it is deleted on purpose.

attilapiros · 2019-04-01T18:34:06Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala

-  def numReceivers(): Int = {
-    receiverInputStreams.size
-  }
+  def numReceivers(): Int = receiverInputStreams.length


This fine (I just would like to help for the next reviewer), so the reason of this change:

Replace .size with .length on arrays and strings

Inspection info: This inspection reports array.size and string.size calls. While such calls are legitimate, they require an additional implicit conversion to SeqLike to be made. A common use case would be calling length on arrays and strings may provide significant advantages.

Yep, we ought to make this kind of change (and use lengthCompare on a similar note where sizes are compared) wherever perf matters. Probably a good habit in general, but, wouldn't change it unless the code is already changing. Here I just took the liberty of adjusting this one, though there's no big reason for it here.

More generally I have a number of large code cleanup changes I want to get into Spark 3, and although I won't change this particular issue wholesale, I do want to get in some code cleanup before a new major version.

attilapiros · 2019-04-01T18:40:11Z

There are a few more "((" which could be transformed to " { (":

(pr/24241) $ git diff 06abd06112 | grep "(([^)]" | grep "=>$"
+    when(resp.encodeRedirectURL(any())).thenAnswer((invocationOnMock: InvocationOnMock) =>
+    when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
+    when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
+    when(diskBlockManager.getFile(any[BlockId])).thenAnswer((invocation: InvocationOnMock) =>
+      Mockito.doAnswer((invocationOnMock: InvocationOnMock) =>
+    when(checkpointerMock.checkpoint(anyString)).thenAnswer((_: InvocationOnMock) =>

But I am fine to keep them as it is.

SparkQA · 2019-04-01T18:49:16Z

Test build #4676 has finished for PR 24241 at commit a94508f.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

attilapiros · 2019-04-01T19:16:08Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

-      // Use the reverse order because PriorityQueue dequeues the max
-      override def compare(x: Iter, y: Iter): Int = comparator.compare(y.head._1, x.head._1)
-    })
+    val heap = new mutable.PriorityQueue[Iter]()(


Now I am checking for missing comments by grep. It is semi-automatic so I am just sharing you the places I have found, here we lost:

// Use the reverse order because PriorityQueue dequeues the max

Sure, I can restore that, it's minor

attilapiros · 2019-04-01T19:18:31Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala

+      val dataSource =
+        DataSource(
+          sparkSession,
+          // In older version(prior to 2.1) of Spark, the table schema can be empty and should be


No comment is missing but should we still support this?

Fair question. I won't change it here, as maybe there's still a use case for reading stuff written by Spark 2.1 in Spark 2.4, for example.

attilapiros

Some questions left but after that LGTM.

SparkQA · 2019-04-01T22:55:35Z

Test build #104168 has finished for PR 24241 at commit 64e01d9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-01T23:04:33Z

Test build #104169 has finished for PR 24241 at commit 028e1b0.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-01T23:56:17Z

Test build #104171 has finished for PR 24241 at commit 827c320.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-04-02T02:38:29Z

Hi, @srowen . This PR cannot pass the Jenkins due to a bug introduced by another commit, #23797.

I made a PR to fix the master branch. Please see #24268 .

SparkQA · 2019-04-02T05:50:53Z

Test build #4677 has finished for PR 24241 at commit 827c320.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-04-02T05:58:13Z

#24268 is merged now.

dongjoon-hyun · 2019-04-02T05:58:19Z

Retest this please.

SparkQA · 2019-04-02T07:05:02Z

Test build #104185 has finished for PR 24241 at commit 827c320.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-04-02T07:18:19Z

Retest this please.

dongjoon-hyun · 2019-04-02T08:12:10Z

Retest this please. (There was a revert in master branch due to UT failure.)

SparkQA · 2019-04-02T11:44:10Z

Test build #104193 has finished for PR 24241 at commit 827c320.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

attilapiros · 2019-04-02T12:34:37Z

This must be either a flaky test or something with the auto-merge with master as I have checked this PR out locally and run:

> testOnly *.StreamingAggregationSuite

In sbt and All tests passed.

attilapiros · 2019-04-02T12:34:48Z

Retest this please.

SparkQA · 2019-04-02T13:12:54Z

Test build #104197 has finished for PR 24241 at commit 827c320.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-02T13:25:05Z

Test build #104198 has finished for PR 24241 at commit 827c320.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-04-02T14:33:40Z

@attilapiros . I reverted the offending commit from the master. :) It was a persistent UT failure across all PRs and master branch.

dongjoon-hyun

+1, LGTM. Merged to master.
Thank you, @srowen and @attilapiros !

SparkQA · 2019-04-02T17:53:13Z

Test build #104212 has finished for PR 24241 at commit 827c320.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

Use Single Abstract Method syntax where possible (and minor related c…

6a95688

…leanup)

srowen self-assigned this Mar 29, 2019

srowen commented Mar 29, 2019

View reviewed changes

Restore private method

49fabf2

dongjoon-hyun reviewed Mar 30, 2019

View reviewed changes

core/src/main/scala/org/apache/spark/executor/Executor.scala Show resolved Hide resolved

Restore comment

205ca98

dongjoon-hyun reviewed Mar 31, 2019

View reviewed changes

Catch a few more cases

a94508f

attilapiros reviewed Apr 1, 2019

View reviewed changes

Review comments

64e01d9

attilapiros reviewed Apr 1, 2019

View reviewed changes

Review comments

028e1b0

attilapiros reviewed Apr 1, 2019

View reviewed changes

One more comment fix

827c320

apache deleted a comment from SparkQA Apr 2, 2019

dongjoon-hyun approved these changes Apr 2, 2019

View reviewed changes

dongjoon-hyun closed this in d4420b4 Apr 2, 2019

srowen deleted the SPARK-27323 branch April 7, 2019 17:30

squito mentioned this pull request Apr 16, 2019

[SPARK-26329][CORE] Faster polling of executor memory metrics. #23767

Closed

[SPARK-27323][CORE][SQL][STREAMING] Use Single-Abstract-Method support in Scala 2.12 to simplify code #24241

[SPARK-27323][CORE][SQL][STREAMING] Use Single-Abstract-Method support in Scala 2.12 to simplify code #24241

Uh oh!

Conversation

srowen commented Mar 29, 2019

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 29, 2019

Uh oh!

SparkQA commented Mar 30, 2019

Uh oh!

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 31, 2019

Uh oh!

SparkQA commented Mar 31, 2019

Uh oh!

SparkQA commented Mar 31, 2019

Uh oh!

HyukjinKwon commented Apr 1, 2019

Uh oh!

HyukjinKwon commented Apr 1, 2019

Uh oh!

SparkQA commented Apr 1, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

attilapiros Apr 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

attilapiros commented Apr 1, 2019

Uh oh!

SparkQA commented Apr 1, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

attilapiros left a comment

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 1, 2019

Uh oh!

SparkQA commented Apr 1, 2019

Uh oh!

SparkQA commented Apr 1, 2019

Uh oh!

dongjoon-hyun commented Apr 2, 2019

Uh oh!

SparkQA commented Apr 2, 2019

Uh oh!

dongjoon-hyun left a comment •

edited

Loading

attilapiros Apr 1, 2019 •

edited

Loading

dongjoon-hyun commented Apr 2, 2019 •

edited

Loading

attilapiros commented Apr 2, 2019 •

edited

Loading

dongjoon-hyun commented Apr 2, 2019 •

edited

Loading