Skip to content

Conversation

@srowen
Copy link
Member

@srowen srowen commented Mar 29, 2019

What changes were proposed in this pull request?

Use Single Abstract Method syntax where possible (and minor related cleanup). Comments below. No logic should change here.

How was this patch tested?

Existing tests.

@srowen srowen self-assigned this Mar 29, 2019
override def apply(key: ContextBarrierId): ContextBarrierState =
new ContextBarrierState(key, numTasks)
})
states.computeIfAbsent(barrierId,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

90% of the changes are like this, converting an anonymous inner class to a SAM expression.

/** Get an optional value, applying variable substitution. */
private[spark] def getWithSubstitution(key: String): Option[String] = {
getOption(key).map(reader.substitute(_))
getOption(key).map(reader.substitute)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the files I did change, I cleaned up a few other things in nearby code, like this. Other examples are using .nonEmpty and removing redundant braces, etc

@SparkQA
Copy link

SparkQA commented Mar 29, 2019

Test build #104088 has finished for PR 24241 at commit 6a95688.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 30, 2019

Test build #104100 has finished for PR 24241 at commit 49fabf2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All transformations look nice. Thanks! If we add the following two cases, it will be complete. Could you add these together in this PR, @srowen ?

  • KinesisCheckpointerSuite.scala
-    when(checkpointerMock.checkpoint(anyString)).thenAnswer(new Answer[Unit] {
-      override def answer(invocations: InvocationOnMock): Unit = {
-        clock.waitTillTime(clock.getTimeMillis() + checkpointInterval.milliseconds / 2)
-      }
-    })
+    when(checkpointerMock.checkpoint(anyString)).thenAnswer((_: InvocationOnMock) =>
+      clock.waitTillTime(clock.getTimeMillis() + checkpointInterval.milliseconds / 2))
  • SparkSQLCLIDriver.scala
-    HiveInterruptUtils.add(new HiveInterruptCallback {
-      override def interrupt() {
-        // Handle remote execution mode
-        if (SparkSQLEnv.sparkContext != null) {
-          SparkSQLEnv.sparkContext.cancelAllJobs()
-        } else {
-          if (transport != null) {
-            // Force closing of TCP connection upon session termination
-            transport.getSocket.close()
-          }
+    HiveInterruptUtils.add(() => {
+      // Handle remote execution mode
+      if (SparkSQLEnv.sparkContext != null) {
+        SparkSQLEnv.sparkContext.cancelAllJobs()
+      } else {
+        if (transport != null) {
+          // Force closing of TCP connection upon session termination
+          transport.getSocket.close()

@SparkQA
Copy link

SparkQA commented Mar 31, 2019

Test build #104120 has finished for PR 24241 at commit 205ca98.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 31, 2019

Test build #104140 has finished for PR 24241 at commit a94508f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 31, 2019

Test build #4673 has finished for PR 24241 at commit a94508f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Looks the test failures are not persistent.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Apr 1, 2019

Test build #104151 has finished for PR 24241 at commit a94508f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// multiple distinct keys might be treated as equal by the ordering. To deal with this, we
// need to read all keys considered equal by the ordering at once and compare them.
new Iterator[Iterator[Product2[K, C]]] {
val it = new Iterator[Iterator[Product2[K, C]]] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (more like a question): why a new val is introduced here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to convert the flatMap(i => i) below to simply flatten. For reasons even I'm not clear about, I had to introduce an intermediate val here then return it.flatten to get it to work. Seems harmless enough as a change but I didn't get the difference in type inference

runner.runCommandWithRetry(processBuilder, p => (), supervise = superviseRetry)
}
}).when(runner).prepareAndRunDriver()
doAnswer((_: InvocationOnMock) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: In this case would not we prefer to have {, like:

doAnswer { (_: InvocationOnMock) =>
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it. I don't have a strong preference. I default to not using blocks where not necessary, but for non-trivial blocks it's probably clearer to use them anyway

throw new IllegalStateException("hostA should be on the blacklist")
}
when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
if (blacklist.nodeBlacklist.contains("hostA")) {
Copy link
Contributor

@attilapiros attilapiros Apr 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the comment lost on purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, no probably lost by automated refactoring

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am kinda glad. Because I started to question whether I would like to read this all... Now I see it has value so, I continue it :)

throw new IllegalStateException("hostA should be on the blacklist")
}
when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
if (blacklist.nodeBlacklist.contains("hostA")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I am quite sure it is deleted on purpose.

def numReceivers(): Int = {
receiverInputStreams.size
}
def numReceivers(): Int = receiverInputStreams.length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fine (I just would like to help for the next reviewer), so the reason of this change:

Replace .size with .length on arrays and strings

Inspection info: This inspection reports array.size and string.size calls. While such calls are legitimate, they require an additional implicit conversion to SeqLike to be made. A common use case would be calling length on arrays and strings may provide significant advantages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we ought to make this kind of change (and use lengthCompare on a similar note where sizes are compared) wherever perf matters. Probably a good habit in general, but, wouldn't change it unless the code is already changing. Here I just took the liberty of adjusting this one, though there's no big reason for it here.

More generally I have a number of large code cleanup changes I want to get into Spark 3, and although I won't change this particular issue wholesale, I do want to get in some code cleanup before a new major version.

@attilapiros
Copy link
Contributor

There are a few more "((" which could be transformed to " { (":

(pr/24241) $ git diff 06abd06112 | grep "(([^)]" | grep "=>$"
+    when(resp.encodeRedirectURL(any())).thenAnswer((invocationOnMock: InvocationOnMock) =>
+    when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
+    when(allocationClientMock.killExecutorsOnHost("hostA")).thenAnswer((_: InvocationOnMock) =>
+    when(diskBlockManager.getFile(any[BlockId])).thenAnswer((invocation: InvocationOnMock) =>
+      Mockito.doAnswer((invocationOnMock: InvocationOnMock) =>
+    when(checkpointerMock.checkpoint(anyString)).thenAnswer((_: InvocationOnMock) =>

But I am fine to keep them as it is.

@SparkQA
Copy link

SparkQA commented Apr 1, 2019

Test build #4676 has finished for PR 24241 at commit a94508f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// Use the reverse order because PriorityQueue dequeues the max
override def compare(x: Iter, y: Iter): Int = comparator.compare(y.head._1, x.head._1)
})
val heap = new mutable.PriorityQueue[Iter]()(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I am checking for missing comments by grep. It is semi-automatic so I am just sharing you the places I have found, here we lost:

 // Use the reverse order because PriorityQueue dequeues the max

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can restore that, it's minor

val dataSource =
DataSource(
sparkSession,
// In older version(prior to 2.1) of Spark, the table schema can be empty and should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment is missing but should we still support this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair question. I won't change it here, as maybe there's still a use case for reading stuff written by Spark 2.1 in Spark 2.4, for example.

Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions left but after that LGTM.

@SparkQA
Copy link

SparkQA commented Apr 1, 2019

Test build #104168 has finished for PR 24241 at commit 64e01d9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 1, 2019

Test build #104169 has finished for PR 24241 at commit 028e1b0.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 1, 2019

Test build #104171 has finished for PR 24241 at commit 827c320.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Hi, @srowen . This PR cannot pass the Jenkins due to a bug introduced by another commit, #23797.

I made a PR to fix the master branch. Please see #24268 .

@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #4677 has finished for PR 24241 at commit 827c320.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

#24268 is merged now.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #104185 has finished for PR 24241 at commit 827c320.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Apr 2, 2019

Retest this please. (There was a revert in master branch due to UT failure.)

@apache apache deleted a comment from SparkQA Apr 2, 2019
@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #104193 has finished for PR 24241 at commit 827c320.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@attilapiros
Copy link
Contributor

attilapiros commented Apr 2, 2019

This must be either a flaky test or something with the auto-merge with master as I have checked this PR out locally and run:

> testOnly *.StreamingAggregationSuite

In sbt and All tests passed.

@attilapiros
Copy link
Contributor

Retest this please.

@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #104197 has finished for PR 24241 at commit 827c320.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #104198 has finished for PR 24241 at commit 827c320.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Apr 2, 2019

@attilapiros . I reverted the offending commit from the master. :) It was a persistent UT failure across all PRs and master branch.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.
Thank you, @srowen and @attilapiros !

@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #104212 has finished for PR 24241 at commit 827c320.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants