Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

This patch makes several test flakiness fixes.

How was this patch tested?

N/A

@gatorsmile gatorsmile changed the title [SPARK-27460][SPARK-27460] Fix flaky tests [SPARK-27460][FOLLOW-UP][TESTS] Fix flaky tests Apr 22, 2019
@gatorsmile
Copy link
Member Author

cc @gengliangwang

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #104794 has finished for PR 24434 at commit 6cbf7e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member

LGTM, I did see some of these flaky tests in #24373 . Increasing the timeout makes the parallel tests more robust.
I will trigger a round of concurrent Jenkins jobs to see if any flaky tests still existed.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4760 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4757 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4759 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4758 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4761 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #104798 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4762 has finished for PR 24434 at commit 6cbf7e7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


try {
eventually(timeout(30.seconds), interval(100.milliseconds)) {
eventually(timeout(180.seconds), interval(100.milliseconds)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write "3.minutes" in cases like this and others below, but it doesn't matter.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4763 has finished for PR 24434 at commit 6cbf7e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4764 has finished for PR 24434 at commit 6cbf7e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4765 has finished for PR 24434 at commit 6cbf7e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #104810 has finished for PR 24434 at commit 3224966.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2019

Test build #4767 has finished for PR 24434 at commit 3224966.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 22, 2019

That's a weird error... it's from javadoc:

[error] /home/jenkins/workspace/NewSparkPullRequestBuilder/core/target/java/org/apache/spark/serializer/SerializationDebugger.java:159: error: cannot find symbol
[error]   static private  org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassReflection reflect ()  { throw new RuntimeException(); }
[error]                                                                    ^
[error]   symbol:   class ObjectStreamClassReflection
[error]   location: class SerializationDebugger
[error] /home/jenkins/workspace/NewSparkPullRequestBuilder/core/target/java/org/apache/spark/serializer/SerializationDebugger.java:22: error: class SerializationDebugger is already defined in package org.apache.spark.serializer
[error]   static private  class SerializationDebugger {
[error]                   ^
[error] /home/jenkins/workspace/NewSparkPullRequestBuilder/external/kafka-0-10-token-provider/target/java/org/apache/spark/kafka010/KafkaDelegationTokenTest.java:6: error: illegal combination of modifiers: public and private
[error]   private  class KafkaJaasConfiguration extends javax.security.auth.login.Configuration {
[error]            ^
...

It's from javadoc'ing the Java code auto-generated from Scala, and this kind of thing is kind of a known problem; I thought we didn't run javadoc on this part. I have no idea so far why it came up in running these tests!

@cloud-fan cloud-fan closed this Apr 23, 2019
@cloud-fan cloud-fan reopened this Apr 23, 2019
@cloud-fan
Copy link
Contributor

ok to test

@HyukjinKwon
Copy link
Member

If it still fails, let me fix it against this branch since I am kind of used to it.

@SparkQA
Copy link

SparkQA commented Apr 23, 2019

Test build #104819 has finished for PR 24434 at commit 3224966.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Will take a look late today (KST)

@SparkQA
Copy link

SparkQA commented Apr 23, 2019

Test build #104820 has finished for PR 24434 at commit 3224966.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 23, 2019

Note that these errors also appear in 'normal' builds; they're just warnings. I still have no idea why they are rendered as errors due to this change. It doesn't touch the build. I could try separately updating sbt-unidoc, but don't think that's the issue. Still looking...

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Apr 23, 2019

Yes .. I think we talked about this few times long long ago @srowen. I made some analysis in gatorsmile#5

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Apr 24, 2019

Test build #104853 has finished for PR 24434 at commit 3224966.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 24, 2019

Test build #104857 has finished for PR 24434 at commit e5a5991.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 24, 2019

Test build #104860 has finished for PR 24434 at commit e5a5991.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in cd4a284 Apr 24, 2019
dongjoon-hyun pushed a commit that referenced this pull request Sep 20, 2019
…rked JVMs for higher parallelism

## What changes were proposed in this pull request?

This is a backport of #24373 , #24404 and #24434

This patch modifies SparkBuild so that the largest / slowest test suites (or collections of suites) can run in their own forked JVMs, allowing them to be run in parallel with each other. This opt-in / whitelisting approach allows us to increase parallelism without having to fix a long-tail of flakiness / brittleness issues in tests which aren't performance bottlenecks.

See comments in SparkBuild.scala for information on the details, including a summary of why we sometimes opt to run entire groups of tests in a single forked JVM .

The time of full new pull request test in Jenkins is reduced by around 53%:
before changes: 4hr 40min
after changes: 2hr 13min

## How was this patch tested?

Unit test

Closes #25861 from dongjoon-hyun/SPARK-27460.

Lead-authored-by: Gengliang Wang <[email protected]>
Co-authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants