Skip to content

Conversation

@jerqi
Copy link
Contributor

@jerqi jerqi commented Jul 24, 2021

What changes were proposed in this pull request?

In current github actions we run TPCDSQueryTestSuite for tpcds benchmark. But it's only tested under default configurations. Since we have added the spark.sql.join.forceApplyShuffledHashJoin config. Now we can test all 3 join strategies in TPCDS to improve the coverage.

Why are the changes needed?

Improve the coverage of join strategies in the TPCDS.

Does this PR introduce any user-facing change?

No, only for testing.

How was this patch tested?

No need.

@github-actions github-actions bot added the SQL label Jul 24, 2021
@jerqi jerqi changed the title [SPARK-36223][SQL][TEST] Cover 3 kinds of join in the TPCDSQueryTestSuite [WIP][SPARK-36223][SQL][TEST] Cover 3 kinds of join in the TPCDSQueryTestSuite Jul 25, 2021
@jerqi jerqi force-pushed the SPARK-36223 branch 7 times, most recently from a603ffc to 46bbf1f Compare July 25, 2021 16:11
@jerqi jerqi changed the title [WIP][SPARK-36223][SQL][TEST] Cover 3 kinds of join in the TPCDSQueryTestSuite [SPARK-36223][SQL][TEST] Cover 3 kinds of join in the TPCDSQueryTestSuite Jul 25, 2021
@jerqi jerqi force-pushed the SPARK-36223 branch 4 times, most recently from 42bb832 to 763c492 Compare August 6, 2021 13:58
@cloud-fan
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Nov 10, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49535/

@SparkQA
Copy link

SparkQA commented Nov 10, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49535/

@SparkQA
Copy link

SparkQA commented Nov 10, 2021

Test build #145066 has finished for PR 33510 at commit 8e2f4fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49589/

@SparkQA
Copy link

SparkQA commented Nov 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49590/

@SparkQA
Copy link

SparkQA commented Nov 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49589/

@SparkQA
Copy link

SparkQA commented Nov 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49590/

@SparkQA
Copy link

SparkQA commented Nov 12, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49640/

@SparkQA
Copy link

SparkQA commented Nov 12, 2021

Test build #145167 has finished for PR 33510 at commit cb0d3bc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 12, 2021

Test build #145170 has finished for PR 33510 at commit cf38cab.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49692/

@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49692/

@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49693/

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 7070eb5 Nov 15, 2021
@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49693/

@jerqi
Copy link
Contributor Author

jerqi commented Nov 15, 2021

Thank you @cloud-fan @linhongliu-db and @HyukjinKwon for review.

@jerqi jerqi deleted the SPARK-36223 branch November 15, 2021 08:19
@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Test build #145222 has finished for PR 33510 at commit c4ff4ab.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Test build #145223 has finished for PR 33510 at commit 510fa61.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

assertResult(expectedSchema, s"Schema did not match\n$queryString") { schema }
assertResult(expectedOutput, s"Result did not match\n$queryString") { outputString }
}
val joinConfSet: Set[Map[String, String]] =
Copy link
Member

@HyukjinKwon HyukjinKwon Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why is this set? Then joinConfSet.head won't be deterministic below, and there would be no point of needSort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me fix it together at #34698

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, seq may be better.

HyukjinKwon added a commit that referenced this pull request Nov 25, 2021
### What changes were proposed in this pull request?

This is kind of a followup for #33510 and #34641. This PR proposes to split TPC-DS build in GitHub Actions.

### Why are the changes needed?

Running these queries easily causes out-of-memory in GitHub Actions machines, and make the build flaky. We should deflake it.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions in this PR should test it out.

Closes #34698 from HyukjinKwon/split-tpcds.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants