Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This PR fixes 3 slow tests:

  1. ParquetQuerySuite.read/write wide table: This is not a good unit test as it runs more than 5 minutes. This PR removes it and add a new regression test in CodeGenerationSuite, which is more "unit".
  2. ParquetQuerySuite.returning batch for wide table: reduce the threshold and use smaller data size.
  3. DatasetSuite.SPARK-14554: Dataset.map may generate wrong java code for wide table: Improve CodeFormatter.format(introduced at [SPARK-15165][SPARK-15205][SQL] Introduce place holder for comments in generated code #12979) can dramatically speed this it up.

How was this patch tested?

N/A

@cloud-fan
Copy link
Contributor Author

cc @davies @andrewor14 @yhuai

val trimmed = line.trim
formatter.addLine(code.comment.getOrElse(trimmed, trimmed))
}
formatter.result()
Copy link
Contributor Author

@cloud-fan cloud-fan May 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sarutak , here I assume the placeholder will always take an entire line, is it corrected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How slow is it if we use an regexp to match the placeholder here?

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59178 has finished for PR 13273 at commit 216fc5c.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59179 has finished for PR 13273 at commit 216fc5c.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented May 24, 2016

test this please

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59205 has finished for PR 13273 at commit 216fc5c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thank you for fixing it!

@davies
Copy link
Contributor

davies commented May 24, 2016

LGTM

val line = l.trim()
val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
line.startsWith("/*") && line.endsWith("*/") &&
map(lastLine).substring(3).contains(map(line).substring(3))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous one doesn't handle the case that the line is not in the comments map.

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59210 has finished for PR 13273 at commit 637bf4a.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59219 has finished for PR 13273 at commit a004ed3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59220 has finished for PR 13273 at commit a004ed3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

m => code.comment.getOrElse(m.group(1), m.group(0)))
formatter.addLine(comment)
}
formatter.result()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, SGTM!

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59230 has finished for PR 13273 at commit 9e844b7.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 25, 2016

Test build #59231 has finished for PR 13273 at commit 9e844b7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 25, 2016

Test build #59242 has finished for PR 13273 at commit eff5e25.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor

Merging to master and branch-2.0.

asfgit pushed a commit that referenced this pull request May 25, 2016
## What changes were proposed in this pull request?

This PR fixes 3 slow tests:

1. `ParquetQuerySuite.read/write wide table`: This is not a good unit test as it runs more than 5 minutes. This PR removes it and add a new regression test in `CodeGenerationSuite`, which is more "unit".
2. `ParquetQuerySuite.returning batch for wide table`: reduce the threshold and use smaller data size.
3. `DatasetSuite.SPARK-14554: Dataset.map may generate wrong java code for wide table`: Improve `CodeFormatter.format`(introduced at #12979) can dramatically speed this it up.

## How was this patch tested?

N/A

Author: Wenchen Fan <[email protected]>

Closes #13273 from cloud-fan/test.

(cherry picked from commit 50b660d)
Signed-off-by: Cheng Lian <[email protected]>
@asfgit asfgit closed this in 50b660d May 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants