-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-15165][SPARK-15205][SQL] Introduce place holder for comments in generated code #12979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #58079 has finished for PR 12979 at commit
|
|
Test build #58394 has finished for PR 12979 at commit
|
|
@sarutak It's expected to compile twice on two different queries, it does not worth to optimize this corner case (ideally it should generate different source code even without comments). |
|
Test build #58746 has finished for PR 12979 at commit
|
| // as a function before. In that case, we just re-use it. | ||
| val code = s"/* ${toCommentSafeString(this.toString)} */" | ||
| ExprCode(code, subExprState.isNull, subExprState.value) | ||
| val placeHolder = s"/*{${ctx.freshName("comment_placeholder")}}*/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add extra {} here? we could also make comment_placeholder shorter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, } is really needed.
If we have place holders like comment_placeholder1 and comment_placeholder12, and the actual comment Hello corresponds to comment_placeholder1, comment_placeholder1 will be replaced with Hello but also comment_placeholder12 will be replaced with Hello2.
|
These place holders are good for safety, could you update the PR title and description? |
|
Test build #58866 has finished for PR 12979 at commit
|
|
Test build #58944 has finished for PR 12979 at commit
|
|
retest this please. |
|
Test build #58945 has finished for PR 12979 at commit
|
|
Test build #58946 has finished for PR 12979 at commit
|
4029d44 to
7fd047c
Compare
|
Test build #59003 has finished for PR 12979 at commit
|
|
Test build #59008 has finished for PR 12979 at commit
|
|
LGTM, |
… in generated code ## What changes were proposed in this pull request? This PR introduce place holder for comment in generated code and the purpose is same for #12939 but much safer. Generated code to be compiled doesn't include actual comments but includes place holder instead. Place holders in generated code will be replaced with actual comments only at the time of logging. Also, this PR can resolve SPARK-15205. ## How was this patch tested? Existing tests. Author: Kousuke Saruta <[email protected]> Closes #12979 from sarutak/SPARK-15205. (cherry picked from commit 22947cd) Signed-off-by: Davies Liu <[email protected]>
|
@sarutak Could you create another PR for 1.6? (If we have not fix the security bug in 1.6) |
|
O.K, I'll do it. Need another PR for |
|
Maybe 1.5 and 1.6 could share the same PR (if no much conflicts) |
|
Test build #59001 has finished for PR 12979 at commit
|
|
Test build #59004 has finished for PR 12979 at commit
|
## What changes were proposed in this pull request? This PR fixes 3 slow tests: 1. `ParquetQuerySuite.read/write wide table`: This is not a good unit test as it runs more than 5 minutes. This PR removes it and add a new regression test in `CodeGenerationSuite`, which is more "unit". 2. `ParquetQuerySuite.returning batch for wide table`: reduce the threshold and use smaller data size. 3. `DatasetSuite.SPARK-14554: Dataset.map may generate wrong java code for wide table`: Improve `CodeFormatter.format`(introduced at #12979) can dramatically speed this it up. ## How was this patch tested? N/A Author: Wenchen Fan <[email protected]> Closes #13273 from cloud-fan/test. (cherry picked from commit 50b660d) Signed-off-by: Cheng Lian <[email protected]>
## What changes were proposed in this pull request? This PR fixes 3 slow tests: 1. `ParquetQuerySuite.read/write wide table`: This is not a good unit test as it runs more than 5 minutes. This PR removes it and add a new regression test in `CodeGenerationSuite`, which is more "unit". 2. `ParquetQuerySuite.returning batch for wide table`: reduce the threshold and use smaller data size. 3. `DatasetSuite.SPARK-14554: Dataset.map may generate wrong java code for wide table`: Improve `CodeFormatter.format`(introduced at #12979) can dramatically speed this it up. ## How was this patch tested? N/A Author: Wenchen Fan <[email protected]> Closes #13273 from cloud-fan/test.
What changes were proposed in this pull request?
This PR introduce place holder for comment in generated code and the purpose is same for #12939 but much safer.
Generated code to be compiled doesn't include actual comments but includes place holder instead.
Place holders in generated code will be replaced with actual comments only at the time of logging.
Also, this PR can resolve SPARK-15205.
How was this patch tested?
Existing tests.