-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13432][SQL] add the source file name and line into a generated Java code #11301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This PR is derived from an offline discussion with @sarutak . |
|
Jenkins, retest this please |
|
Test build #51664 has finished for PR 11301 at commit
|
|
ping @sarutak |
|
Thanks @kiszk ! I'll check this later. |
|
JIRA-13644 handles the following feature. This PR does not implement the following feature in a stack trace I also add one feature. This is to show a message points out the origin of a generated method when an exception occurs in the generated method at runtime. An example of a message (the first line is newly added) Here is a part of generated code. |
|
Closed by my mistake. Reopen this. |
|
Test build #51977 has finished for PR 11301 at commit
|
|
Test build #52059 has finished for PR 11301 at commit
|
|
Test build #52108 has finished for PR 11301 at commit
|
|
Test build #52120 has finished for PR 11301 at commit
|
|
The failed test |
|
Test build #52131 has finished for PR 11301 at commit
|
|
@sarutak , this is ready for your review now. |
|
O.K. I'll inspect this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, constraints in QueryPlan is ExpressionSet and SparkPlan which is a subclass of QueryPlan is serializable so ExpressionSet should be also serializable strictly. But constraints is lazy val and it's not accessed when the receiver object is a instance of SparkPlan. In other word, constraints is accessed only when the receiver object is a instance of LogicalPlan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your explanation on how constraints is accessed. From the implementation view, my understanding is that it is still necessary to declare Serializable for ExpressionSet . Is there another good idea to enable Serializable only for LogicalPlan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ExpressionSet is really serialized only in the case of LogicalPlan, we could move constraints from QueryPlan to LogicalPlan but I'm not sure it's correct way.
Have you ever got any problem because ExpressionSet is not Serializable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I got an exception regarding non-serializable in test suites in hive when ExpressionSet is not Serializable. This is why I added Serialiable to ExpressionSet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O.K. I got it.
|
@davies Can you also check this? Because you have modified code changed in this PR many times. |
|
The information of callsite is helpful in general. The try-catch in generated code just added too much noise. You will see more information when the job fails, I'm not sure how helpful that logging is. cc @rxin |
…me API's call site
|
Test build #59847 has finished for PR 11301 at commit
|
|
Test build #59848 has finished for PR 11301 at commit
|
|
Test build #59851 has finished for PR 11301 at commit
|
|
(Hi all, i am just wondering where we are on this) |
|
I have less bandwidth for this since I am busy for others. Let me close for now. |
What changes were proposed in this pull request?
This PR adds the source file name and line into a comment of a Java code generated by Catalyst. It would be helpful to quickly identify the original source file from a position where an error occurs during a problem determination of a customer. It supports only DataFrame and SQL.
This PR consists of three parts:
This PR adds the information to a comment for existing operations. Other PRs will address the followings:
Here is an example. The original Java program.
Generated Java code
How was the this patch tested?
Unit test (add a test to keep Origin during SerDe)