-
Couldn't load subscription status.
- Fork 28.9k
[SPARK-24806][SQL] Brush up generated code so that JDK compilers can handle it #21770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #93003 has finished for PR 21770 at commit
|
|
As I said in As @mgaido91 suggested there, there is no test to check this now. I think it'd be nice if we could. But, since the compilation of JDK compilers is too slow (see the performance values I showed in |
e19b804 to
2441b07
Compare
|
Test build #93004 has finished for PR 21770 at commit
|
|
cc @rednaxelafx |
2441b07 to
d817f9d
Compare
|
Also, |
|
Test build #93011 has finished for PR 21770 at commit
|
|
retest this please |
|
Test build #93012 has finished for PR 21770 at commit
|
|
retest this please |
|
Test build #93013 has finished for PR 21770 at commit
|
|
I'll fix the failures soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @maropu. As I already mentioned and you remembered here, my only concern here is the lack of testing. I am fine having this in, but without proper tests, I am afraid it could be not so useful (as we may missing other problems or other problems may be introduced later).
Anyway, as it is of no harm having it, if others agree that it is useful, I am fine having it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as we are adding this comment, shall we also mention that Janino works anyway, but JDK complains here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the master, this test currently depends on the message of janino compilation errors. So, If we used JDK java compilers, the test could fail (because the compilers throw an exception with a different message). To fix this, the change resolves the setters before compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic arrays not allowed;
/* 019 */ private org.apache.spark.util.random.BernoulliCellSampler<UnsafeRow>[] sample_mutableStateArray_0 = new org.apache.spark.util.random.BernoulliCellSampler<UnsafeRow>[1];
---
Cause: java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: failed to compile:
(Line 40, Column 101) generic array creation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this may cause problems leading to the 64KB limit issue. So if we are not including the jdk support I'd be against this change...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha, ok. let's us wait for the other developer's comments.
36cd767 to
9fdbefd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgaido91 I a little misunderstood, so I updated the comment. I don't know why this fails in JDK compilers now, so I think I need to dig into this more. cc: @rednaxelafx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why is this change needed? Is it because it's really a scala int that's returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't look into the bytecode that JDK compilers generate though, the type of the return value seems to be erased in the bytecode;
...
/* 045 */ scala.Option<Integer> intOpt_0 =
/* 046 */ org.apache.spark.sql.catalyst.util.DateTimeUtils.stringToDate(((UTF8String) references[0] /* literal */));
/* 047 */ if (intOpt_0.isDefined()) {
/* 048 */ value_1 = ((Integer) intOpt_0.get()).intValue();
/* 049 */ } else {
/* 050 */ isNull_1 = true;
/* 051 */ }
/* 052 */
...
- Day / DayOfMonth *** FAILED ***
Code generation of dayofmonth(cast(2000-02-29 as date)) failed:
java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: failed to compile:
(Line 67, Column 62) incompatible types: scala.Option<java.lang.Object> cannot be converted to scala.Option<java.lang.Integer>
java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: failed to compile:
(Line 67, Column 62) incompatible types: scala.Option<java.lang.Object> cannot be converted to scala.Option<java.lang.Integer>
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would cause InternalError here? just wondering if there's a cleaner way.
Will getCanonicalName ever return null? otherwise seems like you don't need to fall back to getName above. Also return isn't needed below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the commit cc88d7f and this is the same fix with Utils.getSimpleName. Yea, I see and I think this is workaround now.
Yea, in some cases, the doc says that getCanonicalName could return null;
/**
* Returns the canonical name of the underlying class as
* defined by the Java Language Specification. Returns null if
* the underlying class does not have a canonical name (i.e., if
* it is a local or anonymous class or an array whose component
* type does not have a canonical name).
* @return the canonical name of the underlying class if it exists, and
* {@code null} otherwise.
* @since 1.5
*/
public String getCanonicalName() {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, return removed.
|
PS I don't think SparkException should be a RuntimeException even if it were possible. It's possible to get the Scala code to declare SparkException in the byte code if that's what you need it to do for the benefit of a Java compiler -- |
|
Test build #93029 has finished for PR 21770 at commit
|
|
Test build #93030 has finished for PR 21770 at commit
|
|
retest this please |
|
@srowen Thx for the comment. aha, I see, It looks reasonable to me. We should try to fix so. |
|
Test build #93042 has finished for PR 21770 at commit
|
|
Test build #93045 has finished for PR 21770 at commit
|
|
Should this change be a part of #21777? Seems it shall be? |
|
yea, if we get the consensus to implement #21777, it sounds ok to me. |
|
Test build #93233 has finished for PR 21770 at commit
|
5d33d53 to
215a9a0
Compare
215a9a0 to
5a70a7c
Compare
|
Test build #95082 has finished for PR 21770 at commit
|
|
retest this please |
|
Test build #95103 has finished for PR 21770 at commit
|
What changes were proposed in this pull request?
This pr brushed up code so that JDK compilers can handle it because the master sometimes generates Java code that only janino can compile it. This is the issue I found when working on SPARK-24498.
How was this patch tested?
Existing tests.