Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Jul 14, 2018

What changes were proposed in this pull request?

This pr brushed up code so that JDK compilers can handle it because the master sometimes generates Java code that only janino can compile it. This is the issue I found when working on SPARK-24498.

How was this patch tested?

Existing tests.

@SparkQA
Copy link

SparkQA commented Jul 14, 2018

Test build #93003 has finished for PR 21770 at commit 08fddcb.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • final public class BufferHolder

@maropu
Copy link
Member Author

maropu commented Jul 14, 2018

As I said in SPARK-24498, I think, whether SPARK-24498 resolved or not, we'd be better to make generated code Java-compatible as far as possible.

As @mgaido91 suggested there, there is no test to check this now. I think it'd be nice if we could. But, since the compilation of JDK compilers is too slow (see the performance values I showed in SPARK-24498), I think it is impractical that jenkins checks if all the generated code can be compiled in JDK Java compilers...

@maropu maropu force-pushed the FixJavaCompilerErrors branch 2 times, most recently from e19b804 to 2441b07 Compare July 14, 2018 14:21
@SparkQA
Copy link

SparkQA commented Jul 14, 2018

Test build #93004 has finished for PR 21770 at commit 2441b07.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • final public class BufferHolder

@gatorsmile
Copy link
Member

cc @rednaxelafx

@maropu maropu force-pushed the FixJavaCompilerErrors branch from 2441b07 to d817f9d Compare July 15, 2018 02:03
@maropu
Copy link
Member Author

maropu commented Jul 15, 2018

Also, SparkException seems to need to extend RuntimeException instead of Exception because some generated codes do not have code to catch the exception. But, the change causes Mima test failures, so the change is removed from this pr now.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93011 has finished for PR 21770 at commit d817f9d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • public final class BufferHolder

@maropu
Copy link
Member Author

maropu commented Jul 15, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93012 has finished for PR 21770 at commit d817f9d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • public final class BufferHolder

@maropu
Copy link
Member Author

maropu commented Jul 15, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93013 has finished for PR 21770 at commit d817f9d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • public final class BufferHolder

@maropu
Copy link
Member Author

maropu commented Jul 15, 2018

I'll fix the failures soon.

Copy link
Contributor

@mgaido91 mgaido91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @maropu. As I already mentioned and you remembered here, my only concern here is the lack of testing. I am fine having this in, but without proper tests, I am afraid it could be not so useful (as we may missing other problems or other problems may be introduced later).

Anyway, as it is of no harm having it, if others agree that it is useful, I am fine having it in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: as we are adding this comment, shall we also mention that Janino works anyway, but JDK complains here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to add this?

Copy link
Member Author

@maropu maropu Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the master, this test currently depends on the message of janino compilation errors. So, If we used JDK java compilers, the test could fail (because the compilers throw an exception with a different message). To fix this, the change resolves the setters before compilation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic arrays not allowed;

/* 019 */   private org.apache.spark.util.random.BernoulliCellSampler<UnsafeRow>[] sample_mutableStateArray_0 = new org.apache.spark.util.random.BernoulliCellSampler<UnsafeRow>[1];
--- 
 Cause: java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: failed to compile:
(Line 40, Column 101) generic array creation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though this may cause problems leading to the 64KB limit issue. So if we are not including the jdk support I'd be against this change...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, ok. let's us wait for the other developer's comments.

@maropu maropu force-pushed the FixJavaCompilerErrors branch from 36cd767 to 9fdbefd Compare July 15, 2018 16:05
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgaido91 I a little misunderstood, so I updated the comment. I don't know why this fails in JDK compilers now, so I think I need to dig into this more. cc: @rednaxelafx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why is this change needed? Is it because it's really a scala int that's returned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't look into the bytecode that JDK compilers generate though, the type of the return value seems to be erased in the bytecode;

...
/* 045 */       scala.Option<Integer> intOpt_0 =
/* 046 */       org.apache.spark.sql.catalyst.util.DateTimeUtils.stringToDate(((UTF8String) references[0] /* literal */));
/* 047 */       if (intOpt_0.isDefined()) {
/* 048 */         value_1 = ((Integer) intOpt_0.get()).intValue();
/* 049 */       } else {
/* 050 */         isNull_1 = true;
/* 051 */       }
/* 052 */
...
- Day / DayOfMonth *** FAILED ***
  Code generation of dayofmonth(cast(2000-02-29 as date)) failed:
  java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: failed to compile:
  (Line 67, Column 62) incompatible types: scala.Option<java.lang.Object> cannot be converted to scala.Option<java.lang.Integer>
  java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: failed to compile:
  (Line 67, Column 62) incompatible types: scala.Option<java.lang.Object> cannot be converted to scala.Option<java.lang.Integer>
        at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
        at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would cause InternalError here? just wondering if there's a cleaner way.
Will getCanonicalName ever return null? otherwise seems like you don't need to fall back to getName above. Also return isn't needed below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the commit cc88d7f and this is the same fix with Utils.getSimpleName. Yea, I see and I think this is workaround now.

Yea, in some cases, the doc says that getCanonicalName could return null;

    /**
     * Returns the canonical name of the underlying class as
     * defined by the Java Language Specification.  Returns null if
     * the underlying class does not have a canonical name (i.e., if
     * it is a local or anonymous class or an array whose component
     * type does not have a canonical name).
     * @return the canonical name of the underlying class if it exists, and
     * {@code null} otherwise.
     * @since 1.5
     */
    public String getCanonicalName() {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, return removed.

@srowen
Copy link
Member

srowen commented Jul 15, 2018

PS I don't think SparkException should be a RuntimeException even if it were possible. It's possible to get the Scala code to declare SparkException in the byte code if that's what you need it to do for the benefit of a Java compiler -- @throws[...]

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93029 has finished for PR 21770 at commit 36cd767.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93030 has finished for PR 21770 at commit 9fdbefd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jul 16, 2018

retest this please

@maropu
Copy link
Member Author

maropu commented Jul 16, 2018

@srowen Thx for the comment. aha, I see, It looks reasonable to me. We should try to fix so.

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93042 has finished for PR 21770 at commit 9fdbefd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93045 has finished for PR 21770 at commit 9880f26.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Jul 16, 2018

Should this change be a part of #21777? Seems it shall be?

@maropu
Copy link
Member Author

maropu commented Jul 18, 2018

yea, if we get the consensus to implement #21777, it sounds ok to me.

@SparkQA
Copy link

SparkQA commented Jul 18, 2018

Test build #93233 has finished for PR 21770 at commit 5d33d53.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu maropu force-pushed the FixJavaCompilerErrors branch from 5d33d53 to 215a9a0 Compare August 22, 2018 03:53
@maropu maropu force-pushed the FixJavaCompilerErrors branch from 215a9a0 to 5a70a7c Compare August 22, 2018 03:54
@SparkQA
Copy link

SparkQA commented Aug 22, 2018

Test build #95082 has finished for PR 21770 at commit 5a70a7c.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Aug 22, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Aug 22, 2018

Test build #95103 has finished for PR 21770 at commit 5a70a7c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants