Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented May 8, 2016

What changes were proposed in this pull request?

Currently, Java Linter is disabled in Jenkins tests.

https://github.com/apache/spark/blob/master/dev/run-tests.py#L554

However, as of today, Spark has 721 java files with 97362 code (without blank/comments). It's about 1/3 of Scala.

--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Scala                          2353          62819         124060         318747
Java                            721          18617          23314          97362

This PR aims to take advantage of Travis CI to handle the following static analysis by adding a single file, .travis.yml without any additional burden on the existing servers.

  • Java Linter
  • JDK7/JDK8 maven compile

Note that this PR does not propose to remove some of the above work items from the Jenkins. It's possible, but we need to observe the Travis CI stability for a while. The main goal of this issue is to remove committer's overhead on linter-related PRs (the original PR and the fixation PR).

How was this patch tested?

Pass the Travis CI tests. Please see the following link.

https://travis-ci.org/dongjoon-hyun/spark/builds/128595350
https://travis-ci.org/dongjoon-hyun/spark/builds/128708372

@dongjoon-hyun
Copy link
Member Author

Travis CI passed. It took less than 25 minutes.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 8, 2016

Hi, @rxin and @andrewor14 .
I think this is the only way to re-enable Java linter without giving side-effects on AMP clusters.

@SparkQA
Copy link

SparkQA commented May 8, 2016

Test build #58080 has finished for PR 12980 at commit 6063822.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Definitely, the failure is not related to this PR since this PR just adds .travis.yml file.

[error] (mllib/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 5504 s, completed May 7, 2016 8:37:06 PM

@srowen
Copy link
Member

srowen commented May 8, 2016

I don't think this hurts, except that it adds yet another place to look for test results. (PS you want to add "test-compile" as a target too). What is the reason it's disabled for Jenkins, just time? Is it really that resource intensive to check lint?

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 8, 2016

Thank you for review, @srowen . Right, I'll add 'test-compile`, too.

I'm not sure the root cause of that. According to the github log, it was created, disabled (HOTFIX) by @rxin , re-enabled by @rxin and reverted by @zsxwing historically. I've heard that from @andrewor14 that it was resource intensive, too.

Anyway, two weeks ago, when I tried to re-enable that according to @rxin 's request during in the following PR, there was a functional problem on java linter on AMP clusters.

[SPARK-14868][BUILD] Enable NewLineAtEofChecker in checkstyle and fix lint-java errors

So, I fixed that in the following PR. Now I think that Java Linter works now in AMP cluster without any problems.

[SPARK-14867][BUILD] Remove --force option in build/mvn

However, in this PR, I propose to use Travis CI more actively for simple code analysis and compilation for the following main reasons. (I know that Spark uses Travis CI historically and Travis CI cannot supports Spark full testing.)

  • It's free.
  • Most GitHub community developers are more familiar with it than AMP clusters.
  • It will reduce the number of PR failures in AMP clusters.
    • If we have .travis.yml, all the developers' own forks have that, too. It means they will see the Travis CI result on their branches while preparing their PRs, not Spark PR list.
  • In the future, there is possibility to reduce the workload on AMP clusters. It means more faster processing the PR requests.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 8, 2016

Hi, @srowen . I added test-compile, too. You can see the result here.

https://travis-ci.org/dongjoon-hyun/spark/builds/128708372

For your concern about Yet Another Place to look for, I think committers has no need to look into the result of Travis CI even if it fails. You can think this a kind of speculative execution of static analysis. Committers are able to do in the same way by just ignoring Travis CI comments. It has meaning for only the pull requester side.

As you know, pull requesters do not read carefully Spark Contribution Guide. However, they will try to succeed on Travis CI failures definitely if they are developers.

@SparkQA
Copy link

SparkQA commented May 8, 2016

Test build #58110 has finished for PR 12980 at commit 690ca52.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 9, 2016

I don't think it's a bad thing to have some kind of Travis build running in parallel as long as people understand what it is. Maybe some extra comments in the .travis.yml file to explain its purpose?

So it sounds like we could actually re-enable lint in the main build, but the concern is speed? but the Java lint checks themselves seem to take just a few seconds:

$ time ./dev/lint-java 
Using `mvn` from path: /usr/local/bin/mvn
Checkstyle checks passed.

real    0m7.949s
user    0m26.882s
sys 0m1.144s

Almost all the time in that Travis run is downloading stuff, compiling, and running the other lint checks.

@dongjoon-hyun
Copy link
Member Author

Yep, I did the same question before.
The answer was Jenkins also do the clean build as you mentioned for Travis CI. In other words, in AMP cluster, there occurs lots of maven traffic.

All CI suffers the same issue, but it's the purpose of CI.

@srowen
Copy link
Member

srowen commented May 9, 2016

Yeah but the Jenkins builds already have to download all that stuff to run everything else like the main build right? the additional cost of running lint-java is small? or did I misunderstand the problem?

@zsxwing
Copy link
Member

zsxwing commented May 9, 2016

Currently, Java Linter is disabled due to lack of testing resources.

FYI. This is not correct. The problem is mvn checkstyle requires that the artifacts has been published to some place. Since we publish nightly snapshots every day, mvn checkstyle works most of the time. However, when we upgrade the version in pom.xml or add some new projects, it requires a manual maven publish. Otherwise, mvn checkstyle will fail by maven resolution problem.

@zsxwing
Copy link
Member

zsxwing commented May 9, 2016

A solution is always running mvn install before mvn checkstyle. However, this wastes a lot time for sbt build.

@dongjoon-hyun
Copy link
Member Author

You're right. I maybe didn't remember the exact phrase I've read before. By the way, 'this wastes a lot time for sbt building' is effectively different from 'lack of resource'? Time/CPU/Network are all the resource.

@dongjoon-hyun
Copy link
Member Author

Hi, @zsxwing .
According to your advice, I updated the PR; run lint-java after mvn install. Thank you for advice!

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 10, 2016

Travic CI was finished with JDK7 (30min) and JDK8 (31min) in parallel 30 minutes ago.

https://travis-ci.org/dongjoon-hyun/spark/builds/129008329

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-15207][BUILD] Use Travis CI for Java/Scala/Python Linter and JDK7/8 compilation test [WIP][SPARK-15207][BUILD] Use Travis CI for Java/Scala/Python Linter and JDK7/8 compilation test May 10, 2016
@dongjoon-hyun
Copy link
Member Author

I found that the previous similar discussion on maven checkstyle overhead with @zsxwing and @srowen here, #11438

@SparkQA
Copy link

SparkQA commented May 10, 2016

Test build #58190 has finished for PR 12980 at commit ac2cf27.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @srowen .
I removes RAT and lint-python in order to focus lint-java.
Also, lint-scala is removed since it's already done by mvn install.

@dongjoon-hyun dongjoon-hyun changed the title [WIP][SPARK-15207][BUILD] Use Travis CI for Java/Scala/Python Linter and JDK7/8 compilation test [WIP][SPARK-15207][BUILD] Use Travis CI for Java Linter and JDK7/8 compilation test May 10, 2016
@dongjoon-hyun dongjoon-hyun changed the title [WIP][SPARK-15207][BUILD] Use Travis CI for Java Linter and JDK7/8 compilation test [SPARK-15207][BUILD] Use Travis CI for Java Linter and JDK7/8 compilation test May 10, 2016
@dongjoon-hyun
Copy link
Member Author

@srowen
I added comments, too.

@SparkQA
Copy link

SparkQA commented May 10, 2016

Test build #58204 has finished for PR 12980 at commit 2583abc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 10, 2016

Test build #58208 has finished for PR 12980 at commit c4f5f8d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

.travis.yml Outdated
# See the License for the specific language governing permissions and
# limitations under the License.

# Travis CI Guide
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other change I'd suggest here is clearly explaining the purpose of this test harness -- it exists just to run lint-java. A pointer to the discussions here would be more useful than general Travis pointers.

@dongjoon-hyun
Copy link
Member Author

@srowen , I updated the comments by explaining the purpose of this test harness clearly. I didn't described about the Jenkins-related stuff because that's not the main purpose of this PR or this test harness. As you said, the main purpose is to help contributors run lint-java during their preparing PR. I also described about Scalascala since Scalastyle is executed by mvn install.

@SparkQA
Copy link

SparkQA commented May 10, 2016

Test build #58243 has finished for PR 12980 at commit 893e8bc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 10, 2016

OK, seems reasonable to give this a shot in master and see how it goes.

@srowen
Copy link
Member

srowen commented May 10, 2016

Merged in master

@asfgit asfgit closed this in 603c4f8 May 10, 2016
@nchammas
Copy link
Contributor

nchammas commented May 10, 2016

@dongjoon-hyun - What's the intended workflow here? Is the intention that contributors will manually open Travis to check the build results?

Because last time we asked, Apache would not grant us permission to set commit status on GitHub. (This gives the green checkmark next to each commit after the build passes.) That's why we currently have Jenkins post comments in the PR body.

@dongjoon-hyun
Copy link
Member Author

Thank you, @srowen !

@dongjoon-hyun
Copy link
Member Author

Hi, @nchammas . Thank you so much for fast attention!

First of all, I'm not sure about the history well. But, maybe, Apache policy seems to be changed now.
I'm using 3 CI (Jenkins, Travis, AppVoyer) in Apache REEF project. Please refer here.

apache/reef#994

In addition, there are many tools to check the your Travis CI status. For mac, the simplest tool is CCMenu. You don't need to open web browser absolutely. If you are a Mac user, could you try CCMenu and share your thought again?

Also, I think you want to see the green marks on Spark PR, right?
To make it work, I'll make a Apache INFRA Jira issue for this.

@srowen
Copy link
Member

srowen commented May 10, 2016

Yeah I don't think this will cause Travis to post those nice build messages. Right now it won't even email us. It does let anyone who cares go look to see if the lint tests have been failing. Right now there's not really a way to notice it other than trying it manually.

If it's just not useful we can disable it, or can have it send build failure emails, or whatever. I think it's worth just test driving

EDIT: Don't make an infra JIRA. We know the answer to that question already, don't raise it again please.

@nchammas
Copy link
Contributor

nchammas commented May 10, 2016

@dongjoon-hyun - Here's the previous discussion: https://issues.apache.org/jira/browse/INFRA-7367

It looks like Apache Infra has always supported posting build status from the Apache-owned Travis account to GitHub. That may be all we need. Otherwise, my concern is that without build status being communicated directly on the PR somehow, people are unlikely to use and benefit from this new piece of testing.

As @srowen said, if we want to do this anyway as a trial that's totally fine of course. I think for it to make a real impact, though, we need to connect that final piece of reporting status back to the PR.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 10, 2016

Yep. Thank you for giving test drive chance, @srowen and for understanding that, @nchammas .
In terms of computing resources, now we can take advantage of more than AMP clusters. But, if it's not suitable tool for Spark, we can remove that anytime from Spark repository. I just hope we can use that potential usefully in anyway.

@dongjoon-hyun
Copy link
Member Author

@nchammas . I see what you mean now. For INFRA-7367 , Jenkins wanted to use Travis CI API. Yes, it's not possible as you said. We're able to see just pass/fail icons of Travis CI.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-15207 branch May 12, 2016 01:01
@dongjoon-hyun
Copy link
Member Author

Hi, All.
I'm reporting about the recent failures of Travis CI here.

Since this PR was committed a week ago, I've been monitoring Travis CI on my forked master branch (with updating every 30 minutes). It had worked fine without failure until 3 hours ago.

Currently, after the commit, [SPARK-14906][ML] Copy linalg in PySpark to new ML package, lint-java takes over 10 minutes without any output. So, Travis CI kills it.

I'm going to investigate this further for a few days.

asfgit pushed a commit that referenced this pull request Oct 8, 2018
## What changes were proposed in this pull request?

#12980 added Travis CI file mainly for linter because we disabled Java lint check in Jenkins.

It's enabled as of #21399 and now SBT runs it. Looks we can now remove the file added before.

## How was this patch tested?

N/A

Closes #22665

Closes #22667 from HyukjinKwon/SPARK-25673.

Authored-by: hyukjinkwon <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit 2199224)
Signed-off-by: hyukjinkwon <[email protected]>
HyukjinKwon added a commit to HyukjinKwon/spark that referenced this pull request Oct 8, 2018
## What changes were proposed in this pull request?

apache#12980 added Travis CI file mainly for linter because we disabled Java lint check in Jenkins.

It's enabled as of apache#21399 and now SBT runs it. Looks we can now remove the file added before.

## How was this patch tested?

N/A

Closes apache#22665

Closes apache#22667 from HyukjinKwon/SPARK-25673.

Authored-by: hyukjinkwon <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

apache#12980 added Travis CI file mainly for linter because we disabled Java lint check in Jenkins.

It's enabled as of apache#21399 and now SBT runs it. Looks we can now remove the file added before.

## How was this patch tested?

N/A

Closes apache#22665

Closes apache#22667 from HyukjinKwon/SPARK-25673.

Authored-by: hyukjinkwon <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants