-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins build (from 300m to 340m) #21845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @rxin |
|
Test build #93429 has finished for PR 21845 at commit
|
|
Test build #93430 has finished for PR 21845 at commit
|
|
retest this please |
|
@HyukjinKwon do we have any idea why we are hitting a timeout? |
|
I am not really sure on that. I asked the same question before and got no answer before. Just vaguely roughly guess there's something wrong in Jenkins cluster - I have roughly been kind of keen to check build time and to me seems suddenly increased (in some cases or some machines(?)). |
|
Just given observation for the builds in #21822, most of timeouts looked happened in |
|
Test build #93436 has finished for PR 21845 at commit
|
|
This helps, but it is not sustainable to keep increasing the threshold. What we need to do is to look at test time distribution and figure out what test suites are unnecessarily long and actually cut down the time there. @HyukjinKwon Would you be interested in doing that? |
|
of course i am as usual. I actually already have been being taking care of it. Thing is the tests are just being added even if they are duplicated of something. I feel like it's a bit excessive so far. In genetal, I don't think there are particular tests especially taking a lot of time IMHO. What we should do is that we put some efforts to deduplicate the tests. |
|
@rxin, btw you want me close this one or get this in? Will take a look for the build and tests thing again during this week for sure anyway. |
|
Are more pull requests failing due to time out right now?
…On Mon, Jul 23, 2018 at 6:30 PM Hyukjin Kwon ***@***.***> wrote:
@rxin <https://github.com/rxin>, btw you want me close this one or get
this in? Will take a look for the build and tests thing again during this
week for sure anyway.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21845 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AATvPAz8b9kFd3V6puP3zYfyw-GSv2BGks5uJnimgaJpZM4VaY0E>
.
|
|
Yup, looks so in your PR #21822 (comment) |
|
If that's the only one I think that PR itself needs to be fixed
(significantly increases test runtime), and I wouldn't increase the time
here.
…On Mon, Jul 23, 2018 at 11:44 PM Hyukjin Kwon ***@***.***> wrote:
Yup, looks so in your PR #21822 (comment)
<#21822 (comment)>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21845 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AATvPLusPAUegZcctlDTG2lWAJO1pjlDks5uJsJagaJpZM4VaY0E>
.
|
|
Hm, yea then. I actually opened this PR to make the tests passed in your PR. Let me leave this closed then and reopen when we hit the issue next time. |
|
@HyukjinKwon I saw the following test run for 11 minutes on jenkins for one of my PR. Not sure if its a transient problem. Just thought, i should let you know. On the nightly runs, should we have test that runs for that long ? SPARK-22499: Least and greatest should not generate codes beyond 64KB (11 minutes, 38 seconds) |
|
@HyukjinKwon Super. Thanks a lot for fixing. |
|
I am reopening this per #21898 (comment) cc @cloud-fan, @rxin and @shaneknapp |
|
Test build #94345 has finished for PR 21845 at commit
|
|
retest this please |
|
I think we still intermediately meet this limit issue. For instance: #22001 (comment) I saw here multiple times - #21991 (comment) |
|
Test build #94349 has finished for PR 21845 at commit
|
|
i'm also more than happy to bump the timeout in the PRB build, but i think that's just putting duct tape on a band-aid and spray painting it to hide the layers of tape. the builds and tests just take too long. i know that solving this problem is far beyond the scope of this PR, but build duration really needs some attention. |
7afc5c5 to
08b4ebe
Compare
| # format: http://linux.die.net/man/1/timeout | ||
| # must be less than the timeout configured on Jenkins (currently 350m) | ||
| tests_timeout = "300m" | ||
| # must be less than the timeout configured on Jenkins (currently 400m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Let me push this in late tonight or early tomorrow. |
|
Test build #94396 has finished for PR 21845 at commit
|
|
I am getting this in. We are seeing more - #22011 (comment) |
|
Increased 330 -> 340 since even 330 looks not enough. |
|
Merged to master. |
|
Test build #94533 has finished for PR 21845 at commit
|
| # must be less than the timeout configured on Jenkins (currently 350m) | ||
| tests_timeout = "300m" | ||
| # must be less than the timeout configured on Jenkins (currently 400m) | ||
| tests_timeout = "340m" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're STILL seeing test timeouts. let's bump this to 400m and i'll up the timeout in jenkins to 430m.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woah .. got it ..
…enkins build (from 340m to 400m) ## What changes were proposed in this pull request? This PR targets to increase the timeout from 340 to 400m. Please also see #21845 (comment) ## How was this patch tested? N/A Closes #22098 from HyukjinKwon/SPARK-24886-1. Authored-by: hyukjinkwon <[email protected]> Signed-off-by: hyukjinkwon <[email protected]>
What changes were proposed in this pull request?
Currently, looks we hit the time limit time to time. Looks better increasing the time a bit.
For instance, please see #21822
For clarification, current Jenkins timeout is 400m. This PR just proposes to fix the test script to increase it correspondingly.
This PR does not target to change the build configuration
How was this patch tested?
Jenkins tests.