Skip to content

Conversation

@sarutak
Copy link
Member

@sarutak sarutak commented Jun 11, 2020

What changes were proposed in this pull request?

This PR proposes to add pagination support for all-jobs timeline.

Why are the changes needed?

If there are lots of jobs, rendering performance of all jobs timeline can significantly goes down. This issue is reported in SPARK-31967.
For example, after the following operation, UI loading can take >40 sec.

(1 to 1000).foreach(_ => sc.parallelize(1 to 10).collect) 

Although it's not the fundamental solution, pagination can mitigate the issue.

Does this PR introduce any user-facing change?

Yes.
Before this change, all displayable jobs are displayed in one page.
all-jobs-shown
After this change, 2 text boxes are added.
One is to set the maximum number of jobs to be displayed in a page and the other is to set the page number.
Go button is also added and if it's pushed, jobs are re-rendered based on the page size and page number.
all-jobs-pagination

How was this patch tested?

Additional testcase with the following command.

build/mvn -Dguava.version=25.0-jre -Dspark.test.webdriver.chrome.driver=/path/to/chromedriver -Dtest.default.exclude.tags=  -Dtest=none -DwildcardSuites="org.apache.spark.ui.ChromeUISeleniumSuite" test

@sarutak
Copy link
Member Author

sarutak commented Jun 11, 2020

cc: @gengliangwang
This improvement is not the fundamental solution for SPARK-31967 but mitigate it so I've filed as an independent issue.

@gengliangwang
Copy link
Member

@sarutak Thanks for the quick fix. I try on my local setup and it works.

@gengliangwang
Copy link
Member

image

I think we should make the two pagination sections consistent

@SparkQA
Copy link

SparkQA commented Jun 11, 2020

Test build #123864 has finished for PR 28803 at commit 90b132b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member

@sarutak I just confirmed that the issue is caused by #28192 . The issus can't be reproduced before the change.

@sarutak
Copy link
Member Author

sarutak commented Jun 12, 2020

@gengliangwang Yes, I've noticed that. The performance issue of vis-timeline happens for newer version of vis.js.

@gengliangwang
Copy link
Member

So, I am actually -1 in this solution, for two reasons:

  1. if there are many stages in one job, the perf issue still exists
  2. the two paginations in one page seem weird.

I tried a different version 4.21.0 from https://cdnjs.com/libraries/vis
The infinite drawing issue seems better if the zoom is not enabled.
How about we changing to that version? or, just reverting #28192?

@gengliangwang
Copy link
Member

Let me create a PR for downgrading to 4.21.0 now.

@sarutak
Copy link
Member Author

sarutak commented Jun 12, 2020

  1. if there are many stages in one job, the perf issue still exists
  2. the two paginations in one page seem weird.

For 1), AllJobsPage is not affected even if a job has many stages isn't it? JobPage can be affect but we can apply the same solution. For 2), Yes, it's a little bit weird but the similar solution already exists in StagePage.

Even if no performance issue exists, lots of items rendered in one page can be too noisy.

@gengliangwang
Copy link
Member

@sarutak I just created #28806.
I believe downgrading the vis version is a safer solution. We need to hotfix this to Spark 3.0 ASAP

@gengliangwang
Copy link
Member

gengliangwang commented Jun 12, 2020

Even if no performance issue exists, lots of items rendered in one page can be too noisy.

OK, how about we merge #28806 to 3.0/2.4 first.
And start the pagination feature from 3.1 if you insist?

@sarutak
Copy link
Member Author

sarutak commented Jun 12, 2020

OK, how about we merge #28806 to 3.0/2.4 first.
And start the pagination feature from 3.1 if you insist?

I agree. I'll check that PR soon.

@sarutak
Copy link
Member Author

sarutak commented Jun 13, 2020

I think we should make the two pagination sections consistent

The upper text fields are for timeline, while the lower ones are for the table so I think they should be independent each other.

@SparkQA
Copy link

SparkQA commented Jun 13, 2020

Test build #123965 has finished for PR 28803 at commit 0270bb9.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Jun 13, 2020

retest this please.

@SparkQA
Copy link

SparkQA commented Jun 13, 2020

Test build #123969 has finished for PR 28803 at commit 0270bb9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Jul 1, 2020

Hi @gengliangwang, shall we restart discussion?

@sarutak
Copy link
Member Author

sarutak commented Jul 8, 2020

cc: @dongjoon-hyun

@gengliangwang
Copy link
Member

retest this please

method="get"
action=""
class="form-inline justify-content-end"
style="width: 50%; margin-left: auto; margin-bottom: 0px;">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we move the style to webui.css?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll consider it.

Pages. Jump to</label>
<input type="text"
name="jobs.eventTimelinePageNumber"
id={s"form-event-timeline-page-no"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I find similar code in StagePage.scala. But why do we need {s".."} here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. I'll remove s prefix.

@gengliangwang
Copy link
Member

@sarutak I am really sorry I missed your ping last week.
The perf problem is resolved in #28806. If we want to avoid perf issue in the long term, I think we can avoid rendering the timeline by default.
I am +0 with the proposal. The pagination makes the visualization simpler, but also makes the UI usage more complex. If you insist, let's merge this one.

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125295 has finished for PR 28803 at commit 0270bb9.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Jul 13, 2020

I am +0 with the proposal. The pagination makes the visualization simpler, but also makes the UI usage more complex. If you insist, let's merge this one.

I wondered the pagination feature for task timeline is one compromise so I came up with bringing the idea into jobs timeline but this solution has both upside and downside as you suggested.
So, do you have any better idea to avoid being the timeline noisy?

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Oct 22, 2020
@github-actions github-actions bot closed this Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants