-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-25566][SPARK-25567][WEBUI][SQL]Support pagination for SQL tab to avoid OOM #22645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[Spark Job History] SQL UI Page does not support Pagination
e2b45d5 to
a144270
Compare
|
cc @vanzin @srowen @cloud-fan @dongjoon-hyun . Kindly review the PR. |
4cce23b to
c48077c
Compare
c48077c to
4790b37
Compare
… with intercept with L1 regularization 1 min 10 sec
| import org.apache.spark.JobExecutionStatus | ||
| import org.apache.spark.internal.Logging | ||
| import org.apache.spark.ui.{UIUtils, WebUIPage} | ||
| import org.apache.spark.ui._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we avoid wildcard imports? we usually avoid them unless there are a very large number of classes to import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. done. Thanks
| val executionPrevPageSize = Option(parameterExecutionPrevPageSize).map(_.toInt). | ||
| getOrElse(executionPageSize) | ||
|
|
||
| val page: Int = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial nit: you don't need the type here or an extra block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be private? and how about using interpolation for the whole thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below, how about interpolation consistently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be worth using case expressions to clarify what _3 and _1 are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: pull this onto previous line? the indent is odd, at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this shouldn't be indented more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I have used a different scalastyle and checkstyle xml in IJ it seems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, here and around here do you really want to filter first instead of flatMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have modified using filter
cf3a186 to
98e46ea
Compare
|
Thank you @srowen for the review. I have addressed the comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a good place to wrap these to line up the elements in the Seq, one one each line (?) to make it easier to read its contents. They all seem to have an empty string as the second element? can it be removed or did I overlook something? Also you don't need the type here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I have modified based on your suggestions. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: pull this onto previous line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: space after if. Really, maybe just use require
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. used require. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.map { case (title, _, _) => title }?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Thanks.
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala
Outdated
Show resolved
Hide resolved
d9df4ba to
2a3c534
Compare
|
Thank you @srowen , I have modified the code based on your suggestions. |
2a3c534 to
6993b1a
Compare
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala
Outdated
Show resolved
Hide resolved
|
Hi @srowen , There is one behavior change this PR introduces, which is correct. Sorting Job Ids in the previous versions of spark was not proper. After the PR the sorting is proper. I have updated the code accordingly. |
|
ok to test |
|
Test build #97144 has finished for PR 22645 at commit
|
|
@shahidki31 ^^ |
|
Hi @felixcheung , I will update the code |
|
@felixcheung I build locally, Now scalastyle issue is not happening. kindly re-trigger the PR builder. |
|
Test build #97146 has finished for PR 22645 at commit
|
|
@felixcheung It is a random failure. Could you please re-trigger the test. Thanks |
|
Test build #4361 has finished for PR 22645 at commit
|
|
Merged to master |
|
Thanks a lot @srowen |
|
I found the UI patches are very hard to review, because we embed HTML/Javascript in Scala code. Is there a plan to rewrite the Spark UI with some modern frontend frameworks? |
|
Hi @cloud-fan , Since other webui pages like jobs, stages, tasks etc. embed the javascript code in scala code, that is why I followed the same. It would be great if we rewrite the spark UI with some modern front end frameworks. Also, after this PR, loading time of the SQL page improved significantly. earlier loading 55,000 queries sql page took ~4.34 mins from the history server, but after the PR took only ~4 sec. |
… to avoid OOM ## What changes were proposed in this pull request? Currently SQL tab in the WEBUI doesn't support pagination. Because of that following issues are happening. 1) For large number of executions, SQL page is throwing OOM exception (around 40,000) 2) For large number of executions, loading SQL page is taking time. 3) Difficult to analyse the execution table for large number of execution. [Note: spark.sql.ui.retainedExecutions = 50000] All the tabs, Jobs, Stages etc. supports pagination. So, to make it consistent with other tabs SQL tab also should support pagination. I have followed the similar flow of the pagination code in the Jobs and Stages page for SQL page. Also, this patch doesn't make any behavior change for the SQL tab except the pagination support. ## How was this patch tested? bin/spark-shell --conf spark.sql.ui.retainedExecutions=50000 Run 50,000 sql queries. **Before this PR**   **After this PR** Loading of the page is faster, and OOM issue doesn't happen.  Closes apache#22645 from shahidki31/SPARK-25566. Authored-by: Shahid <[email protected]> Signed-off-by: Sean Owen <[email protected]>
### What changes were proposed in this pull request? Supports pagination for SQL Statisitcs table in the JDBC/ODBC tab using existing Spark pagination framework. ### Why are the changes needed? It will easier for user to analyse the table and it may fix the potential issues like oom while loading the page, that may occur similar to the SQL page (refer #22645) ### Does this PR introduce any user-facing change? There will be no change in the `SQLStatistics` table in JDBC/ODBC server page execpt pagination support. ### How was this patch tested? Manually verified. Before PR:  After PR:  Closes #26215 from shahidki31/jdbcPagination. Authored-by: shahid <[email protected]> Signed-off-by: Sean Owen <[email protected]>
… JDBC/ODBC Session page ### What changes were proposed in this pull request? In the PR #26215, we supported pagination for sqlstats table in JDBC/ODBC server page. In this PR, we are extending the support of pagination to sqlstats session table by making use of existing pagination classes in #26215. ### Why are the changes needed? Support pagination for sqlsessionstats table in JDBC/ODBC server page in the WEBUI. It will easier for user to analyse the table and it may fix the potential issues like oom while loading the page, that may occur similar to the SQL page (refer #22645) ### Does this PR introduce any user-facing change? There will be no change in the sqlsessionstats table in JDBC/ODBC server page execpt pagination support. ### How was this patch tested? Manually verified. Before:  After:  Closes #26246 from shahidki31/SPARK_29589. Authored-by: shahid <[email protected]> Signed-off-by: Sean Owen <[email protected]>
… JDBC/ODBC Session page In the PR apache#26215, we supported pagination for sqlstats table in JDBC/ODBC server page. In this PR, we are extending the support of pagination to sqlstats session table by making use of existing pagination classes in apache#26215. Support pagination for sqlsessionstats table in JDBC/ODBC server page in the WEBUI. It will easier for user to analyse the table and it may fix the potential issues like oom while loading the page, that may occur similar to the SQL page (refer apache#22645) There will be no change in the sqlsessionstats table in JDBC/ODBC server page execpt pagination support. Manually verified. Before:  After:  Closes apache#26246 from shahidki31/SPARK_29589. Authored-by: shahid <[email protected]> Signed-off-by: Sean Owen <[email protected]> Signed-off-by: dengziming <[email protected]>
…pache.spark.ui <!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html 2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'. 3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'. --> ### _Why are the changes needed?_ <!-- Please clarify why the changes are needed. For instance, 1. If you add a feature, you can talk about the use case of it. 2. If you fix a bug, you can clarify why it is a bug. --> In order to compaible with both Spark-2 and Spark-3, we should respect package `ui`, since apache/spark#22645, some class cope changed from package `ui` to `spark`. ### _How was this patch tested?_ Pass CI Closes #1430 from ulysses-you/spark-2.4. Closes #1429 8b40c60 [ulysses-you] nit f875cda [ulysses-you] ui Authored-by: ulysses-you <[email protected]> Signed-off-by: ulysses-you <[email protected]>




What changes were proposed in this pull request?
Currently SQL tab in the WEBUI doesn't support pagination. Because of that following issues are happening.
[Note: spark.sql.ui.retainedExecutions = 50000]
All the tabs, Jobs, Stages etc. supports pagination. So, to make it consistent with other tabs
SQL tab also should support pagination.
I have followed the similar flow of the pagination code in the Jobs and Stages page for SQL page.
Also, this patch doesn't make any behavior change for the SQL tab except the pagination support.
How was this patch tested?
bin/spark-shell --conf spark.sql.ui.retainedExecutions=50000

Run 50,000 sql queries.
Before this PR
After this PR
Loading of the page is faster, and OOM issue doesn't happen.
