Skip to content

Conversation

@iRakson
Copy link
Contributor

@iRakson iRakson commented May 3, 2020

What changes were proposed in this pull request?

  • Pagination Support is added to all tables of streaming page in spark web UI.
    For adding pagination support, existing classes from [SPARK-4598][WebUI]Task table pagination for the Stage page #7399 were used.
  • Earlier streaming page has two tables Active Batches and Completed Batches. Now, we will have three tables Running Batches, Waiting Batches and Completed Batches. If we have large number of waiting and running batches then keeping track in a single table is difficult. Also other pages have different table for different type type of data.
  • Earlier empty tables were shown. Now only non-empty tables will be shown.
    Active Batches table used to show details of waiting batches followed by running batches.

Why are the changes needed?

Pagination will allow users to analyse the table in much better way. All spark web UI pages support pagination apart from streaming pages, so this will add consistency as well. Also it might fix the potential OOM errors that can arise.

Does this PR introduce any user-facing change?

Yes. Active Batches table is split into two tables Running Batches and Waiting Batches. Pagination Support is added to the all the tables. Every other functionality is unchanged.

How was this patch tested?

UT added.

Before changes:
Screenshot 2020-05-03 at 7 07 14 PM

After Changes:
Screenshot 2020-06-01 at 11 26 48 PM

@iRakson
Copy link
Contributor Author

iRakson commented May 3, 2020

cc @srowen @dongjoon-hyun @HyukjinKwon Kindly review

@srowen
Copy link
Member

srowen commented May 3, 2020

Jenkins test this please

@SparkQA
Copy link

SparkQA commented May 3, 2020

Test build #122231 has finished for PR 28439 at commit 1eff9f5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@iRakson
Copy link
Contributor Author

iRakson commented May 3, 2020

retest this please

@srowen
Copy link
Member

srowen commented May 4, 2020

Jenkins test this please

@SparkQA
Copy link

SparkQA commented May 4, 2020

Test build #122270 has finished for PR 28439 at commit c07f7bf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 4, 2020

Should #28448 come before this?
I think we do want pagination here and trust your changes, but don't know this code well.
Not sure if @sarutak or @zsxwing wants to take a look.

@iRakson
Copy link
Contributor Author

iRakson commented May 4, 2020

Should #28448 come before this?

Yeah. I think it would be better if we first clean up.

@iRakson iRakson force-pushed the streamingPagination branch from c07f7bf to 701fb70 Compare May 22, 2020 11:43
@iRakson
Copy link
Contributor Author

iRakson commented May 22, 2020

@srowen @sarutak PR updated with latest changes in Pagination Framework.

@srowen
Copy link
Member

srowen commented May 22, 2020

Jenkins test this please

@SparkQA
Copy link

SparkQA commented May 22, 2020

Test build #123004 has finished for PR 28439 at commit 701fb70.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 24, 2020

Does this need to go in before or after #28448 ? want to make sure I understand the desired order and dependencies.

@iRakson
Copy link
Contributor Author

iRakson commented May 24, 2020

Both are independent of each other. Either one can go first.
#28448 only cleans code, it do not introduce any new functionality or any structural changes.

@iRakson
Copy link
Contributor Author

iRakson commented May 25, 2020

@sarutak Kindly take a look at this one.

@srowen
Copy link
Member

srowen commented May 28, 2020

Jenkins test this please

@sarutak
Copy link
Member

sarutak commented May 28, 2020

@iRakson Sorry for the late reply. I'll check this weekend.

@iRakson
Copy link
Contributor Author

iRakson commented May 28, 2020

It's ok. Please check according to your convenience. :)

@SparkQA
Copy link

SparkQA commented May 28, 2020

Test build #123238 has finished for PR 28439 at commit 701fb70.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@sarutak sarutak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iRakson This change newly adds paged tables so I think it's better to add test cases too to test whether this change uses the pagination frame framework correctly.

Seq(
("Batch Time", true, None),
("Records", true, None),
("Scheduling Delay", true, Some(tooltips._1)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just put the tooltip texts as literal like other paged table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked on this before working on other pages, i missed this one. I will update to literals.

<div class="col-12">
<span id="activeBatches" class="collapse-aggregated-activeBatches collapse-table"
onClick="collapseTable('collapse-aggregated-activeBatches',
'aggregated-activeBatches')">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

webui.js refers colapse-aggregated-activeBatches and aggregated-activeBatches so we need change them.

subPath: String,
isRunningTable: Boolean,
isWaitingTable: Boolean,
isCompletedTable: Boolean,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isRunningTable, isWaitingTable and isCompletedTable are orthogonal so how about introducing constants which represent table types and pass one of them to the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you suggesting this. I will change to constants.

} else {
Nil
<tr>
<td id = {batchTimeId} isFailed = {batch.isFailed.toString}>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For attributes, it's natural not to have white spaces between = and it's operands.

{formattedBatchTime}
</a>
</td>
<td> {numRecords.toString} Records </td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to keep records rather than Records to minimize appearance change.

"table table-bordered table-sm table-striped table-head-clickable table-cell-width-limited"

protected def createOutputOperationProgressBar(batch: BatchUIData): Seq[Node] = {
<td class="progress-cell">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to move <td> and </td> outside this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed to maintain consistency. i.e

<td> ...... </td>
<td> ...... </td>

Anyway, i will use the original code.

@iRakson
Copy link
Contributor Author

iRakson commented Jun 1, 2020

@sarutak Thank you for reviewing. I will update PR with test cases and suggested code changes.

@iRakson
Copy link
Contributor Author

iRakson commented Jun 2, 2020

@sarutak @srowen Kindly take a look.
I added few tests for checking number of rows in paginated tables, sorting.

@iRakson
Copy link
Contributor Author

iRakson commented Jun 2, 2020

retest this please

@iRakson iRakson requested a review from sarutak June 2, 2020 14:43
@sarutak
Copy link
Member

sarutak commented Jun 2, 2020

add to whitelist.

@sarutak
Copy link
Member

sarutak commented Jun 2, 2020

@iRakson From now on, Jenkins starts on pushing your commit and you can start Jenkins by saying "retest this please".

@iRakson
Copy link
Contributor Author

iRakson commented Jun 2, 2020

@iRakson From now on, Jenkins starts on pushing your commit and you can start Jenkins by saying "retest this please".

@sarutak Thank you for adding me to whitelist :)

@SparkQA
Copy link

SparkQA commented Jun 2, 2020

Test build #123439 has finished for PR 28439 at commit eb334a2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 2, 2020

Test build #123438 has finished for PR 28439 at commit eb334a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@iRakson
Copy link
Contributor Author

iRakson commented Jun 4, 2020

retest this please

@iRakson
Copy link
Contributor Author

iRakson commented Jun 4, 2020

@sarutak Gentle ping.

@SparkQA
Copy link

SparkQA commented Jun 4, 2020

Test build #123527 has finished for PR 28439 at commit eb334a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 6, 2020

If you're OK with it @sarutak I'll merge

@sarutak
Copy link
Member

sarutak commented Jun 7, 2020

I've inspected again and it seems ok to me.

@sarutak
Copy link
Member

sarutak commented Jun 7, 2020

I'll merge this time.

@sarutak sarutak closed this in e9337f5 Jun 7, 2020
@sarutak
Copy link
Member

sarutak commented Jun 7, 2020

@iRakson Could you update the screenshots in the description?

@iRakson
Copy link
Contributor Author

iRakson commented Jun 7, 2020

@iRakson Could you update the screenshots in the description?

Updated.

@iRakson
Copy link
Contributor Author

iRakson commented Jun 7, 2020

Thank You. :) @sarutak @srowen

@sarutak
Copy link
Member

sarutak commented Jun 7, 2020

Merged to master.

@sarutak
Copy link
Member

sarutak commented Jun 8, 2020

@iRakson
This PR passed PR builder's test but doesn't pass QA so I've reverted.

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7-hive-1.2/lastCompletedBuild/testReport/org.apache.spark.streaming/UISeleniumSuite/attaching_and_detaching_a_Streaming_tab/
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7-hive-2.3/lastCompletedBuild/testReport/org.apache.spark.streaming/UISeleniumSuite/attaching_and_detaching_a_Streaming_tab/
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7-hive-2.3-jdk-11/lastCompletedBuild/testReport/org.apache.spark.streaming/UISeleniumSuite/attaching_and_detaching_a_Streaming_tab/
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-hive-2.3/lastCompletedBuild/testReport/
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-hive-2.3-jdk-11/lastCompletedBuild/testReport/org.apache.spark.streaming/UISeleniumSuite/attaching_and_detaching_a_Streaming_tab/
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7-hive-1.2/lastCompletedBuild/testReport/org.apache.spark.streaming/UISeleniumSuite/attaching_and_detaching_a_Streaming_tab/
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7-hive-2.3/lastCompletedBuild/testReport/

Could you confirm them?

@sarutak
Copy link
Member

sarutak commented Jun 8, 2020

The suite passes on my laptop with both sbt and Maven so the suite can be flaky.

// Check batch tables
val h4Text = findAll(cssSelector("h4")).map(_.text).toSeq
h4Text.exists(_.matches("Active Batches \\(\\d+\\)")) should be (true)
h4Text.exists(_.matches("Running Batches \\(\\d+\\)")) should be (true)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is causing all the failures. I will remove these tests and raise again

iRakson added a commit to iRakson/spark that referenced this pull request Jun 8, 2020
### What changes were proposed in this pull request?
* Pagination Support is added to all tables of streaming page in spark web UI.
For adding pagination support, existing classes from apache#7399 were used.
* Earlier streaming page has two tables `Active Batches` and `Completed Batches`. Now, we will have three tables `Running Batches`, `Waiting Batches` and `Completed Batches`. If we have large number of waiting and running batches then keeping track in a single table is difficult. Also other pages have different table for different type type of data.
* Earlier empty tables were shown. Now only non-empty tables will be shown.
`Active Batches` table used to show details of waiting batches followed by running batches.

### Why are the changes needed?
Pagination will allow users to analyse the table in much better way. All spark web UI pages support pagination apart from streaming pages, so this will add consistency as well. Also it might fix the potential OOM errors that can arise.

### Does this PR introduce _any_ user-facing change?
Yes. `Active Batches` table is split into two tables `Running Batches` and `Waiting Batches`. Pagination Support is added to the all the tables. Every other functionality is unchanged.

### How was this patch tested?
Manually.

Before changes:
<img width="1667" alt="Screenshot 2020-05-03 at 7 07 14 PM" src="https://user-images.githubusercontent.com/15366835/80915680-8fb44b80-8d71-11ea-9957-c4a3769b8b67.png">

After Changes:
<img width="1669" alt="Screenshot 2020-05-03 at 6 51 22 PM" src="https://user-images.githubusercontent.com/15366835/80915694-a9ee2980-8d71-11ea-8fc5-246413a4951d.png">

Closes apache#28439 from iRakson/streamingPagination.

Authored-by: iRakson <[email protected]>
Signed-off-by: Kousuke Saruta <[email protected]>
srowen pushed a commit that referenced this pull request Jun 12, 2020
### What changes were proposed in this pull request?
#28747 reverted #28439 due to some flaky test case. This PR fixes the flaky test and adds pagination support.

### Why are the changes needed?
To support pagination for streaming tab

### Does this PR introduce _any_ user-facing change?
Yes, Now streaming tab tables will be paginated.

### How was this patch tested?
Manually.

Closes #28748 from iRakson/fixstreamingpagination.

Authored-by: iRakson <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants