Skip to content

Conversation

@pgandhi999
Copy link

@pgandhi999 pgandhi999 commented Jul 18, 2018

It would be nice to have a field in Stage Page UI which would show mapping of the current stage id to the job id's to which that stage belongs to.

What changes were proposed in this pull request?

Added a field in Stage UI to display the corresponding job id for that particular stage.

How was this patch tested?

screen shot 2018-07-25 at 1 33 07 pm

Added a field in Stage UI to display the corresponding job id for that particular stage.
@tgravescs
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Jul 19, 2018

Test build #93247 has finished for PR 21809 at commit 7be0520.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}.toSeq
}

def getJobIdsAssociatedWithStage(stageId: Int): Seq[Set[Int]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to fetch all the stage attempts, just the first of it is enough to get all the jobIds.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Have taken care of it.

{if (!stageJobIds.isEmpty) {
<li>
<strong>Associated Job Ids: </strong>
{stageJobIds}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it href link?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that the stage could also have multiple job ids, in that case, we get a bunch of them. Do you want a generic link instead that will take you to the jobs page? Let me know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is to map each job id as a href link.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that and updated the screenshot. Thank you.

}

def getJobIdsAssociatedWithStage(stageId: Int): Seq[Set[Int]] = {
store.view(classOf[StageDataWrapper]).index("stageId").first(stageId).last(stageId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid the store look up here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really did not get the question, perhaps you are suggesting to try an alternative way but is there any other alternative way to do this? Let me know your thoughts. Thank you.

@SparkQA
Copy link

SparkQA commented Jul 23, 2018

Test build #93458 has finished for PR 21809 at commit 3151a62.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

return UIUtils.headerSparkPage(request, stageHeader, content, parent)
}

val stageJobIds = parent.store.getJobIdsAssociatedWithStage(stageId, stageAttemptId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala#L109 there is a query for the stage data already. We can reduce the query to the store here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. we can add a function to return the whole StageDataWrapper in AppStatusStore

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I have fixed it. Request you to have a look.

@gengliangwang
Copy link
Member

@pgandhi999 could you update the title to [SPARK-24851][UI] Map a Stage ID to it's Associated Job ID

@pgandhi999 pgandhi999 changed the title [SPARK-24851] : Map a Stage ID to it's Associated Job ID in UI [SPARK-24851][UI] Map a Stage ID to it's Associated Job ID Jul 24, 2018
@SparkQA
Copy link

SparkQA commented Jul 25, 2018

Test build #93561 has finished for PR 21809 at commit a50e8b1.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 25, 2018

Test build #93563 has finished for PR 21809 at commit d57e6dc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Jul 26, 2018

Test build #93607 has finished for PR 21809 at commit d57e6dc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 27, 2018

Test build #93631 has finished for PR 21809 at commit 3a06b87.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

.asOption(stageDataWrapper.info)
.get
} else {
stageData = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code branch is unreachable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unreachable by the IDE but during runtime, the code does work, I have confirmed that.

{if (!stageJobIds.isEmpty) {
<li>
<strong>Associated Job Ids: </strong>
{for (jobId <- stageJobIds) yield {val detailUrl = "%s/jobs/job/?id=%s".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using map is more readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

def stageAttempt(stageId: Int, stageAttemptId: Int, details: Boolean = false): v1.StageData = {
def stageAttempt(stageId: Int, stageAttemptId: Int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the return type to (StageData, jobIds) might be simpler.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Using map instead of for..yield and returning tuple instead of object
@SparkQA
Copy link

SparkQA commented Jul 28, 2018

Test build #93689 has finished for PR 21809 at commit 52b08b2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val stageKey = Array(stageId, stageAttemptId)
val stage = store.read(classOf[StageDataWrapper], stageKey).info
if (details) stageWithDetails(stage) else stage
val stageDataWrapper: StageDataWrapper = store.read(classOf[StageDataWrapper], stageKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't really need the type here (: StageDataWrapper)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!


def stageAttempt(stageId: Int, stageAttemptId: Int, details: Boolean = false): v1.StageData = {
def stageAttempt(stageId: Int, stageAttemptId: Int,
details: Boolean = false): (v1.StageData, Seq[Int]) = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent , you might make it look like taskSummary

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

.asOption(parent.store.stageAttempt(stageId, stageAttemptId, details = false))
.getOrElse {
var stageDataTuple: Tuple2[StageData, Seq[Int]] = null
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use options here instead of try and null checks to make this cleaner

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

@SparkQA
Copy link

SparkQA commented Aug 9, 2018

Test build #94500 has finished for PR 21809 at commit 9eff537.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgandhi999 had a comment, otherwise LGTM.

{Utils.bytesToString(stageData.diskBytesSpilled)}
</li>
}}
{if (!stageJobIds.isEmpty) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could throw an NPE if stageDataTuple is None

scala> var x:Seq[Int] = null
x: Seq[Int] = null

scala> x.isEmpty
java.lang.NullPointerException

Copy link
Author

@pgandhi999 pgandhi999 Sep 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since, we are returning from the code if it is None, it should never reach there as far as I can tell.

</div>
return UIUtils.headerSparkPage(request, stageHeader, content, parent)
}
var stageDataTuple: Option[Tuple2[StageData, Seq[Int]]] = try {
Copy link
Contributor

@tgravescs tgravescs Sep 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just simplify this to be close to what is was before but return the tuple:

val (stageData, stageJodIds) = parent.store
      .asOption(parent.store.stageAttempt(stageId, stageAttemptId, details = false))
      .getOrElse {
        val content =
          <div id="no-info">
            <p>No information to display for Stage {stageId} (Attempt {stageAttemptId})</p>
          </div>
        return UIUtils.headerSparkPage(stageHeader, content, parent)
      }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, simplified the code and tested it. Looks good. Thank you.

Simplifying code block in StagePage
@SparkQA
Copy link

SparkQA commented Sep 28, 2018

Test build #96762 has finished for PR 21809 at commit 3b73840.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if (details) stageWithDetails(stage) else stage
val stageDataWrapper = store.read(classOf[StageDataWrapper], stageKey)
val stage = if (details) stageWithDetails(stageDataWrapper.info) else stageDataWrapper.info
val jobIds = stageDataWrapper.jobIds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of having separate val just put this in the return:

(stage, stageDataWrapper.jobIds.toSeq)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@SparkQA
Copy link

SparkQA commented Oct 6, 2018

Test build #97008 has finished for PR 21809 at commit 0be099a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@pgandhi999
Copy link
Author

test this please

@tgravescs
Copy link
Contributor

add to whitelist

@SparkQA
Copy link

SparkQA commented Oct 8, 2018

Test build #97120 has finished for PR 21809 at commit 0be099a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Oct 9, 2018

Test build #97130 has finished for PR 21809 at commit 0be099a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

+1

@tgravescs
Copy link
Contributor

merged to master, thanks @pgandhi999

@asfgit asfgit closed this in deb9588 Oct 9, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
It would be nice to have a field in Stage Page UI which would show mapping of the current stage id to the job id's to which that stage belongs to.

## What changes were proposed in this pull request?

Added a field in Stage UI to display the corresponding job id for that particular stage.

## How was this patch tested?

<img width="448" alt="screen shot 2018-07-25 at 1 33 07 pm" src="https://user-images.githubusercontent.com/22228190/43220447-a8e94f80-900f-11e8-8a20-a235bbd5a369.png">

Closes apache#21809 from pgandhi999/SPARK-24851.

Authored-by: pgandhi <[email protected]>
Signed-off-by: Thomas Graves <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants