Skip to content

Conversation

@erenavsarogullari
Copy link
Member

@erenavsarogullari erenavsarogullari commented Oct 13, 2016

What changes were proposed in this pull request?

TaskSetManager should have unique name to avoid adding duplicate ones to parent Pool via SchedulableBuilder. This problem has been surfaced with following discussion: [PR: Avoid adding duplicate schedulables]

Proposal :
There is 1x1 relationship between stageAttemptId and TaskSetManager so taskSet.Id covering both stageId and stageAttemptId looks to be used for uniqueness of TaskSetManager name instead of just stageId.

Current TaskSetManager Name :
var name = "TaskSet_" + taskSet.stageId.toString
Sample: TaskSet_0

Proposed TaskSetManager Name :
val name = "TaskSet_" + taskSet.Id // taskSet.Id = (stageId + "." + stageAttemptId)
Sample : TaskSet_0.0

How was this patch tested?

Added new Unit Test.

@erenavsarogullari
Copy link
Member Author

cc @kayousterhout @markhamstra

Copy link
Contributor

@kayousterhout kayousterhout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

Jenkins, this is OK to test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a little more specific and also include the JIRA: "[SPARK-17894] Verify TaskSetManagers for different stage attempts have unique names"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pass these in as named parameters, to make the tests a little easier to read?

taskSet = FakeTask.createTaskSet(numTasks = 1, stageId = 0, stageAttemptId = 0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment about what's going on in this chunk -- something like "Make sure a task set for the same stage (but a different attempt) has a different ID"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here -- "// Make sure a task set with the same attempt ID but different stage ID also has a unique name"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pass 0 as a named parameter ("stageId = 0") to make it obvious what it's being used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@erenavsarogullari
Copy link
Member Author

Thanks @kayousterhout for the comments. They are addressed.

@kayousterhout
Copy link
Contributor

Jenkins, test this please

@erenavsarogullari
Copy link
Member Author

Thanks @kayousterhout for review again. Jenkins did not start to build and run the tests.

@kayousterhout
Copy link
Contributor

Jenkins, this is OK to test

@erenavsarogullari
Copy link
Member Author

I think Jenkins can not be triggered ;) Is there any other way to trigger?

@kayousterhout
Copy link
Contributor

Jenkins test this please

@erenavsarogullari
Copy link
Member Author

erenavsarogullari commented Oct 21, 2016

Thanks @kayousterhout for support but I think problem still continues. Is there any contact from infrastructure/jenkins team to get support?

@shivaram
Copy link
Contributor

Jenkins, retest this please

@markhamstra
Copy link
Contributor

Whoa! Jenkins disrespects @kayousterhout -- bad Jenkins! Or did you actually fix something @shivaram ?

@shivaram
Copy link
Contributor

Hah - I wish I had that much insight into Jenkins. I just think it was just flaky esp. given the github DNS issues.

@SparkQA
Copy link

SparkQA commented Oct 22, 2016

Test build #67349 has finished for PR 15463 at commit cd6d240.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@erenavsarogullari
Copy link
Member Author

Thanks all for the support. Jenkins looks failed due to timeout.
Jenkins, retest this please

@shivaram
Copy link
Contributor

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 23, 2016

Test build #67399 has finished for PR 15463 at commit cd6d240.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@erenavsarogullari
Copy link
Member Author

erenavsarogullari commented Oct 24, 2016

Thanks all for the support.
Patch looks passed on Jenkins so it is ready to merge if there is no extra comments ;)

@kayousterhout
Copy link
Contributor

LGTM; merged to master

@asfgit asfgit closed this in 81d6933 Oct 24, 2016
stageId: Int,
partitionId: Int,
prefLocs: Seq[TaskLocation] = Nil) extends Task[Int](stageId, 0, partitionId) {
prefLocs: Seq[TaskLocation] = Nil) extends Task[Int](stageId, stageAttemptId = 0, partitionId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not supported in Scala 2.10. Could you submit a follow-up PR to fix the Scala 2.10 build, please? Thanks!

See: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-sbt-scala-2.10/2914/console

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just opened #15617 sorry about this!

@erenavsarogullari erenavsarogullari deleted the SPARK-17894 branch October 26, 2016 20:37
robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326)

**Proposal** :
There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`.

**Current TaskSetManager Name** :
`var name = "TaskSet_" + taskSet.stageId.toString`
**Sample**: TaskSet_0

**Proposed TaskSetManager Name** :
`val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)`
**Sample** : TaskSet_0.0

Added new Unit Test.

Author: erenavsarogullari <[email protected]>

Closes apache#15463 from erenavsarogullari/SPARK-17894.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326)

**Proposal** :
There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`.

**Current TaskSetManager Name** :
`var name = "TaskSet_" + taskSet.stageId.toString`
**Sample**: TaskSet_0

**Proposed TaskSetManager Name** :
`val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)`
**Sample** : TaskSet_0.0

Added new Unit Test.

Author: erenavsarogullari <[email protected]>

Closes apache#15463 from erenavsarogullari/SPARK-17894.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants