[SPARK-7086][Deploy]Do not retry when public service start on port #5657

WangTaoTheTonic · 2015-04-23T09:15:00Z

https://issues.apache.org/jira/browse/SPARK-7086

I just fix it in master side and maybe there are more to fix?

//cc @andrewor14

SparkQA · 2015-04-23T10:38:53Z

Test build #30828 has finished for PR 5657 at commit a9dbda8.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.
This patch does not change any dependencies.

WangTaoTheTonic · 2015-04-23T11:33:55Z

Jenkins, retest this please

SparkQA · 2015-04-23T13:12:23Z

Test build #30835 has finished for PR 5657 at commit a9dbda8.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class NumericType extends NativeType
This patch does not change any dependencies.

srowen · 2015-04-25T12:47:00Z

This is related to #5575. I am not sure this is something we must force on users. There could be decent reasons to retry binding, and it doesn't 'hurt' except to waste a few tries. This is also a little arbitrary to change the behavior on just a few services. I don't think we should do this.

WangTaoTheTonic · 2015-04-26T13:17:37Z

Considering one condition: user submit apps to master with a port config, let's say spark://somehost:7077, and let workers connect to master same way. Once master open another port, for instance 7078, it will be unavailable to submit apps or accept works' registeration.

I know that retrying policy decrease probability of failing when launching master but in the meantime it increase the chance for others to connect it.

Besides I have taken a look at start-all.sh, and it pass one specify port to slave node to launch worker. Obviously worker could not take to master when master take a different port with the passed one.

@srowen I thought it must make some trouble if we take that retries on "public" port unless we found other way to solve it.

srowen · 2015-04-27T00:56:49Z

This change merely causes it to never retry. It doesn't cause the master to use another port, right? That would be bad for the reason you give, but this is changing the retry property.

WangTaoTheTonic · 2015-04-27T01:08:32Z

If retry, then master will use another port. We can see it from Utils.scala:

for (offset <- 0 to maxRetries) {
...
((startPort + offset - 1024) % (65536 - 1024)) + 1024
...
logWarning(s"Service$serviceString could not bind on port $tryPort. " +
s"Attempting port ${tryPort + 1}.")

srowen · 2015-04-27T01:21:42Z

Ah, we don't have this change committed yet: #3314 (Or, a variant on this.) The right-er way to fix this is to be able to express a range of ports, which might only include 1 port, in which case there would be no more retries anyway. I suggesting focusing on resolving SPARK-4449 as a way to fix this.

WangTaoTheTonic · 2015-04-27T02:23:08Z

After taking a look at #3314 and discussion with @scwf offline, we both think the "speifty port range for each" idea is better for issue SPARK-7086 and SPARK-4449.

So I will close this and keep track at #3314.

@srowen Thanks for your comments and nice idea. 😃

do not retry when service starts on public port

a9dbda8

WangTaoTheTonic closed this Apr 27, 2015

WangTaoTheTonic mentioned this pull request Jul 3, 2015

[SPARK-4449][Core]Specify port range in spark #5722

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-7086][Deploy]Do not retry when public service start on port #5657

[SPARK-7086][Deploy]Do not retry when public service start on port #5657

Uh oh!

WangTaoTheTonic commented Apr 23, 2015

Uh oh!

SparkQA commented Apr 23, 2015

Uh oh!

WangTaoTheTonic commented Apr 23, 2015

Uh oh!

SparkQA commented Apr 23, 2015

Uh oh!

srowen commented Apr 25, 2015

Uh oh!

WangTaoTheTonic commented Apr 26, 2015

Uh oh!

srowen commented Apr 27, 2015

Uh oh!

WangTaoTheTonic commented Apr 27, 2015

Uh oh!

srowen commented Apr 27, 2015

Uh oh!

WangTaoTheTonic commented Apr 27, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-7086][Deploy]Do not retry when public service start on port #5657

[SPARK-7086][Deploy]Do not retry when public service start on port #5657

Uh oh!

Conversation

WangTaoTheTonic commented Apr 23, 2015

Uh oh!

SparkQA commented Apr 23, 2015

Uh oh!

WangTaoTheTonic commented Apr 23, 2015

Uh oh!

SparkQA commented Apr 23, 2015

Uh oh!

srowen commented Apr 25, 2015

Uh oh!

WangTaoTheTonic commented Apr 26, 2015

Uh oh!

srowen commented Apr 27, 2015

Uh oh!

WangTaoTheTonic commented Apr 27, 2015

Uh oh!

srowen commented Apr 27, 2015

Uh oh!

WangTaoTheTonic commented Apr 27, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants