Skip to content

Conversation

@CodingCat
Copy link
Contributor

https://issues.apache.org/jira/browse/SPARK-4011

Currently, most of the members in Master/Worker are with public accessibility. We might wish to tighten the accessibility of them

a bit more discussion is here:

#2828

@CodingCat
Copy link
Contributor Author

Hi, @srowen, would you mind taking the review?

Also, though the current fix is only for Master/Worker class, I found that "too much visibility" problem is prevalent in the deploy package

do we want to fix it in a shot?

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28147 has started for PR 4844 at commit b9227e0.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28147 has finished for PR 4844 at commit b9227e0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28147/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28148 has started for PR 4844 at commit 1dca395.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28148 has finished for PR 4844 at commit 1dca395.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28148/
Test PASSed.

@srowen
Copy link
Member

srowen commented Mar 1, 2015

This follows discussion at #2828 (comment)

I also agree that these many fields and members can be private, and do not seem to have been intended to be non-private. (They can be made so later; this is not a public API.)

So I support this and clearly it compiles and passes tests.

I might like a yea/nay from @JoshRosen or @markhamstra who were part of the original discussion though.

@JoshRosen
Copy link
Contributor

I'm in favor of this change. Even if this class is private within the deploy package, keeping everything at the minimum visibility is a good programming practice and makes the code easier to reason about.

@CodingCat
Copy link
Contributor Author

Thanks, @srowen and @JoshRosen ,

So, shall we propogate this change to other classes in the deploy package, in this PR or a different PR?

e.g.

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/DriverInfo.scala#L33 can be changed to private[deploy]

@srowen
Copy link
Member

srowen commented Mar 2, 2015

I imagine there are lots of fields that could and even should be private. The question is where to draw the line, since I don't think we want to hit the whole code base in one go. What other classes in this package logically go with the changed files? I think it would be reasonable to lock them down too as part of this PR, unless it would make this change much larger. That seems like one reasonable bright line to draw.

@CodingCat
Copy link
Contributor Author

sure, thanks @srowen , I will try to address it today

@SparkQA
Copy link

SparkQA commented Mar 2, 2015

Test build #28181 has started for PR 4844 at commit 535065d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 2, 2015

Test build #28186 has started for PR 4844 at commit d5d0e1c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 2, 2015

Test build #28181 has finished for PR 4844 at commit 535065d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class MavenCoordinate(groupId: String, artifactId: String, version: String)

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28181/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Mar 2, 2015

Test build #28186 has finished for PR 4844 at commit d5d0e1c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class MavenCoordinate(groupId: String, artifactId: String, version: String)

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28186/
Test PASSed.

@CodingCat
Copy link
Contributor Author

Hi, @srowen and @JoshRosen

I made further changes. Basically, I got 2 kinds of strategies to tighten the accessibility

  1. modify individual variables/methods
  2. change the accessibility of the classes directly

One more thing to mention is that

I didn't apply the most restrict permission to all places, because I guess in some of the places, the original authors intentionally made more loose accessibility for easier extension in future, e.g. in https://github.com/apache/spark/pull/4844/files#diff-829a8674171f92acd61007bedb1bfa4fR40 (DriverRunner), some of the public variables declared in the constructor is not necessary (e.g. workerUrl), but some of the public variables declared together is read in other classes (e.g. driverId is read in WorkerPage), in such cases, I didn't limit workerUrl as private, since in future we might add more rich content to WorkerPage....

@srowen
Copy link
Member

srowen commented Mar 2, 2015

Oh my, I didn't realize this would expand to change 50 files. In many cases you're removing visibility restrictions. I understand that the class-level visibility still constrains it but it's part of why the change is large now.

I don't feel qualified to judge whether this is too much change or not. It seems like the original much smaller commit was certainly a good change, at the least. I agree that it does feel right to lock down visibility to improve reasoning about the code.

@CodingCat
Copy link
Contributor Author

yeah, in many cases, the class is exposed as private[spark] unnecessarily with/without proper restrictions for members

I scan the code base, found that this is not an intentional design , as in some classes like https://github.com/CodingCat/spark/blob/SPARK-4011/core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala#L26

the class is restricted in its local package, which I thought to be the right choice

so I followed this idea and made the changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be going a little too far. It's probably OK for all of these to just be private[deploy]

@andrewor14
Copy link
Contributor

@CodingCat I'm in full support for the changes that involve changing var x to private var x where possible. I think it's fine to tighten private[spark] class X to private[deploy] class X, but doing it on a per-method or per-field basis seems a little too much without obvious readability benefits. One thing I'm not so sure about is changing otherwise public APIs class X to private[spark] X, which technically breaks backward compatibility for those classes or objects (e.g. o.a.s.deploy.Client).

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28247 has started for PR 4844 at commit 4ecfedc.

  • This patch merges cleanly.

@CodingCat
Copy link
Contributor Author

Hi, @andrewor14 , thank you very much for the comments

  • For your first concern, it's quite reasonable, so I just changed the code again to loose some accessibility restrictions of the fields/methods.

But I still keep some of them, which mainly stay in master.ApplicationInfo and deploy.SparkSubmit

The reason is that 1) for SparkSubmit, those methods are just accessed by the object itself as well as the unit test suite...I guess leaving them to be accessible across the whole code base is not that "elegant"? 2) for ApplicationInfo...it's more than "elegance", I think we may not wish the classes from outside of master package can change the internal status of ApplicationInfo instance?

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28247 has finished for PR 4844 at commit 4ecfedc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class MavenCoordinate(groupId: String, artifactId: String, version: String)

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28247/
Test PASSed.

@CodingCat
Copy link
Contributor Author

@srowen @andrewor14 @JoshRosen do you have further comments about the patch?

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28526 has started for PR 4844 at commit 8d5b0c0.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28526 has finished for PR 4844 at commit 8d5b0c0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class MavenCoordinate(groupId: String, artifactId: String, version: String)

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28526/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28533 has started for PR 4844 at commit f5034a4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28533 has finished for PR 4844 at commit f5034a4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class MavenCoordinate(groupId: String, artifactId: String, version: String)

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28533/
Test PASSed.

@srowen
Copy link
Member

srowen commented Mar 12, 2015

I like this. It is tightening visibility only on Spark-private classes, so there should be no API impact. MiMa doesn't report problems, except:

  • I think the MavenCoordinate warning is spurious since it's an inner class of a private class. Worth double-checking and if it's unclear, it could be left declared private[...] itself
  • Isn't PythonRunner technically public? but one of its methods was made private. Hm. Also worth double-checking and backing out if it's not obviously safe

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28536 has started for PR 4844 at commit e7fd375.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28536 has finished for PR 4844 at commit e7fd375.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28536/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28537 has started for PR 4844 at commit 1a64175.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28537 has finished for PR 4844 at commit 1a64175.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28537/
Test PASSed.

@CodingCat
Copy link
Contributor Author

@srowen you're right, I addressed the comments

@srowen
Copy link
Member

srowen commented Mar 13, 2015

Let me let @andrewor14 have a couple more days to weigh in, but this looks good to me. I believe he's broadly in favor of the change, and you have backed out any changes that potentially make something public into non-public, and removed some of the per-method changes too. MiMa is happy.

@CodingCat
Copy link
Contributor Author

sure, thanks

@srowen
Copy link
Member

srowen commented Mar 16, 2015

Going once, going twice for more comments on this change. It touches a number of files but is a good tightening of visibility.

@asfgit asfgit closed this in 25f3580 Mar 17, 2015
@CodingCat
Copy link
Contributor Author

thanks @srowen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants