-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-3580: New public method for RDD's to have consistent way of obtain... #2447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…aining the number of RDD partitions across languages
|
Can one of the admins verify this patch? |
|
fyi: I think you referenced the wrong JIRA |
|
thanks for catching that @laserson! Fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style comment but you can just write this like:
def getNumPartitions: Int = partitions.size
|
LGTM pending a minor comment and tests. Jenkins, test this please. |
|
QA tests have started for PR 2447 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including a (pretty obvious) spark-shell example in the scaladoc of a simple RDD method isn't really consistent with the rest of the RDD API documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I can clean this up along with the other style issue on merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, although it's worth noting this was essentially ported directly from the python API (including the doc). Any doc changes should be consistent across both versions if possible.
|
QA tests have finished for PR 2447 at commit
|
|
Jenkins, retest this please. |
|
QA tests have started for PR 2447 at commit
|
|
QA tests have finished for PR 2447 at commit
|
|
Jenkins, retest this please. |
1 similar comment
|
Jenkins, retest this please. |
|
QA tests have started for PR 2447 at commit
|
|
QA tests have finished for PR 2447 at commit
|
|
The changes here look fine. @pwendell @markhamstra any additional thoughts? This PR hasn't had much activity for a while. |
|
FWIW I think:
The one bummer here is that there are 20-30 usages of |
|
Yeah I think it's still worth doing just to enforce consistent APIs across python and Scala. Merge conflicts may be a little annoying but they shouldn't stop us from bringing the Scala API to parity. Also I think it's fine to continue to use the old way internally since the motivation here is simply to expose it as a public API. I agree that we should also do the same for Java while we're at it. |
|
@patmcdonough are you in a position to follow up on the comments above? I'm wondering if this is alive or not or whether it should be closed. |
|
Thanks for following up on this @srowen - I didn't even realize it's still open. I'll close this out in favor of somebody issuing a new patch as I'm not in a position to address the comments right now (a lot can happen in 6 months I guess). Please shout at me and suggest otherwise if necessary. CC: @andrewor14 @pwendell |
…ons Across Different Languages I have tried to address all the comments in pull request apache#2447. Note that the second commit (using the new method in all internal code of all components) is quite intrusive and could be omitted. Author: Jeroen Schot <[email protected]> Closes apache#9767 from schot/master.
…ons Across Different Languages I have tried to address all the comments in pull request #2447. Note that the second commit (using the new method in all internal code of all components) is quite intrusive and could be omitted. Author: Jeroen Schot <[email protected]> Closes #9767 from schot/master. (cherry picked from commit 128c290) Signed-off-by: Sean Owen <[email protected]>
...ing the number of RDD partitions across languages.