-
Notifications
You must be signed in to change notification settings - Fork 28.9k
Added setMinCount to Word2Vec.scala #3693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Wanted to customize the minCount variable in the Word2Vec class. Added a method to do so.
|
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2-space indentation
Added javadoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be formatted as:
/**
* The minimum number of times a token must occur in the training corpus to be
* included in the word2vec model (default: 5).
*/
|
Sorry (first time contributing), do you mean I should use 4 spaces instead of tab per the convention? When I do this, my additions appear out of alignment with the rest of the code... |
|
@ganonp No problem (but we are pretty strict about style). When in doubt, check out other code in the project for example. Also, I'd recommend checking out these resources: |
|
Oh, and yes, I meant to use 4 spaces inside the function. |
|
O wow, I just didn't see that the function and everything inside was lining up... Hurts to look at. Thanks for those links and your patience. Spark now makes up about 70% of my job, so I'll definitely be contributing more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just noticed that the other options are grouped at the top of the Word2Vec class. Would you mind moving this there? Thanks a lot!
|
Sounds good! |
Moved mincount variable to top and removed its javadoc and moved setMinCount below other set methods.
|
Bumping this, since it looks like it might be good to go. Looks like there's a new change here related to making |
|
Sorry I didn't mean to commit that norm method for this pull request. That said, I think it makes sense for norm to be public or at least a d=2 version of norm. |
|
@ganonp Could you update the branch and remove the last commit? |
|
ok to test |
|
Test build #24865 has started for PR 3693 at commit
|
|
Test build #24865 has finished for PR 3693 at commit
|
|
Test PASSed. |
|
LGTM (including the change to |
Wanted to customize the private minCount variable in the Word2Vec class. Added
a method to do so.