-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-9788][MLlib] Fix LDA Binary Compatibility #8077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
reviewing now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy doc here:
Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
Note (not requiring changes): I had meant for the default gammaShape to be defined in the abstract LDAModel, in order to improve (but not fix) binary incompatibility issues for anyone who extended LDAModel. However, I just noticed that LDAModel has a private constructor, so we don't need to worry about binary incompatibility. So feel free to arrange gammaShape however you like. I think the current setup is fine. |
|
Test build #1422 has finished for PR 8077 at commit
|
|
This will need an update to PySpark as well. |
|
LGTM pending tests |
|
(Are you updating PySpark in a separate PR?) |
|
@jkbradley This PR doesn't need a PySpark update. Eventually, |
|
True, it's outside this PR. OK we'll see if tests will ever pass |
|
Jenkins test this please |
|
Test build #40494 has finished for PR 8077 at commit
|
|
Jenkins test this please |
|
Test build #40496 has finished for PR 8077 at commit
|
|
Merging with master and branch-1.5 |
1. Add “asymmetricDocConcentration” and revert docConcentration changes. If the (internal) doc concentration vector is a single value, “getDocConcentration" returns it. If it is a constant vector, getDocConcentration returns the first item, and fails otherwise. 2. Give `LDAModel.gammaShape` a default value in `LDAModel` concrete class constructors. jkbradley Author: Feynman Liang <[email protected]> Closes #8077 from feynmanliang/SPARK-9788 and squashes the following commits: 6b07bc8 [Feynman Liang] Code review changes 9d6a71e [Feynman Liang] Add asymmetricAlpha alias bf4e685 [Feynman Liang] Asymmetric docConcentration 4cab972 [Feynman Liang] Default gammaShape (cherry picked from commit be3e271) Signed-off-by: Joseph K. Bradley <[email protected]>
1. Add “asymmetricDocConcentration” and revert docConcentration changes. If the (internal) doc concentration vector is a single value, “getDocConcentration" returns it. If it is a constant vector, getDocConcentration returns the first item, and fails otherwise. 2. Give `LDAModel.gammaShape` a default value in `LDAModel` concrete class constructors. jkbradley Author: Feynman Liang <[email protected]> Closes apache#8077 from feynmanliang/SPARK-9788 and squashes the following commits: 6b07bc8 [Feynman Liang] Code review changes 9d6a71e [Feynman Liang] Add asymmetricAlpha alias bf4e685 [Feynman Liang] Asymmetric docConcentration 4cab972 [Feynman Liang] Default gammaShape
LDAModel.gammaShapea default value inLDAModelconcrete class constructors.@jkbradley