-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-11207][ML] Add test cases for solver selection of LinearRegres… #9180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…sion as followup.
|
Test build #43981 has finished for PR 9180 at commit
|
|
Test build #43983 has finished for PR 9180 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithBigFeature -> WithManyFeatures or WithLargeFeatureSize
|
@Lewuathe The added test took 30 seconds to run, which might be too long. Shall we try to reduce the number of iterations? |
|
Or you can make them sparse by randomly choosing most of the features zeros. |
|
+1 on @dbtsai 's suggestion |
|
Test build #44067 has finished for PR 9180 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra line.
|
Test build #44116 has finished for PR 9180 at commit
|
|
@dbtsai Could you check again please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about consolidate with LinearDataGenerator, and add sparsity = 1.0 as param to control if it's sparse feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also thought it is good idea. But LinearDataGenerator is used as static object, then we have to pass sparsity as parameter to generateLinearInput. This method seems to be used a lot of suites. It is necessary to change a lot of method reference.
Therefore it might be better to do in separate JIRA. What do you thing about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's modify the JIRA and do it here. Basically, you can create a LinearDataGenerator with old signature calling new API for compatibility issue.
|
Test build #44279 has finished for PR 9180 at commit
|
|
Test build #44308 has finished for PR 9180 at commit
|
|
Test build #44313 has finished for PR 9180 at commit
|
|
Test build #44317 has finished for PR 9180 at commit
|
|
@dbtsai Sorry for bothering many times but could check again please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting in pervious method
def generateLinearInput(
intercept: Double,
weights: Array[Double],
nPoints: Int,
seed: Int,
eps: Double = 0.1): Seq[LabeledPoint] = {
generateLinearInput(intercept, weights,
Array.fill[Double](weights.length)(0.0),
Array.fill[Double](weights.length)(1.0 / 3.0),
nPoints, seed, eps)}looks weird for me. Can you fix in this PR? Thanks.
|
Sorry for the delay. The current implementation of creating sparse features is not efficient since we need to create dense feature first. Let's do it as it. But if you are interested in, let's create another JIRA such that the sparse features can be generated without doing dense one. Thanks. |
|
@dbtsai Thank you so much for reviewing even you would be busy in Spark Summit. I'll update. |
|
Test build #44667 has finished for PR 9180 at commit
|
|
Test build #44669 has finished for PR 9180 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make seed = seed into just seed
|
LGTM except the small styling issues. |
|
Test build #44673 has finished for PR 9180 at commit
|
|
Thanks. Merged into master. |
…sion as followup. This is the follow up work of SPARK-10668.