-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-25623][SPARK-25624][SPARK-25625][TEST] Reduce test time of LogisticRegressionSuite #22659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
|
Test build #97087 has finished for PR 22659 at commit
|
2040ada to
3d9673e
Compare
… with intercept with L1 regularization 1 min 10 sec
3d9673e to
c28fd05
Compare
|
Before the changes: cc @srowen @HyukjinKwon . Kindly review |
|
Test build #97093 has finished for PR 22659 at commit
|
|
Test build #97094 has finished for PR 22659 at commit
|
|
In Jenkins CI, testing time of logisticRegressionSuite without the PR is 5 min 10 sec and with the PR, 4 min 21 sec |
|
Merged to master |
|
Thank you @srowen for merging. |
…isticRegressionSuite ...with intercept with L1 regularization ## What changes were proposed in this pull request? In the test, "multinomial logistic regression with intercept with L1 regularization" in the "LogisticRegressionSuite", taking more than a minute due to training of 2 logistic regression model. However after analysing the training cost over iteration, we can reduce the computation time by 50%. Training cost vs iteration for model 1  So, model1 is converging after iteration 150. Training cost vs iteration for model 2  After around 100 iteration, model2 is converging. So, if we give maximum iteration for model1 and model2 as 175 and 125 respectively, we can reduce the computation time by half. ## How was this patch tested? Computation time in local setup : Before change: ~53 sec After change: ~26 sec Please review http://spark.apache.org/contributing.html before opening a pull request. Closes apache#22659 from shahidki31/SPARK-25623. Authored-by: Shahid <[email protected]> Signed-off-by: Sean Owen <[email protected]>






...with intercept with L1 regularization
What changes were proposed in this pull request?
In the test, "multinomial logistic regression with intercept with L1 regularization" in the "LogisticRegressionSuite", taking more than a minute due to training of 2 logistic regression model.

However after analysing the training cost over iteration, we can reduce the computation time by 50%.
Training cost vs iteration for model 1
So, model1 is converging after iteration 150.
Training cost vs iteration for model 2
After around 100 iteration, model2 is converging.
So, if we give maximum iteration for model1 and model2 as 175 and 125 respectively, we can reduce the computation time by half.
How was this patch tested?
Computation time in local setup :
Before change:
~53 sec
After change:
~26 sec
Please review http://spark.apache.org/contributing.html before opening a pull request.