-
Notifications
You must be signed in to change notification settings - Fork 1.9k
LightGBM parameter changes to match Python implementation results #6064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
keep lightgbm default seed if it has not been specified in Seed
LightGBM: map NumberOfIterations to num_trees
LightGBM: Sigmod to default of LightGBM (0.5 => 1)
|
These names are valid aliases. Defaults are not yet considered, but at least metric should be "" per default, not logloss, but it might not matter too much. Missing:
|
|
Is there any reason to use aliases? If not, I suggest we update
TODO: Check names are valid for 2.3.1, above is from current documentation. |
|
Iterations actually seem to be ran inside .NET, so it does not need to be passed anywhere.
I will close this PR as it is better not to update the defaults if it does not provide any better performance. Changing default may be a breaking change for some developers. However, for some reason Python seems to provide better speed and accuracy for LightGBM (and therefore, for many applications) at least on a few datasets I've tried. |


Suggested changes for LightGBM results through ML.NET similar as through Python:
Project that can be used for comparison between LightGBM in Python and Microsoft.ML.LightGBM and also compare ModelBuilder with python-FLAML: https://github.com/torronen/lightgbm-comparison
Rationale: microsoft/FLAML#409 (comment)
Reasons for changes explained in the issues:
I suggest the results should be equal through Python and ML.NET so that developers can discuss and share best practices about hyperparameters. Also, it enables to use tuning from Python.
Sigmoid value change has been propose before but was not implemented.
It may need more consideration: #667
PR is for comments and discussions for now. Results are not yet equal through ML.Net and Python.