Skip to content

LightGBM should use default seed if not set explicitly #6062

@torronen

Description

@torronen

System Information (please complete the following information):

  • OS & Version: Windows 11
  • ML.NET Version: ML.NT 1.6.0
  • .NET Version: .NET 5.0

Describe the bug
LightGBM must get random seed when called, or it will set a set a random seed from MLContext.
However, LightGBM has a default value for seeds. If we set this random_seed, then other seeds will get overridden. ML.NET interface does not provide option to set the other seeds explicitly.

Links:
https://lightgbm.readthedocs.io/en/latest/Parameters.html#seed

/// If not specified, <see cref="MLContext"/> will generate a random seed to be used.

If some features which use randomity are used then this may cause issues:
-not being able to reproduce results after running Python.
-not able to reproduce results in ML.NET due to MLContex random generator may have had a different number of calls already.

Expected behavior
Seed should not be set for LightGBM.
If not possible, then option to set the specific seeds ( data_random_seed, feature_fraction_seed ...) would be useful.

*Possible fix
Do not set res{"seed"] here if Seed == null
res["seed"] = (Seed.HasValue) ? Seed : host.Rand.Next();

Additional context
Related to this discussion about not being able to reproduce LightGBM results in ML.NET from Python
microsoft/FLAML#409 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions