Skip to content

Conversation

@LittleLittleCloud
Copy link
Member

@LittleLittleCloud LittleLittleCloud commented May 26, 2023

We are excited to review your PR.

So we can do the best job, please check:

  • There's a descriptive title that will make sense to other developers some time from now.
  • There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
  • Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
  • You have included any necessary tests in the same PR.

Training on a sub-set of train dataset will help mitigate OOM error.
This will be helpful if the training dataset is huge, in which case subsampling won't hurt metric a lot.
Because this feature is mostly useful for massive training dataset, crossValidationDatasetManger won't need it so the subsampling strategy is only added to trainValidationDatasetManager.

fix #dotnet/machinelearning-modelbuilder#2645

@LittleLittleCloud LittleLittleCloud changed the title Continue training on OOM error Continue training on OOM error && add subsampling support for trainValidationDatasetManager May 26, 2023
@LittleLittleCloud
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

}

internal interface ICrossValidateDatasetManager
public interface ICrossValidateDatasetManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some documentation for these since they are now public?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

internal class CrossValidateDatasetManager : IDatasetManager, ICrossValidateDatasetManager
{
public IDataView Dataset { get; set; }
public IDataView? Dataset { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this to be a nullable type? Can an IDataView directly have null as a value that you can just check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be nullable. I'll make the change

@codecov
Copy link

codecov bot commented Jun 14, 2023

Codecov Report

Merging #6714 (eac6879) into main (0278f43) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6714      +/-   ##
==========================================
+ Coverage   68.84%   68.88%   +0.04%     
==========================================
  Files        1215     1216       +1     
  Lines      250691   250890     +199     
  Branches    26256    26258       +2     
==========================================
+ Hits       172580   172823     +243     
+ Misses      71286    71245      -41     
+ Partials     6825     6822       -3     
Flag Coverage Δ
Debug 68.88% <100.00%> (+0.04%) ⬆️
production 63.38% <100.00%> (+0.04%) ⬆️
test 88.90% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...t.ML.Fairlearn/AutoML/AutoMLExperimentExtension.cs 70.73% <100.00%> (+4.06%) ⬆️
...t.ML.Fairlearn/Reductions/GridSearchTrialRunner.cs 100.00% <100.00%> (ø)
...Microsoft.ML.AutoML.Tests/AutoMLExperimentTests.cs 84.32% <100.00%> (+0.39%) ⬆️
...L.AutoML.Tests/TrainValidaionDatasetManagerTest.cs 100.00% <100.00%> (ø)

... and 26 files with indirect coverage changes

@LittleLittleCloud LittleLittleCloud merged commit 22342df into main Jun 14, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jul 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants