-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System Information (please complete the following information):
- OS & Version: Microsoft Windows 10 Pro
- ML.NET Version: ML.NET 2.0.1
- AutoML version: AutoML 0.20.1
- .NET Version: .NET 6.0
Describe the bug
As outlined in this tutorial, a model can be refitted to include the latest data used in the validation phase of the AutoML experiment.
When calling this method, as in my case:
private static ITransformer RefitBestPipeline(ExperimentResult<RegressionMetrics> experimentResult,IDataView final) { RunDetail<RegressionMetrics> bestRun = experimentResult.BestRun; return bestRun.Estimator.Fit(final); }
the return line causes the exception.
To Reproduce
- Create a regression automl experiment in C#.
- run the experiment on train validation data split
- After experiment finished, call the
RefitBestPipelinefunction with the best model and the entire IDataView (train+validation) - The error should occur on the return line
Expected behavior
The function should return the new ITransformer
Screenshots, Code, Sample Projects
System.OperationCanceledException
HResult=0x8013153B
Message=Operation was canceled.
Source=Microsoft.ML.Core
StackTrace:
at Microsoft.ML.Runtime.Contracts.CheckAlive(IHostEnvironment env)
at Microsoft.ML.Trainers.FastTree.DataConverter.MemImpl.InitializeBins(Int32 maxBins, IParallelTraining parallelTraining)
at Microsoft.ML.Trainers.FastTree.DataConverter.MemImpl..ctor(RoleMappedData data, IHost host, Int32 maxBins, Single maxLabel, Boolean noFlocks, Int32 minDocsPerLeaf, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeatureIndices, Boolean categoricalSplit)
at Microsoft.ML.Trainers.FastTree.DataConverter.Create(RoleMappedData data, IHost host, Int32 maxBins, Single maxLabel, Boolean diskTranspose, Boolean noFlocks, Int32 minDocsPerLeaf, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeatureIndices, Boolean categoricalSplit)
at Microsoft.ML.Trainers.FastTree.ExamplesToFastTreeBins.FindBinsAndReturnDataset(RoleMappedData data, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeaturIndices, Boolean categoricalSplit)
at Microsoft.ML.Trainers.FastTree.FastTreeTrainerBase`3.ConvertData(RoleMappedData trainData)
at Microsoft.ML.Trainers.FastTree.FastForestRegressionTrainer.TrainModelCore(TrainContext context)
at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
at mlnet.AutoMLHelper.RefitBestPipeline(ExperimentResult`1 experimentResult, IDataView final)
Additional context
Note: I have read all of the other opened issues on this exception type and none of the proposed workarounds worked.
If relevant, In my case, the mlContext used to load the data is different then the one performing the experiment.
Perhaps the provided code example is too old now? (4 years) and irrelevant anymore?