Skip to content

System.OperationCanceledException when calling experimentResult.BestRun.Estimator.Fit #6565

@superichmann

Description

@superichmann

System Information (please complete the following information):

  • OS & Version: Microsoft Windows 10 Pro
  • ML.NET Version: ML.NET 2.0.1
  • AutoML version: AutoML 0.20.1
  • .NET Version: .NET 6.0

Describe the bug
As outlined in this tutorial, a model can be refitted to include the latest data used in the validation phase of the AutoML experiment.

When calling this method, as in my case:
private static ITransformer RefitBestPipeline(ExperimentResult<RegressionMetrics> experimentResult,IDataView final) { RunDetail<RegressionMetrics> bestRun = experimentResult.BestRun; return bestRun.Estimator.Fit(final); }
the return line causes the exception.

To Reproduce

  1. Create a regression automl experiment in C#.
  2. run the experiment on train validation data split
  3. After experiment finished, call the RefitBestPipeline function with the best model and the entire IDataView (train+validation)
  4. The error should occur on the return line

Expected behavior
The function should return the new ITransformer

Screenshots, Code, Sample Projects

System.OperationCanceledException
  HResult=0x8013153B
  Message=Operation was canceled.
  Source=Microsoft.ML.Core
  StackTrace:
   at Microsoft.ML.Runtime.Contracts.CheckAlive(IHostEnvironment env)
   at Microsoft.ML.Trainers.FastTree.DataConverter.MemImpl.InitializeBins(Int32 maxBins, IParallelTraining parallelTraining)
   at Microsoft.ML.Trainers.FastTree.DataConverter.MemImpl..ctor(RoleMappedData data, IHost host, Int32 maxBins, Single maxLabel, Boolean noFlocks, Int32 minDocsPerLeaf, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeatureIndices, Boolean categoricalSplit)
   at Microsoft.ML.Trainers.FastTree.DataConverter.Create(RoleMappedData data, IHost host, Int32 maxBins, Single maxLabel, Boolean diskTranspose, Boolean noFlocks, Int32 minDocsPerLeaf, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeatureIndices, Boolean categoricalSplit)
   at Microsoft.ML.Trainers.FastTree.ExamplesToFastTreeBins.FindBinsAndReturnDataset(RoleMappedData data, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeaturIndices, Boolean categoricalSplit)
   at Microsoft.ML.Trainers.FastTree.FastTreeTrainerBase`3.ConvertData(RoleMappedData trainData)
   at Microsoft.ML.Trainers.FastTree.FastForestRegressionTrainer.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at mlnet.AutoMLHelper.RefitBestPipeline(ExperimentResult`1 experimentResult, IDataView final)

Additional context
Note: I have read all of the other opened issues on this exception type and none of the proposed workarounds worked.
If relevant, In my case, the mlContext used to load the data is different then the one performing the experiment.
Perhaps the provided code example is too old now? (4 years) and irrelevant anymore?

Metadata

Metadata

Labels

AutoML.NETAutomating various steps of the machine learning process

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions