-
Notifications
You must be signed in to change notification settings - Fork 61
Closed
Description
System Information (please complete the following information):
- Model Builder Version (available in Manage Extensions dialog): 17.16.0.2326802
- Visual Studio Version: Community 2022 v 4.8.04084
Describe the bug
- On which step of the process did you run into an issue: About 1 hour into model training, Model Builder displays the following exception:
2023-05-21 13:50:16.8911 DEBUG Array dimensions exceeded supported range. at System.Collections.Generic.List1.set_Capacity(Int32 value)
at System.Collections.Generic.List1.EnsureCapacity(Int32 min) at System.Collections.Generic.List1.Add(T item)
at Microsoft.ML.Trainers.FastTree.DataConverter.ValuesList.Add(Int32 index, Double value)
at Microsoft.ML.Trainers.FastTree.DataConverter.MemImpl.MakeBoundariesAndCheckLabels(Int64& missingInstances, Int64& totalInstances)
at Microsoft.ML.Trainers.FastTree.DataConverter.MemImpl..ctor(RoleMappedData data, IHost host, Double[][] binUpperBounds, Single maxLabel, Boolean dummy, Boolean noFlocks, PredictionKind kind, Int32[] categoricalFeatureIndices, Boolean categoricalSplit)
at Microsoft.ML.Trainers.FastTree.DataConverter.Create(RoleMappedData data, IHost host, Int32 maxBins, Single maxLabel, Boolean diskTranspose, Boolean noFlocks, Int32 minDocsPerLeaf, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeatureIndices, Boolean categoricalSplit)
at Microsoft.ML.Trainers.FastTree.ExamplesToFastTreeBins.FindBinsAndReturnDataset(RoleMappedData data, PredictionKind kind, IParallelTraining parallelTraining, Int32[] categoricalFeaturIndices, Boolean categoricalSplit)
at Microsoft.ML.Trainers.FastTree.FastTreeTrainerBase3.ConvertData(RoleMappedData trainData) at Microsoft.ML.Trainers.FastTree.FastTreeBinaryTrainer.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.SweepablePipelineRunner.Run(TrialSettings settings) at Microsoft.ML.AutoML.SweepablePipelineRunner.RunAsync(TrialSettings settings, CancellationToken ct) at Microsoft.ML.AutoML.AutoMLExperiment.<RunAsync>d__24.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.BinaryClassificationExperiment.<ExecuteAsync>d__14.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/BinaryClassificationExperiment.cs:line 148 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLEngine.<StartTrainingAsync>d__21.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 169 (Microsoft.ML.ModelBuilder.Utils.Logger.Debug) - Clear description of the problem: I am attempting to train a model using a large dataset of 468,079,637 rows with 8 columns each using Model Builder. Five of the columns are string, which do not exceed 255 characters. The remaining columns are boolean (0, 1) values. The total dataset size is 26.3 GBs. I have configured the training to be on a boolean column, and the dataset is unbalanced such that a majority of the items are of class 0. One column is an index and is ignored in Data -> Advanced data options. I started training for 7200 seconds, during which time there is relatively low memory usage in my system (maximum 31.7 GBs). After about 50 minutes of training, memory usage spikes to use all remaining memory, and the above exception is displayed.
To Reproduce
Steps to reproduce the behavior:
N/A
Expected behavior
There are no Array dimensions exceptions.
Additional context
This is using the Data classification scenario. In Advanced training options, all trainers are enabled.