Add resource (CPU,RAM,GPU,thread count) monitoring to AutoML experiments

**Is your feature request related to a problem? Please describe.**
As others also experienced, AutoML training is heavy on CPU and RAM and it can cause slowdowns and crashes (#6175, #6286, #6288, #6297). I sometimes run into an issue where some of my trials run longer than expected, potentially because my systems ran out of one of my resources. I had a few system crashes as well, when running AutoML forced Windows to start closing other applications.

**Describe the solution you'd like**
It would be great to have more information about the running AutoML trials, including how much CPU, RAM, GPU are using on how many threads. Ideally it would be included in a new, periodically called method on AutoML's [IMonitor interface](https://github.com/dotnet/machinelearning/blob/main/src/Microsoft.ML.AutoML/AutoMLExperiment/IMonitor.cs).
If this was combined with an extended experiment control ([#5736](https://github.com/dotnet/machinelearning/issues/5736#issuecomment-1243329798)), we could make clever decisions about a trial or experiment depending on its resource usage. We could pause the experiment if the system is out of resources, or even cancel a trial if it uses suspiciously high amount of RAM to prevent system failure, for example. (As it happens sometimes with my experiments.)

**Describe alternatives you've considered**
Well, theoretically I could monitor my system resources constantly on a separate thread, but I still couldn't determine if AutoML is the reason for an elevated CPU, RAM or GPU usage, or something else running on the system independently from AutoML.

**Additional context**
This issue is related to AutoML experiment resource usage limiting (#6061) and AutoML experiment control (#5736).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add resource (CPU,RAM,GPU,thread count) monitoring to AutoML experiments #6320

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add resource (CPU,RAM,GPU,thread count) monitoring to AutoML experiments #6320

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions