|
1 | 1 | # The ML.NET Roadmap |
2 | 2 |
|
3 | | -The goal of ML.NET project is to provide an easy to use, .NET-friendly ML platform. This document describes the tentative plan for the project in the short and long-term. |
| 3 | +The goal of the ML.NET project is to make .NET developers great at machine learning. This document describes the plan for the project. |
4 | 4 |
|
5 | | -ML.NET is a community effort and we welcome community feedback on our plans. The best way to give feedback is to open an issue in this repo. It's always a good idea to have a discussion before embarking on a large code change to make sure there is not duplicated effort. |
6 | | -Many of the features listed on the roadmap already exist in the internal version of the code-base. They are marked with (*). We plan to release more and more internal features to Github over time. |
| 5 | +ML.NET is a community effort and we welcome community feedback on our plans. The best way to give feedback is to open an issue in this repo. |
7 | 6 |
|
8 | | -In the meanwhile, we are looking for contributions. An easy place to start is to look at _up-for-grabs_ issues on [Github](https://github.com/dotnet/machinelearning/issues?q=is%3Aopen+is%3Aissue+label%3Aup-for-grabs) |
| 7 | +We also invite contributions. The [up-for-grabs issues](https://github.com/dotnet/machinelearning/issues?q=is%3Aopen+is%3Aissue+label%3Aup-for-grabs) on GitHub are a good place to start. |
9 | 8 |
|
10 | | -## Short Term |
11 | | -### Training Improvements |
12 | | -* Deep Learning Training Support |
13 | | - * Integrate with leading DNN package(s) |
14 | | - * Support for transfer learning. |
15 | | - * Hybrid training of pipelines containing both DNN and non-DNN predictors. |
16 | | - * Fast.ai like APIs. |
| 9 | +## Goals through June 30, 2020 |
| 10 | +### Test stability |
| 11 | +Continuous integration builds currently have a 30% pass rate. We aim to get this pass rate up to at least 80%. |
17 | 12 |
|
18 | | -### Trained Model Management |
19 | | -* Export models to [ONNX](https://github.com/onnx/models) (*) |
| 13 | +### Streaming metrics |
| 14 | +Currently, the way ML.NET computes [metrics](https://docs.microsoft.com/dotnet/machine-learning/resources/metrics) is memory-intensive. We will compute metrics in a streaming fashion instead, thereby reducing memory consumption. |
20 | 15 |
|
21 | | -## Longer Term |
| 16 | +### Multivariate anomaly detection |
| 17 | +ML.NET already supports [univariate anomaly detection](https://docs.microsoft.com/dotnet/api/microsoft.ml.timeseriescatalog.detectanomalybysrcnn?view=ml-dotnet), but we will add the ability to detect anomalies in multiple variables over time. |
22 | 18 |
|
23 | | -### Training Improvements |
24 | | -* Add more learners, perhaps, including: (*) |
25 | | - * [ProtoNN and Bonsaii](https://www.microsoft.com/en-us/research/project/resource-efficient-ml-for-the-edge-and-endpoint-iot-devices/) for compact and efficient models. |
26 | | -* Integration with other ML packages |
27 | | - * Accord.NET |
28 | | - * etc. |
29 | | -* Additional ML tasks (*) |
30 | | - * _Sequence Classification_ - learns from a series of examples in a sequence, and each item is assigned a distinct label, akin to a multiclass classification task |
31 | | -* Additional Data source support |
32 | | - * Data from SQL Databases, such as SQL Server |
33 | | - * Data located on the cloud |
34 | | - * Apache Parquet |
35 | | - * Native Binary high-performance format |
36 | | -* Distributed Training |
37 | | - * Easily train models on the cloud |
38 | | -* Whole-pipeline optimizations for both training and inference |
39 | | -* Automation of more data science tasks |
40 | | -* Additional Trainers |
41 | | -* Additional tasks |
42 | | - |
43 | | -### Featurization Improvements |
44 | | -* Improved data wrangling support |
45 | | -* Add auto-suggestion of training pipelines. The technology will provide intelligent ```LearningPipeline``` suggestions based on training data attributes (*) |
46 | | -* Additional natural language text preprocessing |
47 | | -* Time series and forecasting |
48 | | -* Support for Video, audio, and other data types |
49 | | - |
50 | | -### Trained Model Management |
51 | | -* Model operationalization in the Cloud |
52 | | -* Model deployment on mobile platforms |
53 | | -* Ability to run [ONNX](https://github.com/onnx/models) models in the ```LearningPipeline``` |
54 | | -* Support for the next version of ONNX |
55 | | -* Model deployment to IOT devices |
56 | | - |
57 | | -### GUI Improvements |
58 | | -* Usability improvements |
59 | | -* Support of additional ML.NET features |
60 | | -* Improved code generation for training and inference |
61 | | -* Run the pipelines rather than just suggesting them; present to the user the pipelines and the metrics generated from running. |
62 | | -* Distributed runs, rather than sequential. |
63 | | - |
64 | | -### Other |
65 | | -* Support for additional languages |
66 | | -* Published reproducible benchmarks against industry-leading ML toolkits on a variety of tasks and datasets |
| 19 | +### ONNX Runtime exportability |
67 | 20 |
|
| 21 | +We will expand the number of ML.NET transforms and estimators that are exportable to the [ONNX Runtime](https://github.com/Microsoft/onnxruntime). |
0 commit comments