You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ML.NET runs on Windows, Linux, and macOS using [.NET Core](https://github.com/dotnet/core), or Windows using .NET Framework. 64 bit is supported on all platforms. 32 bit is supported on Windows, except for TensorFlow, LightGBM, and ONNX related functionality.
24
24
25
-
The current release is 1.0.0. Check out the [release notes](docs/release-notes/1.0.0/release-1.0.0.md) to see what's new.
25
+
Check out the [release notes](docs/release-notes) to see what's new.
26
26
27
27
First, ensure you have installed [.NET Core 2.1](https://www.microsoft.com/net/learn/get-started) or later. ML.NET also works on the .NET Framework 4.6.1 or later, but 4.7.2 or later is recommended.
Copy file name to clipboardExpand all lines: ROADMAP.md
+8-37Lines changed: 8 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,59 +9,30 @@ In the meanwhile, we are looking for contributions. An easy place to start is t
9
9
10
10
## Short Term
11
11
### Training Improvements
12
-
* Improved public API for training and inference
13
-
* Enhanced tests and scenarios
14
-
* Additional Learners
15
-
*[LibSVM](https://www.csie.ntu.edu.tw/~cjlin/libsvm/) for anomaly detection (*)
16
-
* [LightGBM](https://github.com/Microsoft/LightGBM) - a high-performance boosted decision tree (*)
17
-
* Additional Learning Tasks (*)
18
-
* _Ranking_ - problem where the goal is to automatically sort (rank) instances within a group based on ranked examples in training data
19
-
* _Anomaly Detection_ - is also known as _outlier detection_. It is a task to identify items, events or observations which do not conform to an expected pattern in the dataset.
20
-
* _Quantile Regression_ is a type of regression analysis. Whereas regression results in estimates that approximate the conditional mean of the response variable given certain values of the predictor variables, quantile regression aims at estimating either the conditional median or other quantiles of the response variable
21
-
* Additional Data Source support (*)
22
-
* Apache Parquet
23
-
* Native Binary high-performance format
24
-
25
-
### Featurization Improvements
26
-
We already provide text/NLP and image processing functionalities that will be expanded
27
-
* Text (*)
28
-
* Natural language text preprocessing such as improving tokenization features, adding part-of-speech tagging, and sentence boundary disambiguation
29
-
* Pre-trained text models (beyond current n-gram and pre-trained WordEmbedding text handling) that can further improve the extraction of semantic or sentiment features from text
30
-
* Image (*)
31
-
* Image preprocessing such as loading, resizing, and normalization of images
32
-
* Image featurization, including industry-standard pre-trained ImageNet neural models, such as ResNet and AlexNet
12
+
* Deep Learning Training Support
13
+
* Integrate with leading DNN package(s)
14
+
* Support for transfer learning.
15
+
* Hybrid training of pipelines containing both DNN and non-DNN predictors.
16
+
* Fast.ai like APIs.
33
17
34
18
### Trained Model Management
35
19
* Export models to [ONNX](https://github.com/onnx/models) (*)
36
20
37
-
### GUI
38
-
* Release the Model Builder tool to ease model development (*)
39
-
* Design improvements to make the design adhere better to Fluent principles
40
-
* Add a view for an easier comparison of several experiments
41
-
* Ability to select the best performing pipeline, by sweeping transforms, the same way learners are swept.
42
-
43
21
## Longer Term
44
22
45
23
### Training Improvements
46
24
* Add more learners, perhaps, including: (*)
47
-
* Generative Additive Models
48
-
*[SymSGD](https://arxiv.org/pdf/1705.08030.pdf) -a fast linear SGD learner
49
-
* Factorization Machines
50
-
*[ProtoNN and Bonsaii](https://www.microsoft.com/en-us/research/project/resource-efficient-ml-for-the-edge-and-endpoint-iot-devices/) for compact and efficient models
25
+
*[ProtoNN and Bonsaii](https://www.microsoft.com/en-us/research/project/resource-efficient-ml-for-the-edge-and-endpoint-iot-devices/) for compact and efficient models.
51
26
* Integration with other ML packages
52
27
* Accord.NET
53
28
* etc.
54
-
* Deep Learning Support
55
-
* Integrate with leading DNN package(s)
56
-
* Support for transfer learning
57
-
* Hybrid training of pipelines containing both DNN and non-DNN predictors
58
29
* Additional ML tasks (*)
59
-
*_Recommendation_ - Is a problem that can be phrased a: "For a given user, predict the ratings this user would give to the items that they have not explicitly rated yet"
60
-
*_Anomaly Detection_, also known as _outlier detection_. It is a task to identify items, events or observations which do not conform to an expected pattern in the dataset. Typical examples are: detecting credit card fraud, medical problems or errors in text. Anomalies are also referred to as outliers, novelties, noise, deviations and exceptions
61
30
*_Sequence Classification_ - learns from a series of examples in a sequence, and each item is assigned a distinct label, akin to a multiclass classification task
62
31
* Additional Data source support
63
32
* Data from SQL Databases, such as SQL Server
64
33
* Data located on the cloud
34
+
* Apache Parquet
35
+
* Native Binary high-performance format
65
36
* Distributed Training
66
37
* Easily train models on the cloud
67
38
* Whole-pipeline optimizations for both training and inference
Copy file name to clipboardExpand all lines: docs/building/unix-instructions.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,6 +26,7 @@ The following components are needed:
26
26
* clang-3.9
27
27
* cmake 2.8.12
28
28
* libunwind8
29
+
* libomp-dev
29
30
* curl
30
31
* All the requirements necessary to run .NET Core 2.0 applications: libssl1.0.0 (1.0.2 for Debian 9) and libicu5x (libicu52 for ubuntu 14.x, libicu55 for ubuntu 16.x, and libicu57 for ubuntu 17.x). For more information on prerequisites in different linux distributions click [here](https://docs.microsoft.com/en-us/dotnet/core/linux-prerequisites?tabs=netcore2x).
Copy file name to clipboardExpand all lines: docs/code/SchemaComprehension.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -144,7 +144,7 @@ If you ever see the error message that says: `An attempt was made to keep iterat
144
144
`IDataView`[type system](IDataViewTypeSystem.md) differs slightly from the C# type system, so a 1-1 mapping between column types and C# types is not always feasible.
145
145
Below are the most notable examples of the differences:
146
146
147
-
*`IDataView` vector columns often have a fixed (and known) size. The C# array type best corresponds to a 'variable size' vector: the one that can have different number of slots on every row. You can use `[VectorType(N)]` attribute to an array field to specify that the column is a vector of fixed size N. This is often necessary: most ML components don't work with variable-size vectors, they require fixed-size ones.
147
+
*`IDataView` vector columns often have a fixed (and known) size. The C# array type best corresponds to a 'variable size' vector: the one that can have different number of slots on every row. You can use `[VectorDataViewType(N)]` attribute to an array field to specify that the column is a vector of fixed size N. This is often necessary: most ML components don't work with variable-size vectors, they require fixed-size ones.
148
148
*`IDataView`'s [key types](IDataViewTypeSystem.md#key-types) don't have a natural underlying C# type either. To declare a key-type column, you need to make your field an `uint`, and decorate it with `[KeyType]` to denote that the field is a key, and not a regular unsigned integer.
149
149
150
150
### Full list of type mappings
@@ -169,7 +169,7 @@ The below table illustrates what C# types are mapped to what `IDataView` types:
169
169
|`DT`|`DvDateTime`||
170
170
|`DZ`|`DvDateTimeZone`||
171
171
| Variable-size vector |`VBuffer<T>`|`T[]`, and the vector is always dense |
172
-
| Fixed-size vector |`VBuffer<T>` with `[VectorType(N)]`|`T[]` with `VectorType(N)`, and the vector is always dense |
172
+
| Fixed-size vector |`VBuffer<T>` with `[VectorDataViewType(N)]`|`T[]` with `VectorDataViewType(N)`, and the vector is always dense |
0 commit comments