-
Notifications
You must be signed in to change notification settings - Fork 6k
Open
Description
Add documentation to the following doc: https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/load-data-ml-net
Describe ways that you can load data into an IDataView without defining input and output schema classes. These include:
LoadFromTextFilemethodLoadFromEnumerableAnonymous types
LoadFromTextFile method
Given a dataset similar to the following:
Iris-setosa,5.1,3.5,1.4,0.2
Iris-setosa,4.9,3.0,1.4,0.2
Iris-setosa,4.7,3.2,1.3,0.2
You can use the following code to load the data into an IDataView
open Microsoft.ML
open Microsoft.ML.Data
let ctx = new MLContext()
let options = new TextLoader.Options()
options.Separators <- [|','|]
let idv = ctx.Data.LoadFromTextFile("iris.data.txt", options)There are a few assumptions made:
- Your first column is your label / target variable
- All your features are floats. If there are different types (i.e. a string), it gets converted to a float (NaN)
Once loaded, an IDataView is created with two columns:
- Label
- Features
LoadFromEnumerable Anonymous Types
When you have a collection of anonymous types, you can use the LoadFromEnumerable method and the schema is inferred. For example:
open Microsoft.ML
let ctx = new MLContext()
let reviews =
seq {
{|SentimentText = "This is a great steak"; Label= true|}
{|SentimentText = "Service was bad"; Label= false|}
{|SentimentText = "I did not like the green eggs and ham"; Label= false|}
}
let idvAnonIEnumerable = ctx.Data.LoadFromEnumerable(reviews)