-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
APIIssues pertaining the friendly APIIssues pertaining the friendly API
Description
FeaturizeText was upgraded to allow specification of n-grams for words and characters. However, now it awkward to use FeaturizeText without specifying n-grams. It is now necessary to explicitly set CharFeatureExtractor as null.
This is how to compose a bag-of-words with the current API:
var pipeline = mlContext.Transforms.Text.FeaturizeText(
"Features",
new TextFeaturizingEstimator.Options
{
KeepPunctuations = false,
OutputTokens = true,
CharFeatureExtractor = null,
WordFeatureExtractor = new WordBagEstimator.Options { NgramLength = 1},
VectorNormalizer = TextFeaturizingEstimator.NormFunction.None
},
"SentimentText");I would expect to be able to do something like
CharFeatureExtractor = new WordBagEstimator.Options { NgramLength = 0},But this throws an error that Skipgrams is not less-than NgramLength, and Skipgrams must be positive.
Overall, it is a bit awkward and not obvious that you have to manually null a option. Is this the API we want to ship in v1.0?
Metadata
Metadata
Assignees
Labels
APIIssues pertaining the friendly APIIssues pertaining the friendly API