-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
documentationRelated to documentation of ML.NETRelated to documentation of ML.NET
Description
TokenizeIntoCharactersAsKeys:
- The description of TokenizingByCharactersEstimator should be corrected to:
"Create a TokenizingByCharactersEstimator, which tokenizes words by splitting text into sequences of
characters using a sliding window." - outputColumnName description should state that the outputs are Uints rather than keys? I think it might confuse the users that those are KeyDataViewTypes. Or should the name of this method be changed? @artidoro @Ivanidzo4ka @zeahmed ?
"Name of the column resulting from the transformation of inputColumnName. This column's data type will be a variable-sized vector of Uint". - useMarkerCharacters needs a better description.
RemoveStopWords
- inputColumnName,:
"This estimator operates over a vector of text.
CustomStopWordsRemovingEstimator
- Output column data type
"Variable-sized vector of Text"
Replace Unknown-sized vector with Variable-sized vector. - xref not resolving:
xref:Microsoft.ML.Transforms.Text.CustomStopWordsRemovingTransformer/
WordHashBagEstimator
- Output column data type
Known-size vector of of Single - Replace metadata with annotations in the documentation references.
NgramHashingEstimator
- broken xref:Microsoft.ML.Transforms.Text.NgramHashingTransformer/ link.
- casing: "in a way that the former takes "
NormalizeText
- outputColumnName
"This column's data type is a scalar of text or "
WordEmbeddingEstimator
- Add links for Glove50D, dimensionality of the embedding model used.
- Re-phrasehere and everywhere: See the See Also section for links to usage examples.
Metadata
Metadata
Assignees
Labels
documentationRelated to documentation of ML.NETRelated to documentation of ML.NET