Improve VectorTypeAttribute(dims) docs #5301
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #5273 : On the confusion of users about how to use the VectorTypeAttribute.
As confirmed offline by @yaeldekel , users currently can't provide a multidimensional array with a
VectorTypeAttribute(dims). They should flatten their arrays into 1-D arrays and then use theVectorTypeAttribute(dims)to describe the shape ML.NET will use internally.Note: Originally I was planning on throwing a readable exception if the user added the
VectorTypeAttributeto a multidimensional array, but some discussion would be needed to decide where to do this (either inInternalSchemaDefinition.Create, or somewhere else onInternalSchemaDefinitionorSchemaDefinitionitself). Furthermore, I'm thinking that maybe there's a bug on these classes, where if a user adds a customDataViewType(see #3263 ) and adds the attribute of this type to an array, then some unexpected behavior will happen asInternalSchemaDefinition.GetMappedType()assumes that any array (multidimensional or otherwise) isisVector = true, and then elsewhere in the code we assume that if a typeisVector == truethen it should be aVectorDataViewType. More discussion and inspection would be needed on this regard and it might, or not, affect the decision of where to throw an exception for multidimensionals array withVectorTypeAttribute. And these may be addressed on a separate PR.So, for the time being, and the fact that this might not be a common problem anyway, I think that simply changing slightly the docs is the best option for the next release.