Skip to content

Conversation

@Zruty0
Copy link
Contributor

@Zruty0 Zruty0 commented Sep 14, 2018

Added tests for metadata propagation on existing trainers, also fixed SDCA to pass metadata correctly.

@Zruty0 Zruty0 added the API Issues pertaining the friendly API label Sep 14, 2018
@Zruty0 Zruty0 self-assigned this Sep 14, 2018
/// <summary>
/// Normal metadata that we produce for score columns.
/// </summary>
protected static IEnumerable<SchemaShape.Column> MetadataForScoreColumn(bool isNormalized = false)
Copy link
Member

@sfilipi sfilipi Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected static IEnumerable<SchemaShape.Column> MetadataForScoreColu [](start = 8, length = 69)

should this live under some utilities class, since we'll need to replicate for the trainers that don't extend this base class? #Resolved

/// <summary>
/// Normal metadata that we produce for score columns.
/// </summary>
protected static IEnumerable<SchemaShape.Column> MetadataForScoreColumn(bool isNormalized = false)
Copy link
Member

@sfilipi sfilipi Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isNormalized [](start = 85, length = 12)

will this come from the trainerInfo, or it is always true for the Propability type column? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's always true for Probability


In reply to: 217573257 [](ancestors = 217573257)

new SchemaShape.Column(DefaultColumnNames.PredictedLabel, SchemaShape.Column.VectorKind.Scalar, BoolType.Instance, false)
new SchemaShape.Column(DefaultColumnNames.Score, SchemaShape.Column.VectorKind.Scalar, NumberType.R4, false, new SchemaShape(MetadataForScoreColumn())),
new SchemaShape.Column(DefaultColumnNames.Probability, SchemaShape.Column.VectorKind.Scalar, NumberType.R4, false, new SchemaShape(MetadataForScoreColumn(true))),
new SchemaShape.Column(DefaultColumnNames.PredictedLabel, SchemaShape.Column.VectorKind.Scalar, BoolType.Instance, false, new SchemaShape(MetadataForScoreColumn()))
Copy link
Member

@sfilipi sfilipi Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shape.Column(DefaultColumnNames.PredictedLabel [](start = 26, length = 46)

is it intentional to not use also the input label metadata? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the input label metadata is not passed through. I'm merely trying to codify existing behavior of the trainer, which has not been codified before


In reply to: 217574282 [](ancestors = 217574282)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: why do you concat to the label metadata of SDCA multiclass, then? To propagate the info about the string - categories conversion?


In reply to: 217769414 [](ancestors = 217769414,217574282)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDCA propagates key value metadata, that's what I'm doing there. And I think all the multiclass learners do (or should do)


In reply to: 217781701 [](ancestors = 217781701,217769414,217574282)

new SchemaShape.Column(DefaultColumnNames.PredictedLabel, SchemaShape.Column.VectorKind.Scalar, BoolType.Instance, false)
new SchemaShape.Column(DefaultColumnNames.Score, SchemaShape.Column.VectorKind.Scalar, NumberType.R4, false, new SchemaShape(MetadataForScoreColumn())),
new SchemaShape.Column(DefaultColumnNames.Probability, SchemaShape.Column.VectorKind.Scalar, NumberType.R4, false, new SchemaShape(MetadataForScoreColumn(true))),
new SchemaShape.Column(DefaultColumnNames.PredictedLabel, SchemaShape.Column.VectorKind.Scalar, BoolType.Instance, false, new SchemaShape(MetadataForScoreColumn()))
Copy link
Member

@sfilipi sfilipi Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetadataForScoreColumn [](start = 154, length = 22)

shoudl this just be called Metadata, since it is used for all output columns? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe


In reply to: 217574324 [](ancestors = 217574324)

cols.Add(new SchemaShape.Column(MetadataUtils.Kinds.ScoreColumnKind, SchemaShape.Column.VectorKind.Scalar, TextType.Instance, false));
cols.Add(new SchemaShape.Column(MetadataUtils.Kinds.ScoreValueKind, SchemaShape.Column.VectorKind.Scalar, TextType.Instance, false));
if (isNormalized)
cols.Add(new SchemaShape.Column(MetadataUtils.Kinds.IsNormalized, SchemaShape.Column.VectorKind.Scalar, BoolType.Instance, false));
Copy link
Member

@sfilipi sfilipi Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Score column generated from the OVA tests, i see yet another column: "Slot Names"
Dees that need to be added to the list of metadata here? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's only OVA, the answer is no


In reply to: 217576812 [](ancestors = 217576812)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@artidoro artidoro self-requested a review September 14, 2018 20:00
@Zruty0 Zruty0 merged commit a901048 into dotnet:master Sep 14, 2018
@Zruty0 Zruty0 deleted the feature/trainer-metadata branch September 14, 2018 20:59
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

API Issues pertaining the friendly API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants