Skip to content

Exception is thrown if NDCG > 10 is used with LightGbm for evaluating ranking #3993

@nicolehaugen

Description

@nicolehaugen
  • Version: ML.NET 1.2.0

The current code in the RankingEvaluator.cs file has the MaxTruncationLevel for NDCG (Normalized Cumulative Gain Metric) set to 10. Also, the code currently throws an exception if the NDCG is set to a value > 10. This is a blocking issue for ranking because it prevents the ability to measure the quality of ranking with result sets > 10. For example, if you were attempting to rank a group of 100 results, with the MaxTruncationLevel of 10, you could only measure whether the first 10 results were ranked correctly.

Here's the code:

         public RankingEvaluator(IHostEnvironment env, Arguments args)
            : base(env, LoadName)
        {
            // REVIEW: What kind of checking should be applied to labelGains?
            if (args.DcgTruncationLevel <= 0 || args.DcgTruncationLevel > Aggregator.Counters.MaxTruncationLevel)
                throw Host.ExceptUserArg(nameof(args.DcgTruncationLevel), "DCG Truncation Level must be between 1 and {0}", Aggregator.Counters.MaxTruncationLevel);
            Host.CheckUserArg(args.LabelGains != null, nameof(args.LabelGains), "Label gains cannot be null");
...
}

It appears from the //Review comment in the above code that this functionality hasn't been fully completed.

While I'm unsure what the MaxTruncationLevel value should be, I have seen on a ranking contest\example on Kaggle.com where one contest was measuring NDCG with a truncation level of up to 38.

I also noticed that in other parts of this file, the code indicates that a value between 0-100 should be allowed:

 public Transform(IHostEnvironment env, IDataView input, string labelCol, string scoreCol, string groupCol,
                int truncationLevel, Double[] labelGains)
                : base(env, input, labelCol, scoreCol, groupCol, RegistrationName)
            {
                Host.CheckParam(0 < truncationLevel && truncationLevel < 100, nameof(truncationLevel),
                    "Truncation level must be between 1 and 99");
...
}

Also, refer to the linked bug since it is related to this scenario: [Ranker Evaluate doesn't allow you specify metric parameters.] (#2728)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions