Skip to content

Conversation

@SolyarA
Copy link
Contributor

@SolyarA SolyarA commented Jul 24, 2018

Fixes #567

@dnfclas
Copy link

dnfclas commented Jul 24, 2018

CLA assistant check
All CLA requirements met.

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @SolyarA ... this looks OK to me @justinormont is it along the lines of what you wanted in #567?

@justinormont
Copy link
Contributor

@TomFinley, @Zruty0 : What's the implications of missing the range of 0.4 to 0.5 in L2RegularizerWeight for AveragedPerceptron?

A better fix for this would be to add a param to the sweep range to note if the range boundaries are inclusive vs. exclusive.

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@Zruty0
Copy link
Contributor

Zruty0 commented Jul 25, 2018

I cannot say what the implications are, I don't have an intuition on how the optimizer would behave close to 0.5 boundary.

As for inclusive vs. exclusive boundaries: I believe they were originally included in the code (as Min/Max and also Inf/Lim, or something along these lines). But I assume it was deemed unnecessary complexity and removed? I don't find any trace of this anymore.


In reply to: 407652667 [](ancestors = 407652667)

@TomFinley
Copy link
Contributor

I'm not sure "close" to 0.5 is actually a completely sensible value. See here:

// Weights are scaled down by 2 * L2 regularization on each update step, so 0.5 would scale all weights to 0, which is not sensible.
Contracts.CheckUserArg(0 <= args.L2RegularizerWeight && args.L2RegularizerWeight < 0.5, nameof(args.L2RegularizerWeight), "must be in range [0, 0.5)");

and here:

WeightsScale *= 1 - 2 * Args.L2RegularizerWeight; // L2 regularization.

Indeed I feel like this is all somewhat haphazard, and whoever introduced this sweep range was making the mistake of confusing sweep range with defining valid values... which is not the point at all. Anyway, I'm inclined to just accept @justinormont if that is all right. It seems though like if we are going to have continuous values that the notion of inclusive vs. exclusive bounds needs to be accounted for somehow, not sure why such a concept would be removed. 😦

@TomFinley TomFinley merged commit 7fea0af into dotnet:master Jul 25, 2018
@justinormont
Copy link
Contributor

Thanks for pushing in. And thanks @SolyarA for your PR.

@TomFinley: Agreed; we will want to reduce the range of the sweep params from the valid to the useful ranges. This will speed up the hyperparameter optimization. The only reason I see to keep the ranges as wide as the valid is if we can find examples where the extreme values led to good scores. We have further ideas on how to focus the sweeper's energy towards useful ranges of hyperparameters, so perhaps the work of figuring out the useful ranges won't be needed.

eerhardt pushed a commit to eerhardt/machinelearning that referenced this pull request Jul 27, 2018
* Changed range of L2RegularizerWeight parameter in AveragedPerceptron
codemzs pushed a commit to codemzs/machinelearning that referenced this pull request Aug 1, 2018
* Changed range of L2RegularizerWeight parameter in AveragedPerceptron
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants