Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented May 16, 2017

What changes were proposed in this pull request?

Make String Params Case-Insensitive:
solver, modelType, initMode, metricName, handleInvalid, strategy, stringOrderType, coldStartStrategy, impurity, lossType, featureSubsetStrategy, intermediateStorageLevel, finalStorageLevel

Leave alone ChiSqSelector & StringIndexer, for they are not easy to handle like others.

How was this patch tested?

existing tests and added tests

@SparkQA
Copy link

SparkQA commented May 16, 2017

Test build #76962 has finished for PR 17995 at commit abac904.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 17, 2017

Test build #76999 has finished for PR 17995 at commit eecc1b0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 17, 2017

Test build #77004 has finished for PR 17995 at commit 97b8df6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 17, 2017

Test build #77009 has started for PR 17995 at commit bed4c41.

@SparkQA
Copy link

SparkQA commented May 17, 2017

Test build #77005 has finished for PR 17995 at commit 6b1fed5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented May 18, 2017

Test build #77038 has finished for PR 17995 at commit bed4c41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

ping @yanboliang

@SparkQA
Copy link

SparkQA commented May 31, 2017

Test build #77568 has finished for PR 17995 at commit 0db3a52.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we organize as

val AreaUnderROC: String = "areaUnderROC".toLowerCase
val AreaUnderPR: String = "areaUnderPR".toLowerCase
val supportedMetricNames = Set(AreaUnderROC, AreaUnderPR)

in object BinaryClassificationEvaluator? This should be more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supported selector types should always be stored as lower case, please update corresponding code snippet in mllib.feature.ChiSqSelector from:

private[spark] val NumTopFeatures: String = "numTopFeatures"
......

to

private[spark] val NumTopFeatures: String = "numTopFeatures".toLowerCase
......

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we can't do this, since MLlib supports set params via other entrances. Currently we can leave as it is, until we resolved #16028.

Copy link
Contributor

@yanboliang yanboliang Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of adding a new function in object ParamValidators as

def inStringArray(allowed: Array[String]): String => Boolean = { (value: String) =>
    allowed.contains(value.toLowerCase(java.util.Locale.ROOT))
  }

to facilitate similar check here and other place. cc @hhbyyh @sethah

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it a good idea.

@zhengruifeng
Copy link
Contributor Author

@yanboliang I update this PR and revert changes on setSolver in GLR and LiR. Thanks for your reviewing.

@SparkQA
Copy link

SparkQA commented Jun 26, 2017

Test build #78607 has finished for PR 17995 at commit 28941f3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 26, 2017

Test build #78619 has finished for PR 17995 at commit 70e479b.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 26, 2017

Test build #78623 has finished for PR 17995 at commit a4fa44e.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 26, 2017

Test build #78621 has finished for PR 17995 at commit c377208.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

yanboliang referenced this pull request Jul 1, 2017
## What changes were proposed in this pull request?
1, make param support non-final with `finalFields` option
2, generate `HasSolver` with `finalFields = false`
3, override `solver` in LiR, GLR, and make MLPC inherit `HasSolver`

## How was this patch tested?
existing tests

Author: Ruifeng Zheng <[email protected]>
Author: Zheng RuiFeng <[email protected]>

Closes #16028 from zhengruifeng/param_non_final.
@SparkQA
Copy link

SparkQA commented Jul 3, 2017

Test build #79060 has finished for PR 17995 at commit c58614f.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2017

Test build #79066 has finished for PR 17995 at commit 1715131.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2017

Test build #79065 has finished for PR 17995 at commit 6557b37.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2017

Test build #79068 has finished for PR 17995 at commit 1997cd1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${getModelType} -> $getModelType

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to keep the original style, since we may add new metric which is not comply with isLargerBetter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it private[BinaryClassificationEvaluator]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for string value in an allowed set of string values in a case-insensitive way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is useless and involves extra computing cost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the original output format for supported family names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NORMAL -> Normal

@SparkQA
Copy link

SparkQA commented Jul 6, 2017

Test build #79258 has finished for PR 17995 at commit 4a6682f.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 6, 2017

Test build #79260 has finished for PR 17995 at commit 8256530.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 6, 2017

Test build #79262 has finished for PR 17995 at commit 6d89c00.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng zhengruifeng deleted the str_get_set branch August 10, 2017 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants