Skip to content

Conversation

@feynmanliang
Copy link
Contributor

Implementation of significance testing using Streaming API.

@feynmanliang feynmanliang changed the title A/B testing [MLLib] A/B testing Feb 21, 2015
@feynmanliang feynmanliang changed the title [MLLib] A/B testing [SPARK-3147] [MLLib] A/B testing Feb 21, 2015
@feynmanliang feynmanliang changed the title [SPARK-3147] [MLLib] A/B testing [SPARK-3147][MLLib] A/B testing Feb 21, 2015
@mengxr
Copy link
Contributor

mengxr commented Feb 23, 2015

add to whitelist

@mengxr
Copy link
Contributor

mengxr commented Feb 23, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Feb 23, 2015

Test build #27851 has finished for PR 4716 at commit 4bb8636.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@feynmanliang
Copy link
Contributor Author

[error] * abstract method numDim()Int in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary does not have a correspondent in old version

Keep, or remove numDim from this patch completely since MultivariateStatisticalSummary is no longer used by OnlineABTest (instead replaced by univariate StatsCounter)?

@mengxr
Copy link
Contributor

mengxr commented Feb 23, 2015

Let's remove numDim.

@SparkQA
Copy link

SparkQA commented Feb 24, 2015

Test build #27880 has finished for PR 4716 at commit 7ce63c8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Apr 7, 2015

@freeman-lab Do you want to make a pass on this PR?

@freeman-lab
Copy link
Contributor

@mengxr @feynmanliang sure thing! This looks really cool, will try to go through it in the next couple days.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32412 has finished for PR 4716 at commit 80c2211.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented May 11, 2015

@feynmanliang can you close this PR?

@mengxr
Copy link
Contributor

mengxr commented May 11, 2015

@feynmanliang GitHub messed up with the diff. Could you merge the current master and push an update? Another way to refresh the diff is to close this PR first and then re-open it, as @rxin suggested. Thanks!

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32432 has finished for PR 4716 at commit a36ba79.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32433 has finished for PR 4716 at commit f4de414.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 31, 2015

Test build #39295 has finished for PR 4716 at commit 93a0d34.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #39300 has finished for PR 4716 at commit 76f40ff.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #39308 has finished for PR 4716 at commit c572417.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@feynmanliang feynmanliang changed the title [SPARK-3147][MLLib] A/B testing [SPARK-3147][MLLib][Streaming] Streaming A/B testing Aug 18, 2015
feynmanliang and others added 2 commits September 11, 2015 12:18
Fix AB testing implementation and add unit tests.

Extract t-testing code out of OnlineABTesting.

Add peace period for dropping first k entries of each A/B group.

Add numDim to MultivariateOnlineSummarizer.

Refactored ABTestingMethod into sealed trait.

Add (non-sliding) testing window functionality.

Fix peace period implementation.

Fix test window batching.

Handle (inelegantly) closure capture for ABTestMethod

Improve handling of OnlineABTestMethod closure by moving DStream processing method into Serializable class.

Fixed flaky peacePeriod test.

Add ScalaDocs and format to style guide.

Add OnlineABTestExample.

Format code to style guide.

Switch MultivariateOnlineSummarizer to univariate StatsCounter.

Reduce number of passes in pairSummaries.

Add test for behavior when missing data from one group.

Remove numDim from MultivariateOnlineSummarizer.

Style guide in OnlineABTestSuite

Fix broken tests

Style fix

Fix runStream expectedOutput
@SparkQA
Copy link

SparkQA commented Sep 11, 2015

Test build #42351 has finished for PR 4716 at commit 2493418.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class MultilayerPerceptronClassifier(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredictionCol,
    • class MultilayerPerceptronClassificationModel(JavaModel):
    • class MinMaxScaler(JavaEstimator, HasInputCol, HasOutputCol):
    • class MinMaxScalerModel(JavaModel):
    • ("thresholds", "Thresholds in multi-class classification to adjust the probability of " +
    • class HasElasticNetParam(Params):
    • class HasFitIntercept(Params):
    • class HasStandardization(Params):
    • class HasThresholds(Params):
    • thresholds = Param(Params._dummy(), "thresholds", "Thresholds in multi-class classification to adjust the probability of predicting each class. Array must have length equal to the number of classes, with values >= 0. The class with largest value p/t is predicted, where p is the original probability of that class and t is the class' threshold.")
    • self.thresholds = Param(self, "thresholds", "Thresholds in multi-class classification to adjust the probability of predicting each class. Array must have length equal to the number of classes, with values >= 0. The class with largest value p/t is predicted, where p is the original probability of that class and t is the class' threshold.")

@SparkQA
Copy link

SparkQA commented Sep 11, 2015

Test build #42353 has finished for PR 4716 at commit 60b2e57.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class MinMaxScaler(JavaEstimator, HasInputCol, HasOutputCol):
    • class MinMaxScalerModel(JavaModel):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use {{{ for example code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@mengxr
Copy link
Contributor

mengxr commented Sep 18, 2015

test this please

@SparkQA
Copy link

SparkQA commented Sep 18, 2015

Test build #42642 has finished for PR 4716 at commit 60b2e57.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 18, 2015

Test build #42689 has finished for PR 4716 at commit ba71bfa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Sep 21, 2015

LGTM. Merged into master. Thanks! Sorry for the long delay!

@asfgit asfgit closed this in aeef44a Sep 21, 2015
@feynmanliang feynmanliang deleted the ab_testing branch January 13, 2016 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants