Skip to content

Conversation

@yanboliang
Copy link
Contributor

What changes were proposed in this pull request?

Add MinHash and RandomProjection Python API.

How was this patch tested?

Add doc tests.

@SparkQA
Copy link

SparkQA commented Nov 4, 2016

Test build #68135 has finished for PR 15768 at commit 85d22c3.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class LSHParams(Params):
    • class LSHModel():
    • class MinHash(JavaEstimator, LSHParams, HasInputCol, HasOutputCol, HasSeed,
    • class MinHashModel(JavaModel, LSHModel, JavaMLReadable, JavaMLWritable):
    • class RandomProjection(JavaEstimator, LSHParams, HasInputCol, HasOutputCol, HasSeed,
    • class RandomProjectionModel(JavaModel, LSHModel, JavaMLReadable, JavaMLWritable):

@SparkQA
Copy link

SparkQA commented Nov 4, 2016

Test build #68136 has finished for PR 15768 at commit cdeca1c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor Author

cc @jkbradley @sethah

@sethah
Copy link
Contributor

sethah commented Nov 7, 2016

I began to review this, but got sidetracked with a lot of the details we are currently discussing on the original LSH PR.

@jkbradley
Copy link
Member

jkbradley commented Nov 28, 2016

This can now proceed since http://github.com/apache/spark/pull/15874 is ready to be merged. Sorry for the delay! This will need to slip to 2.2

@jkbradley
Copy link
Member

Pinging on this: What's a reasonable ETA for updating the PR? Thanks @yanboliang !

@yanboliang
Copy link
Contributor Author

@jkbradley I just came back from vacation, and will update this PR before the weekend. Thanks.

@SparkQA
Copy link

SparkQA commented Jan 24, 2017

Test build #3550 has finished for PR 15768 at commit cdeca1c.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

Btw, @yanboliang and @Yunni did you sync? I'm fine with the takeover, but don't want to stomp on toes. Both can be listed as authors when this gets merged. Should we close this issue with the other taking its place?

@yanboliang
Copy link
Contributor Author

I'm OK to close this one and glad to help to review #16715 , but I think I will have time until next week.

@yanboliang yanboliang closed this Jan 28, 2017
@yanboliang yanboliang deleted the spark-18080 branch January 28, 2017 15:41
ghost pushed a commit to dbtsai/spark that referenced this pull request Feb 16, 2017
…e Hashing

## What changes were proposed in this pull request?
This pull request includes python API and examples for LSH. The API changes was based on yanboliang 's PR apache#15768 and resolved conflicts and API changes on the Scala API. The examples are consistent with Scala examples of MinHashLSH and BucketedRandomProjectionLSH.

## How was this patch tested?
API and examples are tested using spark-submit:
`bin/spark-submit examples/src/main/python/ml/min_hash_lsh.py`
`bin/spark-submit examples/src/main/python/ml/bucketed_random_projection_lsh.py`

User guide changes are generated and manually inspected:
`SKIP_API=1 jekyll build`

Author: Yun Ni <[email protected]>
Author: Yanbo Liang <[email protected]>
Author: Yunni <[email protected]>

Closes apache#16715 from Yunni/spark-18080.
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 16, 2017
…e Hashing

## What changes were proposed in this pull request?
This pull request includes python API and examples for LSH. The API changes was based on yanboliang 's PR apache#15768 and resolved conflicts and API changes on the Scala API. The examples are consistent with Scala examples of MinHashLSH and BucketedRandomProjectionLSH.

## How was this patch tested?
API and examples are tested using spark-submit:
`bin/spark-submit examples/src/main/python/ml/min_hash_lsh.py`
`bin/spark-submit examples/src/main/python/ml/bucketed_random_projection_lsh.py`

User guide changes are generated and manually inspected:
`SKIP_API=1 jekyll build`

Author: Yun Ni <[email protected]>
Author: Yanbo Liang <[email protected]>
Author: Yunni <[email protected]>

Closes apache#16715 from Yunni/spark-18080.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants