Skip to content

Conversation

@mengxr
Copy link
Contributor

@mengxr mengxr commented May 2, 2016

What changes were proposed in this pull request?

This PR continues the work from #11871 with the following changes:

  • load English stopwords as default
  • covert stopwords to list in Python
  • update some tests and doc

How was this patch tested?

Unit tests.

Closes #11871

cc: @burakkose @srowen

@SparkQA
Copy link

SparkQA commented May 2, 2016

Test build #57535 has finished for PR 12843 at commit 42b54ca.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 2, 2016

Test build #57536 has finished for PR 12843 at commit 9f488fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

*/
val caseSensitive: BooleanParam = new BooleanParam(this, "caseSensitive",
"whether to do case-sensitive comparison during filtering")
"whether to do a case-sensitive comparison over the stop stop words")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"stop stop" --> "stop"

@jkbradley
Copy link
Member

Should there be a unit tests which iterates through StopWordsRemover.supportedLanguages and tests loading all & checking they are non-empty?

Other than those small items, this looks good to me

@SparkQA
Copy link

SparkQA commented May 4, 2016

Test build #57769 has finished for PR 12843 at commit e2d0aba.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

LGTM pending tests

@SparkQA
Copy link

SparkQA commented May 5, 2016

Test build #2974 has finished for PR 12843 at commit e2d0aba.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 5, 2016

Test build #57923 has finished for PR 12843 at commit df2d98f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor Author

mengxr commented May 6, 2016

Merged into master and branch-2.0.

@asfgit asfgit closed this in e20cd9f May 6, 2016
asfgit pushed a commit that referenced this pull request May 6, 2016
…ds for Stop Words Remover

## What changes were proposed in this pull request?

This PR continues the work from #11871 with the following changes:
* load English stopwords as default
* covert stopwords to list in Python
* update some tests and doc

## How was this patch tested?

Unit tests.

Closes #11871

cc: burakkose srowen

Author: Burak Köse <[email protected]>
Author: Xiangrui Meng <[email protected]>
Author: Burak KOSE <[email protected]>

Closes #12843 from mengxr/SPARK-14050.

(cherry picked from commit e20cd9f)
Signed-off-by: Xiangrui Meng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants