Skip to content

Conversation

@erikerlandson
Copy link
Contributor

take RDD as input and return new RDD with elements dropped.

These methods are now implemented as lazy RDD transforms.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@erikerlandson
Copy link
Contributor Author

This is a reboot of:
#1254

@concretevitamin
Copy link
Contributor

Jenkins, this is okay to test.

@erikerlandson
Copy link
Contributor Author

Jenkins still not getting the memo. How strict is Jenkins with commands? Is 'okay' same as 'ok'?

take RDD as input and return new RDD with elements dropped.

These methods are now implemented as lazy RDD transforms.
@erikerlandson
Copy link
Contributor Author

Assuming this is correct, "okay" is not same as "ok":

The following regex checks that: .ok\W+to\W+test.
So I think you should be able to use it in a sentence or whatever.

https://groups.google.com/forum/#!msg/quicksilver---development/Bn7RPYqAfTI/cQ-_u1BbMEQJ

@JoshRosen
Copy link
Contributor

Jenkins, this is ok to test. Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

Can one of the admins verify this patch?

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have started for PR 1839 at commit af73e1f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have finished for PR 1839 at commit af73e1f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class FanInDep[T: ClassTag](rdd: RDD[T]) extends NarrowDependency[T](rdd)
    • class DropRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging with Serializable
    • class FanOutDep[T: ClassTag](rdd: RDD[T]) extends NarrowDependency[T](rdd)
    • class PromisePartition extends Partition
    • class PromiseRDD[V: ClassTag](expr: => (TaskContext => V),
    • class PromiseArgPartition(p: Partition, argv: Seq[PromiseRDD[_]]) extends Partition
    • class PromiseRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging with Serializable

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21648/Test PASSed.

@SparkQA
Copy link

SparkQA commented Oct 31, 2014

Test build #22578 has started for PR 1839 at commit af73e1f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 31, 2014

Test build #22578 has finished for PR 1839 at commit af73e1f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class FanInDep[T: ClassTag](rdd: RDD[T]) extends NarrowDependency[T](rdd)
    • class DropRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging with Serializable
    • class FanOutDep[T: ClassTag](rdd: RDD[T]) extends NarrowDependency[T](rdd)
    • class PromisePartition extends Partition
    • class PromiseRDD[V: ClassTag](expr: => (TaskContext => V),
    • class PromiseArgPartition(p: Partition, argv: Seq[PromiseRDD[_]]) extends Partition
    • class PromiseRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging with Serializable

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22578/
Test PASSed.

@AlexNisnevich
Copy link

Have any admins verified this patch? drop functionality in RDDs would be very useful to have.

@JoshRosen
Copy link
Contributor

@erikerlandson What do you think about releasing this (and maybe #1909) as a library on Maven or http://spark-packages.org? I'm not sure that this is an API that we necessarily want to put in core yet, but if you publish it as a package then folks would be able to use it with their existing Spark deployments without having to upgrade. The interface for users could still be pretty nice: just add an implicit class / object or set of implicit conversions, then have users import that.

Spark Packages has a helpful command line tool for creating a project template, which might be a timesaver if you decide to go this route: http://spark-packages.org/package/databricks/spark-package-cmd-tool.

@erikerlandson
Copy link
Contributor Author

Hi @JoshRosen, publishing some of these odds and ends in some form has been on my to-do list for a while. If there's interest, I can bump it up in priority.

@AlexNisnevich
Copy link

@JoshRosen @erikerlandson That would be great.

@erikerlandson
Copy link
Contributor Author

@AlexNisnevich
drop, dropRight and dropWhile are now available on the silex project:
http://silex.freevariable.com/latest/api/#com.redhat.et.silex.rdd.drop.DropRDDFunctions

@JoshRosen
Copy link
Contributor

Hey @erikerlandson, since I don't think we're going to merge this functionality into core right now, do you mind closing this issue? BTW, it would be cool to list Silex on http://spark-packages.org, since that would put the library in front of a lot more users / eyeballs.

@erikerlandson
Copy link
Contributor Author

@JoshRosen Yes, that's fine. I'll ping @willb about listing silex on spark-packages.org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants