Skip to content

Conversation

@cfregly
Copy link
Contributor

@cfregly cfregly commented Jul 16, 2014

No description provided.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Resolved conflict:
	project/SparkBuild.scala

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a copy-paste error? Did you mean to use "spark-kinesis-asl" here instead of "spark-ganglia-lgpl"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, stephen. thanks!

@mateiz
Copy link
Contributor

mateiz commented Jul 26, 2014

Jenkins, this is ok to test

@mateiz
Copy link
Contributor

mateiz commented Jul 26, 2014

Chris, is this new code from scratch, or is it based on Parviz's old pull request?

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA tests have started for PR 1434. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17230/consoleFull

@cfregly
Copy link
Contributor Author

cfregly commented Jul 27, 2014

@mateiz -

this is a completely brand-new, from-scratch implementation.

parviz's old code was actually a Scala port of the Java-based Kinesis sample app found here: https://github.com/aws/aws-sdk-java/blob/master/src/samples/AmazonKinesisApplication/SampleRecordProcessor.java

this was fine for a quick/dirty sample of kinesis functionality, but my goal was to make this code more reusable, testable, readable, configurable, production-ready, and well-documented. the old code did not support the new Streaming 1.0 API and extras/ build structure (due to ASL-license). i also updated the AWS Java SDK and Kinesis Client Libraries to their latest versions.

here is parviz's PR for reference/comparison: #223.
i've addressed all of the comments provided in the old PR to speed up the acceptance of this new PR.

thanks and i look forward to getting this merged!

-chris

@cfregly
Copy link
Contributor Author

cfregly commented Jul 27, 2014

also, can someone address the questions i have here regarding the ec2 scripts and other peripheral aspects of this PR: https://issues.apache.org/jira/browse/SPARK-1981?focusedCommentId=14072761

thanks!

-chris

@cfregly
Copy link
Contributor Author

cfregly commented Jul 27, 2014

this PR worked for @srosenthal , btw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a need for a complete script to run the kinesis example? Shouldnt the standard bin/run-example be sufficient?
The thing is that we want to have one streamlined way for running all the examples. The more special cases we make, the more we have to explicitly maintain in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdeyhim explained me that this script is necessary as the jar for kinesis is not included in the spark jar. That makes sense.
But its best not to put such module-specific scripts in the main spark/bin directory. Its best to put this in extras/kinesis/bin/ .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, yup. this is a special circumstance due to the licensing. i didn't want to complicate the examples build by depending on an optional package, so this was the best workaround.

i put the scripts alongside the other scripts to reduce confusion, but i can move it to extras/kinesis/bin for sure. i'll update the docs accordingly.

@mateiz
Copy link
Contributor

mateiz commented Jul 29, 2014

@pdeyhim can you take a look over this too when you have a chance?

@tdas
Copy link
Contributor

tdas commented Jul 29, 2014

Jenkins, this is ok to test.

@tdas
Copy link
Contributor

tdas commented Jul 29, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Jul 29, 2014

QA tests have started for PR 1434. This patch DID NOT merge cleanly!
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17365/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 29, 2014

QA results for PR 1434:
- This patch PASSES unit tests.

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17365/consoleFull

@tdas
Copy link
Contributor

tdas commented Jul 29, 2014

Hey @cfregly,

My apologies for being so late to review this. Seems like the PR requires merging with the master, probably because of changes to pom.xml. Would you get a chance to merge with the master once again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename is the directory spark-kinesis-asl to just kinesis. Its already in spark code base, so does not quite need another spark in the directory name. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, I understand, you must have followed the existing spark-ganglia-gpl directory name. Let me figure out why it was named so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I spoke to @pwendell and there is not need to have spark- as part of the name directory name and profile name. So can you make it kinesis-asl at all places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. i'll update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comment style should /* ... / .... /* is reserved only for scala docs.

updated the build to only include kinesis-asl inside the examples jar
when -Pkinesis-asl is specified
@SparkQA
Copy link

SparkQA commented Aug 2, 2014

QA tests have started for PR 1434. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17736/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Change this, this will be super confusing in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yikes. yeah, that's just plain wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction, this is not a nit. Its already confusing ... please change this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, didnt see your response. Good 👍

@SparkQA
Copy link

SparkQA commented Aug 2, 2014

QA results for PR 1434:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
public final class JavaKinesisWordCountASL {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17736/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not need to extend Logging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed and cleaned up imports

@SparkQA
Copy link

SparkQA commented Aug 2, 2014

QA tests have started for PR 1434. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17760/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 2, 2014

QA results for PR 1434:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
public final class JavaKinesisWordCountASL {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17760/consoleFull

@tdas
Copy link
Contributor

tdas commented Aug 2, 2014

Thanks you very much @cfregly! I have merged this!!

asfgit pushed a commit that referenced this pull request Aug 2, 2014
Author: Chris Fregly <[email protected]>

Closes #1434 from cfregly/master and squashes the following commits:

4774581 [Chris Fregly] updated docs, renamed retry to retryRandom to be more clear, removed retries around store() method
0393795 [Chris Fregly] moved Kinesis examples out of examples/ and back into extras/kinesis-asl
691a6be [Chris Fregly] fixed tests and formatting, fixed a bug with JavaKinesisWordCount during union of streams
0e1c67b [Chris Fregly] Merge remote-tracking branch 'upstream/master'
74e5c7c [Chris Fregly] updated per TD's feedback.  simplified examples, updated docs
e33cbeb [Chris Fregly] Merge remote-tracking branch 'upstream/master'
bf614e9 [Chris Fregly] per matei's feedback:  moved the kinesis examples into the examples/ dir
d17ca6d [Chris Fregly] per TD's feedback:  updated docs, simplified the KinesisUtils api
912640c [Chris Fregly] changed the foundKinesis class to be a publically-avail class
db3eefd [Chris Fregly] Merge remote-tracking branch 'upstream/master'
21de67f [Chris Fregly] Merge remote-tracking branch 'upstream/master'
6c39561 [Chris Fregly] parameterized the versions of the aws java sdk and kinesis client
338997e [Chris Fregly] improve build docs for kinesis
828f8ae [Chris Fregly] more cleanup
e7c8978 [Chris Fregly] Merge remote-tracking branch 'upstream/master'
cd68c0d [Chris Fregly] fixed typos and backward compatibility
d18e680 [Chris Fregly] Merge remote-tracking branch 'upstream/master'
b3b0ff1 [Chris Fregly] [SPARK-1981] Add AWS Kinesis streaming support

(cherry picked from commit 91f9504)
Signed-off-by: Tathagata Das <[email protected]>
@asfgit asfgit closed this in 91f9504 Aug 2, 2014
@nchammas
Copy link
Contributor

nchammas commented Aug 2, 2014

This was an epic pull request. Nice work, people.

@pdeyhim
Copy link

pdeyhim commented Aug 2, 2014

Awsome work!! @tdas @cfregly

xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Author: Chris Fregly <[email protected]>

Closes apache#1434 from cfregly/master and squashes the following commits:

4774581 [Chris Fregly] updated docs, renamed retry to retryRandom to be more clear, removed retries around store() method
0393795 [Chris Fregly] moved Kinesis examples out of examples/ and back into extras/kinesis-asl
691a6be [Chris Fregly] fixed tests and formatting, fixed a bug with JavaKinesisWordCount during union of streams
0e1c67b [Chris Fregly] Merge remote-tracking branch 'upstream/master'
74e5c7c [Chris Fregly] updated per TD's feedback.  simplified examples, updated docs
e33cbeb [Chris Fregly] Merge remote-tracking branch 'upstream/master'
bf614e9 [Chris Fregly] per matei's feedback:  moved the kinesis examples into the examples/ dir
d17ca6d [Chris Fregly] per TD's feedback:  updated docs, simplified the KinesisUtils api
912640c [Chris Fregly] changed the foundKinesis class to be a publically-avail class
db3eefd [Chris Fregly] Merge remote-tracking branch 'upstream/master'
21de67f [Chris Fregly] Merge remote-tracking branch 'upstream/master'
6c39561 [Chris Fregly] parameterized the versions of the aws java sdk and kinesis client
338997e [Chris Fregly] improve build docs for kinesis
828f8ae [Chris Fregly] more cleanup
e7c8978 [Chris Fregly] Merge remote-tracking branch 'upstream/master'
cd68c0d [Chris Fregly] fixed typos and backward compatibility
d18e680 [Chris Fregly] Merge remote-tracking branch 'upstream/master'
b3b0ff1 [Chris Fregly] [SPARK-1981] Add AWS Kinesis streaming support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why do we need this log4j file here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants