Skip to content

Conversation

@tmalaska
Copy link

Add common solution for sending upsert actions to HBase (put, deletes,
and increment)

This is the first pull request: mainly to test the review process, but there are still a number of things that I plan to add this week.

  1. Clean up the pom file
  2. Add unit tests for the HConnectionStaticCache

If I have time I will also add the following:

  1. Support for Java
  2. Additional unit tests for Java
  3. Additional unit tests for Spark Streaming

Add common solution for sending upsert actions to HBase (put, deletes,
and increment)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two spaces for indentation

@SparkQA
Copy link

SparkQA commented Jul 27, 2014

QA tests have started for PR 1608. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17237/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 27, 2014

QA results for PR 1608:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
@serializable class HBaseContext(@transient sc: SparkContext,
protected class hconnectionCleanerTask extends TimerTask {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17237/consoleFull

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't specify versions in the children poms. In fact, you can't, since this one in particular has to be overridden by hadoop.version. You can remove all <version> in this file. In fact there is already an hbase.version defined in the parent, which affects the code examples. You may also have to harmonize the examples with the newer HBase.

@SparkQA
Copy link

SparkQA commented Jul 28, 2014

QA tests have started for PR 1608. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17316/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 29, 2014

QA results for PR 1608:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
@serializable class HBaseContext(@transient sc: SparkContext,
protected class hconnectionCleanerTask extends TimerTask {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17316/consoleFull

@rxin
Copy link
Contributor

rxin commented Jul 29, 2014

Does HBase have some sort of schema information? If yes, maybe we can add it as a data source in SchemaRDD?

@SparkQA
Copy link

SparkQA commented Jul 29, 2014

QA tests have started for PR 1608. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17382/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 29, 2014

QA results for PR 1608:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
@serializable class HBaseContext(@transient sc: SparkContext,
protected class hconnectionCleanerTask extends TimerTask {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17382/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need for this constructor? Better to do the following.

class HBaseContext(@transient sc: Sparkcontext, @transient config: Configuration) extends Serializable {
    val broadcastConf = sc.broadcast(new SerializableWritable(config))
}

@pwendell
Copy link
Contributor

Just FYI - there is already an outstanding patch and JIRA for HBase support on Spark:
#194
https://issues.apache.org/jira/browse/SPARK-1127

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA tests have started for PR 1608. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17682/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA tests have started for PR 1608. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17684/consoleFull

@tmalaska tmalaska closed this Aug 1, 2014
@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA results for PR 1608:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
@serializable class HBaseContext(@transient sc: SparkContext,
protected class hconnectionCleanerTask extends TimerTask {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17682/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA results for PR 1608:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
@serializable class HBaseContext(@transient sc: SparkContext,
protected class hconnectionCleanerTask extends TimerTask {

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17684/consoleFull

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon my ignorance, but what is the _2.10 about?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the Scala version

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants