-
Notifications
You must be signed in to change notification settings - Fork 28.9k
Spark-2447 : Spark on HBase #1608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add common solution for sending upsert actions to HBase (put, deletes, and increment)
external/hbase/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces for indentation
|
QA tests have started for PR 1608. This patch merges cleanly. |
|
QA results for PR 1608: |
external/hbase/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't specify versions in the children poms. In fact, you can't, since this one in particular has to be overridden by hadoop.version. You can remove all <version> in this file. In fact there is already an hbase.version defined in the parent, which affects the code examples. You may also have to harmonize the examples with the newer HBase.
|
QA tests have started for PR 1608. This patch merges cleanly. |
|
QA results for PR 1608: |
|
Does HBase have some sort of schema information? If yes, maybe we can add it as a data source in SchemaRDD? |
|
QA tests have started for PR 1608. This patch merges cleanly. |
|
QA results for PR 1608: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need for this constructor? Better to do the following.
class HBaseContext(@transient sc: Sparkcontext, @transient config: Configuration) extends Serializable {
val broadcastConf = sc.broadcast(new SerializableWritable(config))
}
|
Just FYI - there is already an outstanding patch and JIRA for HBase support on Spark: |
|
QA tests have started for PR 1608. This patch merges cleanly. |
|
QA tests have started for PR 1608. This patch merges cleanly. |
|
QA results for PR 1608: |
|
QA results for PR 1608: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pardon my ignorance, but what is the _2.10 about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the Scala version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Add common solution for sending upsert actions to HBase (put, deletes,
and increment)
This is the first pull request: mainly to test the review process, but there are still a number of things that I plan to add this week.
If I have time I will also add the following: