Skip to content

Conversation

@yangBottle
Copy link

fixed spark3.0 access hive table while data in hbase problem

What changes were proposed in this pull request?

The PR modify TableReader.scala to create OldHadoopRDD when inputformat is 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat', beacuse NewHadoopRDD uses the new MapReduce API (org.apache.hadoop.mapreduce),but some initialization operations for hbase are implemented using the older MapReduce API (org.apache.hadoop.mapred),this makes the default NewHadoopRDD can not access hbase table.Therefore, in order to be compatible with implementation classes similar to 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' that uses the old API and the new API, it should be prioritized whether to create OldHadoopRDD.
This PR is similar to #29178

Reference link: https://issues.apache.org/jira/browse/SPARK-32380

Why are the changes needed?

Sparksql cannot access hive table while data in hbase, want to fixed this bug.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • step 1 create hbase table

hbase(main):001:0>create 'hive_hbase_test', 'f'

hbase(main):001:0> put 'hive_hbase_test', 'r1', 'f:c1', '123'

  • step 2 create hive table mapping to hbase table

CREATE EXTERNAL TABLE test.hbase_test(

key string COMMENT '',

value string COMMENT '')

STORED BY

'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES (

'hbase.columns.mapping'=':key,f:c1',

'serialization.format'='1')

TBLPROPERTIES (

'hbase.table.name'='hive_hbase_test')

  • step 3 using spark-sql cli to query data in hive

spark-sql> select * from test.hbase_test limit 1;

fixed spark3.0 access hive table while data in hbase problem
@github-actions github-actions bot added the SQL label Jan 12, 2021
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@yangBottle yangBottle closed this Jan 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants