Update TableReader.scala #31141

yangBottle · 2021-01-12T03:41:30Z

fixed spark3.0 access hive table while data in hbase problem

What changes were proposed in this pull request?

The PR modify TableReader.scala to create OldHadoopRDD when inputformat is 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat', beacuse NewHadoopRDD uses the new MapReduce API (org.apache.hadoop.mapreduce),but some initialization operations for hbase are implemented using the older MapReduce API (org.apache.hadoop.mapred),this makes the default NewHadoopRDD can not access hbase table.Therefore, in order to be compatible with implementation classes similar to 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' that uses the old API and the new API, it should be prioritized whether to create OldHadoopRDD.
This PR is similar to #29178

Reference link： https://issues.apache.org/jira/browse/SPARK-32380

Why are the changes needed?

Sparksql cannot access hive table while data in hbase, want to fixed this bug.

Does this PR introduce any user-facing change?

No

How was this patch tested?

step 1 create hbase table

hbase(main):001:0>create 'hive_hbase_test', 'f'

hbase(main):001:0> put 'hive_hbase_test', 'r1', 'f:c1', '123'

step 2 create hive table mapping to hbase table

CREATE EXTERNAL TABLE test.hbase_test(

key string COMMENT '',

value string COMMENT '')

STORED BY

'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES (

'hbase.columns.mapping'=':key,f:c1',

'serialization.format'='1')

TBLPROPERTIES (

'hbase.table.name'='hive_hbase_test')

step 3 using spark-sql cli to query data in hive

spark-sql> select * from test.hbase_test limit 1;

fixed spark3.0 access hive table while data in hbase problem

AmplabJenkins · 2021-01-12T04:29:56Z

Can one of the admins verify this patch?

Update TableReader.scala

5634129

fixed spark3.0 access hive table while data in hbase problem

github-actions bot added the SQL label Jan 12, 2021

yangBottle closed this Jan 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update TableReader.scala #31141

Update TableReader.scala #31141

Uh oh!

yangBottle commented Jan 12, 2021

Uh oh!

AmplabJenkins commented Jan 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update TableReader.scala #31141

Update TableReader.scala #31141

Uh oh!

Conversation

yangBottle commented Jan 12, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

AmplabJenkins commented Jan 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants