[SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning #7421

piaozhexiu · 2015-07-15T15:51:24Z

@marmbrus @liancheng per request, I am reopening PR that contains #7216 and #7386.

Can you help me to understand unit test failures?

SparkQA · 2015-07-15T16:38:35Z

Test build #37372 has finished for PR 7421 at commit 69eb136.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

piaozhexiu · 2015-07-15T16:54:12Z

Hmm, jenkins failed for a unknown reason-

[error] running /home/jenkins/workspace/SparkPullRequestBuilder@2/build/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive-thriftserver -Phive sql/test mllib/test hive-thriftserver/test hive/test catalyst/test examples/test ; received return code 143
Archiving unit tests logs...
> Send successful.
Attempting to post to Github...
 > Post successful.
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Finished: FAILURE

marmbrus · 2015-07-15T17:46:25Z

ok to test

SparkQA · 2015-07-16T00:41:58Z

Test build #37421 has finished for PR 7421 at commit 5599cc4.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)
- class LDAModel(JavaModelWrapper):
- class LDA(object):
- trait ImplicitCastInputTypes extends ExpectsInputTypes
- abstract class BinaryOperator extends BinaryExpression with ExpectsInputTypes
- case class UnaryMinus(child: Expression) extends UnaryExpression with ExpectsInputTypes
- case class UnaryPositive(child: Expression) extends UnaryExpression with ExpectsInputTypes
- case class Abs(child: Expression) extends UnaryExpression with ExpectsInputTypes
- case class Pmod(left: Expression, right: Expression) extends BinaryArithmetic
- case class BitwiseNot(child: Expression) extends UnaryExpression with ExpectsInputTypes
- final class SpecificRow extends $
- case class Factorial(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Hex(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Unhex(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Round(child: Expression, scale: Expression)
- case class Md5(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Sha1(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Crc32(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Not(child: Expression)
- case class And(left: Expression, right: Expression) extends BinaryOperator with Predicate
- case class Or(left: Expression, right: Expression) extends BinaryOperator with Predicate
- trait StringRegexExpression extends ImplicitCastInputTypes
- trait String2StringExpression extends ImplicitCastInputTypes
- trait StringComparison extends ImplicitCastInputTypes
- case class StringSpace(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class StringLength(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Ascii(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Base64(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class UnBase64(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
- case class Exchange(newPartitioning: Partitioning, child: SparkPlan) extends UnaryNode

marmbrus · 2015-07-16T19:49:36Z

I'm going to trigger a bunch of test runs. Lets see what happens...

SparkQA · 2015-07-16T21:14:08Z

Test build #1087 has finished for PR 7421 at commit 5599cc4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-07-16T21:14:15Z

Test build #1083 has finished for PR 7421 at commit 5599cc4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)
- case class Pmod(left: Expression, right: Expression) extends BinaryArithmetic
- final class SpecificRow extends $

SparkQA · 2015-07-16T21:15:20Z

Test build #1085 has finished for PR 7421 at commit 5599cc4.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

SparkQA · 2015-07-16T21:15:43Z

Test build #1086 has finished for PR 7421 at commit 5599cc4.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)
- case class Pmod(left: Expression, right: Expression) extends BinaryArithmetic
- final class SpecificRow extends $

SparkQA · 2015-07-16T21:21:40Z

Test build #1084 has finished for PR 7421 at commit 5599cc4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

piaozhexiu · 2015-07-16T22:16:09Z

4/5 passed... Do you think this is because of multithreading in unit tests? Otherwise, I have no explanation.

chutium · 2015-07-17T15:43:51Z

getPartitionsByFilter is really a great improvement, normally in a production hive data warehouse, there are tables with huge amount of partitions. looking forward to see this will be included in next release :)

piaozhexiu · 2015-07-17T18:10:15Z

I am repeatedly running the sql/hive unit tests after synchronizing getPartitionsByFilterMethod.invoke. I'll report back how it goes.

marmbrus · 2015-07-17T20:18:41Z

I am pretty confused about why this this failing, but only sometimes. I don't think it could be locking because getPartitionsByFilter is guarded by the locks in withHiveState, right?

Have you been able to reproduce the failure locally?

liancheng · 2015-07-18T10:26:40Z

Investigated the following 3 build failure samples:

Firstly, this issue couldn't be steadily reproduced, and only showed up on Jenkins occasionally. An obvious guess is that it's probably a concurrency bug and only occurs in highly concurrent jobs. (Notice that TestHive is configured with 32 local executor threads, and the Jenkins server has 32 cores, while our laptops usually have only 8 or less).

Secondly, all 3 build failures behaved extremely consistently: 18 ParquetDataSourceOffMetastoreSuite test cases involving partitioned Hive metastore Parquet tables failed altogether. It seems that some internal Hive state got corrupted before this test suite was executed. However, this PR only updates the read path and doesn't introduce any extra state. So my guess is that, this PR doesn't introduce but just somehow triggers an existing issue. The root cause probably lies in some initialization phase, e.g. HiveContext initialization, or testing partitioned table creation in ParquetDataSourceOffMetastoreSuite.beforeAll().

And I got another interesting finding after single step debugging a failed test case. The following stacktrace snippet appears in all 3 build failures:

Caused by: MetaException(message:Filtering is supported only on partition keys of type string)
      .----
      | at org.apache.hadoop.hive.metastore.parser.ExpressionTree$FilterBuilder.setError(ExpressionTree.java:185)
      | at org.apache.hadoop.hive.metastore.parser.ExpressionTree$LeafNode.getJdoFilterPushdownParam(ExpressionTree.java:452)
      | at org.apache.hadoop.hive.metastore.parser.ExpressionTree$LeafNode.generateJDOFilterOverPartitions(ExpressionTree.java:357)
      | at org.apache.hadoop.hive.metastore.parser.ExpressionTree$LeafNode.generateJDOFilter(ExpressionTree.java:279)
      | at org.apache.hadoop.hive.metastore.parser.ExpressionTree.generateJDOFilterFragment(ExpressionTree.java:590)
      | at org.apache.hadoop.hive.metastore.ObjectStore.makeQueryFilterString(ObjectStore.java:2417)
      | at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsViaOrmFilter(ObjectStore.java:2029)
      | at org.apache.hadoop.hive.metastore.ObjectStore.access$500(ObjectStore.java:146)
      | at org.apache.hadoop.hive.metastore.ObjectStore$4.getJdoResult(ObjectStore.java:2332)
      | at org.apache.hadoop.hive.metastore.ObjectStore$4.getJdoResult(ObjectStore.java:2317)
      `----
        at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2214)

The marked code path showed above is actually NEVER executed in normal cases. To be more specific, the getJdoResult() method in the anonymous GetListHelper object is never called in GetHelper<T>.run() in normal cases. Instead, only the getSqlResult() method is called. And we can see that this behavior is controlled by doUseDirectSql, which is partially decided by ObjectStore.directSql.isCompatibleDatastore. Since ObjectStore is initialized while initializing HiveContext, ObjectStore.directSql.isCompatibleDatastore is probably the corrupted Hive internal state.

Haven't got any clue how this state gets corrupted yet. My guess is that there is a race condition during HiveContext initialization. For example, maybe the underlying Derby database is not fully created while ObjectStore is been initialized.

liancheng · 2015-07-18T10:35:21Z

It seems that Hive prefers to access the underlying metastore database via direct SQL, and uses JDO ORM as a fallback. The existing bug doesn't show up because usually either direct SQL or JDO ORM is capable to do the work. But in case of getPartitionByFilter, the ORM one doesn't support predicates involving integral types by default, and thus leads to build failure.

Before fixing the root cause, I guess we can workaround this issue by setting hive.metastore.integral.jdo.pushdown to true to let the JDO ORM code path be able to handle integral partition columns.

liancheng · 2015-07-18T10:55:37Z

Experimenting the workaround mentioned above in PR #7492.

…failures caused by in apache#7421

…or partition pruning This PR forks PR #7421 authored by piaozhexiu and adds [a workaround] [1] for fixing the occasional test failures occurred in PR #7421. Please refer to these [two] [2] [comments] [3] for details. [1]: liancheng@536ac41 [2]: #7421 (comment) [3]: #7421 (comment) Author: Cheolsoo Park <[email protected]> Author: Cheng Lian <[email protected]> Author: Michael Armbrust <[email protected]> Closes #7492 from liancheng/pr-7421-workaround and squashes the following commits: 5599cc4 [Cheolsoo Park] Predicate pushdown to hive metastore 536ac41 [Cheng Lian] Sets hive.metastore.integral.jdo.pushdown to true to workaround test failures caused by in #7421

piaozhexiu · 2015-07-20T22:17:59Z

Closing as it is merged as part of #7492.

litao-buptsse · 2015-09-16T11:04:48Z

@liancheng @piaozhexiu Have you cherry-pick this PR to spark branch-1.5?

piaozhexiu · 2015-09-16T13:44:01Z

@litao-buptsse yes, this patch is committed in branch-1.5. You need to set spark.sql.hive.metastorePartitionPruning to true to enable it, which is false by default.

litao-buptsse · 2015-09-17T03:44:13Z

@piaozhexiu OK, I got it, thank you very much!

piaozhexiu force-pushed the SPARK-6910-3 branch from 69eb136 to ada4b6a Compare July 15, 2015 17:44

Predicate pushdown to hive metastore

5599cc4

piaozhexiu force-pushed the SPARK-6910-3 branch from ada4b6a to 5599cc4 Compare July 15, 2015 22:37

tedyu mentioned this pull request Jul 17, 2015

Make MetastoreRelation#hiveQlPartitions lazy val #7466

Closed

liancheng mentioned this pull request Jul 18, 2015

[SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning #7492

Closed

liancheng added a commit to liancheng/spark that referenced this pull request Jul 18, 2015

Sets hive.metastore.integral.jdo.pushdown to true to workaround test …

536ac41

…failures caused by in apache#7421

piaozhexiu closed this Jul 20, 2015

viirya mentioned this pull request Jul 23, 2015

[SPARK-8838][SQL] Add config to enable/disable merging part-files when merging parquet schema #7238

Closed

gatorsmile mentioned this pull request Feb 28, 2017

[SPARK-19678][SQL] remove MetastoreRelation #17015

Closed

[SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning #7421

[SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning #7421

Uh oh!

Conversation

piaozhexiu commented Jul 15, 2015

Uh oh!

SparkQA commented Jul 15, 2015

Uh oh!

piaozhexiu commented Jul 15, 2015

Uh oh!

marmbrus commented Jul 15, 2015

Uh oh!

SparkQA commented Jul 16, 2015

Uh oh!

marmbrus commented Jul 16, 2015

Uh oh!

SparkQA commented Jul 16, 2015

Uh oh!

SparkQA commented Jul 16, 2015

Uh oh!

SparkQA commented Jul 16, 2015

Uh oh!

SparkQA commented Jul 16, 2015

Uh oh!

SparkQA commented Jul 16, 2015

Uh oh!

piaozhexiu commented Jul 16, 2015

Uh oh!

chutium commented Jul 17, 2015

Uh oh!

piaozhexiu commented Jul 17, 2015

Uh oh!

marmbrus commented Jul 17, 2015

Uh oh!

liancheng commented Jul 18, 2015

Uh oh!

liancheng commented Jul 18, 2015

Uh oh!

liancheng commented Jul 18, 2015

Uh oh!

piaozhexiu commented Jul 20, 2015

Uh oh!

litao-buptsse commented Sep 16, 2015

Uh oh!

piaozhexiu commented Sep 16, 2015

Uh oh!

litao-buptsse commented Sep 17, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants