Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Aug 11, 2016

What changes were proposed in this pull request?

This PR split the the single createPartitions() call into smaller batches, which could prevent Hive metastore from OOM (caused by millions of partitions).

It will also try to gather all the fast stats (number of files and total size of all files) in parallel to avoid the bottle neck of listing the files in metastore sequential, which is controlled by spark.sql.gatherFastStats (enabled by default).

How was this patch tested?

Tested locally with 10000 partitions and 100 files with embedded metastore, without gathering fast stats in parallel, adding partitions took 153 seconds, after enable that, gathering the fast stats took about 34 seconds, adding these partitions took 25 seconds (most of the time spent in object store), 59 seconds in total, 2.5X faster (with larger cluster, gathering will much faster).

CatalogTablePartition(spec, table.storage.copy(locationUri = Some(location.toUri.toString)))
// Hive metastore may not have enough memory to handle millions of partitions in single RPC,
// we should split them into smaller batches.
partitionSpecsAndLocs.iterator.grouped(1024).foreach { batch =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great to add some logging so there is a way to indicate progress.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63633 has finished for PR 14607 at commit f2f3150.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63635 has finished for PR 14607 at commit 21a6e1e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63636 has finished for PR 14607 at commit ec2d8da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 12, 2016

Test build #63646 has finished for PR 14607 at commit c442b75.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Aug 12, 2016

@yhuai @sameeragarwal @rxin I had updated the MSCK REPAIR TABLE to list all the leaf files in parallel to avoid the listing in Hive metastore, hopefully this could speed up it a lot (not benchmarked yet).

@SparkQA
Copy link

SparkQA commented Aug 13, 2016

Test build #63707 has finished for PR 14607 at commit b3797c9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 13, 2016

Test build #63708 has finished for PR 14607 at commit 05ab7fc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 13, 2016

Test build #63709 has finished for PR 14607 at commit 1c490ef.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Aug 13, 2016

@davies Can you create a new JIRA ticket for this change? It is a non-trivial follow-up.

@davies davies changed the title [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up) [SPARK-17063] Improve performance of MSCK REPAIR TABLE with Hive metastore Aug 15, 2016
@davies davies changed the title [SPARK-17063] Improve performance of MSCK REPAIR TABLE with Hive metastore [SPARK-17063] [SQL] Improve performance of MSCK REPAIR TABLE with Hive metastore Aug 15, 2016
@SparkQA
Copy link

SparkQA commented Aug 15, 2016

Test build #63793 has finished for PR 14607 at commit e2ca4e5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 15, 2016

Test build #63794 has finished for PR 14607 at commit f30e387.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 15, 2016

Test build #63800 has finished for PR 14607 at commit a4d07db.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


// These are two fast stats in Hive Metastore
// see https://github.com/apache/hive/blob/master/
// common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java#L88
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually not a stable identifier (i.e. line 88 will likely change quickly) -- maybe just point to the file.

@SparkQA
Copy link

SparkQA commented Aug 18, 2016

Test build #64017 has finished for PR 14607 at commit 8a18bf7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2016

Test build #64024 has finished for PR 14607 at commit 0672c89.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

tableName: TableIdentifier,
cmd: String = "ALTER TABLE RECOVER PARTITIONS") extends RunnableCommand {

// These are two fast stats in Hive Metastore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a little more context here so that this comment is self explanatory -- perhaps something along the lines of "These are list of statistics that can be collected quickly without requiring a scan of the data"

@sameeragarwal
Copy link
Member

LGTM pending jenkins

@SparkQA
Copy link

SparkQA commented Aug 23, 2016

Test build #64311 has finished for PR 14607 at commit 399e38e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • logWarning(s\"All labels belong to a single class and fitIntercept=false. It's a \" +
    • * Set thresholds in multiclass (or binary) classification to adjust the probability of
    • class MultinomialLogisticRegression @Since(\"2.1.0\") (
    • logWarning(s\"All labels belong to a single class and fitIntercept=false. It's\" +
    • /** Margin (rawPrediction) for each class label. */
    • /** Score (probability) for each class label. */
    • class MultinomialLogisticRegressionModelWriter(instance: MultinomialLogisticRegressionModel)
    • class ALSWrapperWriter(instance: ALSWrapper) extends MLWriter
    • class ALSWrapperReader extends MLReader[ALSWrapper]
    • class GaussianMixtureWrapperWriter(instance: GaussianMixtureWrapper) extends MLWriter
    • class GaussianMixtureWrapperReader extends MLReader[GaussianMixtureWrapper]
    • class IsotonicRegressionWrapperWriter(instance: IsotonicRegressionWrapper) extends MLWriter
    • class IsotonicRegressionWrapperReader extends MLReader[IsotonicRegressionWrapper]
    • class LDAWrapperWriter(instance: LDAWrapper) extends MLWriter
    • class LDAWrapperReader extends MLReader[LDAWrapper]
    • class JavaClassificationModel(JavaPredictionModel):
    • class LogisticRegressionModel(JavaModel, JavaClassificationModel, JavaMLWritable, JavaMLReadable):
    • class DecisionTreeClassificationModel(DecisionTreeModel, JavaClassificationModel, JavaMLWritable,
    • class RandomForestClassificationModel(TreeEnsembleModel, JavaClassificationModel, JavaMLWritable,
    • class GBTClassificationModel(TreeEnsembleModel, JavaPredictionModel, JavaMLWritable,
    • class NaiveBayesModel(JavaModel, JavaClassificationModel, JavaMLWritable, JavaMLReadable):
    • class MultilayerPerceptronClassificationModel(JavaModel, JavaPredictionModel, JavaMLWritable,
    • class LinearRegressionModel(JavaModel, JavaPredictionModel, JavaMLWritable, JavaMLReadable):
    • class DecisionTreeModel(JavaModel, JavaPredictionModel):
    • class TreeEnsembleModel(JavaModel):
    • class RandomForestRegressionModel(TreeEnsembleModel, JavaPredictionModel, JavaMLWritable,
    • class GBTRegressionModel(TreeEnsembleModel, JavaPredictionModel, JavaMLWritable, JavaMLReadable):
    • class GeneralizedLinearRegressionModel(JavaModel, JavaPredictionModel, JavaMLWritable,
    • class JavaPredictionModel():
    • class SubstituteUnresolvedOrdinals(conf: CatalystConf) extends Rule[LogicalPlan]
    • case class UnresolvedInlineTable(
    • case class UnresolvedTableValuedFunction(functionName: String, functionArgs: Seq[Expression])
    • case class UnresolvedOrdinal(ordinal: Int)
    • case class Literal (value: Any, dataType: DataType) extends LeafExpression with CodegenFallback
    • abstract class PlanExpression[T <: QueryPlan[_]] extends Expression
    • abstract class SubqueryExpression extends PlanExpression[LogicalPlan]
    • case class ListQuery(plan: LogicalPlan, exprId: ExprId = NamedExpression.newExprId)
    • case class Exists(plan: LogicalPlan, exprId: ExprId = NamedExpression.newExprId)
    • case class With(child: LogicalPlan, cteRelations: Seq[(String, SubqueryAlias)]) extends UnaryNode
    • case class SubqueryAlias(
    • class QuantileSummaries(
    • case class Stats(value: Double, g: Int, delta: Int)
    • class JDBCOptions(
    • class JacksonParser(
    • abstract class ExecSubqueryExpression extends PlanExpression[SubqueryExec]
    • class OrcFileFormat extends FileFormat with DataSourceRegister with Serializable

@SparkQA
Copy link

SparkQA commented Aug 23, 2016

Test build #64310 has finished for PR 14607 at commit 48ae071.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Aug 29, 2016

Merging into master and 2.0 branch.

@asfgit asfgit closed this in 48caec2 Aug 29, 2016
asfgit pushed a commit that referenced this pull request Aug 29, 2016
…e metastore

This PR split the the single `createPartitions()` call into smaller batches, which could prevent Hive metastore from OOM (caused by millions of partitions).

It will also try to gather all the fast stats (number of files and total size of all files) in parallel to avoid the bottle neck of listing the files in metastore sequential, which is controlled by spark.sql.gatherFastStats (enabled by default).

Tested locally with 10000 partitions and 100 files with embedded metastore, without gathering fast stats in parallel, adding partitions took 153 seconds, after enable that, gathering the fast stats took about 34 seconds, adding these partitions took 25 seconds (most of the time spent in object store), 59 seconds in total, 2.5X faster (with larger cluster, gathering will much faster).

Author: Davies Liu <[email protected]>

Closes #14607 from davies/repair_batch.

(cherry picked from commit 48caec2)
Signed-off-by: Davies Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants