[SPARK-15911][SQL] Remove the additional Project to be consistent with SQL #13631

viirya · 2016-06-13T04:08:21Z

What changes were proposed in this pull request?

Currently In DataFrameWriter's insertInto and ResolveRelations of Analyzer, we add additional Project to adjust column ordering. However, it should be using ordering not name for this resolution. This is how Hive does for dynamic partition.

How was this patch tested?

Existing tests.

SparkQA · 2016-06-13T05:20:29Z

Test build #60380 has finished for PR 13631 at commit 5f4455a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2016-06-13T05:36:59Z

@cloud-fan There is a test ("Detect table partitioning with correct partition order") in InsertIntoHiveTableSuite which is dedicated to test insertInto with this column re-ordering. What you think we should do about it? Remove it?

SparkQA · 2016-06-13T14:54:38Z

Test build #60403 has finished for PR 13631 at commit c0500c2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-06-13T16:50:44Z

sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala

    }
  }

-  test("Detect table partitioning with correct partition order") {


can you link to the PR that added this test?

It is added by PR #12239.

SparkQA · 2016-06-14T10:40:57Z

Test build #60481 has finished for PR 13631 at commit f8c9ccf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-06-14T16:45:41Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

-                inputPartCols.find(_.name == name).getOrElse(
-                  throw new AnalysisException(s"Cannot find partition column $name"))
+              tablePartitionNames.filterNot { name =>
+                child.output.exists(_.name == name)


do we really need this check?

hmm. indeed. As we use ordering not name, this check is not needed anymore. I will remove it.

SparkQA · 2016-06-15T04:48:59Z

Test build #60546 has finished for PR 13631 at commit 3030144.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2016-06-16T00:09:13Z

@cloud-fan any other thoughts?

cloud-fan · 2016-06-16T05:16:57Z

hi @viirya , we are auditing the insertion behaviour of spark sql, and will have an agreement this week. How about we revisit this PR after that?

viirya · 2016-06-16T05:32:15Z

@cloud-fan no problem at all.

cloud-fan · 2016-06-18T19:17:59Z

please track the process of https://issues.apache.org/jira/browse/SPARK-16032 and update your PR, thanks!

viirya · 2016-06-19T04:28:08Z

@cloud-fan Looks like a part of this PR is done by some PRs in that JIRA. I will update this.

Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala

viirya · 2016-06-19T04:55:57Z

@cloud-fan Updated. Please take a look. Thanks!

SparkQA · 2016-06-19T06:31:36Z

Test build #60795 has finished for PR 13631 at commit be60027.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2016-06-20T02:32:39Z

ping @cloud-fan

viirya · 2016-06-20T09:03:24Z

@cloud-fan Looks like the change in this PR is done by #13766. Let me close it now.

Remove the additional Project to be consistent with SQL.

5f4455a

Remove test.

c0500c2

cloud-fan reviewed Jun 13, 2016
View reviewed changes

Address comment.

f8c9ccf

cloud-fan reviewed Jun 14, 2016
View reviewed changes

Address comment and add test case.

3030144

viirya added 2 commits June 19, 2016 12:48

Merge remote-tracking branch 'upstream/master' into inserttable

42b69f8

Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

Merge branch 'inserttable' of github.com:viirya/spark-1 into inserttable

be60027

Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala

viirya closed this Jun 20, 2016

viirya deleted the inserttable branch December 27, 2023 18:33

[SPARK-15911][SQL] Remove the additional Project to be consistent with SQL #13631

[SPARK-15911][SQL] Remove the additional Project to be consistent with SQL #13631

Uh oh!

Conversation

viirya commented Jun 13, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jun 13, 2016

Uh oh!

viirya commented Jun 13, 2016

Uh oh!

SparkQA commented Jun 13, 2016

Uh oh!

cloud-fan Jun 13, 2016

Choose a reason for hiding this comment

Uh oh!

viirya Jun 14, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jun 14, 2016

Uh oh!

cloud-fan Jun 14, 2016

Choose a reason for hiding this comment

Uh oh!

viirya Jun 15, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jun 15, 2016

Uh oh!

viirya commented Jun 16, 2016

Uh oh!

cloud-fan commented Jun 16, 2016

Uh oh!

viirya commented Jun 16, 2016

Uh oh!

cloud-fan commented Jun 18, 2016

Uh oh!

viirya commented Jun 19, 2016

Uh oh!

viirya commented Jun 19, 2016

Uh oh!

SparkQA commented Jun 19, 2016

Uh oh!

viirya commented Jun 20, 2016

Uh oh!

viirya commented Jun 20, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants