Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented May 14, 2015

@SparkQA
Copy link

SparkQA commented May 14, 2015

Test build #32707 has finished for PR 6146 at commit 4dec469.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

Please explain your change. Neither the JIRA nor the PR description describe why you the original code is incorrect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a non-partition key attribution doesn't exist in the partition's (determined by the partition's StructObjectInspector), we should produce a null ref. Previously, we don't check it and directly call getStructFieldRef to query the field ref. It will cause the error reported in the JIRA.

Conflicts:
	sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala
@SparkQA
Copy link

SparkQA commented Jun 19, 2015

Test build #35281 has finished for PR 6146 at commit 21e3c2c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 19, 2015

Test build #35292 has finished for PR 6146 at commit eed7c8b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SerializableConfiguration(@transient var value: Configuration) extends Serializable
    • class SerializableJobConf(@transient var value: JobConf) extends Serializable

@marmbrus
Copy link
Contributor

What about a test case?

@viirya
Copy link
Member Author

viirya commented Jun 20, 2015

@marmbrus Although the code looks problematic, I mentioned on the JIRA that I can't reproduce the problem. By more testing and searching codes, I found that in your PR #5876 (https://github.com/apache/spark/pull/5876/files#diff-ee66e11b56c21364760a5ed2b783f863R620) you modified how to produce hive partition objects where you populate partition's schema from table schema.

So any newly added columns in hive table will appear in its partitions' schema too. That is why I can't produce this problem. Thus, all non-partition key attributions will always exist in table partitions. And this reported bug is solved by the PR #5876. Because of that, I think I can close this PR now.

@viirya viirya closed this Jun 20, 2015
@marmbrus
Copy link
Contributor

Thanks for following up!

@viirya viirya deleted the skip_new_column branch December 27, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants