Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

Similar to TreeNode.transform, which only apply the rule on its children, QueryPlan.transformExpressions should also only apply the rule on its expressions(i.e. returned by Query.expressions). I think it's more intuitive to exclude the expressions that are not defined as this plan's expressions, for example, Generate.generatorOutput should not be touched by transformExpressions.

This PR also did some renaming in TreeNode to make the code more readable, and removed the analyzer trick for Generate.

How was this patch tested?

existing tests.

// A special case for Generate, because the output of Generate should not be resolved by
// ResolveReferences. Attributes in the output will be resolved by ResolveGenerate.
case g @ Generate(generator, join, outer, qualifier, output, child)
if child.resolved && !generator.resolved =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trick does not always work, #11497 (comment) is a good example.

However, instead of adding another trick in #11497, this PR tries to solve the problem fundamentally.

@SparkQA
Copy link

SparkQA commented Mar 4, 2016

Test build #52468 has finished for PR 11521 at commit 31c5344.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 4, 2016

Test build #52469 has finished for PR 11521 at commit 28cc1fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

cc @marmbrus @davies

case (name, value: TreeNode[_]) if isOneOfChildren(value) =>
name -> JInt(children.indexOf(value))
case (name, value: Seq[BaseType]) if value.toSet.subsetOf(containsChild) =>
case (name, value: Seq[_]) if isSubsetOfChildren(value) =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value: Seq[BaseType] is misleading, type parameter is erased.

@cloud-fan
Copy link
Contributor Author

Or we can mark QueryPlan.expressions as final, so that it's always consistent with QueryPlan.transformExpressions. Generate is the only one that override expressions(), and it can override producedAttributes instead, which will be a small change.

@dilipbiswal
Copy link
Contributor

@cloud-fan I really like your second approach, which is much cleaner and simpler, but
I was wondering how it helps with our current problem ? In my understanding the problem
is that plan->transformExpression is considering the productIterator as input which
includes the genertor's output attributes and we end up resolving these attributes
from child's output.

Just trying learn from you and improve my understanding :-)

@marmbrus
Copy link
Contributor

marmbrus commented Mar 4, 2016

I'm not sure I like this solution. It seems pretty weird to me that expressions doesn't always have all the expressions in it. Why do we do that for generate?

@dilipbiswal
Copy link
Contributor

@marmbrus Just pasting the short description we have in the code.

// we don't want the gOutput to be taken as part of the expressions
// as that will cause exceptions like unresolved attributes etc.
override def expressions: Seq[Expression] = generator :: Nil

I was trying to comment out this and seeing if i can reproduce the unresolved attribute problem. So far haven't been successful :-)

@marmbrus
Copy link
Contributor

marmbrus commented Mar 4, 2016

Yeah I think we already solved the unresolved attribute problem in a more principled way (though the details escape me)

@dilipbiswal
Copy link
Contributor

@marmbrus Thanks. FYI - the JIRA 5817 introduced this change.

@cloud-fan
Copy link
Contributor Author

@dilipbiswal yea, the second approach doesn't fix the bug you found, just to make the concept more clear, it's orthogonal to your PR.

@cloud-fan
Copy link
Contributor Author

closing in favor of #11532 and #11497

@cloud-fan cloud-fan closed this Mar 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants