-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24635][SQL] Remove Blocks class from JavaCode class hierarchy #21619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -119,6 +119,7 @@ object JavaCode { | |
| * A trait representing a block of java code. | ||
| */ | ||
| trait Block extends JavaCode { | ||
| import Block._ | ||
|
|
||
| // The expressions to be evaluated inside this block. | ||
| def exprValues: Set[ExprValue] | ||
|
|
@@ -148,14 +149,17 @@ trait Block extends JavaCode { | |
| } | ||
|
|
||
| // Concatenates this block with other block. | ||
| def + (other: Block): Block | ||
| def + (other: Block): Block = other match { | ||
| case EmptyBlock => this | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what if
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. An empty block + other empty block is an empty?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry, probably the wrong line for the comment. I mean the case:
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had a idea to early expand |
||
| case _ => code"$this\n$other" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we need
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The concatenation of two blocks needs a newline between them like following: In embedding case, like code"$block1 ... $block2", no extra newlines are added.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, thanks for your explanation |
||
| } | ||
| } | ||
|
|
||
| object Block { | ||
|
|
||
| val CODE_BLOCK_BUFFER_LENGTH: Int = 512 | ||
|
|
||
| implicit def blocksToBlock(blocks: Seq[Block]): Block = Blocks(blocks) | ||
| implicit def blocksToBlock(blocks: Seq[Block]): Block = blocks.reduceLeft(_ + _) | ||
|
|
||
| implicit class BlockHelper(val sc: StringContext) extends AnyVal { | ||
| def code(args: Any*): Block = { | ||
|
|
@@ -190,26 +194,29 @@ object Block { | |
| while (strings.hasNext) { | ||
| val input = inputs.next | ||
| input match { | ||
| case _: ExprValue | _: Block => | ||
| case _: ExprValue | _: CodeBlock => | ||
| codeParts += buf.toString | ||
| buf.clear | ||
| blockInputs += input.asInstanceOf[JavaCode] | ||
| case EmptyBlock => | ||
| case _ => | ||
| buf.append(input) | ||
| } | ||
| buf.append(strings.next) | ||
| } | ||
| if (buf.nonEmpty) { | ||
| codeParts += buf.toString | ||
| } | ||
| codeParts += buf.toString | ||
|
|
||
| (codeParts.toSeq, blockInputs.toSeq) | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * A block of java code. Including a sequence of code parts and some inputs to this block. | ||
| * The actual java code is generated by embedding the inputs into the code parts. | ||
| * The actual java code is generated by embedding the inputs into the code parts. Here we keep | ||
| * inputs of `JavaCode` instead of simply folding them as a string of code, because we need to | ||
| * track expressions (`ExprValue`) in this code block. We need to be able to manipulate the | ||
| * expressions later without changing the behavior of this code block in some applications, e.g., | ||
| * method splitting. | ||
| */ | ||
| case class CodeBlock(codeParts: Seq[String], blockInputs: Seq[JavaCode]) extends Block { | ||
| override lazy val exprValues: Set[ExprValue] = { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not related to this PR, but we should think about it in the future. If we treat It's unclear how
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently it is a But I realized that we may not actually need Thus I agree with you that we can get rid of |
||
|
|
@@ -230,30 +237,11 @@ case class CodeBlock(codeParts: Seq[String], blockInputs: Seq[JavaCode]) extends | |
| } | ||
| buf.toString | ||
| } | ||
|
|
||
| override def + (other: Block): Block = other match { | ||
| case c: CodeBlock => Blocks(Seq(this, c)) | ||
| case b: Blocks => Blocks(Seq(this) ++ b.blocks) | ||
| case EmptyBlock => this | ||
| } | ||
| } | ||
|
|
||
| case class Blocks(blocks: Seq[Block]) extends Block { | ||
| override lazy val exprValues: Set[ExprValue] = blocks.flatMap(_.exprValues).toSet | ||
| override lazy val code: String = blocks.map(_.toString).mkString("\n") | ||
|
|
||
| override def + (other: Block): Block = other match { | ||
| case c: CodeBlock => Blocks(blocks :+ c) | ||
| case b: Blocks => Blocks(blocks ++ b.blocks) | ||
| case EmptyBlock => this | ||
| } | ||
| } | ||
|
|
||
| object EmptyBlock extends Block with Serializable { | ||
| override val code: String = "" | ||
| override val exprValues: Set[ExprValue] = Set.empty | ||
|
|
||
| override def + (other: Block): Block = other | ||
| } | ||
|
|
||
| /** | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general question about
+.Previously we generate a giant string for an expression tree, which is hard to tune. To keep more information in the generated code, we introduce this
JavaCode/CodeBlockframework to keep a tree of strings instead of a giant string.For an expression
a op b, we should generate a tree of strings foraandb, thenopcreates a new tree node and keepsaandbas children. That means, if we refer to aCodeBlockinside acode"...", theCodeBlockshould become a child of the newCodeBlock. However,+usually happens within the same operator, I'm not sure if we should create a new level of tree node here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tree structure itself is an important information in the generated code, we should think carefully about what a tree node means. For example, when we want to split the code into methods, how shall we deal with the tree node? Shall we split an entire tree node into one method? What's our assumption to the java code inside one tree node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ideal usage is that we all put a semantically integral java code into a
CodeBlock. If theCodeBlockis produced by an expression in codegen, in order to split it, we should split the entire tree of theCodeBlock.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good answer, let's make sure that when we call
+, the 2 blocks are 2 individual semantically integral java code.