Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ object JavaCode {
* A trait representing a block of java code.
*/
trait Block extends JavaCode {
import Block._

// The expressions to be evaluated inside this block.
def exprValues: Set[ExprValue]
Expand Down Expand Up @@ -148,14 +149,17 @@ trait Block extends JavaCode {
}

// Concatenates this block with other block.
def + (other: Block): Block
def + (other: Block): Block = other match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general question about +.

Previously we generate a giant string for an expression tree, which is hard to tune. To keep more information in the generated code, we introduce this JavaCode/CodeBlock framework to keep a tree of strings instead of a giant string.

For an expression a op b, we should generate a tree of strings for a and b, then op creates a new tree node and keeps a and b as children. That means, if we refer to a CodeBlock inside a code"...", the CodeBlock should become a child of the new CodeBlock. However, + usually happens within the same operator, I'm not sure if we should create a new level of tree node here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tree structure itself is an important information in the generated code, we should think carefully about what a tree node means. For example, when we want to split the code into methods, how shall we deal with the tree node? Shall we split an entire tree node into one method? What's our assumption to the java code inside one tree node?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ideal usage is that we all put a semantically integral java code into a CodeBlock. If the CodeBlock is produced by an expression in codegen, in order to split it, we should split the entire tree of the CodeBlock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good answer, let's make sure that when we call +, the 2 blocks are 2 individual semantically integral java code.

case EmptyBlock => this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if this is an EmptyBlock? shall we a case also for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty block + other empty block is an empty?

Copy link
Contributor

@mgaido91 mgaido91 Jun 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, probably the wrong line for the comment. I mean the case: EmptyBlock + non-empty block. Shall we add a check and return other in that case? Or we can avoid to remove the overriden + in EmptyBlock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a idea to early expand EmptyBlock before. Now I commit it into. Then both concatenation and embedding cases, EmptyBlock won't be kept in arguments to code block.

case _ => code"$this\n$other"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need \n here? It may be a single space as well in many cases or even nothing, IIUC

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concatenation of two blocks needs a newline between them like following:

[block1]
[block2]

In embedding case, like code"$block1 ... $block2", no extra newlines are added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for your explanation

}
}

object Block {

val CODE_BLOCK_BUFFER_LENGTH: Int = 512

implicit def blocksToBlock(blocks: Seq[Block]): Block = Blocks(blocks)
implicit def blocksToBlock(blocks: Seq[Block]): Block = blocks.reduceLeft(_ + _)

implicit class BlockHelper(val sc: StringContext) extends AnyVal {
def code(args: Any*): Block = {
Expand Down Expand Up @@ -190,26 +194,29 @@ object Block {
while (strings.hasNext) {
val input = inputs.next
input match {
case _: ExprValue | _: Block =>
case _: ExprValue | _: CodeBlock =>
codeParts += buf.toString
buf.clear
blockInputs += input.asInstanceOf[JavaCode]
case EmptyBlock =>
case _ =>
buf.append(input)
}
buf.append(strings.next)
}
if (buf.nonEmpty) {
codeParts += buf.toString
}
codeParts += buf.toString

(codeParts.toSeq, blockInputs.toSeq)
}
}

/**
* A block of java code. Including a sequence of code parts and some inputs to this block.
* The actual java code is generated by embedding the inputs into the code parts.
* The actual java code is generated by embedding the inputs into the code parts. Here we keep
* inputs of `JavaCode` instead of simply folding them as a string of code, because we need to
* track expressions (`ExprValue`) in this code block. We need to be able to manipulate the
* expressions later without changing the behavior of this code block in some applications, e.g.,
* method splitting.
*/
case class CodeBlock(codeParts: Seq[String], blockInputs: Seq[JavaCode]) extends Block {
override lazy val exprValues: Set[ExprValue] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but we should think about it in the future. If we treat CodeBlock as a tree of generated code, then this method doesn't make a lot of sense: it collects all references from its children and put them into a set, which means every time we transform a CodeBlock and create a new copy, we need to build this set again.

It's unclear how exprValues would be used, but I'd image we can provide a contains method which recursively check the children.

Copy link
Member Author

@viirya viirya Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is a lazy one, so we may only build it when we use it. It was invented originally for manipulating expressions in a code block.

But I realized that we may not actually need exprValues if we treat CodeBlock as tree. In the PR #21405, the manipulating API doesn't use exprValues when transforming a CodeBlock.

Thus I agree with you that we can get rid of exprValues in the PR. Then we may have a method to return ExprValue contained in a Block.

Expand All @@ -230,30 +237,11 @@ case class CodeBlock(codeParts: Seq[String], blockInputs: Seq[JavaCode]) extends
}
buf.toString
}

override def + (other: Block): Block = other match {
case c: CodeBlock => Blocks(Seq(this, c))
case b: Blocks => Blocks(Seq(this) ++ b.blocks)
case EmptyBlock => this
}
}

case class Blocks(blocks: Seq[Block]) extends Block {
override lazy val exprValues: Set[ExprValue] = blocks.flatMap(_.exprValues).toSet
override lazy val code: String = blocks.map(_.toString).mkString("\n")

override def + (other: Block): Block = other match {
case c: CodeBlock => Blocks(blocks :+ c)
case b: Blocks => Blocks(blocks ++ b.blocks)
case EmptyBlock => this
}
}

object EmptyBlock extends Block with Serializable {
override val code: String = ""
override val exprValues: Set[ExprValue] = Set.empty

override def + (other: Block): Block = other
}

/**
Expand Down