Skip to content

Line breaks not parsed correctly #1497

@mehaase

Description

@mehaase

Line breaks (e.g. the two spaces at the end of a line) are not parsed correctly. Line breaks should emit a <br> tag, but InlineParser emits it as a Text node, not an element node. This results in a correct rendering when using the supplied HtmlRenderer, but makes other NodeVistor implementations useless.

For example, consider this node visitor, which should print an element tree:

class AstPrinter implements md.NodeVisitor {
    int indent = '';

    void visitElementAfter(md.Element element) {
        print('${indent}leaving: ${element.tag}');
        indent = indent.substring(0, indent.length-2);
    }

    bool visitElementBefore(md.Element element) {
        indent += '  ';

        if (element.isEmpty) {
            print('${indent}void: ${element.tag}');
            indent = indent.substring(0, indent.length-2);
            return false;
        } else {
            print('${indent}entering: ${element.tag}');
            return true;
        }
    }

    void visitText(md.Text node) {
        indent += '  ';
        List<String> lines = node.text.replaceAll('\r\n', '\n').split('\n');
        for (String line in lines) {
            print('${indent}TEXT> ${line}');
        }
        indent = indent.substring(0, indent.length-2);
    }
}

The driver code looks like this:

List<String> lines = ['*hello* **world**!', '', '> this is a quote', 'good bye!'];
md.Document doc = new md.Document();
AstPrinter printer = new AstPrinter();

for (md.Node node in doc.parseLines(lines)) {
    node.accept(printer);
}

(I've imported markdown as md so it doesn't conflict with core:html.)

The output of this example is:

entering: p
  entering: em
    TEXT> hello
  leaving: em
  TEXT>  
  entering: strong
    TEXT> world
  leaving: strong
  TEXT> !
leaving: p
entering: blockquote
  entering: p
    TEXT> this is a quote
  leaving: p
leaving: blockquote
entering: p
  TEXT> good bye!
leaving: p

(Side note, since there are no unit tests for proper AST construction, this AstPrinter would be a handy class to include in the Markdown library for debugging purposes.)

What tree is printed when a linebreak is parsed? Here's a test:

List<String> lines = ['this is a test  ', 'this is only a test'];

The result:

entering: p
  TEXT> this is a test
  TEXT> <br />
  TEXT> this is only a test
leaving: p

The <br /> is emitted as a text node, not an element! So a NodeVisitor can't see <br /> elements, which makes AST analysis impractical and also makes rendering to other formats (e.g. PDF) buggy: you'll get a literal <br /> in your PDF instead of the line break that you expected.

Note: there are multiple unit tests covering line breaks, but these tests only compare output in HTML form, not in AST form. AFAICT, there are zero tests on the AST.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions