Skip to content

Discuss: How we handle the PHP templating language #2330

@joshgoebel

Description

@joshgoebel

Related:

Technically PHP is a templating language that exists inside HTML, ie a full PHP template snippet would look like:

<?php echo '<p>Hello World</p>'; ?> 

Although echo '<p>Hello World</p>'; is still valid PHP code, per se... and it's possible to run this with php -r. So it's possible to have PHP code without the PHP tags...

So keep these two very different context in mind...


Our current handling of PHP is problematic in several ways. It's not very intuitive, and it gives people the wrong idea about how to handle templating languages in general. PHP currently is handled by xml and php. php is the actual php code itself, while the php templating functionality is buried inside xml.

That means a snippet such as <?php echo '<p>Hello World</p>'; ?> would be auto-detected as XML, not PHP. I believe that is one reason we score so poorly on some of the language detection stats. They are throwing PHP files at us and expecting us to return "php" but instead we return "xml".

This also isn't intuitive if you are manually trying to specify the highlighting language... if you have a FULL PHP template and specify "php" your highlighting will be pretty broken because HTML won't be highlighted at all... you'd instead need to specify "xml" to get proper highlighting.

But if you have only php code, without the <?php ?> start/end tags then php would work and xml would be badly broken.


Just including PHP in xml makes people start thinking about this wrongly (IMHO)... See issue #725, for example. They see PHP handled by XML and get the idea that that's perhaps where we should dump other templating languages.

Should XML also handle Mojolicious? ERB? Handlebars? etc... I think it's obvious this is not the correct path. The correct way to specify a grammar for PHP would look something like:

export default function(hljs) {
  return {
    subLanguage: 'xml',
    contains: [
      {
        begin: '<\?(php)?', end: '\?>',
        subLanguage: 'php-code'
      }
    ]
  };
}

If we just made this change today (moving the existing php to php-code) this should "just work" regarding auto-detect. But now we'd have two different languages that PHP could be detected as, depending on whether it was enclosed in <?php ?> tags or not... php or php-code. This seems suboptimal.

Also, this presents a new problem when someone wants to manually specify the language. Should they use php or php-code?

I'm not 100% sure what the best course of action is here, but I think we need to figure this out and slide it into version 10, as it's likely going to be another breaking change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions