-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Related:
- Templating language tags in javascript strings break syntax highlighting #725
- https://github.com/andreasjansson/language-detection.el
Technically PHP is a templating language that exists inside HTML, ie a full PHP template snippet would look like:
<?php echo '<p>Hello World</p>'; ?> Although echo '<p>Hello World</p>'; is still valid PHP code, per se... and it's possible to run this with php -r. So it's possible to have PHP code without the PHP tags...
So keep these two very different context in mind...
Our current handling of PHP is problematic in several ways. It's not very intuitive, and it gives people the wrong idea about how to handle templating languages in general. PHP currently is handled by xml and php. php is the actual php code itself, while the php templating functionality is buried inside xml.
That means a snippet such as <?php echo '<p>Hello World</p>'; ?> would be auto-detected as XML, not PHP. I believe that is one reason we score so poorly on some of the language detection stats. They are throwing PHP files at us and expecting us to return "php" but instead we return "xml".
This also isn't intuitive if you are manually trying to specify the highlighting language... if you have a FULL PHP template and specify "php" your highlighting will be pretty broken because HTML won't be highlighted at all... you'd instead need to specify "xml" to get proper highlighting.
But if you have only php code, without the <?php ?> start/end tags then php would work and xml would be badly broken.
Just including PHP in xml makes people start thinking about this wrongly (IMHO)... See issue #725, for example. They see PHP handled by XML and get the idea that that's perhaps where we should dump other templating languages.
Should XML also handle Mojolicious? ERB? Handlebars? etc... I think it's obvious this is not the correct path. The correct way to specify a grammar for PHP would look something like:
export default function(hljs) {
return {
subLanguage: 'xml',
contains: [
{
begin: '<\?(php)?', end: '\?>',
subLanguage: 'php-code'
}
]
};
}If we just made this change today (moving the existing php to php-code) this should "just work" regarding auto-detect. But now we'd have two different languages that PHP could be detected as, depending on whether it was enclosed in <?php ?> tags or not... php or php-code. This seems suboptimal.
Also, this presents a new problem when someone wants to manually specify the language. Should they use php or php-code?
I'm not 100% sure what the best course of action is here, but I think we need to figure this out and slide it into version 10, as it's likely going to be another breaking change.