Skip to content

Add baseLangage to language grammars #2275

@joshgoebel

Description

@joshgoebel

From my tweaked language checker:

┌────────────────────┬────────────────────┬──────────┬────────────────────┬──────────┬────────────────────┐
│ expected           │ actual             │ score    │ 2nd best           │ score    │ info               │
├────────────────────┼────────────────────┼──────────┼────────────────────┼──────────┼────────────────────┤
│ cpp                │ cpp                │ 16       │ arduino            │ 16       │ Relevance match.   │
├────────────────────┼────────────────────┼──────────┼────────────────────┼──────────┼────────────────────┤
│ cpp                │ cpp                │ 25       │ arduino            │ 25       │ Relevance match.   │
├────────────────────┼────────────────────┼──────────┼────────────────────┼──────────┼────────────────────┤
│ xml                │ xml                │ 46       │ django             │ 46       │ Relevance match.   │
├────────────────────┼────────────────────┼──────────┼────────────────────┼──────────┼────────────────────┤
│ xml                │ xml                │ 7        │ django             │ 7        │ Relevance match.   │
├────────────────────┼────────────────────┼──────────┼────────────────────┼──────────┼────────────────────┤
│ xml                │ xml                │ 3        │ django             │ 3        │ Relevance match.   │

These aren't surprising, but also aren't very helpful. They would be reported as errors except for the load order that's enforced by the requires... and the fact that we magically prefer the "first match" even if the scores are the same.

I'd like to add an explicit key to explain these relationships to the parser.

So that Arduino would be a superset: 'cpp' and Django would be a superset: 'xml'. I'm open to better naming. If two languages are detected to be related in this way then the auto-detect would simply ignore the superset in cases of EQUAL matches, only preferred it if it had a HIGHER match (which is what you'd expect if it was the superset, and not the original).

Also if you're looking to correct an incorrect match it's not helpful to know the second match to cpp was Arduino, they are essentially the same (not going to color differently, etc). It would be useful to know the second match (OTHER than Arduino) was say Pascal.

That's the change I'm suggested here if no one is opposed. Should only be a few lines of code and adding the appropriate attribute to a few grammars.

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto-detectIssue with auto detection of language typediscuss/proposeProposal for a new feature/directionenhancementAn enhancement or new featureparser

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions