Skip to content

Conversation

khwilliamson
Copy link
Contributor

Prior to this the non-first characters in a capture group name could be any \w character, though they were supposed to follow perl identifier syntax. But an identifier excludes a few \w characters from appearing in them. This p.r. tightens what is allowed.

#23775 gave a list of them, but I forgot a couple details in generating that list, so it wasn't quite right.

The complete corrected list is:

GREEK YPOGEGRAMMENI
COMBINING CYRILLIC HUNDRED THOUSANDS SIGN
COMBINING CYRILLIC MILLIONS SIGN
COMBINING PARENTHESES OVERLAY
COMBINING ENCLOSING CIRCLE
COMBINING ENCLOSING SQUARE
COMBINING ENCLOSING DIAMOND
COMBINING ENCLOSING CIRCLE BACKSLASH
COMBINING ENCLOSING SCREEN
COMBINING ENCLOSING KEYCAP
COMBINING ENCLOSING UPWARD POINTING TRIANGLE
CIRCLED LATIN CAPITAL LETTER A - Z
CIRCLED LATIN SMALL LETTER A - Z
VERTICAL TILDE
COMBINING CYRILLIC TEN MILLIONS SIGN
COMBINING CYRILLIC HUNDRED MILLIONS SIGN
COMBINING CYRILLIC THOUSAND MILLIONS SIGN
ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM
ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM
ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM
ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM
ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM
ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM
ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM
ARABIC LIGATURE JALLAJALALOUHOU
ARABIC FATHATAN ISOLATED FORM
ARABIC DAMMATAN ISOLATED FORM
ARABIC KASRATAN ISOLATED FORM
ARABIC FATHA ISOLATED FORM
ARABIC DAMMA ISOLATED FORM
ARABIC KASRA ISOLATED FORM
ARABIC SHADDA ISOLATED FORM
ARABIC SUKUN ISOLATED FORM
SQUARED LATIN CAPITAL LETTER A - Z
NEGATIVE CIRCLED LATIN CAPITAL LETTER A - Z
NEGATIVE SQUARED LATIN CAPITAL LETTER A - Z

  • This set of changes requires a perldelta entry

Prior to this commit the non-first characters could be any \w character.
But an identifier excludes a few \w characters from appearing in them.
This commit tightens what is allowed.

Commit xd1e2a852fbc901b45fba20906a8f42ca227ae462 gave a list of them,
but I forgot a couple details in generating that list, so it wasn't
quite right.

The complete corrected list is:
GREEK YPOGEGRAMMENI
COMBINING CYRILLIC HUNDRED THOUSANDS SIGN
COMBINING CYRILLIC MILLIONS SIGN
COMBINING PARENTHESES OVERLAY
COMBINING ENCLOSING CIRCLE
COMBINING ENCLOSING SQUARE
COMBINING ENCLOSING DIAMOND
COMBINING ENCLOSING CIRCLE BACKSLASH
COMBINING ENCLOSING SCREEN
COMBINING ENCLOSING KEYCAP
COMBINING ENCLOSING UPWARD POINTING TRIANGLE
CIRCLED LATIN CAPITAL LETTER A - Z
CIRCLED LATIN SMALL LETTER A - Z
VERTICAL TILDE
COMBINING CYRILLIC TEN MILLIONS SIGN
COMBINING CYRILLIC HUNDRED MILLIONS SIGN
COMBINING CYRILLIC THOUSAND MILLIONS SIGN
ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM
ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM
ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM
ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM
ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM
ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM
ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM
ARABIC LIGATURE JALLAJALALOUHOU
ARABIC FATHATAN ISOLATED FORM
ARABIC DAMMATAN ISOLATED FORM
ARABIC KASRATAN ISOLATED FORM
ARABIC FATHA ISOLATED FORM
ARABIC DAMMA ISOLATED FORM
ARABIC KASRA ISOLATED FORM
ARABIC SHADDA ISOLATED FORM
ARABIC SUKUN ISOLATED FORM
SQUARED LATIN CAPITAL LETTER A - Z
NEGATIVE CIRCLED LATIN CAPITAL LETTER A - Z
NEGATIVE SQUARED LATIN CAPITAL LETTER A - Z
(F) Group names must follow the rules for perl identifiers, meaning
they must start with a non-digit word character. A common cause of
this error is using (?&0) instead of (?0). See L<perlre>.
that ASCII-range ones must start with a non-digit word character. A
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear what ones means in that sentence. s/ones/identifiers ?

I<name> must not begin with a number, nor contain hyphens.
I<name> must follow the rules for perl identifiers
(L<perldata/Identifier parsing>) which means, for example, that they
can't begin with a number, nor contain hyphens.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better written as: can't begin with a number or contain hyphens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants