Skip to content

Regex used to parse Grok patterns prints warning with latest Joni #47861

@droberts195

Description

@droberts195

Following the Joni upgrade in #47374 the following warning is printed on Elasticsearch startup:

regular expression has redundant nested repeat operator + /%\{(?<name>(?<pattern>[A-z0-9]+)(?::(?<subname>[[:alnum:]@\[\]_:.-]+))?)(?:=(?<definition>(?:(?:[^{}]+|\.+)+)+))?\}/

I guess it means there is a non-fatal inefficiency in this regex:

private static final String GROK_PATTERN =
"%\\{" +
"(?<name>" +
"(?<pattern>[A-z0-9]+)" +
"(?::(?<subname>[[:alnum:]@\\[\\]_:.-]+))?" +
")" +
"(?:=(?<definition>" +
"(?:" +
"(?:[^{}]+|\\.+)+" +
")+" +
")" +
")?" + "\\}";
private static final Regex GROK_PATTERN_REGEX = new Regex(GROK_PATTERN.getBytes(StandardCharsets.UTF_8), 0,
GROK_PATTERN.getBytes(StandardCharsets.UTF_8).length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.DEFAULT);

That regular expression has not changed for a long time, so fixing it is probably not necessary for correctness, but it will avoid questions on forums/issues/support cases if it could be fixed before release of 7.5.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions