Skip to content

Escaped characters U+0000..001F in JSON not dealt with correctly #33

@mbert

Description

@mbert

On Kubernetes container logs are formatted as JSON and redirected into log files in /var/log/containers. In this process the characters U+0000..001F are escaped. Hence, a tab character becomes "\u0009" (i.e. a string). This breaks recognition of stacktraces that contain tabs.

For Java the following change to the existing code works:

diff --git a/lib/fluent/plugin/exception_detector.rb b/lib/fluent/plugin/exception_detector.rb
index c83fcab..ed23c49 100644
--- a/lib/fluent/plugin/exception_detector.rb
+++ b/lib/fluent/plugin/exception_detector.rb
@@ -53,9 +53,9 @@ module Fluent
       rule(:start_state,
            /(?:Exception|Error|Throwable|V8 errors stack trace)[:\r\n]/,
            :java),
-      rule(:java, /^[\t ]+(?:eval )?at /, :java),
-      rule(:java, /^[\t ]*(?:Caused by|Suppressed):/, :java),
-      rule(:java, /^[\t ]*... \d+\ more/, :java)
+      rule(:java, /^(\\u0009|[\t ])+(?:eval )?at /, :java),
+      rule(:java, /^(\\u0009|[\t ])*(?:Caused by|Suppressed):/, :java),
+      rule(:java, /^(\\u0009|[\t ])*... \d+\ more/, :java)
     ].freeze
 
     PYTHON_RULES = [

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions