Description
Component(s)
processor/redaction
Is your feature request related to a problem? Please describe.
I want to redact my http access logs.
We log full request data including POST, cookies, etc...
There is a lot of various fields containing tokens, that I want to hide, these fields have common patterns like 'token' or 'apiKey'.
It will be complicated to collect all the variations of these keys and their values formats.
Also, I don't want to remove these keys from log attributes, because it's important to see if the field exists or not.
In addition, it may be useful to add hashing processing, to hash masked value instead of replacing with mask to keep ability to track logs by similar hash values in keys but without exposing the actual value.
Describe the solution you'd like
Masking option
processors:
redaction/nginx_access_redact_secrets:
allow_all_keys: true
blocked_keys_patterns:
- ".*token.*"
- ".*api_key.*"
- ".*apiKey.*"
- ".*password.*"
mask_string: "<redacted>"
And as result to get attributes like request_args.secret_client_token: <redacted>
Or hashing option
processors:
redaction/nginx_access_hash_secrets:
allow_all_keys: true
blocked_keys_patterns:
- ".*token.*"
- ".*api_key.*"
- ".*apiKey.*"
- ".*password.*"
hashing: sha1 # by default set 'none'
And as result to get attributes like request_args.secret_client_token: <sha1 sum>
Describe alternatives you've considered
Using transform processor:
But it's more complicated and it's possible due to a bug-feature inside replace_all_patterns
ottl function, like:
...
statements:
# this won't work according to docs (it should replace keys, not values), but it works in real (v0.103)
- replace_all_patterns(attributes["request_args"]["query"], "key", ".*token.*", "redacted", SHA1, "redacted %s") where IsMap(attributes["request_args"]["query"])
- replace_all_patterns(attributes["request_args"]["query"], "value", "^redacted.*", "<redacted>") where IsMap(attributes["request_args"]["query"])
# yes, there is a nested map, but it's merged into attributes root later
...
And I believe it can work faster as strict functionality in "redaction" processor than the statements pipeline in transform processor
Additional context
No response