Skip to content

[pkg/ottl] Normalize replace_all_patterns function behavior #32896

Closed
@krokwen

Description

@krokwen

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

I'm implementing log sanitizing processor based on transformprocessor

One of goals - replace value in all 'token' fields with 'redacted' string.

First obvious solution is to use replace_all_patterns(attributes, "key", "*.token.*", "redacted") - but it will just rename the matched keys
If we look into function source, we will found that this behavior is changing if we use converting function in there : replace_all_patterns(attributes, "key", "*.token.*", "redacted", SHA1) - Now, the keys are left as is, but value is changed, with one note - the value isn't a sha1 hash of original value, but a hash of 'redacted' string that doesn't make sense.

Describe the solution you'd like

Split it into two functions 'replace_all_keys' and 'replace_all_values', let them accept also 'key' and 'value' mode and apply them to only keys or only values. E.g:

- replace_all_keys(attributes, "key", "*.token.*", "redacted") # this will replace the key with replacement if key pattern matches
- replace_all_keys(attributes, "value", "*.token.*", "redacted") # this will replace the key with replacement if value matches
- replace_all_values(attributes, "key", "*.token.*", "redacted") # this will replace value with replacement if key matches
- replace_all_values(attributes, "value", "*.token.*", "redacted") # this will replace value with replacement if value matches

This approach will allow to use any converting functions without affecting the behavior

Describe alternatives you've considered

My final solution is to apply SHA1 function with replacement format and the replace it one more time with 'redacted' string because it doesn't make sense to store a hashsum of replacement string.

- replace_all_patterns(attributes, "key", ".*token.*", "redacted", SHA1, "redacted %s")
- replace_all_patterns(attributes, "value", "^redacted.*", "redacted")

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions