Skip to content

Logs priority sampling behavior is incorrect #38468

Closed
@jmacd

Description

@jmacd

Component(s)

processor/probabilisticsampler

What happened?

Description

When logs priority sampling is enabled by setting a sampling_priority configuration for a logs pipeline, the value has to be set or logs will not sample. This is counter-intuitive and not like the traces sampler.

Worse, a conditional in the related logic ensures that logs sampling priority logic can only raise priority, not lower priority. This is incorrect.

Steps to Reproduce

  probabilistic_sampler:
    sampling_percentage: 100
    sampling_priority: sampling_priority
    attribute_source: record
    from_attribute: sampling_uuid

In this configuration, log records without sampling_uuid will drop. Additionally, a transform stage like

  transform:
    log_statements:
    - context: log
      statements:
        - set(attributes["sampling_uuid"], UUID()) 
        - set(attributes["sampling_priority"], 10)
          where IsMatch(log.body, "noisy")

will not have the intended effect, because of the bug.

Expected Result

In the second example above, noisy logs should sample at 10% when followed by the probabilistic sampler configuration above it.

Actual Result

The logs do not pass. This is two related bugs.

Collector version

v0.120.0

Environment information

No response

OpenTelemetry Collector configuration

Log output

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions