Skip to content

implement cardinality limit for spanmetrics #38990

Closed
@povilasv

Description

@povilasv

Component(s)

connector/spanmetrics

Is your feature request related to a problem? Please describe.

It's very easy for instrumentations to incidentally put uuid / unique urls into span name, which causes spanmetrics to create high cardinality metrics.

I would like to have Cardinality limit protections similar to what are available in otel SDKS (https://opentelemetry.io/docs/specs/otel/metrics/sdk/#cardinality-limits)

Describe the solution you'd like

Ideally some disabled by default or feature flagged aggregation_cardinality_limit field, which would limit metric cardinality.

This limit should be applied per unique resource.

I.e if I have two applications, that send spans, and only one application sends spans with uuids, only those metrics should be limited.

Additionally to make it similar to OTEL Metric SDK Cardinality limit, each metric should get it's own limit. I.e. calls_total has it's limit also duration_bucket_ms has it's limit.

so the limit is per resource per metric.

Metrics limited by cardinality limit should get all the resource attributes kept, but dimensions (span.name, span.status_code, etc) should be limited instead of dimensions you would get otel.metric.overflow="true" attribute.

Example how golang sdk works:

requests_total{otel_scope_name="example-meter",otel_scope_version="",request_id="43026296-fa6e-4bff-86c6-47490764389f"} 1
requests_total{otel_scope_name="example-meter",otel_scope_version="",request_id="60f17106-56d7-4aa7-85f2-57004c03682b"} 1
requests_total{otel_scope_name="example-meter",otel_scope_version="",request_id="73c8fb59-59f8-486b-b733-c3d4af7fab7a"} 1
requests_total{otel_scope_name="example-meter",otel_scope_version="",request_id="f12cdbdb-edc0-4b73-bc37-f140559c389e"} 1
requests_total{otel_metric_overflow="true",otel_scope_name="example-meter",otel_scope_version=""} 5
export OTEL_GO_X_CARDINALITY_LIMIT=5

func main() {
	// Create stdout exporter
	stdoutExporter, err := stdoutmetric.New()
	if err != nil {
		log.Fatalf("failed to create stdout exporter: %v", err)
	}

	// Create Prometheus exporter
	promExporter, err := prometheus.New()
	if err != nil {
		log.Fatalf("failed to create prometheus exporter: %v", err)
	}

	// Create a meter provider with both exporters
	provider := sdkmetric.NewMeterProvider(
		sdkmetric.WithReader(sdkmetric.NewPeriodicReader(stdoutExporter)),
		sdkmetric.WithReader(promExporter),
	)
	defer func() {
		if err := provider.Shutdown(context.Background()); err != nil {
			log.Fatalf("failed to shutdown meter provider: %v", err)
		}
	}()

	otel.SetMeterProvider(provider)
	meter := provider.Meter("example-meter")

	// Create counters
	requestCounter, err := meter.Int64Counter("requests_total")
	if err != nil {
		log.Fatalf("failed to create request counter: %v", err)
	}

	bytesCounter, err := meter.Int64Counter("bytes_processed_total")
	if err != nil {
		log.Fatalf("failed to create bytes counter: %v", err)
	}

	ctx := context.Background()

	// Start a goroutine to continuously record metrics with UUIDs
	go func() {
		for {
			// Generate a new UUID for each request
			requestID := uuid.New().String()
			// Create attributes with UUIDs
			attrs := attribute.NewSet(
				attribute.String("request_id", requestID),
			)

			// Record metrics with the UUID attributes
			requestCounter.Add(ctx, 1, metric.WithAttributes(attrs.ToSlice()...))
			// Simulate some bytes processed (random number between 1000 and 10000)
			bytesCounter.Add(ctx, 1000+time.Now().UnixNano()%9000, metric.WithAttributes(attrs.ToSlice()...))

			time.Sleep(1 * time.Second)
		}
	}()

	// Create HTTP server to expose Prometheus metrics
	http.Handle("/metrics", promhttp.Handler())
	server := &http.Server{
		Addr:    ":8080",
		Handler: nil,
	}

	fmt.Println("Starting server on :8080")
	if err := server.ListenAndServe(); err != nil {
		log.Fatalf("failed to start server: %v", err)
	}
}

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions