-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
feat(minimetrics): Add internal recursion protection #57791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing any of the two fixes, will now make the tests fail reliably.
@wraps(real_add) | ||
def tracked_add(self, ty, *args, **kwargs): | ||
real_add(self, ty, *args, **kwargs) | ||
@metrics_noop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual fix #1
@wraps(real_emit) | ||
@metrics_noop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual fix #2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note here: i want to add a change to the python SDK which protects the manual flush
call. Once that lands, the real_emit
must not be marked as metrics_noop
or it stops being called. The unittests should catch this though.
This reverts commit 80d54d3.
@wraps(real_add) | ||
def tracked_add(self, ty, *args, **kwargs): | ||
real_add(self, ty, *args, **kwargs) | ||
@metrics_noop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would honestly fix it in a different way, since this code works under the assumption that the SDK sets internally the thread local. I would honestly implement it differently and more reliably by just adding the @enter_minimetrics
decorator to the monkeypatched functions and protecting the call to the SDK inside of the MinimetricsMetricsBackend
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to #57782 but with the finally
implementation since it doesn't clear up the state in case of an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this code works under the assumption that the SDK sets internally the thread local
I don't believe the code makes such an assumption as metrics_noop
sets the thread local.
The risk of a second thread local is this one:
I temporarily considered moving metrics_noop to sentry itself, but the protection still makes sense in the main SDK. Reason being that the transport is in parts (via client) invoked from the flush logic and this can create metrics cycles even without the sentry specific metrics system.
I believe if they do not use the same thread local we have paths where only minimetrics protects itself, but not the outer code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok so it sets also the thread local, I thought it was only checking it and not triggering the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After my doubts were answered, LGTM
This fixes an issue where metrics can feed back into themselves. It was assumed that this has been correctly prevented as the tests did not surface the internal metrics. However the tests were incorrect as the internal recursion did not go to the test's backend, but the default metrics backend.
For the regular minimetrics backend the fixture was thus changed to patch the global backend:
I temporarily considered moving
metrics_noop
to sentry itself, but the protection still makes sense in the main SDK. Reason being that the transport is in parts (via client) invoked from the flush logic and this can create metrics cycles even without the sentry specific metrics system.Refs INC-528