Skip to content
This repository was archived by the owner on Aug 14, 2024. It is now read-only.

Commit 62d15ce

Browse files
committed
Document tracing issues related to scope propagation
1 parent c666793 commit 62d15ce

File tree

1 file changed

+120
-0
lines changed

1 file changed

+120
-0
lines changed

src/docs/sdk/research/performance/index.mdx

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,4 +24,124 @@ In the next section, we’ll discuss some of the shortcomings with the current m
2424

2525
## Identified Issues
2626

27+
While the reuse of the [Unified SDK architecture](https://develop.sentry.dev/sdk/unified-api/) (hubs, clients, scopes) and the transaction ingestion model have merits, experience revealed some issues that we categorize into two groups.
28+
29+
The first group has to do with scope propagation, in essence the ability to determine what the “current scope” is. This operation is required for both manual instrumentation in user code as well as for automatic instrumentation in SDK integrations.
30+
31+
The second group is for issues related to the wire format used to send transaction data from SDKs to Sentry.
32+
33+
## Scope Propagation Problems
34+
35+
_This issue is tracked by [getsentry/sentry-javascript#3751](https://github.com/getsentry/sentry-javascript/issues/3751)._
36+
37+
The [Unified SDK architecture](https://develop.sentry.dev/sdk/unified-api/) is fundamentally based on the existence of a `hub` per unit of concurrency, each `hub` having a stack of pairs of `client` and `scope`. A `client` holds configuration and is responsible for sending data to Sentry by means of a `transport`, while a `scope` holds contextual data that gets appended to outgoing events, such as tags and breadcrumbs.
38+
39+
Every `hub` knows what the current scope is. It is always the scope on top of the stack. The difficult part is having a `hub` “per unit of concurrency”.
40+
41+
JavaScript, for example, is single-threaded with an event loop and async code execution. There is no standard way to carry contextual data that works across async calls. So for JavaScript browser applications, there is only one global `hub` shared for sync and async code.
42+
43+
A similar situation appears on Mobile SDKs. There is an user expectation that contextual data like tags, what the current user is, breadcrumbs, and other information stored on the `scope` to be available and settable from any thread. Therefore, in those SDKs there is only one global `hub`.
44+
45+
In both cases, everything was relatively fine when the SDK had to deal with reporting errors. With the added responsibility to track transactions and spans, the `scope` became a poor fit to store the current `span`, because it limits the existence of concurrent spans.
46+
47+
For JavaScript, a possible solution is the use of [Zone.js](https://github.com/angular/angular/blob/master/packages/zone.js/README.md), part of the Angular framework. The main challenge is that it increases bundle size and may inadvertendly impact end user apps as it monkey-patches key parts of the JavaScript runtime engine.
48+
49+
The scope propagation problem became specially apparent when we tried to create a simpler API for manual instrumentation. The idea was to expose a `Sentry.trace` function that would implicitly propagate tracing and scope data, and support deep nesting with sync and async code.
50+
51+
As an example, let’s say someone wanted to measure how long searching through a DOM tree took, tracing this operation would look something like this:
52+
53+
```js
54+
await Sentry.trace(
55+
{
56+
op: 'dom',
57+
description: 'Walk DOM Tree',
58+
},
59+
async () => await walkDomTree()
60+
);
61+
```
62+
63+
Users wouldn’t have to worry about keeping the reference to the correct transaction or span when adding timing data. Users are free to create child spans within the `walkDomTree` function and spans will be ordered in the correct hierarchy.
64+
65+
The implementation of the actual `trace` function is relatively simple (see [a PR which has an example implementation](https://github.com/getsentry/sentry-javascript/pull/3697/files#diff-f5bf6e0cdf7709e5675fcdc3b4ff254dd68f3c9d1a399c8751e0fa1846fa85dbR158)), however, knowing what is the current span in async code and global integrations is a challenge yet to be overcome.
66+
67+
Here are two examples that summarize the problems above:
68+
69+
### 1. Cannot Determine Current Span
70+
71+
Consider some auto-instrumentation code that needs to get a reference to the current `span`, a case in which manual scope propagation is not available.
72+
73+
```js
74+
// SDK code
75+
function fetchWrapper(/* ... */) {
76+
/*
77+
... some code omitted for simplicity ...
78+
*/
79+
const parent = getCurrentHub().getScope().getSpan(); // <1>
80+
const span = parent.startChild({
81+
data: { type: 'fetch' },
82+
description: `${method} ${url}`,
83+
op: 'http.client',
84+
});
85+
try {
86+
// ...
87+
// return fetch(...);
88+
} finally {
89+
span.finish();
90+
}
91+
}
92+
window.fetch = fetchWrapper;
93+
94+
// User code
95+
async function f1() {
96+
const hub = getCurrentHub();
97+
let t = hub.startTransaction({ name: 't1' });
98+
hub.getScope().setSpan(t);
99+
try {
100+
await fetch('https://example.com/f1');
101+
} finally {
102+
t.finish();
103+
}
104+
}
105+
async function f2() {
106+
const hub = getCurrentHub();
107+
let t = hub.startTransaction({ name: 't2' });
108+
hub.getScope().setSpan(t);
109+
try {
110+
await fetch('https://example.com/f2');
111+
} finally {
112+
t.finish();
113+
}
114+
}
115+
Promise.all([f1(), f2()]); // run f1 and f2 concurrently
116+
```
117+
118+
In the example above, several concurrent `fetch` requests trigger the execution of the `fetchWrapper` helper. Line `<1>` must be able to observe a different span depending on the current flow of execution, leading to two span trees as below:
119+
120+
```
121+
t1
122+
\
123+
|- http.client GET https://example.com/f1
124+
t2
125+
\
126+
|- http.client GET https://example.com/f2
127+
```
128+
129+
That means that, when `f1` is running, `parent` must refer to `t1` and, when `f2` is running, `parent` must be `t2`. Unfortunately, all code above is racing to update and read from a single `hub` instance, and thus the observed span trees are not deterministic. For example, the result could incorrectly be:
130+
131+
```
132+
t1
133+
t2
134+
\
135+
|- http.client GET https://example.com/f1
136+
|- http.client GET https://example.com/f2
137+
```
138+
139+
As a side-effect of not being able to correctly determine the current span, the present implementation of the `fetch` integration (and others) in [the JavaScript Browser SDK chooses to create flat transactions](https://github.com/getsentry/sentry-javascript/blob/61eda62ed5df5654f93e34a4848fc9ae3fcac0f7/packages/tracing/src/browser/request.ts#L169-L178), where all child spans are direct children of the transaction (instead of having a proper multi-level tree structure).
140+
141+
### 2. Data Propagation on Error
142+
143+
Coming soon.
144+
145+
## Span Ingestion Model Problems
146+
27147
Coming soon.

0 commit comments

Comments
 (0)