|
2 | 2 |
|
3 | 3 | ### Goroutine leak profiles {#goroutineleak-profiles} |
4 | 4 |
|
5 | | -We introduce a new profile type for goroutine leaks. With the experimental flag set to `GOEXPERIMENT=goroutineleakprofile`, it becomes accessible through `pprof` under the name `"goroutineleak"`. |
| 5 | +We introduce a new profile type for goroutine leaks. With the experimental |
| 6 | +flag set to `GOEXPERIMENT=goroutineleakprofile`, it becomes accessible |
| 7 | +through `pprof` under the name `"goroutineleak"`. |
6 | 8 |
|
7 | | -The following snippet showcases a common anti-pattern that leads to goroutine leaks: |
| 9 | +The following snippet showcases a common, but erroneous pattern |
| 10 | +that leads to goroutine leaks: |
8 | 11 | ```go |
9 | | -// AggregateResults concurrently processes each request and aggregates the results. |
10 | | -// If one of the requests returns an error, the function returns immediately with the error. |
11 | | -func (s *Server[T, R]) AggregateResults(reqs []T) ([]R, error) { |
12 | | - ch := make(chan wrap[R]) |
13 | | - for _, req := range reqs { |
14 | | - go func(req T) { |
15 | | - res, err := s.processRequest(req) |
16 | | - ch <- wrap[R]{ |
17 | | - res: res, |
18 | | - err: err, |
19 | | - } |
20 | | - }(req) |
| 12 | +type result struct{ |
| 13 | + res workResult |
| 14 | + err error |
| 15 | +} |
| 16 | + |
| 17 | +func processWorkItems(ws []workItem) ([]workResult, error) { |
| 18 | + // Process work items in parallel, aggregating results in ch. |
| 19 | + ch := make(chan result) |
| 20 | + for _, w := range ws { |
| 21 | + go func() { |
| 22 | + res, err := processWorkItem(w) |
| 23 | + ch <- result{res, err} |
| 24 | + }() |
21 | 25 | } |
22 | 26 |
|
23 | | - var results []R |
24 | | - for range len(reqs) { |
25 | | - x := <-ch |
26 | | - if x.err != nil { |
27 | | - return nil, x.err |
| 27 | + // Collect the results from ch, or return an error if one is found. |
| 28 | + var results []workResult |
| 29 | + for range len(ws) { |
| 30 | + r := <-ch |
| 31 | + if r.err != nil { |
| 32 | + // This premature return may cause goroutine leaks |
| 33 | + return nil, r.err |
28 | 34 | } |
29 | | - results = append(results, x.res) |
| 35 | + results = append(results, r.res) |
30 | 36 | } |
31 | 37 | return results, nil |
32 | 38 | } |
33 | 39 | ``` |
34 | | -Channel `ch` is used to synchronize when concurrently processing each request in the slice `reqs`. |
35 | | -The responses are aggregated in a slice if all requests succeed. |
36 | | -Conversely, if any request produces an error, `AggregateResults` is shortcircuited to |
37 | | -return the error. |
38 | | -However, because `ch` is unbuffered, all pending request goroutines beyond the first to produce |
39 | | -the error will leak. |
40 | | - |
41 | | -The key insight is that `ch` is inaccessible outside the scope of `AggregateResults`. |
42 | | -The Go runtime is now equipped to detect such patterns as they occur at execution time, |
43 | | -and record them in the goroutine leak profile. |
44 | | -For the case above, the goroutine leak profile would appear as: |
45 | | -``` |
46 | | -Samples: |
47 | | -goroutineleak/count |
48 | | - 6: 1 2 3 4 |
49 | | -Locations |
50 | | - 1: 0x104235daf M=1 runtime.gopark src/runtime/proc.go:464:0 s=447 |
51 | | - 2: 0x1041c1ce7 M=1 runtime.chansend src/runtime/chan.go:283:0 s=176 |
52 | | - 3: 0x1041c18f7 M=1 runtime.chansend1 src/runtime/chan.go:161:0 s=160 |
53 | | - 4: 0x10428dd6b M=1 app.(*Server[go.shape.int,go.shape.int]).AggregateResults.func1 app/server.go:37:0 s=35 |
54 | | -``` |
55 | | -The leaked goroutines' stack precisely pinpoints the leaking operation in the source code. |
| 40 | +Because `ch` is unbuffered, if `processWorkItems` returns early due to an error, |
| 41 | +all remaining work item goroutines will leak. |
| 42 | +However, also note that, soon after the leak occurs, `ch` is inaccessible |
| 43 | +to any other goroutine, except those involved in the leak. |
| 44 | + |
| 45 | +To generalize, a goroutine is leaked if it is blocked by concurrency |
| 46 | +primitives (specifically channels, and `sync` primitives such as mutex) that |
| 47 | +are only referenced by the blocked goroutine itself, or other leaked goroutines. |
| 48 | +The Go runtime is now equipped to reveal leaked goroutines by recording their stacks in |
| 49 | +goroutine leak profiles. |
| 50 | +In the example above, the stacks of work item goroutines point to the culprit channel send |
| 51 | +operation. |
| 52 | + |
| 53 | +Note that, while goroutine leak profiles only include true positives, goroutine leaks may be |
| 54 | +missed when caused by concurrency primitives that are accessible globally, or referenced |
| 55 | +by runnable goroutines. |
56 | 56 |
|
57 | | -The main advantage of goroutine leak profiles is that they have **no false positives**, but, for theoretical reasons, they may nevertheless |
58 | | -miss some goroutine leaks, e.g., when caused by global channels. |
59 | | -The underlying theory is presented in detail in [this publication by Saioc et al.](https://dl.acm.org/doi/pdf/10.1145/3676641.3715990). |
| 57 | +Special thanks to Vlad Saioc at Uber for contributing this work. |
| 58 | +The underlying theory is presented in detail by Saioc et al. in [this publication](https://dl.acm.org/doi/pdf/10.1145/3676641.3715990). |
60 | 59 |
|
61 | | -More details about the implementation are presented in the [design document](https://github.com/golang/proposal/blob/master/design/74609-goroutine-leak-detection-gc.md). |
62 | | -We encourage users to experiment with the new feature in different environments (tests, CI, production), and welcome feedback on the [proposal issue](https://github.com/golang/go/issues/74609). |
| 60 | +<!-- More details about the implementation are presented in the [design document](https://github.com/golang/proposal/blob/master/design/74609-goroutine-leak-detection-gc.md). --> |
| 61 | +We encourage users to experiment with the new feature in different environments |
| 62 | +(tests, CI, production), and welcome feedback on the [proposal issue](https://github.com/golang/go/issues/74609). |
0 commit comments