You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/docs/product/accounts/quotas/manage-event-stream-guide.mdx
+35-21Lines changed: 35 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -111,60 +111,74 @@ To review the error events dropped because of spike protection, go to the "Usage
111
111
112
112
Events will not be dropped during any minute in which you don't send more than the hourly limit that Sentry has calculated for you. After 24 hours without any dropped events, spike protection becomes "inactive" again. This means that it is no longer dropping events, but _it does not mean the system has stopped paying attention._ The next time events are dropped, spike protection will be "reactivated".
113
113
114
-
### New Heuristic Changes
114
+
### New Spike Protection Calculations
115
115
116
116
<Includename="limited-avail-note.mdx" />
117
117
118
-
Spike protection is enabled for every project by default, and when it's enabled, Sentry continually monitors for spikes. You can confirm that it's enabled in **Settings > Projects > _Select Project_ > General Settings**.
118
+
Limited availability spike protection is a project-level tool that helps prevent quota overconsumption. It's enabled for every project by default, and when it's enabled, Sentry continually monitors for spikes. You can confirm that it's enabled in **[Project] > Settings > General Settings**.
119
119
120
-
The way our spike protection algorithm essentially works is by using a weighted average of your events over the past 168 hours (past 7 days), applying a multiplier to that number, comparing this final number against a floor bound that is determined using your quota, and setting that as your spike limit.
120
+
Our spike protection algorithm does the following:
121
121
122
-
#### Spike Protection Inputs
122
+
- Uses a weighted average of your events over the past 168 hours (seven days)
123
+
- Applies a multiplier to that number
124
+
- Compares this final number against a minimum number of events, determined using your quota, to trigger a spike
125
+
- Sets this as your spike limit
123
126
124
-
- Number of projects
125
-
- Quota (per event type)
126
-
- Events in the past 7 days
127
+
#### Setting the Spike Limit
127
128
128
-
#### Floor Bound Calculation
129
+
There are two ways that we can set your spike limit, or the number of events that trigger a spike:
129
130
130
-
To break it down even further, the first step of this algorithm identifies a floor bound that is calculated using your quota. This bound takes the max of either 500 events or (3 \* your quota)/(720 \* number of projects) - the latter number represents your project using up 3 times your overall quota in 30 days if events are continually ingested at this hourly rate, thus flagging for a potential spike.
131
+
-[Minimum Event Calculation](#minimum-event-calculation) - A calculation that determines a minimum number of events
132
+
-[Usage-Based Calculation](#usage-based-calculation) - A projection based on your past usage
131
133
132
-
#### Spike Limit Calculation
134
+
The spike limit for each hour is set using either the minimum event or usage-based calculation — whichever is higher. This is done for a number of reasons. Firstly, using a minimum event calculation protects smaller or new projects. New projects that don't have a week’s worth of data to use to calibrate spike limits can use this minimum number of events, an adaptation of the organization’s quota, to approximate appropriate limits. Additionally, this calculation can be used to minimize false positives in smaller or new projects so that spikes aren’t flagged incorrectly.
133
135
134
-
The next step uses hourly data from the past 7 days to calculate spike limit projections for the next 7 days. This data is used to calculate weighted averages, which takes into account weekly and hourly seasonality. For example, the weighted average calculated for Monday at 3 pm is more heavily influenced by data points on Monday or hours around 3 pm. This weighted average is then multiplied by a multiplier that is 5 times the overall standard deviation of the past week - this multiplier is bounded between 3 and 6.
136
+
Spike limits are recalculated in real time throughout the duration of the spike to adjust for the increasing volume of incoming events. This allows the limit to grow at a steady rate such that quota is protected from being quickly consumed. [An example](#example) of how this works during a spike is shown below.
135
137
136
-
####Setting the Final Limit
138
+
##### Minimum Event Calculation
137
139
138
-
The final spike limit for each hour is set to the max of the floor bound or the calculated limit. This is done for a multitude of reasons - firstly, using the floor bound protects smaller or new projects. New projects that do not have a week’s worth of data to use to calibrate spike limits can use the floor, an adaptation of the organization’s quota, to approximate appropriate limits. Additionally, the floor can be used to minimize false positives in smaller/new projects such that spikes aren’t flagged incorrectly.
140
+
This calculation, which is the first step of our algorithm, identifies a minimum number of events, using your quota as a guide. This number takes the maximum of either 500 events or the result of the following formula `(3 \* your quota)/(720 \* number of projects)`. The equation represents your project using up three times your overall quota in 30 days if events are continually ingested at this hourly rate, thus flagging the project for a potential spike.
139
141
140
-
Additionally, at the onset of a spike, spike limits are recalculated in real time throughout the duration of the spike. While this is done to adjust for the increasing volume of incoming events, the limit grows at a steady rate such that quota is protected and not blown through. An example of how our heuristic works during a spike is shown below.
142
+
##### Usage-Based Calculation
141
143
142
-
#### Example Calculations
144
+
This calculation, which is the second step of our algorithm, calculates hourly data from the past seven days to determine spike limit projections for the next seven days. This data is used to calculate weighted averages, which takes into account weekly and hourly seasonality. For example, the weighted average calculated for Monday at 3 pm is more heavily influenced by data points on Monday or the hours around 3 pm. This weighted average is then multiplied by a multiplier that is `5` times the overall standard deviation of the past week — this multiplier is bounded between `3` and `6`.
143
145
146
+
#### Example
147
+
148
+
In this example, the project usually ingests 100-200 events per hour. There's been a spike that’s reached 50,000 events, as shown in the graph below:
In the following graph, we can see a zoomed in perspective of the 12-hour period of the spike, along with a line indicating the spike limit as it’s being recalculated over the course of the spike:
146
152

147
153
148
-
**_During Spike_**
154
+
Throughout the spike, the recalulating limit has the following effect:
149
155
150
156
- 1st hour: 6k events ingested, limit is recalculated to 2083, 3917 events dropped
151
157
- 2nd hour: 34k events ingested, limit is recalculated to 2873, 31217 events dropped
152
158
- 3rd hour: 55k events ingested, limit is recalculated to 5452, ~49k events dropped
153
159
- 4th hour: 49k events ingested, limit is recalculated to 7628, ~41k events dropped
154
160
- 5th hour: 41k events ingested, limit is recalculated to 9371, ~31k events dropped
155
161
156
-
Limits are recalculated throughout the duration of the spike.
157
-
158
162
For this particular example:
159
163
160
-
- Org Quota: 500k
161
-
- Events Ingested: ~478k
162
-
- Events ~157k
164
+
- Org quota: 500k
165
+
- Events ingested during the spike: ~478k
166
+
- Events accepted overall: ~157k
163
167
164
168
Here's an example of spike limit projections for a week, taking into account seasonality:
165
169
166
170

167
171
172
+
These regular differences in event ingestion don't cause a spike to occur.
173
+
174
+
#### Bursty Projects
175
+
176
+
There may be instances where a project routinely accepts a high volume of events in a very short period of time by design — for example projects that orchestrate cron/Airflow jobs or task runners. The screenshot below shows an example of this kind of behavior:
177
+
178
+

179
+
180
+
If this is expected behavior for a given project in your organization, you may want to consider turning off spike protection in the project settings to ensure necessary events aren't dropped.
0 commit comments