Use env var to support blocking IP ranges #4806

justahero · 2022-05-11T13:11:27Z

This PR adds support for the new WEB_PAGE_OFFSET_CIDR_BLOCKLIST environment variable that holds a list of IPv4 based CIDR blocks. The use case of this env var is to block IPs coming from the same network or IP range. It should help to mitigate scenarios where a high number of distributed requests from multiple clients put a high load onto the application (e.g. bot network) which happen to come from the same IP range. The new block list is meant to complement the existing User Agent & IP block lists.

Note the functionality is similar to the WEB_PAGE_OFFSET_UA_BLOCKLIST env var which blocks requests by matching user agents. It also only applies to the same (expensive) API endpoints that fetch data using pagination. The common experience using the web site should not be affected. The list is specified as a comma separated list of IPv4 based CIDR blocks (e.g. "192.16.0.0/16").

Currently the CIDR block list only supports IPv4 based entries, each block requires at least a host prefix of 16 bits.

Turbo87

is there a technical reason why you only implemented it for IPv4?

Cargo.toml

This is to reject requests from IP ranges that overload the API endpoints by fetching information from multiple clients at the same time. A recent incident highlighted this issue where a number of `crates?page` requests hit the ">0.5 failed requests threshold" alert. These requests timed out due to a high number of expensive requests made. These calls came in from about 100 different IPs in parallel. Before this change individual IPs or specific User Agents could be blocked. These existing approaches have some drawbacks, blocking single IPs would not have done much here as this is quite ineffective for this behaviour, while blocking a specific (common) User Agent may prevent valid usage of the API. Therefore a list of CIDR blocks should be used to check if IPs belong to certain ranges that send too many requests to expensive API endpoints. **Please note** this has to be done very carefully, it's generally a tool that may block whole regions of the internet.

Turbo87 · 2022-05-17T13:35:38Z

@justahero just to make sure we're not waiting on each other: did you see the question above about IPv4-only? :)

justahero · 2022-05-17T14:12:51Z

Interesting, indeed I did not see your comment.

The reason for IPv4 based CIDR blocks was to get the block mechanism started while having a safeguard check on the given host prefix value. We could also remove this safeguard, easily add support for both IPv4 & IPv6 CIDR, then we need to be certain that the env var contains reasonable entries. I think that is fair, as this affects only the pagination endpoints.

Turbo87 · 2022-05-17T14:21:20Z

yeah, I think it would probably even simplify the code if we used generic CIDRs. unless there are any major downsides that I'm missing I think it'd be best to slightly tweak it to allow both :)

This refactors the `WEB_PAGE_OFFSET_CIDR_BLOCKLIST` block list to also support IPv6 based CIDR blocks.

Turbo87

LGTM, thank you! :)

justahero force-pushed the feature/use-cidr-block-list branch 4 times, most recently from 524d71a to cc88169 Compare May 11, 2022 13:37

Turbo87 reviewed May 11, 2022

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

Turbo87 added C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear A-backend ⚙️ labels May 11, 2022

justahero added 2 commits May 12, 2022 11:18

Replace cidr-utils crate with ipnetwork

dfe0c87

justahero force-pushed the feature/use-cidr-block-list branch from 8bee7b9 to dfe0c87 Compare May 12, 2022 09:19

justahero force-pushed the feature/use-cidr-block-list branch 2 times, most recently from 46d97ea to cbe1c4b Compare May 18, 2022 10:06

Expand CIDR block list to support IPv6 ranges

e0703b0

This refactors the `WEB_PAGE_OFFSET_CIDR_BLOCKLIST` block list to also support IPv6 based CIDR blocks.

justahero force-pushed the feature/use-cidr-block-list branch from cbe1c4b to e0703b0 Compare May 18, 2022 10:20

Turbo87 approved these changes May 19, 2022

View reviewed changes

Turbo87 merged commit 3e32561 into rust-lang:master May 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use env var to support blocking IP ranges #4806

Use env var to support blocking IP ranges #4806

Uh oh!

justahero commented May 11, 2022

Uh oh!

Turbo87 left a comment

Uh oh!

Uh oh!

Turbo87 commented May 17, 2022

Uh oh!

justahero commented May 17, 2022

Uh oh!

Turbo87 commented May 17, 2022 •

edited

Loading

Uh oh!

Turbo87 left a comment

Uh oh!

Uh oh!

Use env var to support blocking IP ranges #4806

Use env var to support blocking IP ranges #4806

Uh oh!

Conversation

justahero commented May 11, 2022

Uh oh!

Turbo87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Turbo87 commented May 17, 2022

Uh oh!

justahero commented May 17, 2022

Uh oh!

Turbo87 commented May 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Turbo87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Turbo87 commented May 17, 2022 •

edited

Loading