-
Notifications
You must be signed in to change notification settings - Fork 667
Use env var to support blocking IP ranges #4806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use env var to support blocking IP ranges #4806
Conversation
524d71a
to
cc88169
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a technical reason why you only implemented it for IPv4?
This is to reject requests from IP ranges that overload the API endpoints by fetching information from multiple clients at the same time. A recent incident highlighted this issue where a number of `crates?page` requests hit the ">0.5 failed requests threshold" alert. These requests timed out due to a high number of expensive requests made. These calls came in from about 100 different IPs in parallel. Before this change individual IPs or specific User Agents could be blocked. These existing approaches have some drawbacks, blocking single IPs would not have done much here as this is quite ineffective for this behaviour, while blocking a specific (common) User Agent may prevent valid usage of the API. Therefore a list of CIDR blocks should be used to check if IPs belong to certain ranges that send too many requests to expensive API endpoints. **Please note** this has to be done very carefully, it's generally a tool that may block whole regions of the internet.
8bee7b9
to
dfe0c87
Compare
@justahero just to make sure we're not waiting on each other: did you see the question above about IPv4-only? :) |
Interesting, indeed I did not see your comment. The reason for IPv4 based CIDR blocks was to get the block mechanism started while having a safeguard check on the given host prefix value. We could also remove this safeguard, easily add support for both IPv4 & IPv6 CIDR, then we need to be certain that the env var contains reasonable entries. I think that is fair, as this affects only the pagination endpoints. |
yeah, I think it would probably even simplify the code if we used generic CIDRs. unless there are any major downsides that I'm missing I think it'd be best to slightly tweak it to allow both :) |
46d97ea
to
cbe1c4b
Compare
This refactors the `WEB_PAGE_OFFSET_CIDR_BLOCKLIST` block list to also support IPv6 based CIDR blocks.
cbe1c4b
to
e0703b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you! :)
This PR adds support for the new
WEB_PAGE_OFFSET_CIDR_BLOCKLIST
environment variable that holds a list of IPv4 based CIDR blocks. The use case of this env var is to block IPs coming from the same network or IP range. It should help to mitigate scenarios where a high number of distributed requests from multiple clients put a high load onto the application (e.g. bot network) which happen to come from the same IP range. The new block list is meant to complement the existing User Agent & IP block lists.Note the functionality is similar to the
WEB_PAGE_OFFSET_UA_BLOCKLIST
env var which blocks requests by matching user agents. It also only applies to the same (expensive) API endpoints that fetch data using pagination. The common experience using the web site should not be affected. The list is specified as a comma separated list of IPv4 based CIDR blocks (e.g."192.16.0.0/16"
).Currently the CIDR block list only supports IPv4 based entries, each block requires at least a host prefix of 16 bits.