-
Notifications
You must be signed in to change notification settings - Fork 6.9k
[Bots] Web Bot Auth docs #23099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: production
Are you sure you want to change the base?
[Bots] Web Bot Auth docs #23099
Conversation
Howdy and thanks for contributing to our repo. The Cloudflare team reviews new, external PRs within two (2) weeks. If it's been two weeks or longer without any movement, please tag the PR Assignees in a comment. We review internal PRs within 1 week. If it's something urgent or has been sitting without a comment, start a thread in the Developer Docs space internally. PR Change SummaryIntroduced Web Bot Auth (WBA) documentation, enhancing bot verification methods.
Modified Files
Added Files
How can I customize these reviews?Check out the Hyperlint AI Reviewer docs for more information on how to customize the review. If you just want to ignore it on this PR, you can add the Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add |
This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:
|
src/content/docs/bots/concepts/bot/verified-bots/verification.mdx
Outdated
Show resolved
Hide resolved
src/content/docs/bots/concepts/bot/verified-bots/verification.mdx
Outdated
Show resolved
Hide resolved
src/content/docs/bots/concepts/bot/verified-bots/web-bot-auth.mdx
Outdated
Show resolved
Hide resolved
src/content/docs/bots/concepts/bot/verified-bots/web-bot-auth.mdx
Outdated
Show resolved
Hide resolved
Co-authored-by: Patricia Santa Ana <[email protected]>
This PR changes current filenames or deletes current files. Make sure you have redirects set up to cover the following paths:
|
You need to host a key directory which creates a way for Cloudflare to authenticate your bot's requests. | ||
|
||
<Steps> | ||
1. Host a key directory at a well known message signatures directory. The key directory should serve a JSON Web Key Set (JWKS) including the public key derived from your signing key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well known
This term here refers to being under /.well-known
, and only certain paths are allowed, I believe.
The example below suggests you could put it under any other name e.g. '/.well-known/http-message-signatures-directory/foo
or /.well-known/foo
.
That's incorrect - users can only host their signature directory on /.well-known/http-message-signatures-directory
. Our tooling will flag anything else as invalid.
I think we should simply say host it on /.well-known/http-message-signatures-directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note we require Content-Type: application/http-message-signatures-directory+json
today in the response. We might be open to others in the future, but this is mandatory right now.
:::note[Use components with only ASCII values] | ||
Cloudflare currently does not support `bs` or `sf` parameter designed to serialize non-ASCII values into ASCII equivalents. | ||
::: | ||
- Add a `Content-Digest` header if you wish to sign your [message content](https://www.rfc-editor.org/rfc/rfc9421#name-message-content), then specify `Content-Digest` as a component to sign. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should note we don't actually validate their Content-Digest
header 😢
What this means is: anyone can give us a random Content-Digest
header and sign it. We don't actually check the Content-Digest
represents the hash of the message body - we only check if the signature over that hash was valid. Anyone can slap a Content-Digest
header on.
What this means is that there's no guarantee a Content-Digest
came from the message it was signed on, and that makes it a security concern.
I think we should recommend people only use this option if there's no risk of a message being altered on the way to us - like if the message was proxied unencrypted to us.
Or we don't talk about Content-Digest
at all. It's not something we have first class support for anyway.
I'll let you decide @Oxyjun , but I don't feel comfortable not calling out the caveats of our support if people want to do this.
|
||
Construct a [`Signature-Agent` header](https://www.ietf.org/archive/id/draft-meunier-http-message-signatures-directory-00.html#name-header-field-definition) that points to your key directory. Note that Cloudflare will fail to verify a message if: | ||
- The message includes a `Signature-Agent` header that is not an `https://`. | ||
- The message includes a valid URI but do not enclose it in double quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: does, not do
|
||
The following derived components are not supported, and we will fail to verify a message if they are included: | ||
|
||
- `@query-params`: Cloudflare recommends signing the whole query instead of an individual parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: signing the whole query using the @query
component
|
||
### How do I know my JSON Web Key set directory will be accepted? | ||
|
||
Cloudflare uses [`http-signature-directory` tool](https://crates.io/crates/http-signature-directory) to validate your directory. Please your this works before submitting a verification request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo + suggestion: Please ensure this works against your directory before registering with us.
(submitting a verification request is ambiguous - does it refer to registration or to sending us a signed request?)
|
||
### My message is failing validation. What could be the cause? | ||
|
||
- Ensure you have a [`Signature-Agent` header](/bots/concepts/bot/verified-bots/web-bot-auth/#signature-agent-header), and that its value in double-quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: value is in
Cloudflare accepts all valid Ed25519 keys found in your key directory. In the event a key already exists in Cloudflare's registered database, Cloudflare will work with you to supply a new key, or rotate your existing key. | ||
|
||
:::note[Estimated review time] | ||
The estimated review time is approximately one week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should let them know how to track their verification process. Unfortunately, only way to do so is to ask support on the status and have them escalate to us. Should we mention this?
|
||
### What key algorithms does Cloudflare support? | ||
|
||
Cloudflare does not support key algorithms other than Ed25519. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloudflare does not support key algorithms other than Ed25519. | |
Cloudflare supports Ed25519 key algorithm. |
be in the affirmative, not negative
|
||
--- | ||
|
||
### What `web-bot-auth` features from the spec are not supported? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### What `web-bot-auth` features from the spec are not supported? | |
### What `web-bot-auth` features from the IETF draft are not supported? |
## 1. Generate a valid signing key | ||
|
||
You need to generate a signing key which will be used to authenticate your bot's requests. | ||
|
||
{/* prettier-ignore */} | ||
<Steps> | ||
1. Generate a unique [Ed25519](https://ed25519.cr.yp.to/) private key to sign your requests. This example uses the [OpenSSL](https://openssl-library.org/) `genpkey` command: | ||
|
||
```sh | ||
openssl genpkey -algorithm ed25519 -out private-key.pem | ||
``` | ||
2. Extract your public key. | ||
|
||
```sh | ||
openssl pkey -in private-key.pem -pubout -out public-key.pem | ||
``` | ||
3. Convert the public key to JSON Web Key (JWK) using a tool of your choice. This example uses [`jwker`](https://github.com/jphastings/jwker) command line application. | ||
```sh | ||
go install github.com/jphastings/jwker/cmd/jwker@latest | ||
jwker public-key.pem public-key.jwk | ||
``` | ||
</Steps> | ||
|
||
By following these steps, you have generated a private key and a public key, then converted the public key to a JWK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also point out to JavaScript key generation using WebCrypto API
https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/generateKey
this would be directly in the right JWK format
most of the existing JWK libraries or provider should be able to do that as well https://jwt.io/libraries
|
||
import { GlossaryTooltip, Steps } from "~/components" | ||
|
||
Web Bot Auth is an authentication method that leverages cryptographic signatures in HTTP messages to verify that a request comes from an automated bot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Web Bot Auth is an authentication method that leverages cryptographic signatures in HTTP messages to verify that a request comes from an automated bot. | |
Web Bot Auth is an authentication method that leverages cryptographic signatures in HTTP messages to verify that a request comes from an automated bot. | |
It relies on two active IETF drafts: a [directory draft](https://datatracker.ietf.org/doc/html/draft-meunier-http-message-signatures-directory) allowing the crawler to share their public keys, and a [protocol draft](https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-auth-architecture) defining how these keys should be used to attach crawler's identity to HTTP requests. | |
This documentation goes over specific integration within Cloudflare. |
|
||
## 2. Host a key directory | ||
|
||
You need to host a key directory which creates a way for Cloudflare to authenticate your bot's requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing the sentence to have bots as the actor rather than cloudflare.
In addition, clearly reference the IETF draft, given this is where the format is defined.
You need to host a key directory which creates a way for Cloudflare to authenticate your bot's requests. | |
You need to host a key directory which creates a way for your bot to authenticate its requests to Cloudflare. | |
This directory should follow the definition from the active IETF draft [draft-meunier-http-message-signatures-directory-01](https://datatracker.ietf.org/doc/html/draft-meunier-http-message-signatures-directory-01). |
|
||
## 4. (After verification) Sign your requests | ||
|
||
After your bot has been successfully verified, you need to sign your bot's requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly updating the wording.
clearly reference the revision of the IETF draft we support. this is going to be valuable in the future as the draft evolves
After your bot has been successfully verified, you need to sign your bot's requests. | |
After your bot has been successfully verified, your bot is ready to sign its requests. The signature protocol is defined in [draft-meunier-web-bot-auth-architecture-02](https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-auth-architecture-02) | |
| Required component parameter | Requirement | | ||
| ---------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| `tag` | This should be equal to `web-bot-auth`. | | ||
| `alg` | This should be equal to `ed25519`. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the draft suggest that using alg
should be avoided. not prohibited, but I don't think we should have it in our docs @AkshatM
| `alg` | This should be equal to `ed25519`. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will need to be changed after LDW. Today, we require this in the implementation, and things will not verify otherwise, so it needs to be in the docs right now. I'll raise a ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires some pretty intense changes to the whole keyring
concept in web-bot-auth
crate, hence the slowness on my end eradicating the need for alg
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated web-bot-auth crate to not need alg
anymore in v0.3.0.
However, this documentation should not be changed - we still need to update upstream dependencies to use web-bot-auth 3.0
and that's unlikely to happen until after LDW. Until then, customers will be forced to send alg
.
|
||
### 4.2. Calculate the JWK thumbprint | ||
|
||
[Calculate the base64 URL-encoded JWK thumbprint](https://www.rfc-editor.org/rfc/rfc8037.html#appendix-A.3) associated with your Ed25519 public key registered with Cloudflare. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Calculate the base64 URL-encoded JWK thumbprint](https://www.rfc-editor.org/rfc/rfc8037.html#appendix-A.3) associated with your Ed25519 public key registered with Cloudflare. | |
[Calculate the base64 URL-encoded JWK thumbprint](https://www.rfc-editor.org/rfc/rfc8037.html#appendix-A.3) from the public key you registered with Cloudflare. |
|
||
#### `Signature-Agent` header | ||
|
||
Construct a [`Signature-Agent` header](https://www.ietf.org/archive/id/draft-meunier-http-message-signatures-directory-00.html#name-header-field-definition) that points to your key directory. Note that Cloudflare will fail to verify a message if: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Construct a [`Signature-Agent` header](https://www.ietf.org/archive/id/draft-meunier-http-message-signatures-directory-00.html#name-header-field-definition) that points to your key directory. Note that Cloudflare will fail to verify a message if: | |
Construct a [`Signature-Agent` header](https://www.ietf.org/archive/id/draft-meunier-http-message-signatures-directory-01.html#name-header-field-definition) that points to your key directory. Note that Cloudflare will fail to verify a message if: |
|
||
Construct a [`Signature-Agent` header](https://www.ietf.org/archive/id/draft-meunier-http-message-signatures-directory-00.html#name-header-field-definition) that points to your key directory. Note that Cloudflare will fail to verify a message if: | ||
- The message includes a `Signature-Agent` header that is not an `https://`. | ||
- The message includes a valid URI but do not enclose it in double quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The message includes a valid URI but do not enclose it in double quotes. | |
- The message includes a valid URI but does not enclose it in double quotes. This is due to Signature-Agent being a structured field. |
Summary
We're now introducing "Web Bot Auth" (WBA), which is a more secure authentication method for verifying bots. Previously, verifying bots was only possible through two flavours of IP validation,
public IP list
, andreverse DNS
.Web Bot Auth may become the new IETF standard in the near future, and paves the way for better bot detection across the Internet.
This PR restructures and adds information for WBA. Specifically:
Verified bots requirements
, which explains what's involved for a bot to be verified.Verified bots policy
chapter to only talk about the policyVerification methods
, which :Policy
chapter)Policy
chapter)Things that we need to improve:
Consider absorbingDecided against after discussion.Categories
chapter intoVerification methods
(or somewhere else)Reference
>FAQs
> "WBA FAQs"?Screenshots (optional)
Documentation checklist