-
Notifications
You must be signed in to change notification settings - Fork 449
feat: Use Global Cross-region inference for default model id #804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I like this idea. Will have to check to see if we consider this a breaking change. If so we should conditionally check this based on what region you're in (in the US keep current behavior, elsewhere use global) |
@zastrowm, I believe this would be breaking, and likely more dangerous because it may still "succeed" for some customers if we switched from US to Global. But I like the idea of "If so we should conditionally check this based on what region you're in" as a good solution to the default issue |
@zastrowm @dbschmigelski Thank you for the feedback! For example, when I set the profile region to eu-central-1 as shown below, the following error occurred. (I don't have access permission for Claude 4 in eu-central-1.)
The change will be in the 3rd row of the table. This appears to be natural behavior where the Agent can operate with the default model when model access is properly configured in your own AWS region settings. |
Hey, will talk this over with @zastrowm, but I'm still leaning towards this being breaking. In your first row you stated
But this is my concern. Even if a customer granted global cross region inference, the SDK previously was not exercising that feature. So a customer who cannot allow traffic to exit the US may have enabled global inference in an AWS Account. But, if they audited that usage they would see "oh Strands only ever uses the US geography." If we were to change the default, their request could succeed, as you call out, but this would be a violation of expectation/behavior in my opinion; where a customer previously could have been confident that data stayed in the US but now it does not. Another reason to be cautious now is that we can always add this US edge case now and remove it later if customers feel it is unnecessary. But the same cannot be said if we begin with the global approach where some, although likely very few if any, could be impacted. |
@dbschmigelski Thank you for the review! One question I have is: should customers who want to keep all communication within the US continue using the default model? Also, I noticed that changes to this documentation will be necessary. |
Hi, We recently discovered something which I think warrants taking a step back. If we look at https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html we can see that the global inference profile only covers the regions This means that the global inference profile is essentially the US inference profile + EU + APAC. What we had previously thought is that by using global you basically "unlock" access to regions NOT already in an inference profile. This is not the case. So if we were to use the global inference profile we would not be providing practical benefits over just doing the following. In fact, I'd wager most customers would prefer that by default, EU data stays in EU and APAC data stays in APAC.
The one benefit of using global is that as new inference profiles are added if we did
it would be future proofed. But, for now we would rather take on the future burden of updating this list than commit ourselves to potentially requiring someone in Disclaimer: I have no knowledge of future inference profiles, I am simply using sa-east-1 as an example |
Thank you! Yes, my understanding of the list of supported regions and Global CRI specifications is the same as yours! I agree that it's indeed a challenge that global.anthropic doesn't work for users whose AWS_REGION is outside of US, APAC, and EU. Also, as you mentioned, the fact that inference data doesn't complete solely within your configured region is indeed a disadvantage, but it's two sides of the same coin with the Global CRI benefits. Having discussed this far, I'd also like to step back and think about it. Thank you for always commenting so thoughtfully! |
Description
By using Global Cross-region inference, users whose AWS_REGION is not set to a US region can now use the default Agent.
Error that occurred before the fix
Code that causes the error
An alternative solution is proposed in #770. The pros and cons are as follows:
bedrock:ListInferenceProfiles
permission would also be required.Related Issues
Documentation PR
N/A
Type of Change
Other (please describe): Change the default model id
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepare
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.