Skip to content

Conversation

@tbrand
Copy link

@tbrand tbrand commented Sep 5, 2025

Description

By using Global Cross-region inference, users whose AWS_REGION is not set to a US region can now use the default Agent.

Error that occurred before the fix

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation: The provided model identifier is invalid.
└ Bedrock region: ap-northeast-1
└ Model id: us.anthropic.claude-sonnet-4-20250514-v1:0

Code that causes the error

# AWS_REGION environment variable or default profile region is not set to a US region
from strands import Agent

agent = Agent()
agent('Hello')

An alternative solution is proposed in #770. The pros and cons are as follows:

  • Pros
  • Cons
    • Only models that support Global Cross-region inference can be selected as the default model
    • May unexpectedly call models outside of AWS_REGION (however, it's unlikely that users using the default model are aware of this)

Related Issues

Documentation PR

N/A

Type of Change

Other (please describe): Change the default model id

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@zastrowm
Copy link
Member

zastrowm commented Sep 5, 2025

I like this idea. Will have to check to see if we consider this a breaking change. If so we should conditionally check this based on what region you're in (in the US keep current behavior, elsewhere use global)

@dbschmigelski
Copy link
Member

I like this idea. Will have to check to see if we consider this a breaking change. If so we should conditionally check this based on what region you're in (in the US keep current behavior, elsewhere use global)

@zastrowm, I believe this would be breaking, and likely more dangerous because it may still "succeed" for some customers if we switched from US to Global. But I like the idea of "If so we should conditionally check this based on what region you're in" as a good solution to the default issue

@tbrand
Copy link
Author

tbrand commented Sep 8, 2025

@zastrowm @dbschmigelski Thank you for the feedback!
If you're referring to "the model being able to execute without being granted model permission" as a breaking change, that won't happen. According to the Global cross-region inference specification, if there's no model access permission in the configured region that serves as the entry point, an error will occur.

For example, when I set the profile region to eu-central-1 as shown below, the following error occurred. (I don't have access permission for Claude 4 in eu-central-1.)

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation: The provided model identifier is invalid.
└ Bedrock region: eu-central-1
└ Model id: global.anthropic.claude-sonnet-4-20250514-v1:0

AWS Profile/Environment Variable Model Access Permission Default Model (us.anthropic) Default Model (global.anthropic)
AWS_REGION set to "us" Access granted in configured region ✅ Success ✅ Success
AWS_REGION set to "us" No access in configured region ❌ Failure ❌ Failure
AWS_REGION not set to "us" Access granted in configured region ❌ Failure ✅ Success
AWS_REGION not set to "us" No access in configured region ❌ Failure ❌ Failure

The change will be in the 3rd row of the table. This appears to be natural behavior where the Agent can operate with the default model when model access is properly configured in your own AWS region settings.

@dbschmigelski
Copy link
Member

Hey, will talk this over with @zastrowm, but I'm still leaning towards this being breaking.

In your first row you stated

AWS Profile/Environment Variable Model Access Permission Default Model (us.anthropic) Default Model (global.anthropic)
AWS_REGION set to "us" Access granted in configured region ✅ Success ✅ Success

But this is my concern. Even if a customer granted global cross region inference, the SDK previously was not exercising that feature. So a customer who cannot allow traffic to exit the US may have enabled global inference in an AWS Account. But, if they audited that usage they would see "oh Strands only ever uses the US geography." If we were to change the default, their request could succeed, as you call out, but this would be a violation of expectation/behavior in my opinion; where a customer previously could have been confident that data stayed in the US but now it does not.

Another reason to be cautious now is that we can always add this US edge case now and remove it later if customers feel it is unnecessary. But the same cannot be said if we begin with the global approach where some, although likely very few if any, could be impacted.

@tbrand
Copy link
Author

tbrand commented Sep 9, 2025

@dbschmigelski Thank you for the review!
Yes, I think your point is correct! For users who are using the default model and want to keep inference communication within the US, there is a possibility of unintended global communication occurring.

One question I have is: should customers who want to keep all communication within the US continue using the default model?
My personal opinion is that the default model should ideally be configured so that a wide range of users can use it stress-free, but I'll leave that decision to the Strands Agents team.
Currently, for users who primarily use regions outside the US, the us.anthropic model is certainly quite stressful.
It fails once during execution, requiring investigation of the cause and manual reconfiguration of the BedrockModel.

Also, I noticed that changes to this documentation will be necessary.
If you can proceed with reviewing this pull request, I will also create a pull request for the documentation changes.
https://github.com/strands-agents/docs/blob/main/docs/README.md

@dbschmigelski
Copy link
Member

dbschmigelski commented Sep 9, 2025

Hi,

We recently discovered something which I think warrants taking a step back.

If we look at https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html we can see that the global inference profile only covers the regions
image

This means that the global inference profile is essentially the US inference profile + EU + APAC. What we had previously thought is that by using global you basically "unlock" access to regions NOT already in an inference profile. This is not the case.

So if we were to use the global inference profile we would not be providing practical benefits over just doing the following. In fact, I'd wager most customers would prefer that by default, EU data stays in EU and APAC data stays in APAC.

if in_us: 
   use us inference profile
elif in_eu:
    use eu inference profile
elif in_apac:
    use apac inference profile
else:
    raise exception early 

The one benefit of using global is that as new inference profiles are added if we did

else:
    use global inference profile

it would be future proofed. But, for now we would rather take on the future burden of updating this list than commit ourselves to potentially requiring someone in sa-east-1 to global when a few line change could scope them to sa

Disclaimer: I have no knowledge of future inference profiles, I am simply using sa-east-1 as an example

@tbrand
Copy link
Author

tbrand commented Sep 10, 2025

Thank you! Yes, my understanding of the list of supported regions and Global CRI specifications is the same as yours!

I agree that it's indeed a challenge that global.anthropic doesn't work for users whose AWS_REGION is outside of US, APAC, and EU.

Also, as you mentioned, the fact that inference data doesn't complete solely within your configured region is indeed a disadvantage, but it's two sides of the same coin with the Global CRI benefits.
Global CRI was developed to route to appropriate regions to ensure normal operation even during service quota limits and high traffic during peak hours.
In that regard, the hardcoding approach you demonstrated with pseudocode doesn't solve the issues that Global CRI addresses.
To me, it appears to be a tradeoff between security/governance perspectives and UX.
Since users who use the default model that may change are unlikely to be sensitive about data, I personally think prioritizing UX would be better, but that's just a personal opinion from an external person.

Having discussed this far, I'd also like to step back and think about it.
As the current conclusion, will the default model remain as is? Or are you planning to change to the hardcoding approach as demonstrated in your pseudocode?

Thank you for always commenting so thoughtfully!

@tbrand tbrand closed this Sep 11, 2025
@tbrand tbrand deleted the use-gcri-for-default-model branch September 11, 2025 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants