Skip to content

xds: xdsClient caches transient error for new watchers #12262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

shivaspeaks
Copy link
Member

@shivaspeaks shivaspeaks commented Aug 7, 2025

Fixes #11672

@shivaspeaks shivaspeaks marked this pull request as ready for review August 7, 2025 13:50
Copy link
Contributor

@kannanjgithub kannanjgithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@@ -894,6 +903,9 @@ void onError(Status error, @Nullable ProcessingTracker tracker) {
Status errorAugmented = Status.fromCode(error.getCode())
.withDescription(description + "nodeID: " + bootstrapInfo.node().getId())
.withCause(error.getCause());
this.lastError = errorAugmented;
this.data = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cached data should continue to get used when available, until we implement A88 onResourceChanged

@@ -894,6 +903,9 @@ void onError(Status error, @Nullable ProcessingTracker tracker) {
Status errorAugmented = Status.fromCode(error.getCode())
.withDescription(description + "nodeID: " + bootstrapInfo.node().getId())
.withCause(error.getCause());
this.lastError = errorAugmented;
this.data = null;
this.absent = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not change this, only a onData arrival should reset this to false as there is reading logic using this value on onAbsent.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, I really want to kill errorDescription, as it is just lastError = Status.INVALID_ARGUMENT. However, some of the fallback code is using errorDescription in really weird ways, so we'd have to clean that up first.

@@ -676,6 +676,8 @@ private final class ResourceSubscriber<T extends ResourceUpdate> {
private ResourceMetadata metadata;
@Nullable
private String errorDescription;
@Nullable
private Status lastError;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Change data to StatusOr<T> data would have kept this a bit more clear. You still have data vs absent, but at least you don't have yet another thing to deal with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that it's a more invasive refactoring change since it would require touching every method that uses the data field. I can try to do it in a follow up PR and keep this PR's scope limited to bug fix?

I'll agree that we are moving to more usage of StatusOr<T> in our codebase everywhere. We should consider here as well for consistency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only a few other places that access data. It does look like it'd make all of them more complicated. Most are of the form data != null, and would likely need to become data != null && data.hasValue().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

XdsClient does not cache onError for new watchers
3 participants