-
Notifications
You must be signed in to change notification settings - Fork 332
Fix a race condition in sendNotification where concurrent parent-namespace creation causes failures #2693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a race condition in sendNotification where concurrent parent-namespace creation causes failures #2693
Conversation
…space creation causes failures The semantics of the createNonExistingNamespaces method used during sendNotification were supposed to be "create if needed". However, the behavior ended up surfacing an AlreadyExistsException if multiple concurrent sendNotification attempts were made for a brand-new namespace (where the notifications may be different tables). This would cause a table sync to fail if a sibling table was being synced at the same time, even though the new table should successfully get created under the shared namespace.
HonahX
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I have one comment about an improvement we could make to prevent potential NPE in future development.
| createNamespaceInternal(nsLevel, Collections.emptyMap(), resolvedParent); | ||
| try { | ||
| createNamespaceInternal(nsLevel, Collections.emptyMap(), resolvedParent); | ||
| } catch (AlreadyExistsException aee) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we throw AlreadyExistsException if the result is not success without checking whether the return status is ENTITY_ALREADY_EXIST. Despite ENTITY_ALREADY_EXIST is the only error status thrown in existing metastore manager impl, furture development and other implementation may include other error status like UNEXPECTED_ERROR, where we do not want to ignore and log here.
I think we could add an additional check
polaris/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java
Lines 518 to 521 in 19742cc
| if (returnedEntity == null) { | |
| throw new AlreadyExistsException( | |
| "Cannot create namespace %s. Namespace already exists", namespace); | |
| } |
to only throw AlreadyExistsException when return status is that, otherwise throw a RuntimeException/IllegalState/...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, updated to throw ServiceFailureException if returnStatus isn't ENTITY_ALREADY_EXISTS.
dimas-b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some non-blocking comments.
| LOGGER | ||
| .atInfo() | ||
| .setCause(aee) | ||
| .addKeyValue("namespace", namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it supposed to be referenced in the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I actually normally default to the traditional format-string with curly-braces, but I think others have been pushing for the more modern structured logging convention so I'm been forcing myself to use it :)
Structured logging is indeed more beneficial for cases where logs sinks are good at having structured-log elements pulled out into separate semi-structured fields for efficient aggregations, etc.
The convention as I understand it is to make the main log message avoid any direct mention of the structured params: https://www.slf4j.org/manual.html
// using classical API
logger.debug("oldT={} newT={} Temperature changed.", oldT, newT);
// using fluent API
logger.atDebug().setMessage("Temperature changed.").addKeyValue("oldT", oldT).addKeyValue("newT", newT).log();
As opposed to addArgument which is for the traditional inline string-replacement into the human-friendly string, the addKeyvalue is standalone, and usually log outputters still show the addKeyValue fields conveniently in the log message, but for machine readers they are independent from the natural-language message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added a comment above the log statement better elucidating the considerations
| createNamespaceInternal(nsLevel, Collections.emptyMap(), resolvedParent); | ||
| } catch (AlreadyExistsException aee) { | ||
| LOGGER | ||
| .atInfo() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why INFO? If we expect it to be a normal and expected call path, "informing" log readers every time it happens seems to be an overkill from the operations perspective. I propose debug (non-blocking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was on the fence. It's one of those cases where it's "normal" as something that will happen time-to-time, but usually isn't deterministically triggered since here the attempt to create the namespace is only made if the immediate preceding call thinks the namespace is missing. So it would be fairly infrequent, only correlating with periods of high parent-creation contention (not just normal periods of high concurrency on an already-existing namespace).
I had even started atWarn but figured that's too alarming, so downgraded to info.
My thought is that if there are unforeseen bugs relating to the state of these "create if needed" namespaces they're more likely to be related to these collisions than to normal operation, so even if someone sets their log-viewer to exclude DEBUG level it'd be easy to eyeball a correlation between this log statement and whatever other buggy behavior comes up.
I also don't feel too strongly though so I'm happy to downgrade it if there's a preference for DEBUG despite this rationale .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. No need to change :)
| .setCause(aee) | ||
| .addKeyValue("namespace", namespace) | ||
| .log( | ||
| "Namespace already exists in createNonExistingNamespace; possible race condition"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible race condition sounds like it is a problem to address, but the intention of this PR seems to be to treat this case as a "normal" situation 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point, removed the second clause to make it less scary.
checking for ENTITY_ALREADY_EXISTS, per review suggestion. Log a less scary message since it's not an error scenario type of race condition, per review suggestion
This PR contains no functional and no user-facing change. It is merely a refactor to better organize auth code. Summary of changes: - Moved all internal authentication components to the `org.apache.polaris.service.auth.internal` package and subpackages - Reduced visibility of utility classes - Renamed `TokenBroker` class hierarchy to stick to the naming standard: `<Algorithm>JWTBroker` - Introduced `@PolarisImmutable` whenever appropriate - Removed unused `NoneTokenBrokerFactory` (we already have `DisabledOAuth2ApiService`) - Removed unused `TokenBrokerFactoryConfig` Enhancement : adding support for Aurora postgres AWS IAM authentication (apache#2650) Add support for postgres AWS IAM authentication using the `apache-client` lib. Remove unused `name` arg from findCatalogByName in PolarisAdminService (apache#2691) * remove unused name param * Rename for better readability Fix a race condition in sendNotification where concurrent parent-namespace creation causes failures (apache#2693) * Fix a race condition in sendNotification where concurrent parent-namespace creation causes failures The semantics of the createNonExistingNamespaces method used during sendNotification were supposed to be "create if needed". However, the behavior ended up surfacing an AlreadyExistsException if multiple concurrent sendNotification attempts were made for a brand-new namespace (where the notifications may be different tables). This would cause a table sync to fail if a sibling table was being synced at the same time, even though the new table should successfully get created under the shared namespace. * Also better future-proof the createNamespaceInternal logic by explicitly checking for ENTITY_ALREADY_EXISTS, per review suggestion. Log a less scary message since it's not an error scenario type of race condition, per review suggestion Client: add credential reset option (apache#2698) * Client: add credential reset option * Client: add credential reset option * Client: add credential reset option * Add integration testing * Fix lint fix(deps): update dependency software.amazon.awssdk:bom to v2.34.5 (apache#2702) fix(deps): update dependency com.gradleup.shadow:shadow-gradle-plugin to v9.2.2 (apache#2661) added Aurora postgres to metastore documentation
The semantics of the createNonExistingNamespaces method used during sendNotification were supposed to be "create if needed". However, the behavior ended up surfacing an AlreadyExistsException if multiple concurrent sendNotification attempts were made for a brand-new namespace (where the notifications may be different tables). This would cause a table sync to fail if a sibling table was being synced at the same time, even though the new table should successfully get created under the shared namespace.