-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
chore(hybrid-cloud): Break queries that depend on cross silo FKs #45095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ) | ||
|
|
||
| result[alert_rules[rule_activity.alert_rule.id]].update({"created_by": user}) | ||
| rule_activities_by_user_id = {r.user_id: r for r in rule_activities} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there only one activity per user? Or could a user have multiple activities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and I'll adjust
| serialized_users = { | ||
| u["id"]: u | ||
| for u in user_service.serialize_many( | ||
| filter=dict(user_ids=[item.user_id for item in item_list]) | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to do this a bunch. Should the user_service expose a serialize_many_map method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In, general, I think the serialize_many should be returning a dictionary, so I agree. I had been thinking about making a larger refactor for this purpose.
I'd still prefer to keep the service interfaces thing (one way of doing things, not 3), but in this case returning by dictionary is actually preferable to the list, so I may replace it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make some follow up work to consider refactoring all serialize_many call sites into this convention.
| all_team_ids[g.team_id] = g.group_id | ||
| if g.user_id: | ||
| all_user_ids[g.user_id] = g.group_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do teams only have a single group? If item_list is a collection of groups, couldn't a team be assigned to multiple groups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you may be right. I'll take a closer look to see if this can be simplified. In general, I wish we had a much more robust hybrid cloud actor interface.
| except IntegrityError: | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the integrity error pre-existing and now you're just handling it?
| redis_rule_status.set_value("failed") | ||
| return | ||
|
|
||
| user = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markstory
Previously, this check helped inform the RuleActivity check below. Using IntegrityError captures the same meaning but without the lookup. In the future, when this fk is broken for real, it will become a service call and no other enforcement. This was the intermediate until the FK is broken because technically an integrity error can be raised so long as it remains.
markstory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me. I found a few more incorrect passages and the CI failures seem relevant as well.
| use_by_user_id: MutableMapping[int, RpcUser] = { | ||
| user.id: user | ||
| for user in user_service.get_many( | ||
| filter=dict(user_ids=[r.user_id for r in rule_activities]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we get around to building RPC calls we should make sure we only send unique userids in the request body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ideally our service interfaces use "wide" types like Sequence so that I don't have to force a list. This is on my mind, too.
| all_team_ids: MutableMapping[int, Set[int]] = {} | ||
| all_user_ids: MutableMapping[int, Set[int]] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You could make these defaultdict() to save the conditions on 193 and 199
| else: | ||
| all_team_ids[g.team_id].add(g.group_id) | ||
| if g.user_id: | ||
| if g.team_id not in all_team_ids: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if g.team_id not in all_team_ids: | |
| if g.user_id not in all_user_ids: |
| if data_export.user.email: | ||
| user["email"] = data_export.user.email | ||
| if data_export.user_id: | ||
| user = dict(id=data_export.user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trimming down to only user_id seems reasonable as getting the user will use another RPC call.
| assert match_link(url) == expected | ||
|
|
||
|
|
||
| # @region_silo_test(stable=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # @region_silo_test(stable=True) |
|
PR reverted: da6cf99 |
…FKs (#45095)" This reverts commit 2d87621. Co-authored-by: corps <[email protected]>
I need to remerge #45095 after reverting prematurely.
Separating out the query changes from https://github.com/getsentry/sentry/pull/44595/files
Hybrid Cloud needs to break many foreign key relationships that depend on cross silo models. To support this, we have a new column which acts merely as a big int referencing identifiers, but uses a eventually consistent system to cascade or set null when deletions happen across silos.
This PR first changes all known api and test usages of several cross silo foreign keys in preparation for the migration that actually breaks those foreign keys. Several more to come.