Skip to content

Conversation

@corps
Copy link
Contributor

@corps corps commented Feb 24, 2023

Separating out the query changes from https://github.com/getsentry/sentry/pull/44595/files

Hybrid Cloud needs to break many foreign key relationships that depend on cross silo models. To support this, we have a new column which acts merely as a big int referencing identifiers, but uses a eventually consistent system to cascade or set null when deletions happen across silos.

This PR first changes all known api and test usages of several cross silo foreign keys in preparation for the migration that actually breaks those foreign keys. Several more to come.

@corps corps requested a review from a team February 24, 2023 20:39
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 24, 2023
@corps corps marked this pull request as ready for review February 24, 2023 22:30
@corps corps requested review from a team February 24, 2023 22:30
@corps corps requested a review from a team as a code owner February 24, 2023 22:30
)

result[alert_rules[rule_activity.alert_rule.id]].update({"created_by": user})
rule_activities_by_user_id = {r.user_id: r for r in rule_activities}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there only one activity per user? Or could a user have multiple activities?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and I'll adjust

Comment on lines +11 to +16
serialized_users = {
u["id"]: u
for u in user_service.serialize_many(
filter=dict(user_ids=[item.user_id for item in item_list])
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to do this a bunch. Should the user_service expose a serialize_many_map method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In, general, I think the serialize_many should be returning a dictionary, so I agree. I had been thinking about making a larger refactor for this purpose.

I'd still prefer to keep the service interfaces thing (one way of doing things, not 3), but in this case returning by dictionary is actually preferable to the list, so I may replace it that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make some follow up work to consider refactoring all serialize_many call sites into this convention.

Comment on lines 193 to 195
all_team_ids[g.team_id] = g.group_id
if g.user_id:
all_user_ids[g.user_id] = g.group_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do teams only have a single group? If item_list is a collection of groups, couldn't a team be assigned to multiple groups?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think you may be right. I'll take a closer look to see if this can be simplified. In general, I wish we had a much more robust hybrid cloud actor interface.

Comment on lines +101 to +102
except IntegrityError:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the integrity error pre-existing and now you're just handling it?

redis_rule_status.set_value("failed")
return

user = None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markstory
Previously, this check helped inform the RuleActivity check below. Using IntegrityError captures the same meaning but without the lookup. In the future, when this fk is broken for real, it will become a service call and no other enforcement. This was the intermediate until the FK is broken because technically an integrity error can be raised so long as it remains.

Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me. I found a few more incorrect passages and the CI failures seem relevant as well.

use_by_user_id: MutableMapping[int, RpcUser] = {
user.id: user
for user in user_service.get_many(
filter=dict(user_ids=[r.user_id for r in rule_activities])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we get around to building RPC calls we should make sure we only send unique userids in the request body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ideally our service interfaces use "wide" types like Sequence so that I don't have to force a list. This is on my mind, too.

Comment on lines +189 to +190
all_team_ids: MutableMapping[int, Set[int]] = {}
all_user_ids: MutableMapping[int, Set[int]] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You could make these defaultdict() to save the conditions on 193 and 199

else:
all_team_ids[g.team_id].add(g.group_id)
if g.user_id:
if g.team_id not in all_team_ids:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if g.team_id not in all_team_ids:
if g.user_id not in all_user_ids:

if data_export.user.email:
user["email"] = data_export.user.email
if data_export.user_id:
user = dict(id=data_export.user_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimming down to only user_id seems reasonable as getting the user will use another RPC call.

assert match_link(url) == expected


# @region_silo_test(stable=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# @region_silo_test(stable=True)

@corps corps merged commit 2d87621 into master Feb 28, 2023
@corps corps deleted the zc/break-fks-query-part branch February 28, 2023 19:29
@corps corps added the Trigger: Revert Add to a merged PR to revert it (skips CI) label Feb 28, 2023
@corps corps restored the zc/break-fks-query-part branch February 28, 2023 19:30
@getsentry-bot
Copy link
Contributor

PR reverted: da6cf99

getsentry-bot added a commit that referenced this pull request Feb 28, 2023
corps added a commit that referenced this pull request Feb 28, 2023
I need to remerge #45095 after
reverting prematurely.
@github-actions github-actions bot locked and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components Trigger: Revert Add to a merged PR to revert it (skips CI)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants