Skip to content

Conversation

@markstory
Copy link
Member

Eagerly purge events when projects are removed. I've also modified the Project delete task so that it uses the existing group deletion task instead of doing half the work as Project deletion task was missing many relations of Group.

Eagerly purge events when projects are removed. I've also modified the
Project delete task so that it uses the existing group deletion task
instead of doing half the work as Project deletion task was missing
*many* relations of Group.
Comment on lines 86 to 87
eventstream_state = eventstream.start_delete_groups(self.project_id, [self.group_id])
eventstream.end_delete_groups(eventstream_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there's a problem with issuing a lot of separate replacements like this, instead of batching them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about the volume of replacements this could generate as well. Projects could have 1000's of issues.

Only send a replacement to eventstream after all the node data has been
removed. Removing the event data early results in reports and events
being left behind
@volokluev
Copy link
Member

This has a big risk of plugging up the replacement queue when a project is deleted if we send them 1 by 1. How hard is it to batch these things? We should not merge this as is

@markstory
Copy link
Member Author

How hard is it to batch these things?

It isn't simple as ProjectDeletionTask uses the GroupDeletionTask to remove groups one at a time, and thus each group removal task only knows about the current group. Another complication is that the BulkModelDeletionTask implementation doesn't support deleting child relations.

We would have to replace the groupdeletion task with a more specialized implementation that quacked like a bulk deletion but used a mix of instance and bulk deletion for its relations.

@wedamija
Copy link
Member

wedamija commented Dec 5, 2022

The top projects have 1mm+ issues. Is removing that many issues from snuba via replacements even feasible for projects that large?

@markstory
Copy link
Member Author

Is removing that many issues from snuba via replacements even feasible for projects that large?

That could be a reason it has not historically been part of project deletion.

@wedamija
Copy link
Member

wedamija commented Dec 5, 2022

Is removing that many issues from snuba via replacements even feasible for projects that large?

That could be a reason it has not historically been part of project deletion.

Yeah... if we batched them in groups of 1k, then we'd still issue 1k replacements. @volokluev I don't really know the full impact of replacements. How expensive is a replacement with 1 group vs a replacement with say 1k-10k groups?

@volokluev
Copy link
Member

Yes removing that many issues is feasible via replacements. In terms of the impact of batching/not batching replacements it's huge.

In order to get the rows to delete there is a sql query that happens:

https://github.com/getsentry/snuba/blob/master/snuba/datasets/errors_replacer.py#L810-L815

That query will be batched (thus reducing query load).

Then in order to write the "tombstone rows" to mark them for deletion, an insert call will be made

https://github.com/getsentry/snuba/blob/master/snuba/datasets/errors_replacer.py#L637-L642

That insert call will happen once per kafka message on the repacements topic. If you were to do these one by one, it would result in as many insert calls as there are groups. Every insert call is a call to zookeeper and a new created file. We don't want that

@wedamija
Copy link
Member

wedamija commented Dec 6, 2022

Yes removing that many issues is feasible via replacements. In terms of the impact of batching/not batching replacements it's huge.

In order to get the rows to delete there is a sql query that happens:

https://github.com/getsentry/snuba/blob/master/snuba/datasets/errors_replacer.py#L810-L815

That query will be batched (thus reducing query load).

Then in order to write the "tombstone rows" to mark them for deletion, an insert call will be made

https://github.com/getsentry/snuba/blob/master/snuba/datasets/errors_replacer.py#L637-L642

That insert call will happen once per kafka message on the repacements topic. If you were to do these one by one, it would result in as many insert calls as there are groups. Every insert call is a call to zookeeper and a new created file. We don't want that

Is there a recommended max size for a replacement? I assume we can't shove 100k group ids into a single one.

@markstory
Copy link
Member Author

I was thinking more about this last night and I have an idea on how to batch these deletions for more than a single group at a time. I'll make the batch size for groups and events something we can easily change so we can tune deletions if batches are too big/small.

By bulk deleting relations with `in` operations we can delete from
multiple groups in a blended batch.
@markstory
Copy link
Member Author

@wedamija I've revised group deletion so that group deletion can operate in batches of 100 groups. Each batch of 100 groups will emit a single replacement message.

@markstory
Copy link
Member Author

/gcbrun

@wedamija
Copy link
Member

wedamija commented Dec 8, 2022

@wedamija I've revised group deletion so that group deletion can operate in batches of 100 groups. Each batch of 100 groups will emit a single replacement message.

I'm going to try to review this today/tomorrow, but just have a few things going at the moment. In the mean time, @volokluev what's a reasonable max number of replacements to generate here?

https://redash.getsentry.net/queries/3373/source: Here's a breakdown of the top projects by active group. Worst case is 5mm, but there's a long tail of 100k+ orgs. So if we batch in groups of 100, we might issue 50k replacements in the worse case. For other large orgs, we might issue 1k. Wondering if we can bump this up to 1k/10k groups batched just to avoid these edge cases?

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I would like to get clarification from SNS on what a reasonable chunk size is here, given the info in my previous comment. Just don't want to have an incident come up because some project with 1mm+ groups gets deleted.

]
)

model_list = (models.GroupMeta, models.GroupResolution, models.GroupSnooze)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove these? Just already handled in the direct group deletion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Project now has Group as a child_relation and Group has these models as child_relations.

This was likely a workaround to be able to bulk delete across multiple groups. That isn't necessary anymore as I've changed bulk deletions to support __in lookups as well.

@getsentry-bot
Copy link
Contributor

PR reverted: bff380d

getsentry-bot added a commit that referenced this pull request Jan 6, 2023
markstory added a commit that referenced this pull request Jan 6, 2023
… deleted (#41937)""

This reverts commit bff380d which broke
deploys because it used a dev dependency in production code.
markstory added a commit that referenced this pull request Jan 9, 2023
Mulligan on #41937 that doesn't use itertools_more as that is a dev only
dependency.
@github-actions github-actions bot locked and limited conversation to collaborators Jan 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components Trigger: Revert Add to a merged PR to revert it (skips CI)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants