-
Notifications
You must be signed in to change notification settings - Fork 62
skip pruned TUF repos when creating artifact config #9109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got some minor suggestions but this looks good!
I think I'd probably rebase this one on the other one rather than combining them, at least for the purposes of review and approval. I don't feel strongly between those two options. For what it's worth, I already made a branch that starts from this one and merges in the branch from the other one. There was a tiny conflict to resolve. I could push that here if you want (and then you'd want to actually change the base branch in GitHub). (My branch is called see my next commentdap/testing/pruning.)
nexus/test-utils/src/background.rs
Outdated
| ); | ||
| }; | ||
|
|
||
| dbg!(&last_result_completed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if you meant to leave these here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debatable! I'd love some nicer error handling here because when there's an {"error":"blah blah blah"} it panics without showing the message at all. This is halfway to that, and only is printed to the console if a test calls these replication-related helpers and the background tasks fail.
| opctx: &OpContext, | ||
| repo_id: TufRepoUuid, | ||
| ) -> ListResultVec<TufArtifact> { | ||
| opctx.authorize(authz::Action::Read, &authz::FLEET).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add do two things to try to avoid people accidentally using this in API endpoints (since we're making multiple queries here):
- add a call to
opctx.check_complex_operations_allowed()?; - add
_batched()to the name to convey that (we do this in a few other places in the datastore)
Really, I'd be tempted to apply this to artifacts_for_repo, but that may currently break some callers that might be using it from the API. Those should probably be made paginated but we can do that when the dust settles on these APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #9106 lands we may be able to refactor things a little and apply this to artifacts_for_repo, since it removes the list of artifacts from the public APIs.
| .inner_join(tuf_artifact_dsl::tuf_artifact.on( | ||
| tuf_artifact_dsl::id.eq(tuf_repo_artifact_dsl::tuf_artifact_id), | ||
| )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For our use case, I think we don't need this join. We only need the artifact ids. It might be nice to have a version of this that just does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to revert artifacts_for_repo and wait to possibly paginate it during a refactor, particularly if we add check_complex_operations_allowed there. I'll write the query that simply returns the artifact IDs for a repo in the function we add instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No wait we do need the join here, because we ultimately use the artifact sha256sums in the replication task.
This comment was marked as outdated.
This comment was marked as outdated.
9f685b4 to
fdbed8b
Compare
| { | ||
| let generation_now = | ||
| self.datastore.tuf_get_generation(opctx).await?; | ||
| ensure!( | ||
| generation == generation_now, | ||
| "generation changed from {generation} \ | ||
| to {generation_now}, bailing" | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making sure I understand: we have to do this check because if the generation changed, the config we'd build from repos would be inconsistent with other Nexuses building a config at generation_now, right? This seems pretty critical and maybe easy to miss - maybe worth a comment? (My very first reaction reading this was "why do we need to check this? if a new repo has been pruned or added that's fine and we'll just pick it up the next time we run")
Alternatively: should tuf_list_repos_unpruned_batched() take a generation argument and fail if it changes at any point during the listing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I did have this implemented where we read generation and then paginated through the repositories during a single transaction, but I replaced it with the datastore methods from #9107. The comment for tuf_list_repos_unpruned_batched() reads:
/// Since this involves pagination, this is not a consistent snapshot.
/// Consider using `tuf_get_generation()` before calling this function and
/// then making any subsequent queries conditional on the generation not
/// having changed.
So the second check is my interpretation of making subsequent queries conditional on the generation not having changed. I will add a comment to this effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, yeah, one conditional check after does seem better than reasserting the condition as we page through a table. Thanks.
sled-agent/src/sim/artifact_store.rs
Outdated
| let mut watcher = self.storage.delete_done_rx.clone(); | ||
| watcher.mark_unchanged(); | ||
| watcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - I think we could return self.storage.delete_done_tx.subscribe() instead? Then we wouldn't need the mark_unchanged(), and maybe could also drop the delete_done_rx field entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, we could do that. I forgot about the subscribe() method.
Part of #7135. Related to (and should probably be rebased on or merged into) #9107.
No explicit tests yet but seems to work as expected for existing tests.
Also modifies the
artifacts_for_repofunction to be paginated, as the number of artifacts per repo has grown and is expected to continue to grow.