Skip to content

Conversation

@dbkegley
Copy link
Collaborator

@dbkegley dbkegley commented Dec 1, 2023

Intent

Interrupting a long-running rsconnect content build run command with ^C
will now update the local state file before attempting graceful cleanup. This
should help prevent users from getting stuck a "build already running" state.

Fixes #467

Type of Change

  • Bug Fix
  • New Feature
  • Breaking Change

Approach

  • Set global build state before attempting graceful background thread shutdown. This may cause individual build status to still say "RUNNING"
  • RUNNING builds can now be re-started with content build run --running
  • Added a convenience flag --retry that will start a build for all content marked as RUNNING, ABORTED, NEEDS_BUILD or ERROR
    • Marks the --running, --aborted, and --error flags as hidden. Most people will probably want to use --retry anyway but the more specific flags can still be used if needed.

Automated Tests

Added a new test for --retry in test_main_content.py

Directions for Reviewers

alternatively we can merge #529 first and then these tests will run in CI

The tests in test_main_content.py require mock_connect
I would recommend updating scripts/runtests to say: pytest ${PYTEST_ARGS} --mypy ./tests/test_main_content.py
so that you can run only the content tests locally since they are skipped in CI.

# start mock_connect in a new terminal
cd mock_connect && make up

# run tests
CONNECT_CONTENT_BUILD_DIR="rsconnect-build-test" \
CONNECT_SERVER=http://localhost:3939 \
CONNECT_API_KEY="0123456789abcdef0123456789abcdef" \
make test

I've also created a followup to deprecate mock_connect. We should move to httpretty instead for mocking which is already used by test_main.py and part of this follow up we can start running these tests in CI again without mock_connect.

Checklist

  • I have updated CHANGELOG.md to cover notable changes.
  • I have updated all related GitHub issues to reflect their current state.

@dbkegley dbkegley force-pushed the kegs-build-running-state branch from 1b62fec to f33f14b Compare December 1, 2023 19:06
@github-actions
Copy link

github-actions bot commented Dec 1, 2023

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
4426 3146 71% 0% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
rsconnect/actions_content.py 66% 🟢
rsconnect/main.py 62% 🟢
TOTAL 64% 🟢

updated for commit: 7e31eff by action🐍

@dbkegley dbkegley force-pushed the kegs-build-running-state branch from bc50472 to f33f14b Compare December 4, 2023 14:22
Makefile Outdated
CONNECT_CONTENT_BUILD_DIR="rsconnect-build-test" \
CONNECT_SERVER="http://$(HOSTNAME):3939" \
CONNECT_API_KEY="0123456789abcdef0123456789abcdef" \
CONNECT_API_KEY="21232f297a57a5a743894a0e4a801fc3" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we rotate credentials at some point? Just making sure I'm following why this change is needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a static api key that mock_connect was expecting. I changed it to the insecure admin api key (echo -n admin | md5sum) since that's what we use in dev, but this is removed in my next PR anyway

# make sure that we always mark the build as complete but note
# there's no guarantee that the content_executor or build_monitor
# were allowed to shut down gracefully, they may have been interrupted.
_content_build_store.set_build_running(False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this setup, its still possible to ^C into a weird state, right? Just significantly less likely if we set the state of the build first thing.

Copy link
Collaborator Author

@dbkegley dbkegley Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's right. I'm hoping this covers most cases. It could still happen on slow filesystems but this makes it less likely in most scenarios. If the problem persists after this change then we can look at removing the safety check.

The reason it exists in the first place is to stop users from modifying the state file while there's a build running. The 3 commands that modify the local start are content build [run, add, rm].

edit:
probably a better alternative than removing the safety check is to provide an "are you sure?" dialog if they attempt to start another build when the local state says there's a build already running. This will allow them to get unstuck without manually updating the state file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I completely agree with the approach here - just making sure I am following what's going on.

store = ContentBuildStore(RSConnectServer(connect_server, api_key))
store.set_content_item_build_status(_content_guids[0], BuildStatus.RUNNING)

# run the build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think its a blocker or anything, but you could also add some tests for the other hidden flags that are considered when a retry is executed if you wanted to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this in my followup PR before this gets merged to main to avoid a merge conflict

@zackverham
Copy link
Contributor

do we need to document the new flag somewhere?

@dbkegley dbkegley merged commit 71b5204 into master Dec 5, 2023
@dbkegley dbkegley deleted the kegs-build-running-state branch December 5, 2023 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Client has incorrect knowledge of server state when long-running build process is interrupted

3 participants