Skip to content

Conversation

@michi88
Copy link
Contributor

@michi88 michi88 commented Jul 1, 2020

Description

Fixes #2224

It was a bit harder as I expected because we didn't have a clean shutdown from the spawned java processes. It seems something with gRPC is not working well because the java processes already start stopping on the 'Ctrl-c' SIGINT before we actually try to stop then in the graceful shutdown. Added detached: true for that and they all seem to shutdown correctly now. Maybe fixable on the Java side of things as well, this issue might be related: grpc/grpc-java#6512

I also did make this command work:

$ firebase emulators:start --import ./.dataDirThatDoesNotExistYet --export-on-exit
// when the --import [dir] does not exist this would be equivalent of running
$ firebase emulators:start --export-on-exit ./.dataDirThatDoesNotExistYet

// on exit now that folder exists and this a second time running 
$ firebase emulators:start --import ./.dataDirThatDoesNotExistYet --export-on-exit
// would be equivalent of 
$ firebase emulators:start --import ./.dataDirThatDoesNotExistYet --export-on-exit ./.dataDirThatDoesNotExistYet

The reason I did that is that it makes it a lot easier for first time repo users (only 1 command to start fresh / restart). Else one would need to start, export and start again. Which is just a pain in my opinion.

I've added the options to src/commands/ext-dev-emulators-start.ts and src/commands/ext-dev-emulators-exec.ts as well but I do not understand what they do.

Scenarios Tested

image

See tests as well

Sample Commands

$ firebase emulators:start --import ./.data --export-on-exit

$ firebase emulators:start --import ./.data --export-on-exit ./data

$ firebase emulators:start --import ./.other --export-on-exit ./data

// fail
$ firebase emulators:start --export-on-exit

// exec works as well
$ firebase emulators:exec --import ./.data --export-on-exit

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@michi88
Copy link
Contributor Author

michi88 commented Jul 1, 2020

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: yes Manual indication that this has passed CLA. and removed cla: no labels Jul 1, 2020
Copy link
Contributor

@samtstern samtstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is looking great, thank you for doing it so quickly! Just a few questions.

*/
export function setExportOnExitOptions(options: any): any {
if (options.exportOnExit) {
if (typeof options.import === "string") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can assume everything in options (or at least all the flags) are strings when they're defined, no need to do all this typeof checking.

If you're just using typeof to check if it exists, we prefer checking the truthiness instead. So for example you have this:

options.exportOnExit =
        typeof options.exportOnExit === "string" ? options.exportOnExit : options.import;

I think you can just do:

options.exportOnExit = options.exportOnExit || options.import;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options.exportOnExit can be a boolean when used without a [dir]. Like --import ./data --export-on-exit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes good point! Want to just add a comment about this? Also seems like typeof options.import checks are still not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it is not the prettiest, added some test cases and a note.

let receivedSIGINTS = 0;
process.on("SIGINT", async () => {
try {
receivedSIGINTS = receivedSIGINTS + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I send SIGINT twice won't this logic run await onExit(options) and await controller.cleanShutdown() more than once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

onExit will run only once, cleanShutdown() twice and then gives up. Current behaviour is that cleanShutdown() is always called every time you do SIGINT. Not ok when cleanShutdown is hanging on some stop and never finishes or take ages. You need to be able to bail out there.

I was considering changing it to process.once("SIGINT", ... and only do everything once but then I was thinking of the case that an onExit takes a long time and the user want to just quit, in that case we still want to do the cleanShutdown().

Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so after fighting my Windows PC for a few hours I finally got everything running. The new SIGINT handling seems fine however:

  1. When I do a single Ctrl+C I now get warnings like Firestore Emulator has exited upon receiving signal: SIGINT which make it seem like something went wrong but I think that's normal?
  2. The log message Received SIGINT 1 is confusing because it feels like 1 is the signal code but it's the count. Suggest Received SIGINT (count=1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm actually while playing with this I quickly got into a state where the Database emulator is running free and holding port 9000, so I have to shut it down with pkill or similar. That's not good.

@samtstern samtstern self-assigned this Jul 1, 2020
@samtstern
Copy link
Contributor

Oh and @michi88 please add an entry to CHANGELOG.md

@michi88
Copy link
Contributor Author

michi88 commented Jul 1, 2020

@samtstern added it to the changelog. Out of curiosity. How do you guys handle the release versioning / changelog / documentation?

@samtstern
Copy link
Contributor

@michi88 our release process pretty much works like this:

  • We use master as the development branch for the next release and it's always kept green. Every PR goes to master and includes a CHANGELOG entry if it should be included in release notes.
  • Once in a while a human on the team says "hey should we do a release?" at which point we read over the changelog and decide on the SemVer implications. Then we kick off a Cloud Build job that does the release.
  • The CHANGELOG is directly converted into the GitHub release notes and then we clear it out and start over.

@samtstern
Copy link
Contributor

@michi88 I did a bunch of testing and all the import/export stuff works great! My remaining concerns are about the SIGINT handling and the newly detached processes. I don't think that's working quite right.

@michi88
Copy link
Contributor Author

michi88 commented Jul 1, 2020

Yeah, it would be best if we can handle the SIGINT in the java children correctly (wait for exit). I think the RPC process goes down to quickly although the jvm is still running. Best would really be to have them attached. I did try all kinds of combinations for the stdio settings, to no avail. In the end I think they should be attached but the SIGINT signal should not propagate to the child.

It is very probable this is fixable in the java side of things. This has been merged end of last year, do you guys have that in?
grpc/grpc-java#6512

@samtstern
Copy link
Contributor

@michi88 looking at the source code of the emulator we do call server.awaitTerminated() so I don't actually think that gRPC issue is the root of the problem? I am a bit stumped here to be honest,

@samtstern
Copy link
Contributor

@michi88 I want to add that the bugs this has revealed are not your fault, but they are blockers at the moment. So I'm gonna see if I can get some more of my teammates to look into this and offer ideas.

@yuchenshi
Copy link
Member

My two cents on SIGINT:

When you Ctrl-C, a SIGINT is sent to ALL processes in the foreground group. In our case, that includes the Emulator Java processes. There is no way you can "catch" that signal in the CLI process to prevent it from reaching the child process.

If we want to shield the java process from receiving SIGINT, we need to modify the CLI to launch the java process in the background. In Node.js, I believe it can be achieved via spawn(..., ..., {detached: true}).

Disclaimer: Not an expert on this domain and haven't tried that.

@michi88
Copy link
Contributor Author

michi88 commented Jul 2, 2020

Here some code to play around with this: https://gist.github.com/michi88/48e1b3b8530163b45ce1db75aa35451e

In the end I think it is best to not run in detached mode but ignore the SIGINT signal in the emulators (children). One could argue we only want to ignore the first SIGINT and do our clean shutdown but on a second do go into a quick shutdown.

@yuchenshi
Copy link
Member

@michi88 Thanks for the suggestion! However, modifying the Firestore emulator to ignore the first SIGINT is really complicated on our side, since the Firestore emulator is also included the Google Cloud CLI (gcloud) and we need to keep the backwards compatibility there. What issue do you see with detached mode? Let's see if we can address that instead.

@samtstern
Copy link
Contributor

@yuchenshi while manually testing this I quickly got into a situation where I had exited the parent process but the detached emulator processes were still running and holding on to their ports. I'm not 100% sure how I got there, I was doing a lot of manual testing with Ctrl+C at different times.

So I think we all agree that detached: true is the way to go but we need to make absolutely sure we never exit without killing all the children.

@michi88
Copy link
Contributor Author

michi88 commented Jul 7, 2020

EDIT: didn't see your comment above...

@samtstern What were the issue you were observing with detached mode?

One tradeoff we have is:

  • Do we always wait for a clean shutdown of the emulators no matter what, this makes sure we never leave orphan emulator processes running
    OR
  • Do we skip the clean shutdown after a third SIGINT for example (like we do now) ensuring that we do give back the shell to the user when they really want to (but potentially leve emulators running).

Also, we now only listen for SIGINT, we probably want to change that to also listen for SIGTERM, SIGHUP and SIGQUIT.

I would propose we do allow for a non-clean exit on repeated exit signals (with the warning we're not cleanly shutting down) and we also start listening for the other 3 exit signals.

@michi88
Copy link
Contributor Author

michi88 commented Jul 7, 2020

@samtstern I could see if we can make it such that we always wait for a clean shutdown if you'd like.

@samtstern
Copy link
Contributor

@michi88 I do agree that as a developer you want to get control of your terminal back when you slam Ctrl+C multiple times. But it can definitely be hard to find the right zombie processes to kill after.

I think we know all of the PIDs of the child processes. Before doing a non-clean exit could we print a warning like:

Warning: you have forced the Emulator Suite to exit without waiting for {n} subprocesses to finish. These processes may still be running on your machine.  To force them to exit run:

  - kill {pid 1}
  - kill {pid 2}

@michi88
Copy link
Contributor Author

michi88 commented Jul 7, 2020

Works for me as well, we do have the pid. Let me know your decision.

@samtstern
Copy link
Contributor

@michi88 let's go with the warning message, thanks for being so flexible! We'll need to make sure we show a platform-dependent warning; kill for Mac/Linux and TaskKill for Windows.

michi88 added 2 commits July 10, 2020 19:25
- Listen for more stop signals
- Stop running on fatal start of emulator
- Stop emulators on exit that are 'starting'
- Faster shutdown (parallel emulator stops)
@michi88
Copy link
Contributor Author

michi88 commented Jul 10, 2020

Ok, so changed quite some things here to make it all work together:

  • Listen for more stop signals
  • Stop running on fatal start of emulator, for example try: JAVA_HOME=/does/not/exist firebase emulators:start
  • Stop emulators on exit that are 'starting'
  • Faster shutdown (parallel emulator stops)
  • Throw an error when non existing emulator is used, for example: firebase emulators:start --only doesnotexist

image

This fixes a lot of cases that emulators keep running and shutdown is a bit faster as well.

Looking forward to your feedback.

Copy link
Contributor

@samtstern samtstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is a really great PR that cleans up a lot of edge cases! Thank you for sticking with it. I have one final comment, but approved otherwise.

fs
.readFileSync(
path.join(path.resolve(options.import), HubExport.METADATA_FILE_NAME),
"utf8"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to handle this else case or log something? Having it empty seems strange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I indeed wanted to add a warning there. Added now!

@samtstern
Copy link
Contributor

Oh and don't worry about the CLA bot, I checked internally and you definitely signed it so I will force-merge this when ready.

@michi88
Copy link
Contributor Author

michi88 commented Jul 11, 2020

Oh and don't worry about the CLA bot, I checked internally and you definitely signed it so I will force-merge this when ready.

Not sure what you're seeing but it has been looking ok from my side for a while already:

image

Wow this is a really great PR that cleans up a lot of edge cases! Thank you for sticking with it. I have one final comment, but approved otherwise.

Awesome! I did have some fun diving into this project!

@samtstern samtstern merged commit 053be81 into firebase:master Jul 11, 2020
@samtstern
Copy link
Contributor

@michi88 merged!

@quantuminformation
Copy link

Will it be in the next release?

Nice work, and not even a Google staff.

@samtstern
Copy link
Contributor

@quantuminformation yep, whenever a change is merged into master it gets included with the next release.

@anantanandgupta
Copy link

i am getting my emulator stopping before it can export the database. this happens on precc of ctrl+c on macos with [email protected]

^C
i  emulators: Received SIGINT (Ctrl-C) for the first time. Starting a clean shutdown.
i  emulators: Please wait for a clean shutdown or send the SIGINT (Ctrl-C) signal again to stop right now.
i  Automatically exporting data using --export-on-exit "./rtdb" please wait for the export to finish...

⚠  emulators: Received SIGINT (Ctrl-C) 2 times. You have forced the Emulator Suite to exit without waiting for 1 subprocess to finish. These processes may still be running on your machine:

┌───────────────────┬────────────────┬───────┐
│ Emulator          │ Host:Port      │ PID   │
├───────────────────┼────────────────┼───────┤
│ Database Emulator │ localhost:9000 │ 20288 │
└───────────────────┴────────────────┴───────┘

To force them to exit run:

kill 20288

below is the command i used to start the process

firebase emulators:start --only functions,hosting,database --inspect-functions --import ./rtdb --export-on-exit

can anyone please suggest a way to gracefully stop the process and wait for the database to be exported.

@samtstern
Copy link
Contributor

@anantanandgupta this looks like #2507 and I have a PR coming to fix it very shortly.

@anantanandgupta
Copy link

thanks @samtstern ... will wait for this. one more point to make over here, which no one has pointed out is that emulator ui is also not stopped. it keeps on getting new port even after we kill that pid shown in the message. on a fresh start emulator ui gets port 4000. but after this issue and killing the pid (or even sending sigint) it keeps emulator ui running, thus keeping the port. next execution it gets 4001 ... 4002 and onwards. i have to kill the nodejs process manually to free that port.

@alexandermckay
Copy link

@samtstern works perfectly. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Manual indication that this has passed CLA.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow emulators to export on exit

7 participants