Skip to content

Conversation

michael-kerscher
Copy link
Collaborator

Make the mdbook build process parallel. This is a simple approach by just cloning the entire repository into new directories and running the process in parallel. This is a demonstration for #2767.

The process appears to be mostly CPU bound and not memory or disk heavy so the amount of cores on the CI runner matter.
mdbook build does not seem to use more than 1 core and a typical github CI machine has 4 cores. So the expected speedup would be 4x.

Currently 30 minutes are used for publish, 25 minutes of this is build process, this would be reduced to 5 + 6.25 = ~12 minutes, thus more than halving the process.

@michael-kerscher
Copy link
Collaborator Author

Down from 25 minutes to 11 minutes. Less than expected, but also less than half the previous run time

@djmitche
Copy link
Collaborator

djmitche commented Jun 4, 2025

Nice!!

@michael-kerscher
Copy link
Collaborator Author

This now finished (even with missing rust caches!) in 21 minutes. https://github.com/google/comprehensive-rust/actions/runs/15612132409 With caching this would get down quite a lot.

@mgeisler can you review the comprehensive-rust-all artifact if that is the correct structure? This would show that uploading the intermediate steps and artifact downloading with merge works as expected.

The workflow still needs to be refactored a little bit as there is quite some duplication now but this only serves as a POC as of now

@michael-kerscher
Copy link
Collaborator Author

@djmitche @mgeisler any thoughts on the latest approach?

Copy link
Collaborator

@djmitche djmitche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach, but I admit I don't know much about GitHub actions so if there are any subtle gotcha's I would not be the person to detect them.

Comment on lines 90 to 102
- name: Update Rust
run: rustup update

- name: Setup Rust cache
uses: ./.github/workflows/setup-rust-cache

- name: Install Gettext
run: |
sudo apt update
sudo apt install gettext

- name: Install mdbook
uses: ./.github/workflows/install-mdbook
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small question: do we need to install Rust, mdbook, and the Gettext tools here? I would guess not since everything is built already, no? We're just putting the pieces together and uploading it all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... but we do ;)

We actually only need i18n-report, that is installed in the install-mdbook step cd922c4

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changed over time a bit but essentially cargo xtask install-tools installs the necessary tool at this moment. For simplification and POC this is just installed as is. This needs to be refactored

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know exactly if i18n-report requires gettext.

I could upload the .cargo/bin/i18n-report binary in the previous job then download it in all jobs to use use that. This has the potential to be a bit more obscure in the future. But it probably is already pretty obscure where this is from at this point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while doing this and already making this a bit more obscure, I could create a tool-building phase that uploads all rust tools so we can upload the entire .cargo/bin/ directory and just use this in every translation

Comment on lines -80 to -82
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this went away? Is this not the most important part, so to speak? 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is! This was removed for development and testing purposes to not accidentally deploy stuff

@mgeisler
Copy link
Collaborator

mgeisler commented Sep 1, 2025

I like this approach, but I admit I don't know much about GitHub actions so if there are any subtle gotcha's I would not be the person to detect them.

I think this is a great approach, well done, @michael-kerscher! Originaly, we only had a simple mdbook build, which is very fast. But now, I guess the PDF generation is what slows things down dramatically? So it makes sense to do this in parallel.

As for the caching, if building mdboo (and other binaries) takes a long time, we could build them just once, cache them, and reuse them in each job going forward. I forget if we're able to run 20+ jobs in paralle, or if they run, say 4 or 8 at a time? If it's the latter, then building mdbook once could help here

…steps.

This should be a massive performance gain in the publish step as we don't build several rust tools from scratch anymore for each language
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants