-
Notifications
You must be signed in to change notification settings - Fork 66
Add zstd support for faster tarball creation or extraction in eessi_container.sh #994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2d831f6
901dc67
9cad884
96bb45f
6e7b252
1b408d3
fe3c227
8d706de
17cff91
cffa653
1a41987
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -408,6 +408,19 @@ else | |
| echo "Using ${EESSI_HOST_STORAGE} as tmp directory (to resume session add '--resume ${EESSI_HOST_STORAGE}')." | ||
| fi | ||
|
|
||
| # if ${RESUME} is a file, unpack it into ${EESSI_HOST_STORAGE} | ||
| if [[ ! -z ${RESUME} && -f ${RESUME} ]]; then | ||
| if [[ "${RESUME}" == *.tgz ]]; then | ||
| tar xf ${RESUME} -C ${EESSI_HOST_STORAGE} | ||
| # Add support for resuming from zstd-compressed tarballs | ||
| elif [[ "${RESUME}" == *.zst && -x "$(command -v zstd)" ]]; then | ||
| zstd -dc ${RESUME} | tar -xf - -C ${EESSI_HOST_STORAGE} | ||
| elif [[ "${RESUME}" == *.zst && ! -x "$(command -v zstd)" ]]; then | ||
| fatal_error "Trying to resume from tarball ${RESUME} which was compressed using zstd, but zstd command not found" | ||
| fi | ||
| echo "Resuming from previous run using temporary storage ${RESUME} unpacked into ${EESSI_HOST_STORAGE}" | ||
| fi | ||
|
|
||
| # if ${RESUME} is a file (assume a tgz), unpack it into ${EESSI_HOST_STORAGE} | ||
| if [[ ! -z ${RESUME} && -f ${RESUME} ]]; then | ||
| tar xf ${RESUME} -C ${EESSI_HOST_STORAGE} | ||
|
|
@@ -865,17 +878,30 @@ if [[ ! -z ${SAVE} ]]; then | |
| # ARCH which might have been used internally, eg, when software packages | ||
| # were built ... we rather keep the script here "stupid" and leave the handling | ||
| # of these aspects to where the script is used | ||
| # Compression with zlib may be quite slow. On some systems, the pipeline takes ~20 mins for a 2 min build because of this. | ||
| # Check if zstd is present for faster compression and decompression | ||
| if [[ -d ${SAVE} ]]; then | ||
| # assume SAVE is name of a directory to which tarball shall be written to | ||
| # name format: tmp_storage-{TIMESTAMP}.tgz | ||
| ts=$(date +%s) | ||
| TGZ=${SAVE}/tmp_storage-${ts}.tgz | ||
| if [[ -x "$(command -v zstd)" ]]; then | ||
| TARBALL=${SAVE}/tmp_storage-${ts}.zst | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, it is. https://en.wikipedia.org/wiki/Zstd I believe if your tar is build with zstd support, you can do -xaf to inflate and it will infer from the extension that it needs to use zstd. But: I guess you could have zstd on your system and not have your tar compiled with zstd support, so my current implementation is more portable. |
||
| tar -cf - -C ${EESSI_TMPDIR} . | zstd -T0 > ${TARBALL} | ||
| else | ||
| TARBALL=${SAVE}/tmp_storage-${ts}.tgz | ||
| tar czf ${TARBALL} -C ${EESSI_TMPDIR} . | ||
| fi | ||
| else | ||
| # assume SAVE is the full path to a tarball's name | ||
| TGZ=${SAVE} | ||
| TARBALL=${SAVE} | ||
| # if zstd is present and a .zst extension is asked for, use it | ||
| if [[ "${SAVE}" == *.zst && -x "$(command -v zstd)" ]]; then | ||
| tar -cf - -C ${EESSI_TMPDIR} . | zstd -T0 > ${TARBALL} | ||
| else | ||
| tar czf ${TARBALL} -C ${EESSI_TMPDIR} | ||
| fi | ||
| fi | ||
| tar czf ${TGZ} -C ${EESSI_TMPDIR} . | ||
| echo "Saved contents of tmp directory '${EESSI_TMPDIR}' to tarball '${TGZ}' (to resume session add '--resume ${TGZ}')" | ||
| echo "Saved contents of tmp directory '${EESSI_TMPDIR}' to tarball '${TARBALL}' (to resume session add '--resume ${TARBALL}')" | ||
| fi | ||
|
|
||
| # TODO clean up tmp by default? only retain if another option provided (--retain-tmp) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not check once if
zstdis supported, define a constant, and use that everywhere:Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, you can, but then you have to check if
[[ "${USE_ZSTD}" == 'true' ]](remember, this is bash, not python, there is no boolean in bash.Since you have to do a test anyway, the current is explicit, equally short, and execution time of
command -vis negligible. No point in storing it in an environment variable, then having to remember what the value to check for is (true or True or whatever).