-
Notifications
You must be signed in to change notification settings - Fork 13.4k
devops: add s390x & ppc64le CI #15925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
But it's not a cross build is it? Should be in |
Nope, native build. I am testing around, hence draft haha. Thanks for the heads up! |
I think you will need to resolve the warnings and revert before undrafting, the build is done with This looks like an actual bug (fixed in #15928): This might be a bug, as it suggests these loops are no-op: llama.cpp/ggml/src/ggml-cpu/vec.h Lines 677 to 684 in 6ab397e
llama.cpp/ggml/src/ggml-cpu/vec.h Lines 740 to 747 in 6ab397e
|
Yep, will take note of that before marking as ready. I wanted to progress past the compile stage to fix the endianness issues with the test models first because that is the problematic part. Also, the problem about GCC on s390x and ppc64le is that the compiler is a little more strict. Things that would have been okay on x86 or ARM are reported as warnings on s390x and ppc64le and, in this case, marked as fatal. I suppose we can either have the codeowner work to fix the warning on the affected lines, or (not ideal) we disable warnings as fatal for s390x and ppc64le. |
Maybe just convert and overwrite the vocab GGUFs for BE tests? |
You need the |
It's 3 AM on my side now, will continue this tomorrow 🙂 |
Looks like there's an endian-issue in |
If you add vocab GGUF conversion instead of duplicating files, |
You mean use the I don't think storage is an issue here also right? But if this is a concern, do let me know and we can go the conversion route during CI run.
I didn't quite get this part. Can you explain further? |
Yes, shouldn't add much to runtime as they are only small vocab files.
Generally we don't want to bloat git with more binaries than necessary, that's why new vocab files are being stored on HF.
This test downloads extra vocab files from HF, it's preferable that you also run this test and not skip it, but these files are little-endian only, so must be converted. |
Don't try to install all the repo requirements just for |
It's just |
I was trying to target the relative path so it installs from local. Should I prefer installing via pypi instead? |
Ah, no, that's fine, it'll install the right dependencies then. |
Here's a nice trick to check if system is big-endian, that can be added to echo -n I | od -to2 | head -n1 | cut -f2 -d" " | cut -c1 |
Taking a quick glance at the tokeniser tests, its clear that Little-Endian and Big-Endian models process these a little differently (i.e., space vs. no space before text) - 6: src: 'Hello, y'all! How are you 😁 ?我想在apple工作1314151天~'
+ 6: res: 'Hello, y'all! How are you 😁?我想在apple工作1314151天~'
6: tok: 15496 11 331 6 439 0 1374 389 345 30325 223 5633 22755 239 46349 111 28839 101 18040 32432 98 43291 1485 1415 24309 25465 171 121 252 For emojis, I don't seem to know why it can't produce the correct emojis even though I've had no issues with them for my Z & LinuxONE demos. I suppose it's okay for me to edit the |
I suppose this is to separate the Little-Endian and Big-Endian Edit: Doesn't work :( [taronaeo@aqlinux2 ~]$ echo -n I | od -to2 | head -n1 | cut -f2 -d" " | cut -c1
0 |
For converting the downloaded files.
Dang, I was making a guess there to invert the result, guess I was wrong, change |
No, this looks like an endian-bug in the tokenizer, the output must be equal. Edit: It would be interesting to know what |
Hmm I've added this check though. I think its enough to prevent the endianness conversion script from running on LE systems. llama.cpp/.github/workflows/build.yml Line 228 in ec1993d
I was thinking your check can be added for the |
Yeah, but
They should not be different, this is a bug. |
The emoji issue seems to be specific for the |
Got it. I was looking at the test-tokenizer-0 results and I didn't scroll down to check that portion. With regards to the tokeniser, I'm a little stumped now. If the generated |
Someone will have to debug the tokenizers in question in |
Great, but that was just |
Ah, I have a flight tomorrow and will on vacation from 14 to 21 September. I'll come back to this the week after, or I'll check if @AlekseiNikiforovIBM is able to continue this whilst I'm away :) |
Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
Array q8bytes had only 4 elements allocated, but 8 elements accessed. This lead to write out of bounds and later read of overwritten values out of bounds and incorrect result. Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
for some reason it keeps failing test-thread-safety tests and I do not have a machine that is able to replicate the tests. Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
Ensure it works even if both XDG_CACHE_HOME and HOME are unset. This might happen in containers. Signed-off-by: Aaron Teo <[email protected]>
Signed-off-by: Aaron Teo <[email protected]>
Only memcpy data from sections argument if it's non-NULL. Signed-off-by: Aaron Teo <[email protected]>
ecf6aaa
to
e33d9ee
Compare
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Yes, it's uploaded now. |
1fd003c
to
8ab0491
Compare
@AlekseiNikiforovIBM Tests passing, wait for @taronaeo to merge or shall I merge when CI is done? |
Feel free to merge this, I didnt have the time to check the CI results :) |
|
It's "OK", it's a corrupt ccache. |
Gotcha. Will proceed to merge then. |
* devops: move s390x and ppc64le ci build we have access to ubuntu-24.04-s390x and ppc64le images now Signed-off-by: Aaron Teo <[email protected]> * devops: disable ppc64le for now since they have compiler errors Signed-off-by: Aaron Teo <[email protected]> * devops: stop warnings as errors Signed-off-by: Aaron Teo <[email protected]> * devops: switch to non-macro flag Signed-off-by: Aaron Teo <[email protected]> * devops: going the llama macro route Signed-off-by: Aaron Teo <[email protected]> * devops: add big-endian gguf test models Signed-off-by: Aaron Teo <[email protected]> * devops: disable ppc64le to test s390x, check test build Signed-off-by: Aaron Teo <[email protected]> * devops: dup .gguf.inp files for big-endian tests Signed-off-by: Aaron Teo <[email protected]> * devops: dup .gguf.out files for big-endian too Signed-off-by: Aaron Teo <[email protected]> * devops: add python setup and endian byteswap Signed-off-by: Aaron Teo <[email protected]> * devops: pooring thing does not have s390x python3 Signed-off-by: Aaron Teo <[email protected]> * devops: add missing rust compiler for s390x Signed-off-by: Aaron Teo <[email protected]> * devops: try rust actions runner Signed-off-by: Aaron Teo <[email protected]> * Revert "devops: try rust actions runner" This reverts commit 3f8db04. Signed-off-by: Aaron Teo <[email protected]> * devops: try a different path for rust Signed-off-by: Aaron Teo <[email protected]> * devops: dump home directory and user info Signed-off-by: Aaron Teo <[email protected]> * devops: install gguf-py only Signed-off-by: Aaron Teo <[email protected]> * devops: missed relative path Signed-off-by: Aaron Teo <[email protected]> * devops: remove big-endian files since local swapping is working Signed-off-by: Aaron Teo <[email protected]> * devops: revert test-tokenizer-0 cmakelists Signed-off-by: Aaron Teo <[email protected]> * Fix unicode flags conversion from and to uint16_t Bitfields are allocated in different order on s390x Signed-off-by: Aaron Teo <[email protected]> * Simplify byteswap command Signed-off-by: Aaron Teo <[email protected]> * Add byteswapping and git-lfs for test-tokenizers-ggml-vocabs Signed-off-by: Aaron Teo <[email protected]> * Fix endianness detection in vocab loader Signed-off-by: Aaron Teo <[email protected]> * Disable test-thread-safety on s390x In this test a model is downloaded, then immediately loaded to check if more downloads are needed, and then used for test. There is no clean way to separate all those steps to add byteswapping between them, so just skip this test. Signed-off-by: Aaron Teo <[email protected]> * Fix q8_0 test in test-quantize-fns vec_signed uses unexpected rounding mode. Explicitly use different rounding function. Signed-off-by: Aaron Teo <[email protected]> * devops: add big-endian stories260K Signed-off-by: Aaron Teo <[email protected]> * devops: add s390x test-eval-callback Signed-off-by: Aaron Teo <[email protected]> * devops: fix test does not exist Signed-off-by: Aaron Teo <[email protected]> * devops: fix model not found llama-eval-callback Signed-off-by: Aaron Teo <[email protected]> * Fix q3_K dot product error in test-quantize-fns on s390x Array q8bytes had only 4 elements allocated, but 8 elements accessed. This lead to write out of bounds and later read of overwritten values out of bounds and incorrect result. Signed-off-by: Aaron Teo <[email protected]> * devops: re-enable ppc64le for testing Signed-off-by: Aaron Teo <[email protected]> * devops: activate test-thread-safety for s390x Signed-off-by: Aaron Teo <[email protected]> * devops: disable ppc64le tests for some reason it keeps failing test-thread-safety tests and I do not have a machine that is able to replicate the tests. Signed-off-by: Aaron Teo <[email protected]> * devops: LLAMA_FATAL_WARNINGS=ON Signed-off-by: Aaron Teo <[email protected]> * Correct repository URL for s390x for test-thread-safety model Signed-off-by: Aaron Teo <[email protected]> * Fix fs_get_cache_directory Ensure it works even if both XDG_CACHE_HOME and HOME are unset. This might happen in containers. Signed-off-by: Aaron Teo <[email protected]> * Re-enable CI for ppc64le Signed-off-by: Aaron Teo <[email protected]> * Fortify ggml_rope_impl Only memcpy data from sections argument if it's non-NULL. Signed-off-by: Aaron Teo <[email protected]> * Add TODO in struct unicode_cpt_flags to reimplement it in endian-independent way * Update URL for big-endian model * Update .github/workflows/build.yml Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update remaining mentions of BE models to ggml-org/models repo --------- Signed-off-by: Aaron Teo <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
* devops: move s390x and ppc64le ci build we have access to ubuntu-24.04-s390x and ppc64le images now Signed-off-by: Aaron Teo <[email protected]> * devops: disable ppc64le for now since they have compiler errors Signed-off-by: Aaron Teo <[email protected]> * devops: stop warnings as errors Signed-off-by: Aaron Teo <[email protected]> * devops: switch to non-macro flag Signed-off-by: Aaron Teo <[email protected]> * devops: going the llama macro route Signed-off-by: Aaron Teo <[email protected]> * devops: add big-endian gguf test models Signed-off-by: Aaron Teo <[email protected]> * devops: disable ppc64le to test s390x, check test build Signed-off-by: Aaron Teo <[email protected]> * devops: dup .gguf.inp files for big-endian tests Signed-off-by: Aaron Teo <[email protected]> * devops: dup .gguf.out files for big-endian too Signed-off-by: Aaron Teo <[email protected]> * devops: add python setup and endian byteswap Signed-off-by: Aaron Teo <[email protected]> * devops: pooring thing does not have s390x python3 Signed-off-by: Aaron Teo <[email protected]> * devops: add missing rust compiler for s390x Signed-off-by: Aaron Teo <[email protected]> * devops: try rust actions runner Signed-off-by: Aaron Teo <[email protected]> * Revert "devops: try rust actions runner" This reverts commit 3f8db04. Signed-off-by: Aaron Teo <[email protected]> * devops: try a different path for rust Signed-off-by: Aaron Teo <[email protected]> * devops: dump home directory and user info Signed-off-by: Aaron Teo <[email protected]> * devops: install gguf-py only Signed-off-by: Aaron Teo <[email protected]> * devops: missed relative path Signed-off-by: Aaron Teo <[email protected]> * devops: remove big-endian files since local swapping is working Signed-off-by: Aaron Teo <[email protected]> * devops: revert test-tokenizer-0 cmakelists Signed-off-by: Aaron Teo <[email protected]> * Fix unicode flags conversion from and to uint16_t Bitfields are allocated in different order on s390x Signed-off-by: Aaron Teo <[email protected]> * Simplify byteswap command Signed-off-by: Aaron Teo <[email protected]> * Add byteswapping and git-lfs for test-tokenizers-ggml-vocabs Signed-off-by: Aaron Teo <[email protected]> * Fix endianness detection in vocab loader Signed-off-by: Aaron Teo <[email protected]> * Disable test-thread-safety on s390x In this test a model is downloaded, then immediately loaded to check if more downloads are needed, and then used for test. There is no clean way to separate all those steps to add byteswapping between them, so just skip this test. Signed-off-by: Aaron Teo <[email protected]> * Fix q8_0 test in test-quantize-fns vec_signed uses unexpected rounding mode. Explicitly use different rounding function. Signed-off-by: Aaron Teo <[email protected]> * devops: add big-endian stories260K Signed-off-by: Aaron Teo <[email protected]> * devops: add s390x test-eval-callback Signed-off-by: Aaron Teo <[email protected]> * devops: fix test does not exist Signed-off-by: Aaron Teo <[email protected]> * devops: fix model not found llama-eval-callback Signed-off-by: Aaron Teo <[email protected]> * Fix q3_K dot product error in test-quantize-fns on s390x Array q8bytes had only 4 elements allocated, but 8 elements accessed. This lead to write out of bounds and later read of overwritten values out of bounds and incorrect result. Signed-off-by: Aaron Teo <[email protected]> * devops: re-enable ppc64le for testing Signed-off-by: Aaron Teo <[email protected]> * devops: activate test-thread-safety for s390x Signed-off-by: Aaron Teo <[email protected]> * devops: disable ppc64le tests for some reason it keeps failing test-thread-safety tests and I do not have a machine that is able to replicate the tests. Signed-off-by: Aaron Teo <[email protected]> * devops: LLAMA_FATAL_WARNINGS=ON Signed-off-by: Aaron Teo <[email protected]> * Correct repository URL for s390x for test-thread-safety model Signed-off-by: Aaron Teo <[email protected]> * Fix fs_get_cache_directory Ensure it works even if both XDG_CACHE_HOME and HOME are unset. This might happen in containers. Signed-off-by: Aaron Teo <[email protected]> * Re-enable CI for ppc64le Signed-off-by: Aaron Teo <[email protected]> * Fortify ggml_rope_impl Only memcpy data from sections argument if it's non-NULL. Signed-off-by: Aaron Teo <[email protected]> * Add TODO in struct unicode_cpt_flags to reimplement it in endian-independent way * Update URL for big-endian model * Update .github/workflows/build.yml Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update remaining mentions of BE models to ggml-org/models repo --------- Signed-off-by: Aaron Teo <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
closes #13243
Introduce s390x & ppc64le CI using IBM Actions on POWER and Z Runner images.
TODO:
Edit: I have to use this PR to test the GitHub Actions as only Llama.cpp is authorised to use the s390x and ppc64le images. Unfortunately I do not have an alternative to develop this elsewhere and submit the PR once I'm ready.