Skip to content

Conversation

slaren
Copy link
Member

@slaren slaren commented Sep 16, 2024

Moves CPU backend declarations in ggml-impl.h to a new header ggml-cpu-impl.h. Avoids unnecessary declarations in other backends that need this header, and workarounds a conflict with nvcc and immintrin.h.

Fixes #9473

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 16, 2024
@slaren slaren merged commit 23e0d70 into master Sep 16, 2024
58 checks passed
@slaren slaren deleted the sl/fix-cuda-ggml-impl branch September 16, 2024 14:22
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
@Pleune
Copy link

Pleune commented Feb 16, 2025

I am still running into this issue when compiling with gcc's -march, I believe due to

#include <immintrin.h>

Probably the most sane thing to do is not compile with -march... but still wanted to note this.

Edit: to say yes patching a hack in to turn off the __F16C__ flag (replace __F16C with 0), and then adding #include <immintrin.h> to ggml.c has me compiling fine with the broken cuda/gcc combo (I can't easily change the system over to a non-broken set).

@Pleune Pleune mentioned this pull request Feb 19, 2025
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Build failure in master on Ubuntu 24.04 with CUDA enabled
3 participants