-
Notifications
You must be signed in to change notification settings - Fork 13k
Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs #10693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+9,014
−1
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
f56fb69
[cl][adreno] Add Adreno GPU support
lhez 3571bb6
[cl][ci] Add workflow for CL
lhez c1af4b7
[cl][adreno] Fix memory leak for non SMALL_ALLOC path
lhez 8ad0bb3
opencl: integrate backend dyn.load interface and fix compiler and for…
max-krasnyansky 671c7af
opencl: remove small-alloc support and fix build errors for non-openc…
max-krasnyansky d24b360
opencl: fixed merge conflict (MUSA added twice in cmake)
max-krasnyansky 9b6540b
opencl-ci: use RUNNER_TEMP instead of github.workspace
max-krasnyansky 4bca601
opencl: fix embed tool invocation with python3
max-krasnyansky 969a00a
opencl: CI workflow fixes
max-krasnyansky 66d4330
opencl: Clean up small-alloc in CMake files
lhez 0451edd
opencl: cleanup ggml-opencl2 header file
max-krasnyansky 31f305e
opencl: use ulong for offsets and strides in ADD kernel
max-krasnyansky c21fc8c
opencl: use cl_ulong for all offsets
max-krasnyansky 9a9d92b
opencl: use cl_ulong for sizes and strides
max-krasnyansky e9a9738
opencl: use `GGML_LOG_xxx` instead of `fprintf(stderr, ...)`
lhez 34f2fc1
opencl: rename backend `opencl2` -> `opencl`
lhez 97a1270
opencl: rename kernel files `ggml-opencl2` -> `ggml-opencl`
lhez 22411ab
opencl: make OpenCL required, remove redundant lib and inc directories
lhez e447dbc
opencl: rename backend - funcs, structs, etc `opencl2` -> `opencl`
lhez c64ef0f
opencl: remove copyright marker since main license already covers
lhez 70063c6
opencl: replace some more OPENCL2 leftovers
max-krasnyansky 74a9baf
opencl: remove limits on `tensor_extra`
lhez 3bc085b
opencl: use pools for `tensor_extra`
lhez c971a18
opencl: fix compiler warnings with GCC and Clang
max-krasnyansky b25a4ca
opencl: fail gracefully if opencl devices are not available
max-krasnyansky b41b6e6
opencl: fix MSVC builds (string length error)
max-krasnyansky dbaa360
opencl: check for various requirements, allow deprecated API
lhez 9697d07
opencl: update log message for unsupported GPUs
max-krasnyansky File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
#ifndef GGML_OPENCL_H | ||
#define GGML_OPENCL_H | ||
|
||
#include "ggml.h" | ||
#include "ggml-backend.h" | ||
|
||
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
|
||
// | ||
// backend API | ||
// | ||
GGML_BACKEND_API ggml_backend_t ggml_backend_opencl_init(void); | ||
GGML_BACKEND_API bool ggml_backend_is_opencl(ggml_backend_t backend); | ||
|
||
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_opencl_buffer_type(void); | ||
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_opencl_host_buffer_type(void); | ||
|
||
GGML_BACKEND_API ggml_backend_reg_t ggml_backend_opencl_reg(void); | ||
|
||
#ifdef __cplusplus | ||
} | ||
#endif | ||
|
||
#endif // GGML_OPENCL_H |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
find_package(OpenCL REQUIRED) | ||
find_package(Python3 REQUIRED) | ||
|
||
set(TARGET_NAME ggml-opencl) | ||
|
||
ggml_add_backend_library(${TARGET_NAME} | ||
ggml-opencl.cpp | ||
../../include/ggml-opencl.h) | ||
target_link_libraries(${TARGET_NAME} PRIVATE ${OpenCL_LIBRARIES}) | ||
target_include_directories(${TARGET_NAME} PRIVATE ${OpenCL_INCLUDE_DIRS}) | ||
|
||
if (GGML_OPENCL_PROFILING) | ||
message(STATUS "OpenCL profiling enabled (increases CPU overhead)") | ||
add_compile_definitions(GGML_OPENCL_PROFILING) | ||
endif () | ||
|
||
add_compile_definitions(GGML_OPENCL_SOA_Q) | ||
|
||
if (GGML_OPENCL_USE_ADRENO_KERNELS) | ||
message(STATUS "OpenCL will use matmul kernels optimized for Adreno") | ||
add_compile_definitions(GGML_OPENCL_USE_ADRENO_KERNELS) | ||
endif () | ||
|
||
if (GGML_OPENCL_EMBED_KERNELS) | ||
add_compile_definitions(GGML_OPENCL_EMBED_KERNELS) | ||
|
||
set(OPENCL_CL_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl.cl.h") | ||
set(OPENCL_MM_CL_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_mm.cl.h") | ||
set(OPENCL_CVT_CL_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_cvt.cl.h") | ||
|
||
set(OPENCL_GEMV_NOSHUFFLE_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_gemv_noshuffle.cl.h") | ||
set(OPENCL_GEMV_NOSHUFFLE_GENERAL_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_gemv_noshuffle_general.cl.h") | ||
set(OPENCL_MUL_MAT_Ab_Bi_8x4_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_mul_mat_Ab_Bi_8x4.cl.h") | ||
set(OPENCL_TRANSPOSE_16_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_transpose_16.cl.h") | ||
set(OPENCL_TRANSPOSE_32_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_transpose_32.cl.h") | ||
set(OPENCL_TRANSPOSE_32_16_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-opencl_transpose_32_16.cl.h") | ||
|
||
set(EMBED_KERNEL_SCRIPT "${CMAKE_CURRENT_SOURCE_DIR}/kernels/embed_kernel.py") | ||
file(MAKE_DIRECTORY "${CMAKE_BINARY_DIR}/autogenerated") | ||
|
||
include_directories("${CMAKE_BINARY_DIR}/autogenerated") | ||
|
||
# Python must be accessible from command line | ||
add_custom_command( | ||
OUTPUT ${OPENCL_CL_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl.cl | ||
${OPENCL_CL_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_MM_CL_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_mm.cl | ||
${OPENCL_MM_CL_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_mm.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_mm.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_CVT_CL_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_cvt.cl | ||
${OPENCL_CVT_CL_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_cvt.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_cvt.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_GEMV_NOSHUFFLE_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_gemv_noshuffle.cl | ||
${OPENCL_GEMV_NOSHUFFLE_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_gemv_noshuffle.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_gemv_noshuffle.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_GEMV_NOSHUFFLE_GENERAL_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_gemv_noshuffle_general.cl | ||
${OPENCL_GEMV_NOSHUFFLE_GENERAL_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_gemv_noshuffle_general.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_gemv_noshuffle_general.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_MUL_MAT_Ab_Bi_8x4_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_mul_mat_Ab_Bi_8x4.cl | ||
${OPENCL_MUL_MAT_Ab_Bi_8x4_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_mul_mat_Ab_Bi_8x4.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_mul_mat_Ab_Bi_8x4.cl.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_TRANSPOSE_16_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_transpose_16.cl | ||
${OPENCL_TRANSPOSE_16_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_transpose_16.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_transpose_16.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_TRANSPOSE_32_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_transpose_32.cl | ||
${OPENCL_TRANSPOSE_32_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_transpose_32.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_transpose_32.cl.h" | ||
) | ||
|
||
add_custom_command( | ||
OUTPUT ${OPENCL_TRANSPOSE_32_16_SOURCE_EMBED} | ||
COMMAND ${Python3_EXECUTABLE} ${EMBED_KERNEL_SCRIPT} | ||
${CMAKE_CURRENT_SOURCE_DIR}/kernels/ggml-opencl_transpose_32_16.cl | ||
${OPENCL_TRANSPOSE_32_16_SOURCE_EMBED} | ||
DEPENDS kernels/ggml-opencl_transpose_32_16.cl ${EMBED_KERNEL_SCRIPT} | ||
COMMENT "Generate ggml-opencl_transpose_32_16.cl.h" | ||
) | ||
|
||
target_sources(${TARGET_NAME} PRIVATE | ||
${OPENCL_CL_SOURCE_EMBED} | ||
${OPENCL_MM_CL_SOURCE_EMBED} | ||
${OPENCL_CVT_CL_SOURCE_EMBED} | ||
${OPENCL_GEMV_NOSHUFFLE_SOURCE_EMBED} | ||
${OPENCL_GEMV_NOSHUFFLE_GENERAL_SOURCE_EMBED} | ||
${OPENCL_MUL_MAT_Ab_Bi_8x4_SOURCE_EMBED} | ||
${OPENCL_TRANSPOSE_16_SOURCE_EMBED} | ||
${OPENCL_TRANSPOSE_32_SOURCE_EMBED} | ||
${OPENCL_TRANSPOSE_32_16_SOURCE_EMBED}) | ||
else () | ||
# copy ggml-opencl.cl to bin directory | ||
configure_file(kernels/ggml-opencl.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_mm.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_mm.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_cvt.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_cvt.cl COPYONLY) | ||
|
||
configure_file(kernels/ggml-opencl_gemv_noshuffle.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_gemv_noshuffle.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_gemv_noshuffle_general.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_gemv_noshuffle_general.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_mul_mat_Ab_Bi_8x4.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_mul_mat_Ab_Bi_8x4.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_transpose_16.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_transpose_16.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_transpose_32.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_transpose_32.cl COPYONLY) | ||
configure_file(kernels/ggml-opencl_transpose_32_16.cl ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-opencl_transpose_32_16.cl COPYONLY) | ||
endif () |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.