tensor-cores

Star

Here are 8 public repositories matching this topic...

xlite-dev / ffpa-attn

Star

⚡️FFPA: Extend FlashAttention-2 with Split-D, achieve ~O(1) SRAM complexity for large headdim, 1.8x~3x↑ vs SDPA.

cuda attention sdpa mla mlsys tensor-cores flash-attention deepseek deepseek-v3 deepseek-r1 fused-mla flash-mla

Updated May 10, 2025
Cuda

xlite-dev / HGEMM

Star

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.

cuda tensor-cores hgemm

Updated May 10, 2025
Cuda

tgautam03 / tGeMM

Star

General Matrix Multiplication using NVIDIA Tensor Cores

matrix-multiplication cuda-kernels gpu-computing nvidia-cuda nvidia-gpu gpu-programming sgemm cuda-programming tensor-cores nvidia-tensor-cores

Updated Jan 25, 2025
Cuda

etasnadi / VulkanCooperativeMatrixAttention

Star

Vulkan & GLSL implementation of FlashAttention-2

vulkan glsl artificial-intelligence gpu-acceleration attention gpu-computing deel-learning tensor-cores large-language-models llm flash-attention flash-attention-2

Updated Jan 19, 2025
C++

LDRyan0 / Correlator-Bench

Star

A benchmarking framework for correlators of FX telescope arrays

cpp cuda radio-astronomy astronomy-instrumentation tensor-cores

Updated Oct 20, 2023
Cuda

NeuralAditya / Neural_Network_C

Star

Neural Network C is an advanced neural network implementation in pure C, optimized for high performance on CPUs and NVIDIA GPUs.

Updated Mar 29, 2025
C

aye-shadow / neural-network-acceleration

Star

cuda gpu-acceleration tensor-cores

Updated Apr 20, 2025
Cuda

8e8bdba457c18cf692a95fe2ec67000b / VulkanCooperativeMatrixAttention

Star

Vulkan & GLSL implementation of FlashAttention-2

vulkan glsl artificial-intelligence gpu-acceleration attention gpu-computing deel-learning tensor-cores large-language-models llm flash-attention flash-attention-2

Updated Jun 21, 2025

Improve this page

Add a description, image, and links to the tensor-cores topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tensor-cores topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensor-cores

Here are 8 public repositories matching this topic...

xlite-dev / ffpa-attn

xlite-dev / HGEMM

tgautam03 / tGeMM

etasnadi / VulkanCooperativeMatrixAttention

LDRyan0 / Correlator-Bench

NeuralAditya / Neural_Network_C

aye-shadow / neural-network-acceleration

8e8bdba457c18cf692a95fe2ec67000b / VulkanCooperativeMatrixAttention

Improve this page

Add this topic to your repo