On February 26, Deepseek announced open source DeepGEMM on the third day of Open Source Week. DeepGEMM is a library designed for simple and efficient FP8 Universal Matrix multiplication (GEMM) with fine-grained scaling capabilities, as proposed in DeepSeek-V3. It supports GEMM in General and Mixed Expert (MoE) groups. The library is written in CUDA and does not require compilation during installation, compiling all kernels at run time by using lightweight just-in-time (JIT) modules. The FP8GEMM library provides support for V3/R1 training and inference
DeepSeek announces open source DeepGEMM
On February 26, Deepseek announced open source DeepGEMM on the third day of Open Source Week. DeepGEMM is a library designed for simple and efficient FP8 Universal Matrix multiplication (GEMM) with fine-grained scaling capabilities, as proposed in DeepSeek-V3. It supports GEMM in General and Mixed Expert (MoE) groups. The library is written in CUDA and does not require compilation during installation, compiling all kernels at run time by using lightweight just-in-time (JIT) modules. The FP8GEMM library provides support for V3/R1 training and inference
.