Releases: intel/xFasterTransformer
Releases · intel/xFasterTransformer
IntrinsicGemm
xDNN v1.5.6
- fix issue of BF16 pack issue when transpose=true
xDNN v1.5.5
- bf16fp8bf16 gemv
xDNN v1.5.4
- Optimize BF16 pack performance
- Add bf16bf16bf16 gemm
xDNN v1.5.3
- Add beta parameter to AMX gemm
xDNN v1.5.2
- Add AMX_FP16 support for small gemm.
- Built with GCC 13.2.1
xDNN v1.5.1
- Add xdnn_hgemm_f32f16f32_packb_block.
xDNN v1.5.0
- Add hgemm w/ fp32 bias.
xDNN v1.4.6
- Add alpha and beta param to small_sgemm_f32f16bf16.
xDNN v1.4.5
- Add post op gelu activation.
xDNN v1.4.4
- Fix AMX illegal instruction issue.
xDNN v1.4.3
- Add small_sgemm_f32bf16bf16.
- Add small_sgemm_f32f16bf16.
xDNN v1.4.2
- Support amx_gemm_bf16bf16bf16 kernel w/ any shapes.
xDNN v1.4.1
- Add sgemm_bf16bf16f32 and sgemm_f32bf16bf16 kernels.
- Add softmax kernels.
xDNN v1.4.0
- Add hgemm_f32u4f32 kernels.
- Add sgemm_f32nf4f32 kernels.
xDNN v1.3.1
- Fix sgemm_f32u4f32 kernels parallel bug.
xDNN v1.3.0
- Add sgemm_f32u4f32 kernels
xDNN v1.2.1
- Add xdnn_small_amx_sgemm_bf16bf16bf16_packb implemention with transposed weight.
xDNN v1.2
- Add xdnn_small_amx_sgemm_bf16bf16bf16_packb implemention.
- Add xdnn_small_amx_sgemm_bf16bf16bf16_compute implemention.
xDNN v1.1
- Add bgemm_f32bf16f32_packb weight format BA16a64b2a.
- Add intrinsic extension api.
xDNN v1.0
- Add sgemm kernels
- Add sgemm_f32f16f32 kernels
- Add sgemm_f32i8f32 kernels
- Add hgemm_f32f16f32 kernels
- Add hgemm_f16f16f32 kernels
- Add hgemm_f32i8f32 kernels
- Add bgemm_f32bf16f32 kernels