Qwen3 Model Quantization Toolkit

Project Overview

This project provides various quantization method implementations for the Qwen3 large language model. Quantization can significantly reduce model size and inference time while maintaining model performance as much as possible. We support five advanced quantization techniques, each with its unique advantages and application scenarios.

Supported Quantization Methods

Method	Description	Documentation
AWQ	Activation-aware Weight Quantization, an efficient quantization method for LLM compression and acceleration	AWQ Documentation
GPTQ	Gradient-based Post-training Quantization method	GPTQ Documentation
RTN	Recursive Tensor Network quantization method	RTN Documentation
SmoothQuant	Smooth quantization technique that reduces quantization error propagation	SmoothQuant Documentation
Bi-LLM	Dual-precision quantization method that maintains high precision in critical layers	Bi-LLM Documentation

Attention!!!

File Upload Limitation Notice

Due to GitHub's file size restrictions, the following five files cannot be directly uploaded to our code repository:

eval_my/mmlu/data/auxiliary_train/race.csv
eval_my/ppl_datasets/wikitext/wikitext-2-raw-v1/test
eval_my/ppl_datasets/wikitext/wikitext-2-raw-v1/train
eval_my/ppl_datasets/allenai/c4/allenai--c4/train
eval_my/ppl_datasets/allenai/c4/allenai--c4/validation

You can download these files from Hugging Face. We will also update them to our project's Hugging Face repository in the future.

Contributions

Contributions to this project are welcome! If you have suggestions for improvements or find issues, please submit an issue or pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
BiLLM		BiLLM
GPTQ-for-Qwen		GPTQ-for-Qwen
RTN		RTN
SmoothQuant-for-Qwen3		SmoothQuant-for-Qwen3
eval_my		eval_my
llm-awq		llm-awq
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Qwen3 Model Quantization Toolkit

Project Overview

Supported Quantization Methods

Attention!!!

File Upload Limitation Notice

Contributions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Efficient-ML/Qwen3-Quantization

Folders and files

Latest commit

History

Repository files navigation

Qwen3 Model Quantization Toolkit

Project Overview

Supported Quantization Methods

Attention!!!

File Upload Limitation Notice

Contributions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages