Integration of Fast-LLM #1

bigximik · 2025-04-15T15:05:09Z

Fast-LLM Integration (Draft)

This is a basic integration of Fast-LLM and is a work in progress:

Based on the existing HFLM integration.
The HFLM constructor is incompatible, so all components are initialized explicitly in our constructor.
Tests are adapted from sglang since neither sglang nor our implementation supports "EleutherAI/pythia-70m"; we use Qwen2 instead. However, the test coverage is not yet as complete as in the HF integration tests.
Quantized models will not supported in this iteration.
Cache will not be supported in this iteration.

base integrarion of Fast-LLM

9d55c70

bigximik changed the title ~~base integrarion of Fast-LLM~~ Integration of Fast-LLM Apr 15, 2025

include fix and obsolete param remove

2d54548