Skip to content

Integration of Fast-LLM #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Integration of Fast-LLM #1

wants to merge 2 commits into from

Conversation

bigximik
Copy link
Collaborator

@bigximik bigximik commented Apr 15, 2025

Fast-LLM Integration (Draft)

This is a basic integration of Fast-LLM and is a work in progress:

  • Based on the existing HFLM integration.
  • The HFLM constructor is incompatible, so all components are initialized explicitly in our constructor.
  • Tests are adapted from sglang since neither sglang nor our implementation supports "EleutherAI/pythia-70m"; we use Qwen2 instead. However, the test coverage is not yet as complete as in the HF integration tests.
  • Quantized models will not supported in this iteration.
  • Cache will not be supported in this iteration.

Still to be done:

  • Integration of distributed inference.
  • A test case for passing an instantiated Fast-LLM model?

Notes:

@bigximik bigximik changed the title base integrarion of Fast-LLM Integration of Fast-LLM Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant