Drop Server Side Indexer, Native Offline Chat, Old Migration Scripts #1212

debanjum · 2025-07-26T18:01:59Z

Overview

Make server leaner to increase development speed.
Remove old indexing code and the native offline chat which was hard to maintain.

The native offline chat module was written when the local ai model api ecosystem wasn't mature. Now it is. Reuse that.
Offline chat requires GPU for usable speeds. Decoupling offline chat from Khoj server is the recommended way to go for practical inference speeds (e.g Ollama on machine, Khoj in docker etc.)

Details

Drop old code to index files on server filesystem. Clean cli, init paths.
Drop native offline chat support with llama-cpp-python.
Use established local ai APIs like Llama.cpp Server, Ollama, vLLM etc.
Drop old pre 1.0 khoj config migration scripts
Update test setup to index test data after old indexing code removed.

These were used when khoj was configured using khoj.yml file

It is recommended to chat with open-source models by running an open-source server like Ollama, Llama.cpp on your GPU powered machine or use a commercial provider of open-source models like DeepInfra or OpenRouter. These chat model serving options provide a mature Openai compatible API that already works with Khoj. Directly using offline chat models only worked reasonably with pip install on a machine with GPU. Docker setup of khoj had trouble with accessing GPU. And without GPU access offline chat is too slow. Deprecating support for an offline chat provider directly from within Khoj will reduce code complexity and increase developement velocity. Offline models are subsumed to use existing Openai ai model provider.

This stale code was originally used to index files on server file system directly by server. We currently push files to sync via API. Server side syncing of remote content like Github and Notion is still supported. But old, unused code for server side sync of files on server fs is being cleaned out. New --log-file cli args allows specifying where khoj server should store logs on fs. This replaces the --config-file cli arg that was only being used as a proxy for deciding where to store the log file. - TODO - Tests are broken. They were relying on the server side content syncing for test setup

- Delete tests testing deprecated server side indexing flows - Delete `Local(Plaintext|Org|Markdown|Pdf)Config' methods, files and references in tests - Index test data via new helper method, `get_index_files' - It is modelled after the old `get_org_files' variants in main app - It passes the test data in required format to `configure_content' Allows maintaining the more realistic tests from before while using new indexing mechanism (rather than the deprecated server side indexing mechanism

debanjum added this to the Release Khoj 2.0 milestone Jul 26, 2025

debanjum added the improve Upgrade or improve an existing feature or capability label Jul 26, 2025

debanjum force-pushed the drop-offline-chat-and-old-indexing-code branch 2 times, most recently from 8ff93e2 to e8cac68 Compare August 1, 2025 01:01

debanjum added 4 commits July 31, 2025 18:25

Drop old pre 1.0 khoj config migration scripts

3f8cc71

These were used when khoj was configured using khoj.yml file

debanjum force-pushed the drop-offline-chat-and-old-indexing-code branch from e8cac68 to 892d573 Compare August 1, 2025 01:25

debanjum merged commit c6670e8 into master Aug 1, 2025
10 checks passed

debanjum deleted the drop-offline-chat-and-old-indexing-code branch August 1, 2025 01:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Drop Server Side Indexer, Native Offline Chat, Old Migration Scripts #1212

Drop Server Side Indexer, Native Offline Chat, Old Migration Scripts #1212

Uh oh!

debanjum commented Jul 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Drop Server Side Indexer, Native Offline Chat, Old Migration Scripts #1212

Drop Server Side Indexer, Native Offline Chat, Old Migration Scripts #1212

Uh oh!

Conversation

debanjum commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

Uh oh!

Uh oh!

Uh oh!

debanjum commented Jul 26, 2025 •

edited

Loading