Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

BartoszBLL · 2025-02-26T11:11:34Z

This PR introduces support for Habana Gaudi accelerators (HPUs) to the project, enabling the application to run on HPU devices in addition to the existing support for CUDA, MPS, and CPU. The changes include:

🚀 Key Updates

HPU Dockerfile (Dockerfile.hpu)
- Added a new Dockerfile for running Khoj on Habana Gaudi devices.
- Installs necessary dependencies (optimum-habana).
- Configures environment variables for Habana optimizations.
Device Selection (helpers.py)
- Enhanced get_device() to detect HPU if available.
- Supports cuda, hpu, mps, or cpu based on availability or user preference.
Memory Management (helpers.py)
- get_device_memory() now supports Habana HPU memory queries.
Dependency Updates (pyproject.toml)
- Added optimum-habana, torch-geometric, and numba.
Documentation
- README.md & src/khoj/app/README.md: Instructions for building and running Khoj with HPU.

💎 Why This Matters:

HPU Support

This PR enables the application to leverage Habana Gaudi accelerators, which can provide significant performance improvements for deep learning workloads.

Flexibility

Users can now choose their preferred device (CUDA, HPU, MPS, or CPU) for running the application, making it more versatile across different hardware setups.

Optimization

The addition of optimum-habana ensures that models are optimized for HPU and other hardware, improving efficiency and performance.

⚡ Performance Benchmarks

HPU: ~0.2703s average runtime (10 runs)
CPU: ~76.3144s average runtime (10 runs)
Result: ~282× speedup using HPU compared to CPU.

🛠 How to Test

Use the new Dockerfile.hpu to build and run the application on a system with Habana Gaudi accelerators.

# Build the HPU Docker image
docker build -t khoj-hpu -f Dockerfile.hpu .

# Run with Habana runtime
docker run --runtime=habana -e HABANA_VISIBLE_DEVICES=all -p <PORT>:<PORT> khoj-hpu

Check logs to confirm that HPU is recognized and in use.

✅ Checklist

Tested on Habana Gaudi accelerator
Verified compatibility with CPU and other devices
Updated documentation
Added required dependencies

📝 Notes

This PR is part of the effort to expand hardware support for the application, ensuring it can run efficiently on a wide range of devices.

BL359-96

Add Dockerfile for HPU runtime along with installation requirements.

Add HPUs (Intel® Gaudi®) support

fix: Device loading

debanjum

Hi, thanks for creating a PR to add support for Habana HPU to Khoj. Apologize for the delayed review. Not sure if this change should be included in Khoj (yet). Some questions below:

Are you using Khoj on Intel Gaudi machines? What's the use-case? Gaudi support seems more relevant for production scenarios. For such setups with Khoj, you should offload both the LLM heavy components (embedding generation and chat model interactions) to appropriate llm inference servers (like vllm, sglang, tensort etc.).
You mention performance benchmarks for Khoj with HPU support. Can you clarify what kind of workloads you tested? Was it the rag indexing, interacting with a local/offline chat model or something else? More details on the perf benchmarks would be useful for context

debanjum · 2025-03-11T18:20:14Z

src/khoj/utils/helpers.py

@@ -309,15 +312,32 @@ def get_device_memory() -> int:
        return psutil.virtual_memory().total


-def get_device() -> torch.device:
-    """Get device to run model on"""
+def get_device(preferred_device=None) -> torch.device:


The preferred_device arg seems unused. Should we remove it?

debanjum · 2025-05-12T14:36:14Z

src/khoj/app/README.md

+- **`optimum-habana`**: Optimizes models for Habana Gaudi accelerators.
+- **`torch-geometric`**: Enables deep learning on graph-based data structures.
+- **`numba`**: Accelerates Python code by compiling it to machine code at runtime.


Seems like the only dependency explicitly added is optimum-habana (in pyproject.toml)?

debanjum · 2025-05-12T14:37:53Z

src/khoj/app/README.md

+### 🧠 Device Selection
+
+The application now supports multiple device types, including **CUDA**, **HPU**, **MPS** (Apple Silicon), and **CPU**. You can specify your preferred device by passing the `preferred_device` argument to the `get_device()` function in `helpers.py`. For example:
+
+```python
+device = get_device(preferred_device="hpu")  # Use HPU if available
+```
+
+If no preferred device is specified, the application will automatically select the best available device.
+


This isn't used and is relevant to Khoj users or deployers. So should be removed from the documention

debanjum · 2025-05-12T14:43:23Z

Dockerfile.hpu

+ENV OMPI_MCA_btl_vader_single_copy_mechanism=none
+ENV PT_HPU_LAZY_ACC_PAR_MODE=0
+ENV PT_HPU_ENABLE_LAZY_COLLECTIVES=1


These seem like optional runtime variables to configure habana support based on the Habana docs?

If these runtime env vars are the only change this Dockerfile.hpu adds, we can drop this Dockerfile.hpu file and just mention in our setup documentation that folks wanting to run khoj on habana hpu can setup these (and other required) environment variables for their setup by referring to the Habana documentation before starting khoj?

debanjum · 2025-05-12T14:46:50Z

Dockerfile.hpu

+ARG VERSION=0.0.0
+
+# Set environment variables for Habana
+ENV HABANA_VISIBLE_DEVICES=all


These seems to be a required runtime environment variable to enable habana hpu? If so, it just be mentioned in the Khoj setup docs under the HPU tab. See /documentation/docs/get-started/setup.mdx for reference

debanjum · 2025-05-12T14:48:29Z

src/khoj/app/README.md

Updates to this should be instead moved to a new HPU tab under the Khoj setup docs at /documentation/docs/get-started/setup.mdx (which maps to https://docs.khoj.dev/get-started/setup/)

debanjum · 2025-05-12T14:49:34Z

src/khoj/app/README.md

+1. **Build the HPU Docker Image**:
+   Use the provided `Dockerfile.hpu` to build a Docker image optimized for HPU:
+   ```bash
+   docker build -t khoj-hpu -f Dockerfile.hpu .
+   ```
+


This may not be required if the previous comments on the Dockerfile.hpu are valid. Folks can just use the default Khoj dockerfile or image

BartoszBLL added 9 commits January 13, 2025 14:54

feat: Add HPU to supported backends.

3c891e1

BL359-96

feat: Add Habana dependencies to pyproject.toml

4e80e82

feat: Add Dockerfile

a17d22e

Add Dockerfile for HPU runtime along with installation requirements.

docs: Update README.md

d9f7fea

Remove vanilla Pytorch to remove conflicts with Gaudi Pytorch

a69780c

Merge pull request #1 from BlueLabelLabs/feat/hpu-support

9ecdeb7

Add HPUs (Intel® Gaudi®) support

fix: Device loading

d3c3244

Merge pull request #2 from BlueLabelLabs/feat/hpu-support

8564d1e

fix: Device loading

Merge branch 'master' into master

c403a05

debanjum requested changes May 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

BartoszBLL commented Feb 26, 2025

debanjum left a comment

debanjum Mar 11, 2025

debanjum May 12, 2025

debanjum May 12, 2025

debanjum May 12, 2025

debanjum May 12, 2025

debanjum May 12, 2025

debanjum May 12, 2025

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Are you sure you want to change the base?

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Conversation

BartoszBLL commented Feb 26, 2025

🚀 Key Updates

💎 Why This Matters:

HPU Support

Flexibility

Optimization

⚡ Performance Benchmarks

🛠 How to Test

✅ Checklist

📝 Notes

debanjum left a comment

Choose a reason for hiding this comment

debanjum Mar 11, 2025

Choose a reason for hiding this comment

debanjum May 12, 2025

Choose a reason for hiding this comment

debanjum May 12, 2025

Choose a reason for hiding this comment

debanjum May 12, 2025

Choose a reason for hiding this comment

debanjum May 12, 2025

Choose a reason for hiding this comment

debanjum May 12, 2025

Choose a reason for hiding this comment

debanjum May 12, 2025

Choose a reason for hiding this comment