⚡ semcache

semcache is a semantic caching layer for your LLM applications.

Quick Start

Start the Semcache Docker image:

docker run -p 8080:8080 semcache/semcache:latest

Configure your application e.g with the OpenAI Python SDK:

from openai import OpenAI

# Point to your Semcache host instead of OpenAI
client = OpenAI(base_url="http://localhost:8080", api_key="your-key")

# Cache miss - continues to OpenAI
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the capital of France?"}]
)

# Cache hit - returns instantly 
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me France's capital city"}]
)

Node.js follows a similar pattern of changing the base URL to point to your Semcache host:

const OpenAI = require('openai');

// Point to your Semcache host instead of OpenAI
const openai = new OpenAI({baseURL: 'http://localhost:8080', apiKey: 'your-key'});

Features

🧠 Completely in-memory - Prompts, responses and the vector database are stored in-memory
🎯 Flexible by design - Can work with your custom or private LLM APIs
🔌 Support for major LLM APIs - OpenAI, Anthropic, Gemini, and more
⚡ HTTP proxy mode - Drop-in replacement that reduces costs and latency
📈 Prometheus metrics - Full observability out of the box
📊 Build-in dashboard - Monitor cache performance at /admin
📤 Smart eviction - LRU cache eviction policy

Semcache is still in beta and being actively developed.

How it works

Semcache accelerates LLM applications by caching responses based on semantic similarity.

When you make a request Semcache first searches for previously cached answers to similar prompts and delivers them immediately. This eliminates redundant API calls, reducing both latency and costs.

Semcache also operates in a "cache-aside" mode, allowing you to load prompts and responses yourself.

Example Integrations

For comprehensive provider configuration and detailed code examples, visit our LLM Providers & Tools documentation.

HTTP Proxy

Point your existing SDK to Semcache instead of the provider's endpoint.

OpenAI

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080", api_key="your-key")

Anthropic

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:8080",  # Semcache endpoint
    api_key="your-key"
)

LangChain

from langchain.llms import OpenAI

llm = OpenAI(
    openai_api_base="http://localhost:8080",
    openai_api_key="your-key"
)

LiteLLM

import litellm

litellm.api_base = "http://localhost:8080"

Cache-aside

Install with:

pip install semcache

from semcache import Semcache

# Initialize the client
client = Semcache(base_url="http://localhost:8080")

# Store a key-data pair
client.put("What is the capital of France?", "Paris")

# Retrieve data by semantic similarity
response = client.get("Tell me France's capital city.")
print(response)  # "Paris"

Configuration

Configure via environment variables or config.yaml:

log_level: info
port: 8080

Environment variables (prefix with SEMCACHE_):

SEMCACHE_PORT=8080
SEMCACHE_LOG_LEVEL=debug

Monitoring

Prometheus Metrics

Semcache emits comprehensive Prometheus metrics for production monitoring.

Check out our /monitoring directory for our custom Grafana dashboard.

Built-in Dashboard

Access the admin dashboard at /admin to monitor cache performance.

Enterprise

Our managed version of Semcache provides you with semantic caching as a service.

Features we offer:

Custom text embedding models for your specific business
Persistent storage allowing you to build application memory over time
In-depth analysis of your LLM responses
SLA support and dedicated engineering resources

Contact us at [email protected]

Contributing

Interested in contributing? Contributions to Semcache are welcome! Feel free to make a PR.

Built with ❤️ in Rust • Documentation • GitHub Issues

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
assets		assets
docker		docker
docs		docs
monitoring		monitoring
scripts		scripts
src		src
templates		templates
tests/manual		tests/manual
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚡ semcache

Quick Start

Features

How it works

Example Integrations

HTTP Proxy

Cache-aside

Configuration

Monitoring

Prometheus Metrics

Built-in Dashboard

Enterprise

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

sensoris/semcache

Folders and files

Latest commit

History

Repository files navigation

⚡ semcache

Quick Start

Features

How it works

Example Integrations

HTTP Proxy

Cache-aside

Configuration

Monitoring

Prometheus Metrics

Built-in Dashboard

Enterprise

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages