DeepSeek R1 AI Voice Agent

A real-time AI voice assistant powered by DeepSeek R1 that enables seamless voice conversations through speech-to-text transcription, AI response generation, and text-to-speech synthesis.

🌟 Overview

This project creates an interactive AI voice agent that:

Captures and transcribes speech in real-time using AssemblyAI
Generates intelligent responses using DeepSeek R1 (7B model) via Ollama
Converts AI responses back to natural speech using ElevenLabs
Streams audio responses for immediate playback

✨ Features

Real-time Speech Recognition: High-quality speech-to-text transcription with AssemblyAI
Advanced AI Responses: Powered by DeepSeek R1's reasoning capabilities
Natural Voice Synthesis: Professional text-to-speech with ElevenLabs
Streaming Audio Playback: Low-latency audio streaming for responsive conversations
Conversation Memory: Maintains context throughout the conversation
Cross-platform Support: Works on macOS, Linux, and Windows

🔧 Prerequisites

API Keys Required

AssemblyAI API Key: Get your free API key
ElevenLabs API Key: Sign up for ElevenLabs

System Dependencies

Install Ollama

Download and install Ollama from ollama.com

Install PortAudio

Ubuntu/Debian:

sudo apt update && sudo apt install portaudio19-dev

macOS:

brew install portaudio

Windows: PortAudio is typically included with the Python package installation.

Install MPV (macOS only)

brew install mpv

📦 Installation

1. Clone the Repository

git clone https://github.com/danieladdisonorg/DeepSeek-R1-Voice-Agent.git
cd DeepSeek-R1-Voice-Agent

2. Install Python Dependencies

pip install "assemblyai[extras]" ollama elevenlabs

3. Download DeepSeek R1 Model

ollama pull deepseek-r1:7b

4. Configure API Keys

Edit AIVoiceAgent.py and replace the placeholder API keys:

aai.settings.api_key = "YOUR_ASSEMBLYAI_API_KEY"
self.client = ElevenLabs(api_key="YOUR_ELEVENLABS_API_KEY")

🚀 Usage

Start the Voice Agent

python AIVoiceAgent.py

Interaction Flow

Speak: The agent listens for your voice input
Processing: Your speech is transcribed and sent to DeepSeek R1
Response: The AI generates a response (limited to 300 characters for quick interactions)
Playback: The response is converted to speech and played back
Continue: The conversation continues with maintained context

Stopping the Agent

Press Ctrl+C to stop the voice agent.

⚙️ Configuration

Model Settings

AI Model: DeepSeek R1 7B (configurable in the code)
Voice Model: ElevenLabs Turbo v2 (configurable)
Response Length: Limited to 300 characters (adjustable in system prompt)
Sample Rate: 16kHz for optimal quality

Customization Options

Modify the system prompt in AIVoiceAgent.py to change AI behavior
Adjust response length limits
Change voice models in ElevenLabs configuration
Modify audio streaming parameters

🔍 Troubleshooting

Common Issues

"No module named 'assemblyai'"

pip install "assemblyai[extras]"

"Ollama connection error"

Ensure Ollama is running: ollama serve
Verify the model is downloaded: ollama list

"Audio device not found"

Check microphone permissions
Verify PortAudio installation
Test microphone with other applications

"ElevenLabs API error"

Verify API key is correct
Check API quota/usage limits
Ensure stable internet connection

Performance Tips

Use a quality microphone for better transcription accuracy
Ensure stable internet connection for API calls
Close unnecessary applications to free up system resources

🏗️ Architecture

┌─────────────────┐    ┌──────────────┐    ┌─────────────────┐
│   Microphone    │───▶│  AssemblyAI  │───▶│   DeepSeek R1   │
│   (Audio Input) │    │ (Speech-to-  │    │ (AI Response    │
└─────────────────┘    │  Text)       │    │  Generation)    │
                       └──────────────┘    └─────────────────┘
                                                      │
┌─────────────────┐    ┌──────────────┐              │
│   Speakers      │◀───│  ElevenLabs  │◀─────────────┘
│ (Audio Output)  │    │ (Text-to-    │
└─────────────────┘    │  Speech)     │
                       └──────────────┘

📄 License

This project is open source. Please check the repository for license details.

🤝 Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.

📞 Support

For issues and questions:

Open an issue on GitHub
Check the troubleshooting section above
Review API documentation for AssemblyAI, Ollama, and ElevenLabs

Note: This project requires active internet connection for API services and sufficient system resources to run the DeepSeek R1 model locally via Ollama.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
AIVoiceAgent.py		AIVoiceAgent.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepSeek R1 AI Voice Agent

🌟 Overview

✨ Features

🔧 Prerequisites

API Keys Required

System Dependencies

Install Ollama

Install PortAudio

Install MPV (macOS only)

📦 Installation

1. Clone the Repository

2. Install Python Dependencies

3. Download DeepSeek R1 Model

4. Configure API Keys

🚀 Usage

Start the Voice Agent

Interaction Flow

Stopping the Agent

⚙️ Configuration

Model Settings

Customization Options

🔍 Troubleshooting

Common Issues

Performance Tips

🏗️ Architecture

📄 License

🤝 Contributing

📞 Support

About

Uh oh!

Releases

Packages

Languages

License

danieladdisonorg/DeepSeek-R1-Voice-Agent

Folders and files

Latest commit

History

Repository files navigation

DeepSeek R1 AI Voice Agent

🌟 Overview

✨ Features

🔧 Prerequisites

API Keys Required

System Dependencies

Install Ollama

Install PortAudio

Install MPV (macOS only)

📦 Installation

1. Clone the Repository

2. Install Python Dependencies

3. Download DeepSeek R1 Model

4. Configure API Keys

🚀 Usage

Start the Voice Agent

Interaction Flow

Stopping the Agent

⚙️ Configuration

Model Settings

Customization Options

🔍 Troubleshooting

Common Issues

Performance Tips

🏗️ Architecture

📄 License

🤝 Contributing

📞 Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages