Skip to content

neural-maze/ava-whatsapp-agent-course

Repository files navigation

logo

📱 Ava 📱

Turning the Turing Test into a WhatsApp Agent

logo

Table of Contents

Course Overview

What happens when two ML Engineers with a love for sci-fi movies team up? 🤔

You get Ava, a Whatsapp agent that can engage with users in a "realistic" way, inspired by the great film Ex Machina. Ok, you won't find a fully sentient robot here, but you will have some pretty interesting Whatsapp conversations.

By the end of this course, you'll have built your own Ava too, capable of:

  • Receiving and sending Whatsapp messages 📲
  • Understanding your voice 🗣️
  • Recognizing your images 🖼️
  • Sending voice notes back 🎤
  • Sharing updates about its "daily activities" 🚣
  • Sending you images of its current activities 🖼️

You can think of it as a modern reinterpretation of the Turing Test 🤣

Excited? Let's get started!

ava_trimmed_trimmed.mp4

The Neural Maze Logo

📬 Stay Updated

Join The Neural Maze and learn to build AI Systems that actually work, from principles to production. Every Wednesday, directly to your inbox. Don't miss out!

Subscribe Now

Jesus Copado YouTube Channel

🎥 Watch More Content

Join Jesús Copado on YouTube to explore how to build real AI projects—from voice agents to creative tools. Weekly videos with code, demos, and ideas that push what's possible with AI. Don't miss the next drop!

Subscribe Now


Who is this course for?

This course is for Software Engineers, ML Engineers, and AI Engineers who want to level up by building complex end-to-end apps. It's not just a basic "Hello World" tutorial—it's a deep dive into making a production-ready WhatsApp agent.

What you'll get out of this course

  • Build a fully working WhatsApp agent you can chat with on your phone
  • Get a solid understanding of how to build LangGraph workflows
  • Set up a long-term memory system using Qdrant as a Vector Database
  • Use Groq models to power AI Agent responses
  • Implement STT systems using Whisper
  • Implement TTS systems using ElevenLabs
  • Generate high-quality images using diffusion models, like FLUX models
  • Process images using VLM models, like llama-3.2-vision
  • Create chat interfaces using Chainlit
  • Deploy agentic applications to Cloud Run
  • Connect agentic applications to the WhatsApp API

Getting started

Before you begin the course, there are a few things you need to do.

I'm referring to the virtual environment creation, dependencies installation, .env file creation, etc. I know, it's very boring, but it's a necessary evil! 😅

All of this is detailed in the following doc: GETTING STARTED.md.

Make sure you follow the instructions in the doc, as it's crucial for the course to work.

Course syllabus

Lesson Number Written Lesson Video Lesson Description
1
Project overview Thumbnail 1 Understand the project architecture and the tech stack.
2
Dissecting Ava's brain Thumbnail 2 Learn the basics of LangGraph and implement complex workflows using this framework.
3
Unlocking Ava's memories Thumbnail 3 Build a short-term memory system for graph state persistence and chat history. Also, implement a long-term memory system using Qdrant.
4
Giving Ava a Voice Thumbnail 4 Build a STT and a TTS pipeline to make Ava process input and output audio.
5
Ava learns to see Thumbnail 5 Understand how to process images using VLM models. Implement an image generation pipeline using FLUX models.
6
Ava installs Whatsapp Thumbnail 6 Connect Ava to WhatsApp. Learn how to deploy a LangGraph application to Google Cloud Run.

And if you're feeling extra brave, there's also a 2+ hour video course where we walk through all the project details and the code, step by step.

Ava Full Course


How much is this going to cost me?

The awesome thing about this project is you can run it on your own computer for free!

The free tiers from Groq, ElevenLabs, Qdrant Cloud, and Together AI are more than enough to get you going.

If you want to try it out on Google Cloud Run, you can get a free account and get $300 in free credits. Even if you've already used up your free credits, Cloud Run is super cheap - so it will take just a buck or two for your experiments.


The tech stack

Technology Description
Groq Logo Powering the project with Llama 3.3, Llama 3.2 Vision, and Whisper. Groq models are awesome (and fast!!)
Qdrant Logo Serving as the long-term database, enabling our agent to recall details you shared months ago.
Cloud Run Logo Deploying your containers easily to Google Cloud Platform
LangGraph Logo Learn how to build production-ready LangGraph workflows
ElevenLabs Logo Amazing TTS models
Together AI Logo Behind Ava's image generation process

Contributors

Miguel Otero Pedrido | Senior ML / AI Engineer
Founder of The Neural Maze. Rick and Morty fan.

LinkedIn
YouTube
The Neural Maze Newsletter
Jesús Copado | Senior ML / AI Engineer
Equal parts cinema fan and AI enthusiast.

YouTube
LinkedIn

License

This project is licensed under the MIT License - see the LICENSE file for details.


The Neural Maze Logo

📬 Stay Updated

Join The Neural Maze and learn to build AI Systems that actually work, from principles to production. Every Wednesday, directly to your inbox. Don't miss out!

Subscribe Now

Jesus Copado YouTube Channel

🎥 Watch More Content

Join Jesús Copado on YouTube to explore how to build real AI projects—from voice agents to creative tools. Weekly videos with code, demos, and ideas that push what's possible with AI. Don't miss the next drop!

Subscribe Now