Browser Agents - AI Browser Automation Platform

Browser Agents is a web platform that lets you create, deploy, and manage AI-powered browser automation agents. Build intelligent bots that can navigate websites, extract data, fill forms, and perform complex web tasks using natural language instructions.

🚀 Quick Start • 📖 Documentation • 🎯 Features • 🤝 Contributing • 💬 Community

✨ Key Features

🎨 No-Code Agent Builder

Visual drag-and-drop interface
Natural language task descriptions
Pre-built templates for common use cases
Smart configuration recommendations

🧠 Multi-Model AI Support

OpenAI GPT-4o, GPT-4 Turbo
Anthropic Claude 3.5 Sonnet, Claude 3 Opus
Google Gemini Pro
Custom model configurations

📊 Enterprise Dashboard

Real-time execution monitoring
Performance analytics & insights
Detailed execution logs with screenshots
Team collaboration & user management

🔌 Developer APIs

RESTful API with OpenAPI docs
Webhook integrations
API key management
SDK support (Python, JavaScript)

🚀 Advanced Capabilities

Visual Understanding: Agents can see and interact with web pages like humans
Context Memory: Maintains state across complex multi-step workflows
Error Recovery: Smart retry mechanisms and failure handling
Parallel Execution: Run multiple agents simultaneously
Data Export: CSV, JSON, Excel export formats

🏗️ Architecture

graph TB
    A[Web Interface] --> B[FastAPI Backend]
    B --> C[Agent Executor]
    C --> D[Browser Engine]
    C --> E[LLM Provider]
    B --> F[PostgreSQL Database]
    B --> G[Task Queue]
    
    subgraph "AI Models"
        E --> H[OpenAI GPT-4o]
        E --> I[Anthropic Claude]
        E --> J[Google Gemini]
    end
    
    subgraph "Browser Automation"
        D --> K[Playwright]
        D --> L[browser-use]
    end

🚀 Quick Start

Prerequisites

Docker & Docker Compose (recommended)
Python 3.11+ (for local development)
Node.js 18+ (for frontend development)
PostgreSQL 13+ (if not using Docker)

🐳 Docker Installation (Recommended)

Get up and running in under 5 minutes:

# 1. Clone the repository
git clone https://github.com/daviddaytw/browser-agents.git
cd browser-agents

# 2. Configure environment
cp .env.example .env
# Edit .env with your settings (see Configuration section below)

# 3. Start all services
docker compose up -d

# 4. Access the platform
open http://localhost:5173

🛠️ Local Development Setup

For development or customization:

Click to expand local setup instructions

# Backend setup
cd backend
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install uv
uv sync
uv run alembic upgrade head
uv run uvicorn app.main:app --reload --port 8000

# Frontend setup (new terminal)
cd frontend
npm install
npm run generate-client
npm run dev

API Usage

Python

import requests

# Create agent
response = requests.post(
    "http://localhost:8000/api/v1/agents",
    headers={"Authorization": "Bearer your_api_key"},
    json={
        "name": "Data Extractor",
        "task_prompt": "Extract all product names and prices from {url}",
        "llm_model": "claude-3-5-sonnet"
    }
)

agent_id = response.json()["id"]

# Execute agent
execution = requests.post(
    f"http://localhost:8000/api/v1/agents/{agent_id}/execute",
    headers={"Authorization": "Bearer your_api_key"},
    json={
        "parameters": {
            "url": "https://example-store.com/products"
        }
    }
)

print(f"Execution ID: {execution.json()['id']}")

JavaScript/Node.js

const axios = require('axios');

const client = axios.create({
  baseURL: 'http://localhost:8000/api/v1',
  headers: { 'Authorization': 'Bearer your_api_key' }
});

// Execute agent
const execution = await client.post(`/agents/${agentId}/execute`, {
  parameters: { url: 'https://example.com' }
});

console.log('Execution started:', execution.data.id);

🛡️ Security & Compliance

🔐 Secure Authentication: JWT-based authentication with API key management
🔒 Data Encryption: All data encrypted in transit (TLS) and at rest
🛡️ Input Validation: Comprehensive input sanitization and validation
📊 Audit Logging: Complete audit trail of all actions and changes
🏢 Enterprise Ready: GDPR compliant with enterprise security features
🔑 Role-Based Access: Granular permissions and user management
🚫 Rate Limiting: API rate limiting and abuse prevention

🤝 Contributing

We welcome contributions from the community! Whether you're fixing bugs, adding features, or improving documentation, your help makes Browser Agents better for everyone.

🚀 Quick Contribution Guide

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📋 Development Setup

# Install development dependencies
cd backend && uv sync --dev
cd frontend && npm install

# Run tests
cd backend && uv run pytest
cd frontend && npm test

# Code formatting
cd backend && uv run ruff format .
cd frontend && npm run lint

🎯 Ways to Contribute

🐛 Report Bugs: Help us identify and fix issues
💡 Suggest Features: Share ideas for new functionality
📝 Improve Documentation: Help others understand the platform
🔧 Submit Code: Fix bugs or add new features
🎨 Design & UX: Improve the user interface and experience

👉 Read our detailed Contributing Guide

❓ FAQ

What makes Browser Agents different from other automation tools?

Browser Agents combines the power of large language models with browser automation, allowing you to describe tasks in natural language rather than writing complex scripts. It's designed for both technical and non-technical users.

Which AI models are supported?

We support OpenAI (GPT-4o, GPT-4 Turbo), Anthropic (Claude 3.5 Sonnet, Claude 3 Opus), Google (Gemini Pro), and custom model configurations. You can choose the best model for each specific task.

Is there a free tier available?

Yes! The open-source version is completely free. You only pay for the AI model API usage (OpenAI, Anthropic, etc.) and any cloud hosting costs if you choose to deploy it.

Can I run this on-premises?

Absolutely! Browser Agents is designed to run on-premises or in your private cloud. All data stays within your infrastructure.

How do I handle websites that require authentication?

Browser Agents supports various authentication methods including cookies, session storage, and form-based login. You can configure authentication steps as part of your agent's workflow.

📄 License

Browser Agents is open source software licensed under the Apache License 2.0.

🙏 Acknowledgments

Browser Agents is built on the shoulders of giants. Special thanks to:

browser-use - The core browser automation library that powers our agents
FastAPI - Modern, fast Python web framework for building APIs
React - A JavaScript library for building user interfaces

Made with ❤️ by the Browser Agents community

⭐ Star us on GitHub • 🐛 Report Issues

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
.vscode		.vscode
backend		backend
frontend		frontend
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docker-compose.override.yml		docker-compose.override.yml
docker-compose.traefik.yml		docker-compose.traefik.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Browser Agents - AI Browser Automation Platform

✨ Key Features

🎨 No-Code Agent Builder

🧠 Multi-Model AI Support

📊 Enterprise Dashboard

🔌 Developer APIs

🚀 Advanced Capabilities

🏗️ Architecture

🚀 Quick Start

Prerequisites

🐳 Docker Installation (Recommended)

🛠️ Local Development Setup

API Usage

Python

JavaScript/Node.js

🛡️ Security & Compliance

🤝 Contributing

🚀 Quick Contribution Guide

📋 Development Setup

🎯 Ways to Contribute

❓ FAQ

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

daviddaytw/browser-agents

Folders and files

Latest commit

History

Repository files navigation

Browser Agents - AI Browser Automation Platform

✨ Key Features

🎨 No-Code Agent Builder

🧠 Multi-Model AI Support

📊 Enterprise Dashboard

🔌 Developer APIs

🚀 Advanced Capabilities

🏗️ Architecture

🚀 Quick Start

Prerequisites

🐳 Docker Installation (Recommended)

🛠️ Local Development Setup

API Usage

Python

JavaScript/Node.js

🛡️ Security & Compliance

🤝 Contributing

🚀 Quick Contribution Guide

📋 Development Setup

🎯 Ways to Contribute

❓ FAQ

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages