Browser Agents is a web platform that lets you create, deploy, and manage AI-powered browser automation agents. Build intelligent bots that can navigate websites, extract data, fill forms, and perform complex web tasks using natural language instructions.
π Quick Start β’ π Documentation β’ π― Features β’ π€ Contributing β’ π¬ Community
|
|
- Visual Understanding: Agents can see and interact with web pages like humans
- Context Memory: Maintains state across complex multi-step workflows
- Error Recovery: Smart retry mechanisms and failure handling
- Parallel Execution: Run multiple agents simultaneously
- Data Export: CSV, JSON, Excel export formats
graph TB
A[Web Interface] --> B[FastAPI Backend]
B --> C[Agent Executor]
C --> D[Browser Engine]
C --> E[LLM Provider]
B --> F[PostgreSQL Database]
B --> G[Task Queue]
subgraph "AI Models"
E --> H[OpenAI GPT-4o]
E --> I[Anthropic Claude]
E --> J[Google Gemini]
end
subgraph "Browser Automation"
D --> K[Playwright]
D --> L[browser-use]
end
- Docker & Docker Compose (recommended)
- Python 3.11+ (for local development)
- Node.js 18+ (for frontend development)
- PostgreSQL 13+ (if not using Docker)
Get up and running in under 5 minutes:
# 1. Clone the repository
git clone https://github.com/daviddaytw/browser-agents.git
cd browser-agents
# 2. Configure environment
cp .env.example .env
# Edit .env with your settings (see Configuration section below)
# 3. Start all services
docker compose up -d
# 4. Access the platform
open http://localhost:5173
For development or customization:
Click to expand local setup instructions
# Backend setup
cd backend
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install uv
uv sync
uv run alembic upgrade head
uv run uvicorn app.main:app --reload --port 8000
# Frontend setup (new terminal)
cd frontend
npm install
npm run generate-client
npm run dev
import requests
# Create agent
response = requests.post(
"http://localhost:8000/api/v1/agents",
headers={"Authorization": "Bearer your_api_key"},
json={
"name": "Data Extractor",
"task_prompt": "Extract all product names and prices from {url}",
"llm_model": "claude-3-5-sonnet"
}
)
agent_id = response.json()["id"]
# Execute agent
execution = requests.post(
f"http://localhost:8000/api/v1/agents/{agent_id}/execute",
headers={"Authorization": "Bearer your_api_key"},
json={
"parameters": {
"url": "https://example-store.com/products"
}
}
)
print(f"Execution ID: {execution.json()['id']}")
const axios = require('axios');
const client = axios.create({
baseURL: 'http://localhost:8000/api/v1',
headers: { 'Authorization': 'Bearer your_api_key' }
});
// Execute agent
const execution = await client.post(`/agents/${agentId}/execute`, {
parameters: { url: 'https://example.com' }
});
console.log('Execution started:', execution.data.id);
- π Secure Authentication: JWT-based authentication with API key management
- π Data Encryption: All data encrypted in transit (TLS) and at rest
- π‘οΈ Input Validation: Comprehensive input sanitization and validation
- π Audit Logging: Complete audit trail of all actions and changes
- π’ Enterprise Ready: GDPR compliant with enterprise security features
- π Role-Based Access: Granular permissions and user management
- π« Rate Limiting: API rate limiting and abuse prevention
We welcome contributions from the community! Whether you're fixing bugs, adding features, or improving documentation, your help makes Browser Agents better for everyone.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
# Install development dependencies
cd backend && uv sync --dev
cd frontend && npm install
# Run tests
cd backend && uv run pytest
cd frontend && npm test
# Code formatting
cd backend && uv run ruff format .
cd frontend && npm run lint
- π Report Bugs: Help us identify and fix issues
- π‘ Suggest Features: Share ideas for new functionality
- π Improve Documentation: Help others understand the platform
- π§ Submit Code: Fix bugs or add new features
- π¨ Design & UX: Improve the user interface and experience
π Read our detailed Contributing Guide
What makes Browser Agents different from other automation tools?
Browser Agents combines the power of large language models with browser automation, allowing you to describe tasks in natural language rather than writing complex scripts. It's designed for both technical and non-technical users.
Which AI models are supported?
We support OpenAI (GPT-4o, GPT-4 Turbo), Anthropic (Claude 3.5 Sonnet, Claude 3 Opus), Google (Gemini Pro), and custom model configurations. You can choose the best model for each specific task.
Is there a free tier available?
Yes! The open-source version is completely free. You only pay for the AI model API usage (OpenAI, Anthropic, etc.) and any cloud hosting costs if you choose to deploy it.
Can I run this on-premises?
Absolutely! Browser Agents is designed to run on-premises or in your private cloud. All data stays within your infrastructure.
How do I handle websites that require authentication?
Browser Agents supports various authentication methods including cookies, session storage, and form-based login. You can configure authentication steps as part of your agent's workflow.
Browser Agents is open source software licensed under the Apache License 2.0.
Browser Agents is built on the shoulders of giants. Special thanks to:
- browser-use - The core browser automation library that powers our agents
- FastAPI - Modern, fast Python web framework for building APIs
- React - A JavaScript library for building user interfaces
Made with β€οΈ by the Browser Agents community