A modern, async Python API server for cyber threat intelligence, news, and scraping, powered by ReadyAPI, MongoDB, Elasticsearch, and Apify.
- RSS feed aggregation and cleaning
- Twitter and screenshot scraping via Apify
- Full-text and advanced search via Elasticsearch
- Stats endpoints from MongoDB
- Rate limiting, CORS, and production-ready Docker support
- Python 3.11+
- MongoDB instance
- Elasticsearch (Elastic Cloud recommended)
- Apify account (for scraping endpoints)
- Docker (optional, for containerized deployment)
- Clone the repository
- Install dependencies
pip install -r requirements.txt
- Configure environment variables
- Copy
.env.example
to.env
and fill in your credentials:cp .env.example .env # Edit .env with your values
- Copy
uvicorn app.main:app --reload --port 5555
docker build -t data_server_api_server_py .
docker run -d --env-file .env -p 5555:5555 data_server_api_server_py
GET /rss-feed?source=hacker-news,graham-cluley
— Aggregated RSS feedsGET /scrapper/screenshot?url=...
— Screenshot a web page via ApifyGET /scrapper/twitter?query=...&max_item=...
— Scrape tweets via ApifyPOST /search/index
— Search Elasticsearch (see request body in code)POST /search/index/v2
— Advanced search across monthly indicesPOST /search/index/v3
— Custom index/query searchGET /stats/last-scrap-time
— Last scrape status from MongoDBGET /stats/{id}
— Stats document by ID
See .env.example
for all required variables:
MONGO_URI
,MONGO_DB
ELASTIC_CLOUD_ID
,ELASTIC_CLOUD_USERNAME
,ELASTIC_CLOUD_PASSWORD
APIFY_TOKEN
MIT