This project is an end-to-end AI pipeline designed to detect, track, search, and fit jewelry items — specifically rings, earrings, and dresses — using state-of-the-art computer vision, 3D modeling, and multimodal tools like CLIP and SAM. It now includes 3D mesh fitting and AR-style visualization of rings placed on hands.
- 🖐️ Detect & track hands and rings with MediaPipe & YOLOv8
- 🧠 Smart visual similarity search using CLIP (image & text prompts)
- 📱 Unsupervised Style Clustering (CLIP + KMeans), discover hidden design grouping
- 🌀 DeepSORT-based video tracking of jewelry
- 🧊 Segment jewelry (rings) with SAM (Segment Anything)
- 💍 Place and align 3D ring meshes on finger joints
- 🖼️ Overlay 3D ring on original hand image for AR-style try-on
- 📦 Export
.obj
/.ply
files of ring placements for 3D tools (Blender, WebGL) - ✅ Professional, modular, and clean codebase with full documentation
- Current implementation: Jewelry detection, segmentation, tracking, and basic 3D mesh fitting.
- Limitations:
- No comprehensive clustering
- Novel view synthesis not implemented
- No complete 3D rig/model of hand or body
📄 For a detailed breakdown of tools, methodology, failures, and future scope, please refer to the Design Document.
Feature | Status |
---|---|
Ring detection with YOLOv8 | ✅ Done |
Hand joint detection (MediaPipe) | ✅ Done |
Video tracking with DeepSORT | ✅ Done |
CLIP-powered search | ✅ Done |
Automatically group visually similar jewelry designs using CLIP embeddings and KMeans clustering | ✅ Done |
Prompt-to-image retrieval | ✅ Done |
SAM-based ring segmentation | ✅ Done |
3D ring placement using Open3D | ✅ Done |
Ring orientation aligned to finger direction | ✅ Done |
Export ring mesh as .obj/.ply | ✅ Done |
Overlay 3D ring on original image | ✅ Done |
Documentation (README, Design Doc) | ✅ Done |
Code published to GitHub | ✅ Done |
- 💍 Jewelry Detection Dataset (Roboflow)
- 👗 Dress Detection Dataset (Roboflow)
- Classes used:
ring
,earring
,dress
- Format: YOLOv8
pip install -r requirements.txt
python main.py # For image
python run_video.py # For video
python clip_search/embed_dataset.py
python clip_search/search_similar.py
python segmentation/segment_ring.py
python fitting/fit_ring_mesh.py
python visualization/overlay_ring_3d_on_image.py
python clip_search/cluster_embeddings.py
jewelry-tracking-ai/
├── segmentation/ ← SAM ring mask generation
├── fitting/ ← 3D mesh placement on finger joint
├── clip_search/ ← CLIP similarity search & clustering
├── visualization/ ← Overlay 3D ring on 2D hand
├── detectors/, trackers/ ← YOLOv8 + DeepSORT
├── output/ ← Rendered results, ring masks, meshes
├── main.py, run_video.py ← Entry points
├── README.md, design_doc.md ← Documentation
Explore JewelSense in action through the following demo recordings:
-
🔍 Detection & Tracking
Watch on Loom -
💍 Hand Bug Removed
Watch on Loom -
🧠 Search Products by Images, Text and perform Clustering
Watch on Loom -
🧩 3D Segmentation and 3D Modelling
Watch on Loom
- 🪞 Add earrings and dresses to 3D segmentation
- 📐 Fit earrings or necklaces in 3D space
- 🤖 Use GPT/LLaVA for style-based jewelry prompts
- 🌐 Web-based 3D viewer (Three.js / Streamlit)
- 📲 Full AR preview & product try-on simulation
MIT — free to use, build on, and contribute.
Author: Syed Abdul Rafey Ali Status: 🎯 Production-ready with real-world & research use potential