You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
π§ Train your own DeepSeek-R1 style reasoning model on Mac! First MLX implementation of GRPO - the breakthrough technique behind R1's o1-matching performance. Build mathematical reasoning AI without expensive RLHF. Apple Silicon optimized. π
MLX-GRPO allows you to train your own DeepSeek-R1 models directly on your Mac. This implementation simplifies the process of building advanced reasoning AI, making it accessible for developers. ππ