ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
-
Updated
Feb 22, 2024 - Python
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
Resources of deep learning for mathematical reasoning (DL4MATH).
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement learning, and text-only reinforcement learning—to achieve faithful, concise, and self-reflective state-of-the-art performance in visual and textual reasoning.
Small and Efficient Mathematical Reasoning LLMs
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
Discriminator-Guided Chain-of-Thought Reasoning
[NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models".
Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"
The lecture notes for my discrete mathematics classes.
MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Language Models' paper by Microsoft Research. The code replicates the methods discussed in the paper.
[AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems
🧠 Train your own DeepSeek-R1 style reasoning model on Mac! First MLX implementation of GRPO - the breakthrough technique behind R1's o1-matching performance. Build mathematical reasoning AI without expensive RLHF. Apple Silicon optimized. 🚀
Fuzzy reasoning of Generalized Quantifiers (EMNLP 2024)
GSM8K-Consistency is a benchmark database for analyzing the consistency of Arithmetic Reasoning on GSM8K.
We implement and benchmark various prompting techniques for LLMs (i.e. PAL, CoT, PoT, etc.) on a specialized math reasoning dataset (on elementary school grade).
This dataset aggregates carefully selected elementary-level math problems from various existing resources, providing an optimal mix for testing and enhancing math-solving chatbots for young learners.
This repository contains the code, data, and models of the paper titled "Math Word Problem Solving by Generating Linguistic Variants of Problem Statements" published in the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop).
Add a description, image, and links to the mathematical-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the mathematical-reasoning topic, visit your repo's landing page and select "manage topics."