data/train.json
, 3958 questionsdata/HoF_test.json
, 410 questionsdata/HeF_test.json
, 383 questions
- pytorch 2.0
- transformers
- zhipuai
- openai 0.28.0
- dashscope
Install numbat tool from [https://github.com/sharkdp/numbat].
GLM-4
series: baselines/LLMs/GLM/ChatGLM4_api.pyGPT
series: baselines/LLMs/GLM/ChatGPT_api.pyQwen
series: baselines/LLMs/GLM/Qwen_api.py- eval:
cd baselines/LLMs/ && python eval_results.py --hof_results {HoF_result_file} --hef_results {HeF_result_file}
We use MCTS to construct preference data, with the code provided in baselines/MCTS-PRM
. The code for DPO can be found in baselines/DPO
.
- with calculator:
cd baselines/small_models && bash run_qwen.sh
- without calculator:
cd baselines/small_models && bash run_qwen_wo_cal.sh
- train formula retriever:
cd baselines/RAG/ && bash run.sh
- eval formula retriever:
cd baselines/RAG/ && python eval.py --model_path outputs_retriever