├── LICENSE ├── README.md ├── dataset ├── MR-GSM8K.json ├── chat_template.jinja ├── k-shot-demos.json └── synthesized_training_data.jsonl ├── eval_results ├── Claude_eval_results.json ├── GPT3_5_eval_results.json ├── GPT4_eval_results.json ├── gpt4-grading │ ├── Claude_eval_results.json │ ├── GPT3_5_eval_results.json │ ├── GPT4_eval_results.json │ ├── llama2_70b_eval_results.json │ ├── mammoth_70B_eval_results.json │ ├── metamath_70B_eval_results.json │ └── wizardmath_70B_eval_results.json ├── llama2_70b_eval_results.json ├── mammoth_70B_eval_results.json ├── metamath_70B_eval_results.json └── wizardmath_70B_eval_results.json ├── images └── illustration.png └── scripts ├── eval_mr_gsm8k.py ├── run.sh └── train_math.py /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/README.md -------------------------------------------------------------------------------- /dataset/MR-GSM8K.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/dataset/MR-GSM8K.json -------------------------------------------------------------------------------- /dataset/chat_template.jinja: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/dataset/chat_template.jinja -------------------------------------------------------------------------------- /dataset/k-shot-demos.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/dataset/k-shot-demos.json -------------------------------------------------------------------------------- /dataset/synthesized_training_data.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/dataset/synthesized_training_data.jsonl -------------------------------------------------------------------------------- /eval_results/Claude_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/Claude_eval_results.json -------------------------------------------------------------------------------- /eval_results/GPT3_5_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/GPT3_5_eval_results.json -------------------------------------------------------------------------------- /eval_results/GPT4_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/GPT4_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/Claude_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/Claude_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/GPT3_5_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/GPT3_5_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/GPT4_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/GPT4_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/llama2_70b_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/llama2_70b_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/mammoth_70B_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/mammoth_70B_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/metamath_70B_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/metamath_70B_eval_results.json -------------------------------------------------------------------------------- /eval_results/gpt4-grading/wizardmath_70B_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/gpt4-grading/wizardmath_70B_eval_results.json -------------------------------------------------------------------------------- /eval_results/llama2_70b_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/llama2_70b_eval_results.json -------------------------------------------------------------------------------- /eval_results/mammoth_70B_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/mammoth_70B_eval_results.json -------------------------------------------------------------------------------- /eval_results/metamath_70B_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/metamath_70B_eval_results.json -------------------------------------------------------------------------------- /eval_results/wizardmath_70B_eval_results.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/eval_results/wizardmath_70B_eval_results.json -------------------------------------------------------------------------------- /images/illustration.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/images/illustration.png -------------------------------------------------------------------------------- /scripts/eval_mr_gsm8k.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/scripts/eval_mr_gsm8k.py -------------------------------------------------------------------------------- /scripts/run.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/scripts/run.sh -------------------------------------------------------------------------------- /scripts/train_math.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dvlab-research/MR-GSM8K/HEAD/scripts/train_math.py --------------------------------------------------------------------------------