├── README.md ├── eval.py ├── grpo_qwen.py ├── gsm8k_eval_results_GRPO.json ├── gsm8k_eval_results_Original.json ├── gsm8k_eval_results_SFT.json └── sft_qwen.py /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/README.md -------------------------------------------------------------------------------- /eval.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/eval.py -------------------------------------------------------------------------------- /grpo_qwen.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/grpo_qwen.py -------------------------------------------------------------------------------- /gsm8k_eval_results_GRPO.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/gsm8k_eval_results_GRPO.json -------------------------------------------------------------------------------- /gsm8k_eval_results_Original.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/gsm8k_eval_results_Original.json -------------------------------------------------------------------------------- /gsm8k_eval_results_SFT.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/gsm8k_eval_results_SFT.json -------------------------------------------------------------------------------- /sft_qwen.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mingyin0312/RL4LLM/HEAD/sft_qwen.py --------------------------------------------------------------------------------