├── .gitignore ├── LICENSE ├── Makefile ├── README.md ├── analysis ├── eval_self_reflect.py ├── math_grader.py └── script.sh ├── asset └── oat-zero-results.jpg ├── data ├── math_train_500.json └── math_train_500_r1.json └── training ├── countdown.py ├── run_grpo.sh └── zero_countdown.py /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/LICENSE -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/Makefile -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/README.md -------------------------------------------------------------------------------- /analysis/eval_self_reflect.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/analysis/eval_self_reflect.py -------------------------------------------------------------------------------- /analysis/math_grader.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/analysis/math_grader.py -------------------------------------------------------------------------------- /analysis/script.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/analysis/script.sh -------------------------------------------------------------------------------- /asset/oat-zero-results.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/asset/oat-zero-results.jpg -------------------------------------------------------------------------------- /data/math_train_500.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/data/math_train_500.json -------------------------------------------------------------------------------- /data/math_train_500_r1.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/data/math_train_500_r1.json -------------------------------------------------------------------------------- /training/countdown.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/training/countdown.py -------------------------------------------------------------------------------- /training/run_grpo.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/training/run_grpo.sh -------------------------------------------------------------------------------- /training/zero_countdown.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sail-sg/oat-zero/HEAD/training/zero_countdown.py --------------------------------------------------------------------------------