├── .gitignore ├── README.md ├── Slides.pdf ├── gpt_sentiment.py ├── hugging-face-code-commented └── trl │ ├── core_commented.py │ ├── core_original.py │ └── trainer │ ├── ppo_trainer_commented.py │ └── ppo_trainer_original.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/.gitignore -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/README.md -------------------------------------------------------------------------------- /Slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/Slides.pdf -------------------------------------------------------------------------------- /gpt_sentiment.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/gpt_sentiment.py -------------------------------------------------------------------------------- /hugging-face-code-commented/trl/core_commented.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/hugging-face-code-commented/trl/core_commented.py -------------------------------------------------------------------------------- /hugging-face-code-commented/trl/core_original.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/hugging-face-code-commented/trl/core_original.py -------------------------------------------------------------------------------- /hugging-face-code-commented/trl/trainer/ppo_trainer_commented.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/hugging-face-code-commented/trl/trainer/ppo_trainer_commented.py -------------------------------------------------------------------------------- /hugging-face-code-commented/trl/trainer/ppo_trainer_original.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/hugging-face-code-commented/trl/trainer/ppo_trainer_original.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hkproj/rlhf-ppo/HEAD/requirements.txt --------------------------------------------------------------------------------