├── LICENSE ├── README.md ├── chunk_and_tokenize_datasets.py ├── create_tokenizer.py ├── requirements.txt ├── t5_data_collator.py └── train_model.py /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/README.md -------------------------------------------------------------------------------- /chunk_and_tokenize_datasets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/chunk_and_tokenize_datasets.py -------------------------------------------------------------------------------- /create_tokenizer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/create_tokenizer.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/requirements.txt -------------------------------------------------------------------------------- /t5_data_collator.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/t5_data_collator.py -------------------------------------------------------------------------------- /train_model.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/huggingface/olm-training/HEAD/train_model.py --------------------------------------------------------------------------------