├── .gitignore ├── LICENSE ├── README.md ├── chatglm_tokenizer ├── tokenization_chatglm.py ├── tokenizer.model └── tokenizer_config.json ├── data_process.py ├── dataset.py ├── dataset_sft.py ├── eval.py ├── eval_pretrain.py ├── loss_tokens-v1.png ├── loss_tokens-v3.png ├── loss_tokens.png ├── model.py ├── pretrain.py ├── requirements.txt ├── sft.py └── sft_data_process.py /.gitignore: -------------------------------------------------------------------------------- 1 | data/ -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/README.md -------------------------------------------------------------------------------- /chatglm_tokenizer/tokenization_chatglm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/chatglm_tokenizer/tokenization_chatglm.py -------------------------------------------------------------------------------- /chatglm_tokenizer/tokenizer.model: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/chatglm_tokenizer/tokenizer.model -------------------------------------------------------------------------------- /chatglm_tokenizer/tokenizer_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/chatglm_tokenizer/tokenizer_config.json -------------------------------------------------------------------------------- /data_process.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/data_process.py -------------------------------------------------------------------------------- /dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/dataset.py -------------------------------------------------------------------------------- /dataset_sft.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/dataset_sft.py -------------------------------------------------------------------------------- /eval.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/eval.py -------------------------------------------------------------------------------- /eval_pretrain.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/eval_pretrain.py -------------------------------------------------------------------------------- /loss_tokens-v1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/loss_tokens-v1.png -------------------------------------------------------------------------------- /loss_tokens-v3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/loss_tokens-v3.png -------------------------------------------------------------------------------- /loss_tokens.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/loss_tokens.png -------------------------------------------------------------------------------- /model.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/model.py -------------------------------------------------------------------------------- /pretrain.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/pretrain.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/requirements.txt -------------------------------------------------------------------------------- /sft.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/sft.py -------------------------------------------------------------------------------- /sft_data_process.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YoYiL/llama2-chinese/HEAD/sft_data_process.py --------------------------------------------------------------------------------