├── .gitignore ├── 1. TinyStories ├── README.md ├── TinyStories.ipynb └── requirements.txt ├── 2. Lora_from_scratch ├── .~LoRA from scratch.md ├── LoRA-from-scratch.ipynb ├── LoRA-from-scratch.md ├── README.md ├── img │ ├── llama_structure.png │ ├── lora.png │ └── transformer.png └── requirements.txt ├── 3. Qwen2.5 Technical Report Analysis ├── Qwen2.5-Coder 技术报告解读.md ├── Qwen2.5-Math 技术报告解读.md └── static │ ├── coder │ ├── data_mixture.png │ ├── file_fim_format.png │ ├── repo_fim_format.png │ ├── special_tokens.png │ ├── text_code_grounding_data_perf.png │ └── training_pipeline.png │ └── math │ ├── mammoth2-qa.png │ └── pipeline.png ├── 4. Async ├── Async.md ├── README.md ├── gsm8k-async.py ├── gsm8k-sync.py ├── gsm8k-test.json ├── requirements.txt └── story.py ├── 5. Hybrid Reasoning of Qwen3 ├── Qwen3是如何实现混合推理(快慢思考)的.md └── post-training.png ├── LICENSE └── README.md /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/.gitignore -------------------------------------------------------------------------------- /1. TinyStories/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/1. TinyStories/README.md -------------------------------------------------------------------------------- /1. TinyStories/TinyStories.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/1. TinyStories/TinyStories.ipynb -------------------------------------------------------------------------------- /1. TinyStories/requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/1. TinyStories/requirements.txt -------------------------------------------------------------------------------- /2. Lora_from_scratch/.~LoRA from scratch.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/.~LoRA from scratch.md -------------------------------------------------------------------------------- /2. Lora_from_scratch/LoRA-from-scratch.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/LoRA-from-scratch.ipynb -------------------------------------------------------------------------------- /2. Lora_from_scratch/LoRA-from-scratch.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/LoRA-from-scratch.md -------------------------------------------------------------------------------- /2. Lora_from_scratch/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/README.md -------------------------------------------------------------------------------- /2. Lora_from_scratch/img/llama_structure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/img/llama_structure.png -------------------------------------------------------------------------------- /2. Lora_from_scratch/img/lora.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/img/lora.png -------------------------------------------------------------------------------- /2. Lora_from_scratch/img/transformer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/img/transformer.png -------------------------------------------------------------------------------- /2. Lora_from_scratch/requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/2. Lora_from_scratch/requirements.txt -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/Qwen2.5-Coder 技术报告解读.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/Qwen2.5-Coder 技术报告解读.md -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/Qwen2.5-Math 技术报告解读.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/Qwen2.5-Math 技术报告解读.md -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/coder/data_mixture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/coder/data_mixture.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/coder/file_fim_format.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/coder/file_fim_format.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/coder/repo_fim_format.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/coder/repo_fim_format.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/coder/special_tokens.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/coder/special_tokens.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/coder/text_code_grounding_data_perf.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/coder/text_code_grounding_data_perf.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/coder/training_pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/coder/training_pipeline.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/math/mammoth2-qa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/math/mammoth2-qa.png -------------------------------------------------------------------------------- /3. Qwen2.5 Technical Report Analysis/static/math/pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/3. Qwen2.5 Technical Report Analysis/static/math/pipeline.png -------------------------------------------------------------------------------- /4. Async/Async.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/4. Async/Async.md -------------------------------------------------------------------------------- /4. Async/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/4. Async/README.md -------------------------------------------------------------------------------- /4. Async/gsm8k-async.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/4. Async/gsm8k-async.py -------------------------------------------------------------------------------- /4. Async/gsm8k-sync.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/4. Async/gsm8k-sync.py -------------------------------------------------------------------------------- /4. Async/gsm8k-test.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/4. Async/gsm8k-test.json -------------------------------------------------------------------------------- /4. Async/requirements.txt: -------------------------------------------------------------------------------- 1 | openai>=1.20.0 2 | loguru>=0.7.0 3 | tqdm>=4.64.1 -------------------------------------------------------------------------------- /4. Async/story.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/4. Async/story.py -------------------------------------------------------------------------------- /5. Hybrid Reasoning of Qwen3/Qwen3是如何实现混合推理(快慢思考)的.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/5. Hybrid Reasoning of Qwen3/Qwen3是如何实现混合推理(快慢思考)的.md -------------------------------------------------------------------------------- /5. Hybrid Reasoning of Qwen3/post-training.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/5. Hybrid Reasoning of Qwen3/post-training.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Mxoder/LLM-from-scratch/HEAD/README.md --------------------------------------------------------------------------------