├── LICENSE ├── README.md ├── ar ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── en ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── es ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── fa ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── fr ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── it ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── ja ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── ko ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── th ├── README.md └── cheatsheet-transformers-large-language-models.pdf ├── tr ├── README.md └── cheatsheet-transformers-large-language-models.pdf └── zh ├── README.md └── cheatsheet-transformers-large-language-models.pdf /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Afshine Amidi and Shervine Amidi 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Transformers & LLMs cheatsheet for Stanford's CME 295 2 | Available in [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## Goal 5 | This repository aims at summing up in the same place all the important notions that are covered in Stanford's CME 295 Transformers & Large Language Models course. It includes: 6 | - **Transformers**: self-attention, architecture, variants, optimization techniques (sparse attention, low-rank attention, flash attention) 7 | - **LLMs**: prompting, finetuning (SFT, LoRA), preference tuning, optimization techniques (mixture of experts, distillation, quantization) 8 | - **Applications**: LLM-as-a-judge, RAG, agents, reasoning models (train-time and test-time scaling from DeepSeek-R1) 9 | 10 | ## Content 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## Class textbook 15 | This VIP cheatsheet gives an overview of what is in the "Super Study Guide: Transformers & Large Language Models" book, which contains ~600 illustrations over 250 pages and goes into the following concepts in depth. You can find more details at https://superstudy.guide. 16 | 17 | ## Class website 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## Authors 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) and [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University) 22 | -------------------------------------------------------------------------------- /ar/README.md: -------------------------------------------------------------------------------- 1 | # وُرَيقات لمقرر ستانفورد ٢٩٥ CME للمحولات والنماذج اللغوية الضخمة 2 | متوفِّر: **العربية** - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | 5 | ## الهدف 6 | هذا المخزن يهدف لتلخيص أهم الترميزات من مقرر ٢٩٥ CME للموحولات والنماذج اللغوية الضخمة في مكان واحد والذي يتضمن: 7 | - **المحولات**: الانتباه الذاتي، الهيكلية، التشكلات، أساليب الاستمثال (الانتباه المتباعد، انتباه الرتبة الدنيا، الانتباه الومضي) 8 | - **النماذج اللغوية الضخمة**: الأوامر والضبط الدقيق (المعيرة المهيكلة، التكايف الرتيب) والتنسيق التفضيلي، وأساليب الاستمثال (مخلط الخبراء، التقطير، التكمية) 9 | - **التطبيقات**: النموذج اللغوي الضخم الحكم، الاسترجاع المعزز بالتوليد، الوُكلاء، النماذج المستدلة (بالقياس وقت التدريب والاختبار من دِيبِ سيك R1) 10 | 11 | ## المحتوى 12 | ### VIP Cheatsheet 13 | Illustration 14 | 15 | ## كتاب المقرر 16 | هذه الوُريقة الشريفة تُعطي صورة عامة عما في كتاب "Super Study Guide: Transformers & Large Language Models" الحاوي أكثر من ٦٠٠ رسمٍ بياني في ٢٥٠ صفحة تفاصيله: على https://superstudy.guide. 17 | 18 | ## موقع المقرر 19 | [cme295.stanford.edu](https://cme295.stanford.edu/) 20 | 21 | ## المؤلفون 22 | [أفشين أميدي](https://www.linkedin.com/in/afshineamidi/) (إيكول سنترال باريس، معهد ماساتشوستس للتقانة) و[شروين عميدي](https://www.linkedin.com/in/shervineamidi/) (إيكول سنترال باريس، جامعة ستانفورد) 23 | 24 | ## المترجم 25 | سري السباعي (جامعة الأمير سلطان) -------------------------------------------------------------------------------- /ar/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/ar/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /en/README.md: -------------------------------------------------------------------------------- 1 | # Transformers & LLMs cheatsheet for Stanford's CME 295 2 | Available in [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - **English** - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## Goal 5 | This repository aims at summing up in the same place all the important notions that are covered in Stanford's CME 295 Transformers & Large Language Models course. It includes: 6 | - **Transformers**: self-attention, architecture, variants, optimization techniques (sparse attention, low-rank attention, flash attention) 7 | - **LLMs**: prompting, finetuning (SFT, LoRA), preference tuning, optimization techniques (mixture of experts, distillation, quantization) 8 | - **Applications**: LLM-as-a-judge, RAG, agents, reasoning models (train-time and test-time scaling from DeepSeek-R1) 9 | 10 | ## Content 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## Class textbook 15 | This VIP cheatsheet gives an overview of what is in the "Super Study Guide: Transformers & Large Language Models" book, which contains ~600 illustrations over 250 pages. You can find more details at https://superstudy.guide. 16 | 17 | ## Class website 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## Authors 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) and [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University) 22 | -------------------------------------------------------------------------------- /en/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/en/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /es/README.md: -------------------------------------------------------------------------------- 1 | # Guía del curso sobre Transformers y Grandes Modelos de Lenguaje para el curso CME 295 de Stanford 2 | Disponible en: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - **Español** - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## Objetivo 5 | Este repositorio tiene como objetivo resumir en un solo lugar todas las nociones importantes que se cubren en el curso de CME 295 Transformers & Large Language Models de Stanford. Incluye: 6 | - **Transformers**: auto-atención, arquitectura, variantes, técnicas de optimización (atención dispersa, atención de bajo rango, atención flash) 7 | - **LLMs**: prompting, ajuste fino (SFT, LoRA), ajuste de preferencias, técnicas de optimización (mezcla de expertos, destilación, cuantización) 8 | - **Aplicaciones**: LLM como juez, RAG, agentes, modelos de razonamiento (escalado en tiempo de entrenamiento y en tiempo de prueba de DeepSeek-R1) 9 | 10 | ## Contenido 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## Libro de texto 15 | Esta VIP cheatsheet ofrece un resumen del contenido del libro "Super Study Guide: Transformers & Large Language Models”, que incluye ~600 ilustraciones a lo largo de 250 páginas. Puedes encontrar más detalles en https://superstudy.guide. 16 | 17 | ## Sitio web de la clase 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## Autores 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) y [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University) 22 | 23 | ## Traductores 24 | Steven Van Vaerenbergh (Universidad de Cantabria) y Lara Lloret Iglesias (Instituto de Física de Cantabria - CSIC/UC) 25 | -------------------------------------------------------------------------------- /es/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/es/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /fa/README.md: -------------------------------------------------------------------------------- 1 | # چیت‌شیت‌ ترانسفورمرها و مدل‌های زبانی بزرگ کلاس CME ۲۹۵ در دانشگاه استنفورد 2 | قابل دسترسی به زبان‌های: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - **فارسی** - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | 5 | ## هدف 6 | این مجموعه خلاصه‌ای متمرکز از مفاهیم کلیدی دورۀ CME ۲۹۵ دانشگاه استنفورد دربارۀ ترانسفورمرها و مدل‌های زبانی بزرگ ارائه می‌دهد، از جمله: 7 | - **ترنسفورمرها**: توجه خودکار، معماری، انواع، و بهینه‌سازی (توجه پراکنده، کم‌رتبه، فلش) 8 | - **مدل‌های زبانی بزرگ**: پرسش‌سازی، تنظیم دقیق (SFT, LoRA)، تنظیم ترجیحی، و تکنیک‌های بهینه‌سازی (ترکیب متخصصان، تقطیر، کوانتیزاسیون) 9 | - **کاربرده**: مدل قاضی، RAG، عامل‌ها، و مدل‌های استدلالی (مقیاس‌پذیری در آموزش/آزمون از DeepSeek-R1) 10 | 11 | ## مطالب 12 | ### VIP Cheatsheet 13 | Illustration 14 | 15 | ## کتاب کلاس 16 | این چیت‌شیت ویژه، مروری کلی بر محتوای کتاب "Super Study Guide: Transformers & Large Language Models" ارائه می‌دهد — کتابی شامل حدود ۶۰۰ نگاره در ۲۵۰ صفحه که مفاهیم زیر را به‌صورت عمیق تحلیل می‌کند. اطلاعات بیشتر در https://superstudy.guide در دسترس است. 17 | 18 | ## وب‌سایت کلاس 19 | [cme295.stanford.edu](https://cme295.stanford.edu/) 20 | 21 | ## نویسندگان 22 | [افشین عمیدی](https://www.linkedin.com/in/afshineamidi/) (دانشکده مرکزی پاریس، مؤسسه فناوری ماساچوست) و [شروین عمیدی](https://www.linkedin.com/in/shervineamidi/) (دانشکده مرکزی پاریس، دانشگاه استنفورد) 23 | 24 | ## مترجم 25 | امیر ضیائی (َدانشگاه کالیفرنیا، برکلی) -------------------------------------------------------------------------------- /fa/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/fa/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /fr/README.md: -------------------------------------------------------------------------------- 1 | # Pense-bête de Transformeurs et LLMs pour CME 295 de Stanford 2 | Disponible en [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - **Français** - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## But 5 | Ce repo a pour but de résumer toutes les notions importantes du cours de Transformeurs et LLMs (CME 295) de Stanford. En particulier, il inclut : 6 | - **Transformeurs**: auto-attention, architecture, variants, optimisations (attention éparse, attention de bas rang, flash attention) 7 | - **LLMs**: prompting, ajustement fin (SFT, LoRA), ajustement par préférence, optimisations (mélange d'experts, distillation, quantification) 8 | - **Applications**: LLM comme juge, RAG, agents, modèles de raisonnement (extension à l'inférence et à l'entraînement de DeepSeek-R1) 9 | 10 | ## Contenu 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## Manuel scolaire 15 | Cette VIP cheatsheet donne un aperçu du contenu du livre intitulé « Super Study Guide : Transformeurs et Grands Modèles de Langage », qui contient environ 600 illustrations réparties sur 250 pages. Pour plus d'information, veuillez visiter : https://superstudy.guide. 16 | 17 | ## Site du cours 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## Auteurs 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (École Centrale Paris, MIT) et [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (École Centrale Paris, Université de Stanford) 22 | -------------------------------------------------------------------------------- /fr/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/fr/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /it/README.md: -------------------------------------------------------------------------------- 1 | # Trasformatori e Modelli di linguaggio di grandi dimensioni, cheasheets per Stanford's CME 295 2 | Disponibile in: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - **Italiano** - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## Obiettivo 5 | Questo repository ha lo scopo di riassumere nello stesso luogo tutte le nozioni importanti che vengono trattate nel corso CME 295 Transformers & Large Language Models di Stanford. Include: 6 | - **Trasformatori**: auto-attenzione, architettura, varianti, tecniche di ottimizzazione (sparse attention, low-rank attention, flash attention) 7 | - **LLMs**: prompting, finetuning (SFT, LoRA), preference tuning, tecniche di ottimizzazione (mixture of experts, distillation, quantization) 8 | - **Applicazioni**: LLM come giudice, RAG, agenti, modelli di ragionamento (scaling train-time e test-time da DeepSeek-R1) 9 | 10 | ## Contenuto 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## Libro di testo del corso 15 | Questo cheatsheet VIP fornisce una panoramica del contenuto del libro “Super Study Guide: Transformers & Large Language Models”, che contiene circa 600 illustrazioni in 250 pagine. Ulteriori dettagli sono disponibili sul sito https://superstudy.guide. 16 | 17 | ## Sito web del corso 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## Autori 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) e [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Università di Stanford) 22 | 23 | ## Traduttore 24 | [Gianluca Guzzetta](https://www.linkedin.com/in/gianluca-guzzetta-1a778015b/) (PoliTO, UniSalento) 25 | -------------------------------------------------------------------------------- /it/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/it/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /ja/README.md: -------------------------------------------------------------------------------- 1 | # スタンフォード大学 CME 295「Transformer と大規模言語モデル」のチートシート 2 | 対応言語:[العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - **日本語** - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## 目的 5 | このリポジトリは、スタンフォード大学 CME 295「Transformer と大規模言語モデル」で扱う重要概念を一箇所にまとめることを目的としています。内容は以下のとおりです。 6 | - **Transformer**:Self-Attention、構造、派生モデル、最適化手法(Sparse Attention・低ランク Attention・Flash Attention) 7 | - **大規模言語モデル**:プロンプト、ファインチューニング(SFT・LoRA)、プリファレンスチューニング、最適化手法(混合エキスパート・蒸留・量子化) 8 | - **応用**:LLM-as-a-judge、RAG、エージェント、推論モデル(DeepSeek-R1 によるトレーニング時とテスト時のスケーリング) 9 | 10 | ## 内容 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## 講義テキスト 15 | この VIP Cheatsheet は、『Super Study Guide: Transformer と大規模言語モデル』という書籍の概要です。この書籍には、250 ページにわたって約 600 点の図が含まれています。詳細は https://superstudy.guide をご覧ください。 16 | 17 | ## 講義ウェブサイト 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## 著者 21 | [アフシン・アミディ](https://www.linkedin.com/in/afshineamidi/)(パリ中央工科大学、MIT) 22 | 23 | [シェルビン・アミディ](https://www.linkedin.com/in/shervineamidi/)(パリ中央工科大学、スタンフォード大学) 24 | 25 | ## 訳者 26 | 中井 喜之 27 | -------------------------------------------------------------------------------- /ja/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/ja/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /ko/README.md: -------------------------------------------------------------------------------- 1 | # 트랜스포머와 대형 언어 모델, 스탠포드 대학교, CME 295 핵심 요약본 2 | 지원 언어: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - **한국어** - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## 목적 5 | 이 깃허브 리포지토리는 Stanford의 CME 295 트랜스포머와 대형 언어 모델 강의에서 배우는 중요한 개념들을 한 곳에 모아서 간편하게 살펴볼 수 있도록 정리한 것입니다. 포함 내용은 다음과 같습니다: 6 | - **트랜스포머 (Transformers)**: 셀프 어텐션 (self-attention), 아키텍처, 변형, 최적화 기법 (스파스 어텐션, 로랭크 어텐션, 플래시 어텐션) 7 | - **대형 언어 모델 (LLMs)**: 프롬프팅 (prompting), 파인 튜닝 (SFT, LoRA), 선호도 튜닝, 최적화 기법 (mixture of experts), 지식 증류 (distillation), 양자화 (quantization) 8 | - **응용**: LLM-as-a-Judge (LLM 으로 다른 모델을 평가하는 방식), RAG, 에이전트, 추론 모델, DeepSeek-R1을 활용한 학습 단계 (train-time) 및 테스트 단계 (test-time) 에서의 스케일링 9 | 10 | ## 내용 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## 수업 교재 15 | 이 VIP Cheatsheet 은 250페이지에 걸쳐 약 600장의 일러스트로 구성된 「Super Study Guide: Transformers & Large Language Models」 책에 담긴 내용을 간단히 요약해 보여줍니다. 더 자세한 내용은 https://superstudy.guide 에서 확인하실 수 있습니다. 16 | 17 | ## 수업 웹사이트 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## 저자 21 | [압신 아미디](https://www.linkedin.com/in/afshineamidi/) (에콜 상트랄 파리, MIT) and [셰르빈 아미디](https://www.linkedin.com/in/shervineamidi/) (에콜 상트랄 파리, 스탠포드 대학교) 22 | 23 | ## 옮긴이 24 | YJ (Yongjin) Kim 김용진 25 | -------------------------------------------------------------------------------- /ko/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/ko/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /th/README.md: -------------------------------------------------------------------------------- 1 | # สรุปเนื้อหาเรื่องตัวแปลงและแบบจำลองภาษาขนาดใหญ่ สำหรับรายวิชา CME 295 ม.สแตนฟอร์ด 2 | ฉบับภาษาต่างๆ: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - **ไทย** - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## เป้าหมาย 5 | คลังนี้มุ่งรวบรวมหัวข้อสำคัญทั้งหมดที่กล่าวถึงในรายวิชา “CME 295 ตัวแปลงและแบบจำลองภาษาขนาดใหญ่” ของมหาวิทยาลัยสแตนฟอร์ดมาไว้ในที่เดียว มีเนื้อหาดังนี้: 6 | - **ตัวแปลง (transformer)**: การเพ่งตน (self-attention), สถาปัตยกรรม (architecture), รูปแปรต่างๆ (variants), กลวิธีเพิ่มประสิทธิภาพ (เพ่งโหรง sparse attention, เพ่งแรงก์ต่ำ low-rank attention, เพ่งแฟลช flash attention) 7 | - **LLM (แบบจำลองภาษาขนาดใหญ่)**: การบอก (prompting), การปรับละเอียด (finetuning; โดยใช้ SFT, LoRA), การปรับความชอบ (preference tuning), กลวิธีเพิ่มประสิทธิภาพ (เหล่าผู้เชี่ยวชาญ mixture of experts, การกลั่น distillation, การแจงหน่วย quantization) 8 | - **การประยุกต์ใช้**: กรรมการ LLM (LLM-as-a-judge), RAG, ตัวแทน (agents), แบบจำลองให้เหตุผล (reasoning models), การย่อขยายตอนฝึกและตอนใช้งาน (train-time and test-time scaling) จาก DeepSeek-R1 9 | 10 | ## เนื้อหา 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## หนังสือเรียนประกอบรายวิชา 15 | VIP Cheatsheet นี้ ให้ภาพรวมของเนื้อหาในหนังสือ "Super Study Guide: Transformers & Large Language Models" ซึ่งบรรจุภาพประกอบกว่า 600 ภาพใน 250 หน้า ผู้สนใจสามารถดูรายละเอียดเพิ่มเติมได้ที่ https://superstudy.guide 16 | 17 | ## เว็บไซต์รายวิชา 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## ผู้เขียน 21 | [อัฟชิน อามีดี](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) และ [เชอร์วิน อามิดี](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University) 22 | 23 | ## ผู้แปล 24 | บดินทร์ พรวิลาวัณย์ (นักแปลอิสระ) 25 | ชารินทร์ พลภาณุมาศ (นักแปลอิสระ) 26 | -------------------------------------------------------------------------------- /th/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/th/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /tr/README.md: -------------------------------------------------------------------------------- 1 | # Dönüştürücüler & Büyük Dil Modelleri Stanford CME 295 için Özet Rehberi 2 | Erişilebilir dil [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - **Türkçe** - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh) 3 | 4 | ## Hedef 5 | Bu repo, Stanford'un CME 295 Dönüştürücüler & Büyük Dil Modelleri (Transformers & Large Language Models) dersinde ele alınan tüm önemli kavramları aynı yerde toplamayı amaçlamaktadır. Şunları içerir: 6 | - **Dönüştürücüler**: öz-dikkat, mimari, türler, optimizasyon teknikleri (seyrek dikkat, düşük dereceli dikkat, ani dikkat) 7 | - **BDMler**: yönlendirme, ince ayarlama (SFT, LoRA), tercih ayarlaması, optimizasyon teknikleri (uzmanların karışımı, damıtma, kuantalama) 8 | - **Uygulamalar**: Yargıç olarak BDM’ler, RAG, temsilciler, akıl yürütme modelleri (DeepSeek-R1'den eğitim-zamanı ve test-zamanı ölçekleme) 9 | 10 | ## İçerik 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## Ders kitabı 15 | Bu VIP Özet Rehberi, 250 sayfadan oluşan ve yaklaşık 600 resim içeren "Super Study Guide: Dönüştürücüler & Büyük Dil Modelleri" kitabında bulunanlara genel bir bakış sunar. Daha fazla ayrıntıyı https://superstudy.guide adresinde bulabilirsiniz. 16 | 17 | ## Dersin internet sayfası 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## Yazarlar 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) ve [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford Üniversitesi) 22 | 23 | ## Çevirmen 24 | [Merve Ayyüce Kızrak](https://www.linkedin.com/in/merve-ayyuce-kizrak/) (Bahçeşehir Üniversitesi, YZ Mühendisliği Bölümü) 25 | -------------------------------------------------------------------------------- /tr/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/tr/cheatsheet-transformers-large-language-models.pdf -------------------------------------------------------------------------------- /zh/README.md: -------------------------------------------------------------------------------- 1 | # 斯坦福大学 CME 295 课程:Transformer 与大语言模型速查表 2 | 可用语言:[العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - **中文** 3 | 4 | ## 目标 5 | 本仓库旨在集中汇总斯坦福大学 CME 295 “Transformer 与大语言模型”课程所涵盖的所有核心概念。其内容包括: 6 | - **Transformer**:自注意力机制、架构、变体、优化技术(如稀疏注意力、低秩注意力、Flash Attention) 7 | - **大语言模型 (LLM)**:提示 (prompting)、微调(SFT、LoRA)、偏好调优、优化技术(混合专家模型、知识蒸馏、量化) 8 | - **应用**:LLM 作为评判者、检索增强生成 (RAG)、智能体、推理模型(来自 DeepSeek-R1 的训练时与测试时缩放技术) 9 | 10 | ## 内容 11 | ### VIP Cheatsheet 12 | Illustration 13 | 14 | ## 课程教材 15 | 这份 VIP 速查表概述了《Super Study Guide: Transformer 与大语言模型》一书的核心内容。该书包含约 600 幅插图,全书超过 250 页。更多详情请访问:https://superstudy.guide 16 | 17 | ## 课程网站 18 | [cme295.stanford.edu](https://cme295.stanford.edu/) 19 | 20 | ## 作者 21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (巴黎中央理工学院, 麻省理工学院) 和 [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (巴黎中央理工学院, 斯坦福大学) 22 | 23 | ## 译者 24 | [Tao Yu(俞涛)](https://www.linkedin.com/in/taoyucmu/) 和 [Binbin Xiong(熊斌斌)](https://www.linkedin.com/in/binbin-xiong-51ab8a43/) 25 | -------------------------------------------------------------------------------- /zh/cheatsheet-transformers-large-language-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/zh/cheatsheet-transformers-large-language-models.pdf --------------------------------------------------------------------------------