├── LICENSE
├── README.md
├── ar
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── en
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── es
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── fa
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── fr
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── it
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── ja
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── ko
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── th
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
├── tr
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
└── zh
├── README.md
└── cheatsheet-transformers-large-language-models.pdf
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2025 Afshine Amidi and Shervine Amidi
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Transformers & LLMs cheatsheet for Stanford's CME 295
2 | Available in [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## Goal
5 | This repository aims at summing up in the same place all the important notions that are covered in Stanford's CME 295 Transformers & Large Language Models course. It includes:
6 | - **Transformers**: self-attention, architecture, variants, optimization techniques (sparse attention, low-rank attention, flash attention)
7 | - **LLMs**: prompting, finetuning (SFT, LoRA), preference tuning, optimization techniques (mixture of experts, distillation, quantization)
8 | - **Applications**: LLM-as-a-judge, RAG, agents, reasoning models (train-time and test-time scaling from DeepSeek-R1)
9 |
10 | ## Content
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## Class textbook
15 | This VIP cheatsheet gives an overview of what is in the "Super Study Guide: Transformers & Large Language Models" book, which contains ~600 illustrations over 250 pages and goes into the following concepts in depth. You can find more details at https://superstudy.guide.
16 |
17 | ## Class website
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## Authors
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) and [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University)
22 |
--------------------------------------------------------------------------------
/ar/README.md:
--------------------------------------------------------------------------------
1 | # وُرَيقات لمقرر ستانفورد ٢٩٥ CME للمحولات والنماذج اللغوية الضخمة
2 | متوفِّر: **العربية** - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 |
5 | ## الهدف
6 | هذا المخزن يهدف لتلخيص أهم الترميزات من مقرر ٢٩٥ CME للموحولات والنماذج اللغوية الضخمة في مكان واحد والذي يتضمن:
7 | - **المحولات**: الانتباه الذاتي، الهيكلية، التشكلات، أساليب الاستمثال (الانتباه المتباعد، انتباه الرتبة الدنيا، الانتباه الومضي)
8 | - **النماذج اللغوية الضخمة**: الأوامر والضبط الدقيق (المعيرة المهيكلة، التكايف الرتيب) والتنسيق التفضيلي، وأساليب الاستمثال (مخلط الخبراء، التقطير، التكمية)
9 | - **التطبيقات**: النموذج اللغوي الضخم الحكم، الاسترجاع المعزز بالتوليد، الوُكلاء، النماذج المستدلة (بالقياس وقت التدريب والاختبار من دِيبِ سيك R1)
10 |
11 | ## المحتوى
12 | ### VIP Cheatsheet
13 |
14 |
15 | ## كتاب المقرر
16 | هذه الوُريقة الشريفة تُعطي صورة عامة عما في كتاب "Super Study Guide: Transformers & Large Language Models" الحاوي أكثر من ٦٠٠ رسمٍ بياني في ٢٥٠ صفحة تفاصيله: على https://superstudy.guide.
17 |
18 | ## موقع المقرر
19 | [cme295.stanford.edu](https://cme295.stanford.edu/)
20 |
21 | ## المؤلفون
22 | [أفشين أميدي](https://www.linkedin.com/in/afshineamidi/) (إيكول سنترال باريس، معهد ماساتشوستس للتقانة) و[شروين عميدي](https://www.linkedin.com/in/shervineamidi/) (إيكول سنترال باريس، جامعة ستانفورد)
23 |
24 | ## المترجم
25 | سري السباعي (جامعة الأمير سلطان)
--------------------------------------------------------------------------------
/ar/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/ar/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/en/README.md:
--------------------------------------------------------------------------------
1 | # Transformers & LLMs cheatsheet for Stanford's CME 295
2 | Available in [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - **English** - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## Goal
5 | This repository aims at summing up in the same place all the important notions that are covered in Stanford's CME 295 Transformers & Large Language Models course. It includes:
6 | - **Transformers**: self-attention, architecture, variants, optimization techniques (sparse attention, low-rank attention, flash attention)
7 | - **LLMs**: prompting, finetuning (SFT, LoRA), preference tuning, optimization techniques (mixture of experts, distillation, quantization)
8 | - **Applications**: LLM-as-a-judge, RAG, agents, reasoning models (train-time and test-time scaling from DeepSeek-R1)
9 |
10 | ## Content
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## Class textbook
15 | This VIP cheatsheet gives an overview of what is in the "Super Study Guide: Transformers & Large Language Models" book, which contains ~600 illustrations over 250 pages. You can find more details at https://superstudy.guide.
16 |
17 | ## Class website
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## Authors
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) and [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University)
22 |
--------------------------------------------------------------------------------
/en/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/en/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/es/README.md:
--------------------------------------------------------------------------------
1 | # Guía del curso sobre Transformers y Grandes Modelos de Lenguaje para el curso CME 295 de Stanford
2 | Disponible en: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - **Español** - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## Objetivo
5 | Este repositorio tiene como objetivo resumir en un solo lugar todas las nociones importantes que se cubren en el curso de CME 295 Transformers & Large Language Models de Stanford. Incluye:
6 | - **Transformers**: auto-atención, arquitectura, variantes, técnicas de optimización (atención dispersa, atención de bajo rango, atención flash)
7 | - **LLMs**: prompting, ajuste fino (SFT, LoRA), ajuste de preferencias, técnicas de optimización (mezcla de expertos, destilación, cuantización)
8 | - **Aplicaciones**: LLM como juez, RAG, agentes, modelos de razonamiento (escalado en tiempo de entrenamiento y en tiempo de prueba de DeepSeek-R1)
9 |
10 | ## Contenido
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## Libro de texto
15 | Esta VIP cheatsheet ofrece un resumen del contenido del libro "Super Study Guide: Transformers & Large Language Models”, que incluye ~600 ilustraciones a lo largo de 250 páginas. Puedes encontrar más detalles en https://superstudy.guide.
16 |
17 | ## Sitio web de la clase
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## Autores
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) y [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University)
22 |
23 | ## Traductores
24 | Steven Van Vaerenbergh (Universidad de Cantabria) y Lara Lloret Iglesias (Instituto de Física de Cantabria - CSIC/UC)
25 |
--------------------------------------------------------------------------------
/es/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/es/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/fa/README.md:
--------------------------------------------------------------------------------
1 | # چیتشیت ترانسفورمرها و مدلهای زبانی بزرگ کلاس CME ۲۹۵ در دانشگاه استنفورد
2 | قابل دسترسی به زبانهای: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - **فارسی** - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 |
5 | ## هدف
6 | این مجموعه خلاصهای متمرکز از مفاهیم کلیدی دورۀ CME ۲۹۵ دانشگاه استنفورد دربارۀ ترانسفورمرها و مدلهای زبانی بزرگ ارائه میدهد، از جمله:
7 | - **ترنسفورمرها**: توجه خودکار، معماری، انواع، و بهینهسازی (توجه پراکنده، کمرتبه، فلش)
8 | - **مدلهای زبانی بزرگ**: پرسشسازی، تنظیم دقیق (SFT, LoRA)، تنظیم ترجیحی، و تکنیکهای بهینهسازی (ترکیب متخصصان، تقطیر، کوانتیزاسیون)
9 | - **کاربرده**: مدل قاضی، RAG، عاملها، و مدلهای استدلالی (مقیاسپذیری در آموزش/آزمون از DeepSeek-R1)
10 |
11 | ## مطالب
12 | ### VIP Cheatsheet
13 |
14 |
15 | ## کتاب کلاس
16 | این چیتشیت ویژه، مروری کلی بر محتوای کتاب "Super Study Guide: Transformers & Large Language Models" ارائه میدهد — کتابی شامل حدود ۶۰۰ نگاره در ۲۵۰ صفحه که مفاهیم زیر را بهصورت عمیق تحلیل میکند. اطلاعات بیشتر در https://superstudy.guide در دسترس است.
17 |
18 | ## وبسایت کلاس
19 | [cme295.stanford.edu](https://cme295.stanford.edu/)
20 |
21 | ## نویسندگان
22 | [افشین عمیدی](https://www.linkedin.com/in/afshineamidi/) (دانشکده مرکزی پاریس، مؤسسه فناوری ماساچوست) و [شروین عمیدی](https://www.linkedin.com/in/shervineamidi/) (دانشکده مرکزی پاریس، دانشگاه استنفورد)
23 |
24 | ## مترجم
25 | امیر ضیائی (َدانشگاه کالیفرنیا، برکلی)
--------------------------------------------------------------------------------
/fa/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/fa/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/fr/README.md:
--------------------------------------------------------------------------------
1 | # Pense-bête de Transformeurs et LLMs pour CME 295 de Stanford
2 | Disponible en [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - **Français** - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## But
5 | Ce repo a pour but de résumer toutes les notions importantes du cours de Transformeurs et LLMs (CME 295) de Stanford. En particulier, il inclut :
6 | - **Transformeurs**: auto-attention, architecture, variants, optimisations (attention éparse, attention de bas rang, flash attention)
7 | - **LLMs**: prompting, ajustement fin (SFT, LoRA), ajustement par préférence, optimisations (mélange d'experts, distillation, quantification)
8 | - **Applications**: LLM comme juge, RAG, agents, modèles de raisonnement (extension à l'inférence et à l'entraînement de DeepSeek-R1)
9 |
10 | ## Contenu
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## Manuel scolaire
15 | Cette VIP cheatsheet donne un aperçu du contenu du livre intitulé « Super Study Guide : Transformeurs et Grands Modèles de Langage », qui contient environ 600 illustrations réparties sur 250 pages. Pour plus d'information, veuillez visiter : https://superstudy.guide.
16 |
17 | ## Site du cours
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## Auteurs
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (École Centrale Paris, MIT) et [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (École Centrale Paris, Université de Stanford)
22 |
--------------------------------------------------------------------------------
/fr/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/fr/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/it/README.md:
--------------------------------------------------------------------------------
1 | # Trasformatori e Modelli di linguaggio di grandi dimensioni, cheasheets per Stanford's CME 295
2 | Disponibile in: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - **Italiano** - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## Obiettivo
5 | Questo repository ha lo scopo di riassumere nello stesso luogo tutte le nozioni importanti che vengono trattate nel corso CME 295 Transformers & Large Language Models di Stanford. Include:
6 | - **Trasformatori**: auto-attenzione, architettura, varianti, tecniche di ottimizzazione (sparse attention, low-rank attention, flash attention)
7 | - **LLMs**: prompting, finetuning (SFT, LoRA), preference tuning, tecniche di ottimizzazione (mixture of experts, distillation, quantization)
8 | - **Applicazioni**: LLM come giudice, RAG, agenti, modelli di ragionamento (scaling train-time e test-time da DeepSeek-R1)
9 |
10 | ## Contenuto
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## Libro di testo del corso
15 | Questo cheatsheet VIP fornisce una panoramica del contenuto del libro “Super Study Guide: Transformers & Large Language Models”, che contiene circa 600 illustrazioni in 250 pagine. Ulteriori dettagli sono disponibili sul sito https://superstudy.guide.
16 |
17 | ## Sito web del corso
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## Autori
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) e [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Università di Stanford)
22 |
23 | ## Traduttore
24 | [Gianluca Guzzetta](https://www.linkedin.com/in/gianluca-guzzetta-1a778015b/) (PoliTO, UniSalento)
25 |
--------------------------------------------------------------------------------
/it/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/it/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/ja/README.md:
--------------------------------------------------------------------------------
1 | # スタンフォード大学 CME 295「Transformer と大規模言語モデル」のチートシート
2 | 対応言語:[العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - **日本語** - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## 目的
5 | このリポジトリは、スタンフォード大学 CME 295「Transformer と大規模言語モデル」で扱う重要概念を一箇所にまとめることを目的としています。内容は以下のとおりです。
6 | - **Transformer**:Self-Attention、構造、派生モデル、最適化手法(Sparse Attention・低ランク Attention・Flash Attention)
7 | - **大規模言語モデル**:プロンプト、ファインチューニング(SFT・LoRA)、プリファレンスチューニング、最適化手法(混合エキスパート・蒸留・量子化)
8 | - **応用**:LLM-as-a-judge、RAG、エージェント、推論モデル(DeepSeek-R1 によるトレーニング時とテスト時のスケーリング)
9 |
10 | ## 内容
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## 講義テキスト
15 | この VIP Cheatsheet は、『Super Study Guide: Transformer と大規模言語モデル』という書籍の概要です。この書籍には、250 ページにわたって約 600 点の図が含まれています。詳細は https://superstudy.guide をご覧ください。
16 |
17 | ## 講義ウェブサイト
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## 著者
21 | [アフシン・アミディ](https://www.linkedin.com/in/afshineamidi/)(パリ中央工科大学、MIT)
22 |
23 | [シェルビン・アミディ](https://www.linkedin.com/in/shervineamidi/)(パリ中央工科大学、スタンフォード大学)
24 |
25 | ## 訳者
26 | 中井 喜之
27 |
--------------------------------------------------------------------------------
/ja/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/ja/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/ko/README.md:
--------------------------------------------------------------------------------
1 | # 트랜스포머와 대형 언어 모델, 스탠포드 대학교, CME 295 핵심 요약본
2 | 지원 언어: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - **한국어** - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## 목적
5 | 이 깃허브 리포지토리는 Stanford의 CME 295 트랜스포머와 대형 언어 모델 강의에서 배우는 중요한 개념들을 한 곳에 모아서 간편하게 살펴볼 수 있도록 정리한 것입니다. 포함 내용은 다음과 같습니다:
6 | - **트랜스포머 (Transformers)**: 셀프 어텐션 (self-attention), 아키텍처, 변형, 최적화 기법 (스파스 어텐션, 로랭크 어텐션, 플래시 어텐션)
7 | - **대형 언어 모델 (LLMs)**: 프롬프팅 (prompting), 파인 튜닝 (SFT, LoRA), 선호도 튜닝, 최적화 기법 (mixture of experts), 지식 증류 (distillation), 양자화 (quantization)
8 | - **응용**: LLM-as-a-Judge (LLM 으로 다른 모델을 평가하는 방식), RAG, 에이전트, 추론 모델, DeepSeek-R1을 활용한 학습 단계 (train-time) 및 테스트 단계 (test-time) 에서의 스케일링
9 |
10 | ## 내용
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## 수업 교재
15 | 이 VIP Cheatsheet 은 250페이지에 걸쳐 약 600장의 일러스트로 구성된 「Super Study Guide: Transformers & Large Language Models」 책에 담긴 내용을 간단히 요약해 보여줍니다. 더 자세한 내용은 https://superstudy.guide 에서 확인하실 수 있습니다.
16 |
17 | ## 수업 웹사이트
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## 저자
21 | [압신 아미디](https://www.linkedin.com/in/afshineamidi/) (에콜 상트랄 파리, MIT) and [셰르빈 아미디](https://www.linkedin.com/in/shervineamidi/) (에콜 상트랄 파리, 스탠포드 대학교)
22 |
23 | ## 옮긴이
24 | YJ (Yongjin) Kim 김용진
25 |
--------------------------------------------------------------------------------
/ko/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/ko/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/th/README.md:
--------------------------------------------------------------------------------
1 | # สรุปเนื้อหาเรื่องตัวแปลงและแบบจำลองภาษาขนาดใหญ่ สำหรับรายวิชา CME 295 ม.สแตนฟอร์ด
2 | ฉบับภาษาต่างๆ: [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - **ไทย** - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## เป้าหมาย
5 | คลังนี้มุ่งรวบรวมหัวข้อสำคัญทั้งหมดที่กล่าวถึงในรายวิชา “CME 295 ตัวแปลงและแบบจำลองภาษาขนาดใหญ่” ของมหาวิทยาลัยสแตนฟอร์ดมาไว้ในที่เดียว มีเนื้อหาดังนี้:
6 | - **ตัวแปลง (transformer)**: การเพ่งตน (self-attention), สถาปัตยกรรม (architecture), รูปแปรต่างๆ (variants), กลวิธีเพิ่มประสิทธิภาพ (เพ่งโหรง sparse attention, เพ่งแรงก์ต่ำ low-rank attention, เพ่งแฟลช flash attention)
7 | - **LLM (แบบจำลองภาษาขนาดใหญ่)**: การบอก (prompting), การปรับละเอียด (finetuning; โดยใช้ SFT, LoRA), การปรับความชอบ (preference tuning), กลวิธีเพิ่มประสิทธิภาพ (เหล่าผู้เชี่ยวชาญ mixture of experts, การกลั่น distillation, การแจงหน่วย quantization)
8 | - **การประยุกต์ใช้**: กรรมการ LLM (LLM-as-a-judge), RAG, ตัวแทน (agents), แบบจำลองให้เหตุผล (reasoning models), การย่อขยายตอนฝึกและตอนใช้งาน (train-time and test-time scaling) จาก DeepSeek-R1
9 |
10 | ## เนื้อหา
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## หนังสือเรียนประกอบรายวิชา
15 | VIP Cheatsheet นี้ ให้ภาพรวมของเนื้อหาในหนังสือ "Super Study Guide: Transformers & Large Language Models" ซึ่งบรรจุภาพประกอบกว่า 600 ภาพใน 250 หน้า ผู้สนใจสามารถดูรายละเอียดเพิ่มเติมได้ที่ https://superstudy.guide
16 |
17 | ## เว็บไซต์รายวิชา
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## ผู้เขียน
21 | [อัฟชิน อามีดี](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) และ [เชอร์วิน อามิดี](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford University)
22 |
23 | ## ผู้แปล
24 | บดินทร์ พรวิลาวัณย์ (นักแปลอิสระ)
25 | ชารินทร์ พลภาณุมาศ (นักแปลอิสระ)
26 |
--------------------------------------------------------------------------------
/th/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/th/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/tr/README.md:
--------------------------------------------------------------------------------
1 | # Dönüştürücüler & Büyük Dil Modelleri Stanford CME 295 için Özet Rehberi
2 | Erişilebilir dil [العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - **Türkçe** - [中文](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/zh)
3 |
4 | ## Hedef
5 | Bu repo, Stanford'un CME 295 Dönüştürücüler & Büyük Dil Modelleri (Transformers & Large Language Models) dersinde ele alınan tüm önemli kavramları aynı yerde toplamayı amaçlamaktadır. Şunları içerir:
6 | - **Dönüştürücüler**: öz-dikkat, mimari, türler, optimizasyon teknikleri (seyrek dikkat, düşük dereceli dikkat, ani dikkat)
7 | - **BDMler**: yönlendirme, ince ayarlama (SFT, LoRA), tercih ayarlaması, optimizasyon teknikleri (uzmanların karışımı, damıtma, kuantalama)
8 | - **Uygulamalar**: Yargıç olarak BDM’ler, RAG, temsilciler, akıl yürütme modelleri (DeepSeek-R1'den eğitim-zamanı ve test-zamanı ölçekleme)
9 |
10 | ## İçerik
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## Ders kitabı
15 | Bu VIP Özet Rehberi, 250 sayfadan oluşan ve yaklaşık 600 resim içeren "Super Study Guide: Dönüştürücüler & Büyük Dil Modelleri" kitabında bulunanlara genel bir bakış sunar. Daha fazla ayrıntıyı https://superstudy.guide adresinde bulabilirsiniz.
16 |
17 | ## Dersin internet sayfası
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## Yazarlar
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (Ecole Centrale Paris, MIT) ve [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (Ecole Centrale Paris, Stanford Üniversitesi)
22 |
23 | ## Çevirmen
24 | [Merve Ayyüce Kızrak](https://www.linkedin.com/in/merve-ayyuce-kizrak/) (Bahçeşehir Üniversitesi, YZ Mühendisliği Bölümü)
25 |
--------------------------------------------------------------------------------
/tr/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/tr/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------
/zh/README.md:
--------------------------------------------------------------------------------
1 | # 斯坦福大学 CME 295 课程:Transformer 与大语言模型速查表
2 | 可用语言:[العربية](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ar) - [English](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/en) - [Español](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/es) - [فارسی](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fa) - [Français](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/fr) - [Italiano](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/it) - [日本語](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ja) - [한국어](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/ko) - [ไทย](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/th) - [Türkçe](https://github.com/afshinea/stanford-cme-295-transformers-large-language-models/tree/main/tr) - **中文**
3 |
4 | ## 目标
5 | 本仓库旨在集中汇总斯坦福大学 CME 295 “Transformer 与大语言模型”课程所涵盖的所有核心概念。其内容包括:
6 | - **Transformer**:自注意力机制、架构、变体、优化技术(如稀疏注意力、低秩注意力、Flash Attention)
7 | - **大语言模型 (LLM)**:提示 (prompting)、微调(SFT、LoRA)、偏好调优、优化技术(混合专家模型、知识蒸馏、量化)
8 | - **应用**:LLM 作为评判者、检索增强生成 (RAG)、智能体、推理模型(来自 DeepSeek-R1 的训练时与测试时缩放技术)
9 |
10 | ## 内容
11 | ### VIP Cheatsheet
12 |
13 |
14 | ## 课程教材
15 | 这份 VIP 速查表概述了《Super Study Guide: Transformer 与大语言模型》一书的核心内容。该书包含约 600 幅插图,全书超过 250 页。更多详情请访问:https://superstudy.guide
16 |
17 | ## 课程网站
18 | [cme295.stanford.edu](https://cme295.stanford.edu/)
19 |
20 | ## 作者
21 | [Afshine Amidi](https://www.linkedin.com/in/afshineamidi/) (巴黎中央理工学院, 麻省理工学院) 和 [Shervine Amidi](https://www.linkedin.com/in/shervineamidi/) (巴黎中央理工学院, 斯坦福大学)
22 |
23 | ## 译者
24 | [Tao Yu(俞涛)](https://www.linkedin.com/in/taoyucmu/) 和 [Binbin Xiong(熊斌斌)](https://www.linkedin.com/in/binbin-xiong-51ab8a43/)
25 |
--------------------------------------------------------------------------------
/zh/cheatsheet-transformers-large-language-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/afshinea/stanford-cme-295-transformers-large-language-models/b5eef9a0168f4f460b0382ed1fbb506fe8beaf2d/zh/cheatsheet-transformers-large-language-models.pdf
--------------------------------------------------------------------------------