└── README.md


/README.md:
--------------------------------------------------------------------------------
 1 | # Fine-Tuning LLaMA 3.2 (3B) with Unsloth
 2 | 
 3 | ## 📌 Project Overview
 4 | This repository contains the implementation for fine-tuning the **LLaMA 3.2 (3B) model** using **PEFT (Parameter-Efficient Fine-Tuning)** techniques. The goal is to enhance the performance of the model using a dataset from **Hugging Face** while keeping memory usage efficient.
 5 | 
 6 | ## 🚀 Features
 7 | - **Uses FastLanguageModel from Unsloth** for optimized performance
 8 | - **PEFT with LoRA (Low-Rank Adaptation)** for efficient fine-tuning
 9 | - **Standardized Chat Formatting** using Unsloth's chat templates
10 | - **Fine-tuning with SFTTrainer** for better response generation
11 | - **Supports inference on fine-tuned model**
12 | 
13 | ## 📂 Dataset
14 | We used the **mlabonne/FineTome-100k** dataset from Hugging Face for training.
15 | 
16 | ## 📌 Installation
17 | Ensure you have Python 3.8+ and install dependencies:
18 | ```bash
19 | pip install unsloth transformers trl datasets
20 | ```
21 | 
22 | ## 🛠 Fine-Tuning Process
23 | 1. **Load Base Model & Tokenizer**  
24 |    - Utilizes "unsloth/Llama-3.2-3B-Instruct" for initialization.
25 | 
26 | 2. **Apply PEFT & LoRA**  
27 |    - Fine-tunes select layers for efficient training.
28 | 
29 | 3. **Preprocess Dataset**  
30 |    - Uses Unsloth's `standardize_sharegpt()` for structured chat data.
31 | 
32 | 4. **Train the Model**  
33 |    - Configured with gradient accumulation, mixed precision (FP16/BF16), and SFTTrainer.
34 | 
35 | 5. **Save and Deploy**  
36 |    - Saves fine-tuned model and loads it for inference.
37 | 
38 | ## 🔬 Inference Example
39 | After fine-tuning, you can run inference using:
40 | ```python
41 | prompts = ["Explain the principles of investment."]
42 | inputs = tokenizer.apply_chat_template([{ "role": "user", "content": prompts[0] }], tokenize=False)
43 | output = model.generate(**tokenizer(inputs, return_tensors="pt").to("cuda"))
44 | print(tokenizer.decode(output[0], skip_special_tokens=True))
45 | ```
46 | 
47 | ## 📌 Training Configuration
48 | - **Batch Size:** 2 per device
49 | - **Gradient Accumulation:** 4 steps
50 | - **Warmup Steps:** 5
51 | - **Max Steps:** 60
52 | - **Learning Rate:** 2e-4
53 | - **Mixed Precision:** FP16/BF16
54 | - **Logging:** Every step
55 | 
56 | 
57 | 


--------------------------------------------------------------------------------