├── Images ├── cars.png ├── frogs.png ├── horse.png └── trucks.png └── README.md /Images/cars.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/1mutaalk/CIFAR-10_ImageClassifier/HEAD/Images/cars.png -------------------------------------------------------------------------------- /Images/frogs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/1mutaalk/CIFAR-10_ImageClassifier/HEAD/Images/frogs.png -------------------------------------------------------------------------------- /Images/horse.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/1mutaalk/CIFAR-10_ImageClassifier/HEAD/Images/horse.png -------------------------------------------------------------------------------- /Images/trucks.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/1mutaalk/CIFAR-10_ImageClassifier/HEAD/Images/trucks.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CIFAR-10: Classical ML vs Deep Learning (PyTorch) 2 | 3 | ## Project Title & Team Members 4 | 5 | **Title:** CIFAR-10 Image Classification: Classical Machine Learning vs Deep Learning (PyTorch) 6 | **Team Members:** 7 | - Muhammad Mutaal Khan (CMS ID:522455) – Classical ML, deep learning implementation, overall pipeline 8 | - Saif Ullah Farooqi(CMS ID 511676) – Evaluation, plots 9 | 10 | --- 11 | 12 | ## Abstract 13 | 14 | This project investigates how well **classical machine learning** methods perform on the CIFAR‑10 image classification task compared to a modern **deep learning** architecture, and how performance changes when classical models are combined with deep features. Using PyTorch and scikit‑learn, we fine‑tune a ResNet‑18 model and train SVM and k‑NN classifiers on 15 | (1) hand‑crafted HOG features 16 | (2) raw pixels, and 17 | (3) ResNet‑based deep features. Results show that ResNet‑18 and classical models trained on deep features significantly outperform traditional pipelines based on raw pixels or HOG descriptors, demonstrating the value of representation learning for complex vision tasks. 18 | 19 | --- 20 | 21 | ## Introduction 22 | 23 | Image classification is a core problem in computer vision, with applications ranging from autonomous driving to content moderation. CIFAR‑10 is a widely used benchmark for evaluating algorithms on small natural images. 24 | 25 | The objective of this project is to: 26 | 1. Implement and evaluate **at least two classical ML methods** on CIFAR‑10. 27 | 2. Implement an **advanced deep learning model** in PyTorch (ResNet‑18). 28 | 3. Design a **fair comparative framework** to analyze the strengths and weaknesses of classical vs deep learning approaches. 29 | 4. Quantify the **business / practical impact** of moving from simpler models to deep learning in terms of accuracy and potential decision quality. 30 | 31 | --- 32 | 33 | ## Dataset Description 34 | 35 | ### Source and Size 36 | 37 | - **Dataset:** CIFAR‑10 38 | - **Source:** Canadian Institute for Advanced Research (CIFAR), available via `torchvision.datasets.CIFAR10`. 39 | - **Total samples:** 60,000 color images. 40 | - **Image size:** 32×32 pixels, 3 channels (RGB). 41 | - **Classes (10):** plane, car, bird, cat, deer, dog, frog, horse, ship, truck. 42 | 43 | ### Splits Used 44 | 45 | - **Training set:** 50,000 images (official training split). 46 | - **Validation set:** 5,000 images (50% of official test split). 47 | - **Test set:** 5,000 images (remaining 50% of official test split). 48 | 49 | The same validation and test indices are used across **all** models (classical + deep learning) to ensure a fair comparison. 50 | 51 | ### Features and Preprocessing 52 | 53 | - **For Deep Learning (PyTorch):** 54 | - Training transforms: 55 | - RandomCrop(32, padding=4) 56 | - RandomHorizontalFlip() 57 | - ToTensor() 58 | - Normalize with CIFAR‑10 mean and std: (0.4914, 0.4822, 0.4465) and (0.2470, 0.2435, 0.2616) 59 | - Validation/Test transforms: 60 | - ToTensor() 61 | - Same normalization (no augmentation). 62 | 63 | - **For Classical ML (scikit‑learn):** 64 | - **HOG features:** 65 | - Convert images to `(H, W, C)` and `uint8`. 66 | - Extract HOG descriptors using `skimage.feature.hog` (8 orientations, 8×8 pixels per cell, 2×2 cells per block, channel_axis=-1). 67 | - Standardize using `StandardScaler`. 68 | - **Raw pixels:** 69 | - Flatten images to 3072‑dimensional vectors and standardize with `StandardScaler`. 70 | - **Deep features:** 71 | - Pass images through a ResNet‑18 feature extractor (all layers except final FC). 72 | - Obtain 512‑dimensional embeddings. 73 | - Standardize embeddings with `StandardScaler`. 74 | 75 | --- 76 | 77 | ## Methodology 78 | 79 | ### Classical ML Approaches 80 | 81 | 1. **HOG + SVM** 82 | - Feature engineering: Histogram of Oriented Gradients on each RGB image. 83 | - Classifier: Support Vector Machine (`sklearn.svm.SVC`) with RBF kernel. 84 | - Preprocessing: `StandardScaler` on HOG vectors. 85 | - Hyperparameter tuning: 86 | - GridSearchCV on a subset of 5,000 training images. 87 | - Search space: `C ∈ {1, 10}`, `gamma = 'scale'`. 88 | - 3‑fold cross‑validation. 89 | - Final model retrained on all HOG features using the best hyperparameters. 90 | 91 | 2. **k‑NN on Raw Pixels** 92 | - Features: Flattened pixel vectors (3072 dimensions). 93 | - Preprocessing: `StandardScaler` on pixels. 94 | - Classifier: `KNeighborsClassifier` (scikit‑learn). 95 | - Hyperparameter tuning: 96 | - GridSearchCV on a 5,000‑sample subset. 97 | - Search space: `n_neighbors ∈ {3, 7}`, `weights ∈ {'distance'}`. 98 | - 3‑fold cross‑validation. 99 | - Final k‑NN trained on all standardized training examples. 100 | 101 | 3. **Classical ML on Deep Features** 102 | - Features: 512‑D embeddings from the trained ResNet‑18 feature extractor. 103 | - Preprocessing: `StandardScaler`. 104 | - Models: 105 | - SVM (RBF kernel, `C = 5`). 106 | - k‑NN (`n_neighbors = 5`, `weights = 'distance'`). 107 | 108 | This “hybrid” setting evaluates how much classical performance improves when given high‑level deep representations. 109 | 110 | ### Deep Learning Architectures Implemented 111 | 112 | 1. **ResNet‑18 (Transfer Learning, PyTorch)** 113 | - Base model: `torchvision.models.resnet18(weights='DEFAULT')`. 114 | - Adaptation: 115 | - Replace final FC layer with `Linear(512 → 10)` for CIFAR‑10. 116 | - Move model to GPU (`cuda`) when available. 117 | - Training: 118 | - Loss: `nn.CrossEntropyLoss`. 119 | - Optimizer: `Adam` with learning rate `3e‑4`. 120 | - Scheduler: `ReduceLROnPlateau` (monitor validation loss). 121 | - Epochs: up to 25, with best‑model checkpointing. 122 | - Data: Training loader with augmentation; validation loader without augmentation. 123 | 124 | 2. **FeatureExtractor Module** 125 | - Wraps trained ResNet‑18 and removes final classification layer. 126 | - Outputs 512‑D embeddings used as input to classical models. 127 | 128 | ### Hyperparameter Tuning Strategy 129 | 130 | - **Classical models:** 131 | - Use `GridSearchCV` with 3‑fold CV on a manageable subset of the training data to reduce computation. 132 | - After selecting best hyperparameters, retrain models on full training data. 133 | 134 | - **Deep learning:** 135 | - Learning rate (`3e‑4`) and number of epochs (25) selected based on validation loss curves. 136 | - `ReduceLROnPlateau` automatically decreases LR when validation loss stops improving. 137 | - Early stopping implemented via “best validation loss” checkpoint. 138 | 139 | --- 140 | 141 | ## Results & Analysis 142 | 143 | Following are the results of some of the classes. 144 | 145 | 146 | - cars 147 | 148 | 149 | ![images/gen_cars.png](Images/cars.png) 150 | 151 | - Frogs 152 | 153 | 154 | ![images/frogs_gen.png](Images/frogs.png) 155 | 156 | - trucks 157 | 158 | 159 | ![images/trucks_gen.png](Images/trucks.png) 160 | 161 | 162 | 163 | ### Performance Comparison (Test Set) 164 | 165 | | Model | Precision | Recall | F1-Score |ROC AUC| 166 | |------------------|-----------|--------|----------|--------| 167 | | Artificial Neural Network (ANN) | 0.862| 0.87 | 0.865 |0.9894| 168 | | Support Vector Machine (SVM) | 0.861 | 0.86 | 0.864 |0.9813| 169 | | k-Nearest Neighbors (k-NN) | 0.860 | 0.83 | 0.861 |0.9565| 170 | 171 | 172 | 173 | Interpretation: 174 | - ResNet‑18 achieves the strongest overall performance. 175 | - Classical models on deep features closely approach (or sometimes match) ResNet‑18 accuracy, confirming that good representations matter more than the classifier itself. 176 | - Classical models on hand‑crafted or raw features lag behind, especially on fine‑grained classes like cats vs dogs. 177 | 178 | ### Visualizations 179 | 180 | The notebook generates several visual aids: 181 | 182 | - **Confusion matrices** for: 183 | - ResNet‑18. 184 | - SVM on deep features. 185 | - k‑NN on deep features. 186 | - **Per‑class precision/recall/F1 plots** using the `classification_report`. 187 | - Training curves for ResNet‑18: 188 | - Validation accuracy vs epoch. 189 | - Validation loss vs epoch. 190 | 191 | These visualizations help identify: 192 | 193 | - Which classes are most frequently confused (e.g., truck vs car). 194 | - Whether the model is under‑ or over‑fitting. 195 | 196 | 197 | ### Business Impact Analysis 198 | 199 | From a practical perspective: 200 | 201 | - **Higher accuracy** directly translates to fewer misclassifications. 202 | - On a 5,000‑image test set, a 5% accuracy improvement = 250 fewer wrong decisions. 203 | - In real applications (e.g., defect detection, product tagging, content filtering), this can mean: 204 | - Fewer false positives → less manual review and customer frustration. 205 | - Fewer false negatives → fewer critical errors slipping through. 206 | - **Cost vs benefit:** 207 | - Classical models are simpler to train but plateau at lower accuracy when using raw features. 208 | - ResNet‑18 (and classical models on deep features) require GPU resources and longer training but offer a clear performance gain that is likely worth the cost in most high‑value applications. 209 | 210 | Overall, the project shows that deep learning (and hybrid deep‑feature pipelines) provide a **meaningful business advantage** over pure classical baselines on complex vision tasks. 211 | 212 | --- 213 | 214 | ## Conclusion & Future Work 215 | 216 | ### Conclusion 217 | 218 | - Classical models like HOG + SVM and k‑NN provide useful baselines, but their performance on CIFAR‑10 is limited when relying on raw or hand‑crafted features. 219 | - Fine‑tuning a modern deep architecture (ResNet‑18) yields substantially better results, thanks to end‑to‑end representation learning. 220 | - When classical models are trained on **deep features** extracted from ResNet‑18, their performance increases dramatically and can approach that of the deep model itself. 221 | - This demonstrates that **representations matter more than the classifier**, and that classical ML can still play an important role when paired with deep feature extractors. 222 | 223 | ### Future Work 224 | 225 | Possible extensions include: 226 | 227 | - Trying deeper or more recent architectures (e.g., ResNet‑50, DenseNet, or Vision Transformers). 228 | - Performing more systematic hyperparameter tuning for classical models and the ResNet fine‑tuning process. 229 | - Exploring other feature extraction strategies (e.g., self‑supervised learning, contrastive learning) for classical models. 230 | - Investigating model compression / distillation to deploy smaller models with near‑ResNet performance. 231 | - Extending the comparison to other datasets (e.g., CIFAR‑100, Tiny ImageNet) to generalize findings. 232 | 233 | --- 234 | 235 | ## References 236 | 237 | - Krizhevsky, A. (2009). *Learning Multiple Layers of Features from Tiny Images*. CIFAR‑10 Dataset. 238 | - He, K., Zhang, X., Ren, S., & Sun, J. (2016). *Deep Residual Learning for Image Recognition*. CVPR. 239 | - Paszke, A., et al. (2019). *PyTorch: An Imperative Style, High‑Performance Deep Learning Library*. NeurIPS. 240 | - Pedregosa, F., et al. (2011). *Scikit‑learn: Machine Learning in Python*. JMLR. 241 | - Official PyTorch documentation: https://pytorch.org/ 242 | - Official scikit‑learn documentation: https://scikit-learn.org/ 243 | - CIFAR‑10 dataset description: https://www.cs.toronto.edu/~kriz/cifar.html 244 | - Very helpful github repository https://github.com/Hydrino/ACGAN_cifar10/tree/master 245 | --------------------------------------------------------------------------------