└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Multi-class-Classification-Logistic-Regression 2 | 3 | The notebook is a complete **hands-on lab** for performing **multi-class classification** using a real-world dataset related to **obesity levels**. Here's a structured breakdown of its content: 4 | 5 | --- 6 | 7 | ### 🔍 **Main Purpose** 8 | To demonstrate how to implement **multi-class classification strategies** in Python using scikit-learn on a labeled dataset about obesity. 9 | 10 | --- 11 | 12 | ### 📂 **Dataset Used** 13 | - File: https://www.kaggle.com/datasets/ezzaldeenesmail/obesitydataset-raw-and-data-sinthetic 14 | - Loaded with: `pandas.read_csv()` 15 | - Target column: `NObeyesdad` (which represents obesity categories) 16 | 17 | --- 18 | 19 | ### 📌 **Notebook Structure** 20 | 21 | #### 1. **Setup and Imports** 22 | ```python 23 | import numpy as np 24 | import pandas as pd 25 | import seaborn as sns 26 | import matplotlib.pylab as plt 27 | from sklearn.model_selection import train_test_split 28 | from sklearn.linear_model import LogisticRegression 29 | from sklearn.preprocessing import StandardScaler, OneHotEncoder 30 | from sklearn.multiclass import OneVsOneClassifier 31 | from sklearn.metrics import accuracy_score 32 | ``` 33 | These are the main tools used for: 34 | - Data processing 35 | - Visualization 36 | - Training and evaluating classification models 37 | 38 | --- 39 | 40 | #### 2. **Data Loading & Exploration** 41 | - Load the dataset into a DataFrame 42 | - Display first few records (`data.head()`) 43 | - Visualize target variable distribution: 44 | ```python 45 | sns.countplot(y='NObeyesdad', data=data) 46 | ``` 47 | 48 | --- 49 | 50 | #### 3. **Preprocessing** 51 | - Apply **One-Hot Encoding** for categorical variables 52 | - Use **StandardScaler** to normalize numerical features 53 | - Split data into **training** and **testing sets** 54 | 55 | --- 56 | 57 | #### 4. **Modeling** 58 | - Implements a **Logistic Regression** model using: 59 | - **One-vs-Rest (OvR)** strategy 60 | - **One-vs-One (OvO)** strategy 61 | 62 | ```python 63 | # Example: 64 | model = LogisticRegression() 65 | ovo = OneVsOneClassifier(model) 66 | ovo.fit(X_train, y_train) 67 | ``` 68 | 69 | --- 70 | 71 | #### 5. **Evaluation** 72 | - Evaluate model performance using **accuracy_score** 73 | - Compare results from OvO and OvR classifiers 74 | 75 | --- 76 | 77 | #### 6. **Conclusion & Summary** 78 | - Analyzes which strategy works better for the dataset 79 | - Shows final accuracy and insights 80 | 81 | --- 82 | 83 | Would you like a copy of the dataset used, a visual chart of the workflow, or a restructured version of the notebook for learning purposes? 84 | --------------------------------------------------------------------------------