├── README.md
├── Screenshot 2025-04-11 011919.png
├── Screenshot 2025-04-11 013724.png
├── Screenshot 2025-04-11 015559.png
├── Screenshot 2025-04-11 041749.png
├── Screenshot 2025-04-11 041802.png
├── Screenshot 2025-04-11 042816.png
├── Screenshot 2025-04-12 004141.png
├── Screenshot 2025-04-12 005726.png
├── Screenshot 2025-04-12 015610.png
├── Screenshot 2025-04-12 022721.png
├── Screenshot 2025-04-12 022738.png
├── Screenshot 2025-04-12 022803.png
├── Screenshot 2025-04-12 022836.png
└── personalizedLearningDataset.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries
 2 | This project analyzes a **Personalized Learning Dataset** to understand student behavior, academic engagement, and dropout likelihood using Python. It uses data science techniques to extract insights and develop an interactive dashboard for educators to monitor and support student performance.
 3 | 
 4 | 🔍 Objective: The primary goals of this project include:
 5 | 
 6 | - Performing Exploratory Data Analysis (EDA)
 7 | - Creating a custom Engagement Score
 8 | - Studying learning styles and academic success
 9 | - Testing statistical hypotheses (T-Test, Chi-Square)
10 | - Identifying key dropout factors
11 | - Designing an interactive dashboard for educators
12 | 
13 | 🧠 Technologies & Libraries Used: 
14 | 
15 | - Python
16 | - **Pandas** – Data manipulation
17 | - **Matplotlib & Seaborn** – Data visualization
18 | - **Plotly** – Interactive graphing
19 | - **Dash** – Web-based dashboard creation
20 | - **SciPy** – Statistical analysis
21 | 
22 | 📂 Dataset
23 | 
24 | - **Source**: [Kaggle - Personalized Learning and Adaptive Education Dataset](https://www.kaggle.com/datasets/adilshamim8/personalized-learning-and-adaptive-education-dataset)
25 | - **Size**: ~10,000 rows and 15 columns
26 | - **Columns** include:
27 |   - Student_ID
28 |   - Age
29 |   - Gender
30 |   - Education_Level
31 |   - Time_Spent_on_Videos
32 |   - Quiz_Attempts
33 |   - Quiz_Score
34 |   - Forum_Participation
35 |   - Assignment_Completion_Rate
36 |   - Learning_Style
37 |   - Dropout_Likelihood
38 | 


--------------------------------------------------------------------------------
/Screenshot 2025-04-11 011919.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-11 011919.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-11 013724.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-11 013724.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-11 015559.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-11 015559.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-11 041749.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-11 041749.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-11 041802.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-11 041802.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-11 042816.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-11 042816.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 004141.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 004141.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 005726.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 005726.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 015610.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 015610.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 022721.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 022721.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 022738.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 022738.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 022803.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 022803.png


--------------------------------------------------------------------------------
/Screenshot 2025-04-12 022836.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jyotsnak1603/Data-Visualisation-on-Personalized-Leaning-Dataset-using-Python-libraries/b34e95ce3abe592c5b1bd3feb3e92c377497dc34/Screenshot 2025-04-12 022836.png


--------------------------------------------------------------------------------
/personalizedLearningDataset.py:
--------------------------------------------------------------------------------
  1 | #Personalized Learning Dataset
  2 | 
  3 | import pandas as pd
  4 | import matplotlib.pyplot as plt
  5 | import seaborn as sns
  6 | from scipy.stats import ttest_ind, chi2_contingency
  7 | import scipy.stats as stats
  8 | import dash
  9 | from dash import dcc, html
 10 | import plotly.express as px
 11 | 
 12 | #Loading the dataset
 13 | 
 14 | df = pd.read_csv(r"C:\Users\muska\OneDrive\Desktop\personalized_learning_dataset (1).csv")
 15 | 
 16 | 
 17 | #OBJECTIVE -1: Analyze and Preprocess data by performing Exploratory Data Analysis
 18 | 
 19 | #Displaying basic info
 20 | print(df.head())
 21 | print(df.info())
 22 | 
 23 | missing_values = df.isnull().sum()
 24 | print("Missing Values:\n", missing_values)
 25 | 
 26 | duplicate_rows = df.duplicated().sum()
 27 | print("Duplicate Rows:", duplicate_rows)
 28 | 
 29 | #Summary statistics for numerical values
 30 | numerical_summary = df.describe()
 31 | print("Numerical Summary: \n", numerical_summary)
 32 | 
 33 | # Summary for categorical columns
 34 | categorical_summary = df.describe(include=["object"])
 35 | print("Categorical Summary:\n", categorical_summary)
 36 | 
 37 | #Plot heatmap of correlations
 38 | plt.figure(figsize=(10, 6))
 39 | 
 40 | correlation_matrix = df.select_dtypes(include=['float64', 'int64']).corr()
 41 | sns.heatmap(correlation_matrix, annot=True)
 42 | plt.title("Correlation Matrix of Numerical Features")
 43 | plt.show()
 44 | 
 45 | 
 46 | #OBJECTIVE -2: Develop a Personalized Engagement Score
 47 | df["Engagement_Score"] = (0.4 * df["Time_Spent_on_Videos"] +
 48 |                           0.3 * df["Forum_Participation"] +
 49 |                           0.3 * df["Assignment_Completion_Rate"])
 50 | print("Engagement Score: \n", df[["Student_ID", "Engagement_Score"]].head())
 51 | #Scatter plot
 52 | sns.scatterplot(x="Engagement_Score", y="Final_Exam_Score", data=df, hue="Dropout_Likelihood", palette="coolwarm")
 53 | plt.title("Engagement Score vs Final Exam Score")
 54 | plt.xlabel("Engagement Score")
 55 | plt.ylabel("Final Exam Score")
 56 | plt.grid(True)
 57 | plt.tight_layout()
 58 | plt.show()
 59 | 
 60 | # Plot histograms of numerical features
 61 | sns.set(style="whitegrid")
 62 | 
 63 | numerical_columns = [
 64 |     'Age', 'Time_Spent_on_Videos', 'Quiz_Attempts', 'Quiz_Scores',
 65 |     'Forum_Participation', 'Assignment_Completion_Rate',
 66 |     'Final_Exam_Score', 'Feedback_Score'
 67 | ]
 68 | 
 69 | df[numerical_columns].hist(figsize=(16, 10), bins=20, edgecolor='black')
 70 | plt.suptitle("Histograms of Numerical Features", fontsize=16, fontweight='bold')
 71 | plt.tight_layout(rect=[0, 0.03, 1, 0.95])
 72 | plt.show()
 73 | 
 74 | ####################################################################################
 75 | 
 76 | #OBJECTIVE -3: Analyze the Impact of Learning Styles on Academic Success
 77 | #Box plot
 78 | plt.figure(figsize=(10, 6))
 79 | sns.set_style("whitegrid")
 80 | sns.boxplot(x="Learning_Style", y="Final_Exam_Score", data=df, hue="Learning_Style", palette="Set2", legend=False)
 81 | plt.title("Box Plot - Final Exam Scores by Learning Style", fontsize=16)
 82 | plt.xlabel("Learning Style", fontsize=12)
 83 | plt.ylabel("Final Exam Score", fontsize=12)
 84 | 
 85 | plt.tight_layout()
 86 | plt.show()
 87 | 
 88 | #Bar plot
 89 | learning_style_avg = df.groupby("Learning_Style")["Final_Exam_Score"].mean().reset_index()
 90 | 
 91 | plt.figure(figsize=(8, 5))
 92 | sns.barplot(x="Learning_Style", y="Final_Exam_Score", data=learning_style_avg, hue="Learning_Style", palette="Blues_d", legend=False)
 93 | 
 94 | plt.title("Bar Plot - Average Final Exam Score per Learning Style", fontsize=15)
 95 | plt.xlabel("Learning Style")
 96 | plt.ylabel("Average Final Exam Score")
 97 | 
 98 | plt.tight_layout()
 99 | plt.show()
100 | 
101 | ####################################################################################
102 | 
103 | #OBJECTIVE -4: Hypothesis Testing - Chi Square Test and T-Test
104 | 
105 | #T-Test -> Do Highly Engaged Students Perform Better?
106 | 
107 | df["Engagement_Category"] = pd.qcut(df["Engagement_Score"], q=3, labels=["Low", "Medium", "High"])
108 | 
109 | low_engagement = df[df["Engagement_Category"] == "Low"]["Final_Exam_Score"]
110 | high_engagement = df[df["Engagement_Category"] == "High"]["Final_Exam_Score"]
111 | 
112 | 
113 | t_stat, p_value = ttest_ind(low_engagement, high_engagement, equal_var=False)
114 | 
115 | print(f"T-Statistic: {t_stat:.2f}")
116 | print(f"P-Value: {p_value:.5f}\n")
117 | 
118 | if p_value < 0.05:
119 |     print("Reject H0: High engagement students perform significantly better.")
120 | else:
121 |     print("Fail to Reject H0: No significant difference in performance.")
122 | 
123 | 
124 | #Chi-Square Test -> Does Engagement Level Affect Dropout?
125 | 
126 | # Convert Engagement Score into categories (Low, Medium, High)
127 | df["Engagement_Category"] = pd.qcut(df["Engagement_Score"], q=3, labels=["Low", "Medium", "High"])
128 | 
129 | # Contingency table
130 | contingency_table = pd.crosstab(df["Engagement_Category"], df["Dropout_Likelihood"])
131 | 
132 | # Perform Chi-Square Test
133 | chi2_stat, p_value, dof, expected = stats.chi2_contingency(contingency_table)
134 | 
135 | print(f"Chi-Square Statistic: {chi2_stat:.2f}")
136 | print(f"P-Value: {p_value:.5f}\n")
137 | 
138 | if p_value < 0.05:
139 |     print("Reject H0: Engagement Level significantly affects Dropout Likelihood.")
140 | else:
141 |     print("Fail to Reject H0: No significant relationship between Engagement Level and Dropout.")
142 | 
143 | print(f"Chi-Square Statistic: {chi2_stat:.2f}")
144 | print(f"P-Value: {p_value:.5f}")
145 | 
146 | #Grouped bar chart
147 | contingency_table.plot(kind='bar', figsize=(9, 6), colormap='viridis')
148 | plt.title("Dropout Likelihood by Engagement Level", fontsize=16)
149 | plt.xlabel("Engagement Level", fontsize=12)
150 | plt.ylabel("Number of Students", fontsize=12)
151 | plt.xticks(rotation=0)
152 | plt.legend(title="Dropout Likelihood")
153 | plt.grid(axis='y', linestyle='--', alpha=0.7)
154 | plt.tight_layout()
155 | plt.show()
156 | 
157 | #OBJECTIVE -5: Identify key factors influencing Dropout Likelihood using charts and plots
158 | 
159 | #Distribution Analysis (Box plot)
160 | sns.set(style="whitegrid")
161 | 
162 | important_features = [
163 |     'Time_Spent_on_Videos',
164 |     'Assignment_Completion_Rate',
165 |     'Forum_Participation',
166 |     'Final_Exam_Score'
167 | ]
168 | 
169 | fig, axes = plt.subplots(1, 4, figsize=(16, 6))
170 | 
171 | sns.boxplot(y=df['Time_Spent_on_Videos'], ax=axes[0], color='skyblue')
172 | axes[0].set_title("Time Spent on Videos")
173 | axes[0].set_xticks([])
174 | 
175 | sns.boxplot(y=df['Assignment_Completion_Rate'], ax=axes[1], color='lightgreen')
176 | axes[1].set_title("Assignment Completion Rate")
177 | axes[1].set_xticks([])
178 | 
179 | sns.boxplot(y=df['Forum_Participation'], ax=axes[2], color='salmon')
180 | axes[2].set_title("Forum Participation")
181 | axes[2].set_xticks([])
182 | 
183 | sns.boxplot(y=df['Final_Exam_Score'], ax=axes[3], color='plum')
184 | axes[3].set_title("Final Exam Score")
185 | axes[3].set_xticks([])
186 | 
187 | plt.suptitle("Boxplots of Key Features", fontsize=16, fontweight='bold')
188 | plt.tight_layout(rect=[0, 0.03, 1, 0.90])
189 | plt.show()
190 | 
191 | # Visualization of categorical features
192 | sns.set(style="whitegrid")
193 | categorical_features = ['Gender', 'Education_Level', 'Learning_Style', 'Dropout_Likelihood']
194 | plt.figure(figsize=(16, 10))
195 | plt.subplot(2, 2, 1)
196 | sns.countplot(x='Gender', data=df, hue="Gender", palette='pastel', legend=False)
197 | plt.title('Gender Distribution')
198 | 
199 | plt.subplot(2, 2, 2)
200 | sns.countplot(x='Education_Level', data=df, hue="Education_Level", palette='Set2', legend=False)
201 | plt.title('Education Level Distribution')
202 | plt.xticks(rotation=30)
203 | 
204 | plt.subplot(2, 2, 3)
205 | sns.countplot(x='Learning_Style', data=df, hue="Learning_Style", palette='Set3', legend=False)
206 | plt.title('Learning Style Distribution')
207 | plt.xticks(rotation=30)
208 | 
209 | plt.subplot(2, 2, 4)
210 | sns.countplot(x='Dropout_Likelihood', data=df, hue="Dropout_Likelihood", palette='coolwarm', legend=False)
211 | plt.title('Dropout Likelihood Distribution')
212 | 
213 | plt.suptitle("Distribution of Categorical Features", fontsize=16, fontweight='bold')
214 | plt.tight_layout(rect=[0, 0.03, 1, 0.95])
215 | plt.show()
216 | 
217 | #Develop an Interactive Dashboard for Educators
218 | 
219 | # PIE CHART – Learning Style Distribution
220 | pie_chart = px.pie(
221 |     df, names="Learning_Style", title="Distribution of Learning Styles",
222 |     color_discrete_sequence=px.colors.sequential.RdBu
223 | )
224 | 
225 | # DONUT CHART – Dropout Distribution
226 | donut_chart = px.pie(
227 |     df, names="Dropout_Likelihood", title="Dropout Likelihood (Yes/No)",
228 |     hole=0.5, color_discrete_sequence=px.colors.sequential.Mint
229 | )
230 | 
231 | # Dash app layout
232 | app = dash.Dash(__name__)
233 | 
234 | # Define a dark background style
235 | dark_style = {
236 |     'backgroundColor': '#1e1e2f',
237 |     'color': '#f0f0f0',
238 |     'fontFamily': 'Arial',
239 |     'padding': '20px'
240 | }
241 | 
242 | app.layout = html.Div(style={'backgroundColor': dark_style['backgroundColor']}, children=[
243 |     html.H1("Personalized Learning Dashboard", style={
244 |         'textAlign': 'center',
245 |         'color': dark_style['color'],
246 |         'paddingTop': '20px'
247 |     }),
248 | 
249 |     html.Div([
250 |         dcc.Graph(figure=pie_chart)
251 |     ], style={'width': '48%', 'display': 'inline-block', 'padding': '10px'}),
252 | 
253 |     html.Div([
254 |         dcc.Graph(figure=donut_chart)
255 |     ], style={'width': '48%', 'display': 'inline-block', 'padding': '10px'}),
256 | 
257 |     html.Footer("Developed by Jyotsna Chaudhary", style={
258 |         'textAlign': 'center',
259 |         'padding': '20px',
260 |         'color': dark_style['color'],
261 |         'fontSize': '14px'
262 |     })
263 | ])
264 |     
265 | if __name__ == '__main__':
266 |     app.run_server(debug=True)
267 | 


--------------------------------------------------------------------------------