├── 01-TMDB-Dataset-Analysis ├── README.md ├── TMDB-dataset-analysis.ipynb └── tmdb-movies.csv ├── 02-Auto-MPG-Dataset-Analysis ├── README.md ├── auto-mpg.csv └── mpg-dataset-analysis.ipynb ├── 03-Medical-Appointment-No-Show ├── README.md ├── medical-appointment-dataset-analysis.ipynb └── noshowappointments-kagglev2-may-2016.csv ├── 04-9000+-Movies-Dataset-Analysis ├── 9000-movies-dataset-analysis.ipynb ├── README.md └── mymoviedb.csv ├── 05-Wine-Quality-Dataset ├── README.md ├── wine-quality-analysis.ipynb ├── wineQualityReds.csv ├── wineQualityWhites.csv └── wine_full.csv ├── 06-Query-a-Digital-Music-Store-Database ├── Chinook-SQL-Project-Report.pdf ├── Chinook-SQL-Queries.sql ├── README.md └── img │ ├── q1.png │ ├── q2.png │ ├── q3.png │ └── q4.png ├── 07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company ├── README.md ├── SSBC-Data-Model.png ├── SSBC-Project.pbix ├── SSBC-Report-Tab1.jpg ├── SSBC-Report-Tab2.jpg ├── SSBC-Report.pdf └── Source Files │ ├── CFO Metrics Tracker.xlsx │ ├── Customer List (as of FY2021).txt │ ├── Monthly Sales Logs │ ├── SSBC - Apr 2021 Sales.xlsx │ ├── SSBC - Aug 2021 Sales.xlsx │ ├── SSBC - Dec 2020 Sales.xlsx │ ├── SSBC - Feb 2021 Sales.xlsx │ ├── SSBC - Jan 2021 Sales.xlsx │ ├── SSBC - Jul 2021 Sales.xlsx │ ├── SSBC - Jun 2021 Sales.xlsx │ ├── SSBC - Mar 2021 Sales.xlsx │ ├── SSBC - May 2021 Sales.xlsx │ ├── SSBC - Nov 2020 Sales.xlsx │ ├── SSBC - Oct 2020 Sales.xlsx │ └── SSBC - Sep 2021 Sales.xlsx │ ├── SSBC Product Offerings.pdf │ └── USD-CAD Exchange Rates.csv ├── 08-Building-Power-BI-Report-for-Waggle ├── README.md ├── Waggle-Project.pbix ├── Waggle-Theme.json ├── Waggle-color-palette.png ├── Waggle-dashboard │ ├── Waggle-Project.pdf │ ├── Waggle-tab1.jfif │ ├── Waggle-tab2.jfif │ └── Waggle-tab3.jfif ├── Waggle-data-model.png ├── Waggle-datasets.xlsx └── marketing_collateral │ ├── cat_face_icon_blue.png │ ├── cat_face_icon_gray.png │ ├── cat_face_icon_green.png │ ├── cat_face_icon_pink.png │ ├── cat_face_icon_teal.png │ ├── cat_face_icon_violet.png │ ├── cat_face_icon_yellow.png │ ├── color_palette.png │ ├── dog_face_icon_blue.png │ ├── dog_face_icon_gray.png │ ├── dog_face_icon_green.png │ ├── dog_face_icon_pink.png │ ├── dog_face_icon_teal.png │ ├── dog_face_icon_violet.png │ ├── dog_face_icon_yellow.png │ ├── lapcat_logo_blue_background.png │ ├── lapcat_logo_green_background.png │ ├── lapcat_logo_pink_background.png │ ├── lapcat_logo_teal_background.png │ ├── lapcat_logo_transparent_blue.png │ ├── lapcat_logo_transparent_darker_gray.png │ ├── lapcat_logo_transparent_gray.png │ ├── lapcat_logo_transparent_green.png │ ├── lapcat_logo_transparent_pink.png │ ├── lapcat_logo_transparent_teal.png │ ├── lapcat_logo_transparent_violet.png │ ├── lapcat_logo_transparent_yellow.png │ ├── lapcat_logo_violet_background.png │ ├── lapcat_logo_white_transparent_blue.png │ ├── lapcat_logo_white_transparent_green.png │ ├── lapcat_logo_white_transparent_pink.png │ ├── lapcat_logo_white_transparent_teal.png │ ├── lapcat_logo_white_transparent_violet.png │ ├── lapcat_logo_white_transparent_wine.png │ ├── lapcat_logo_white_transparent_yellow.png │ ├── lapcat_logo_yellow_background.png │ ├── lapdog_logo_blue_background.png │ ├── lapdog_logo_green_background.png │ ├── lapdog_logo_pink_background.png │ ├── lapdog_logo_teal_background.png │ ├── lapdog_logo_transparent_blue.png │ ├── lapdog_logo_transparent_darker_gray.png │ ├── lapdog_logo_transparent_gray.png │ ├── lapdog_logo_transparent_green.png │ ├── lapdog_logo_transparent_pink.png │ ├── lapdog_logo_transparent_teal.png │ ├── lapdog_logo_transparent_violet.png │ ├── lapdog_logo_transparent_yellow.png │ ├── lapdog_logo_violet_background.png │ ├── lapdog_logo_white_transparent_blue.png │ ├── lapdog_logo_white_transparent_green.png │ ├── lapdog_logo_white_transparent_pink.png │ ├── lapdog_logo_white_transparent_teal.png │ ├── lapdog_logo_white_transparent_violet.png │ ├── lapdog_logo_white_transparent_yellow.png │ ├── lapdog_logo_yellow_background.png │ ├── waggle_logo_black.png │ ├── waggle_logo_blue.png │ ├── waggle_logo_green.png │ ├── waggle_logo_pink.png │ ├── waggle_logo_red.png │ ├── waggle_logo_teal.png │ ├── waggle_logo_violet.png │ ├── waggle_logo_white.png │ ├── waggle_logo_wine.png │ └── waggle_logo_yellow.png ├── 09-Market-Analysis-Report-for-National-Clothing-Chain ├── Data-Source │ ├── census-data.xlsx │ ├── customer-list.xlsx │ ├── purchase-list.xlsx │ └── state-list.xlsx ├── National-Clothing-Chain-Data-Model.png ├── National-Clothing-Chain-Project.pbix ├── National-Clothing-Chain-Report.pdf ├── National-Clothing-Chain-Summary.doc ├── README.md └── img │ ├── avg-income-by-state.png │ ├── customer-return-rate.png │ ├── customers-by-income.png │ ├── predicted-income-by-state.png │ ├── product-by-price.png │ ├── product-instock.png │ ├── product-recomm.png │ └── sales-income-corr.png ├── 10-Coursera-Sales-Analysis-in-Power-BI-Guided-Project └── README.md ├── 11-dyslexia-and-music-notes-paper-analysis └── README.md └── README.md /01-TMDB-Dataset-Analysis/README.md: -------------------------------------------------------------------------------- 1 | # TMDB Movies Dataset Analysis 2 | ### Udacity Become a Data Analyst Nanodegree | Project 2 3 | 4 | | Contents | 5 | | -------- | 6 | | [Dataset Description](#Dataset-Description) | 7 | | [Columns Descreption](#Columns-Descreption) | 8 | | [Questions for Analysis](#Questions-for-Analysis) | 9 | | [Data Wrangling](#Data-Wrangling) | 10 | | [Data Cleaning](#Data-Cleaning) | 11 | | [Exploratory Data Analysis](#Exploratory-Data-Analysis) | 12 | | [Built with](#Built-with) | 13 | 14 | ## Dataset Description: 15 | This data set contains information about 10,000 movies extracted from [TMDB](https://www.themoviedb.org/). The dataset contains movies from 1960 to 2015. Including user ratings and revenue. Original data from [Kaggle](https://www.kaggle.com/tmdb/tmdb-movie-metadata) 16 | 17 | ## Columns Descreption: 18 | - `id, imdb_id`: unique id or imdb id for each movie on TMDB 19 | - `popularity`: a metric used to measure the popularity of the movie. 20 | - `budget`:the total budget of the moviein USD. 21 | - `revenue`:the total revenue of the movie in USD. 22 | - `original_title`: the original title of the movie. 23 | - `cast`:the names of the cast of the movie separated by "|". 24 | - `homepage`: the website of the movie (if it existed). 25 | - `director`:name(s) of the director(s) of the movie (separated by "|" if there are more than one director). 26 | - `tagline`:a catchphrase describing the movie. 27 | - `keywords`: keywords related to the movie. 28 | - `overview`:summary of the plot of the movie. 29 | - `runtime`:total runtime of the movie in minutes. 30 | - `genres`: genres of the movie separated by "|". 31 | - `production_companies`:production compan(y/ies) of the movie. 32 | - `release_date`:release date of the movie. 33 | - `vote_count`:number of voters of te movie. 34 | - `vote_average`:the average user rating of the movie 35 | - `release_year`:release year of the movie (from 1960 to 2015) 36 | - `budget_adj`:the total budget of the moviein USD in terms of 2010 dollars, accounting for inflation over time. 37 | - `revenue_adj`:the total budget of the movie in USD in terms of 2010 dollars, accounting for inflation over time. 38 | 39 | ## Questions for Analysis: 40 | - Do movies with high popularity achive high revenvue? 41 | - What are the most filmed genres in this whole dataset? 42 | - Is there a correlation between a movie budget and its revenue? 43 | 44 | ## Data Wrangling: 45 | Our data can be found on `tmdb-movies.csv` file provided on this repository. It is an edited version of the original Kaggle's [TMDB 5000 Movie Dataset](https://www.kaggle.com/tmdb/tmdb-movie-metadata) provided by Udacity on the Become a Data Analyst Nanodegree Program. 46 | 47 | ## Data Cleaning: 48 | **Main Observations:** 49 | 1. Our dataset consisted of a total of 10866 rows and 21 columns. 50 | 2. We had only 1 duplicated row which had been dropped. 51 | 3. Some columns wont be useful in answering our questions so they were dropped. 52 | 4. Few columns had many missing values that needed to be handled. 53 | 5. Columns `cast` `director` `genre` had values saperated with a '|'. 54 | 6. `release_date`'s data type needed to be casted. 55 | 7. We could append a column for the movie `profit` using the formula: $profit = revenue - budget$. 56 | 8. `vote_average` better be presented as a catecorical variable that groubs multible ratings values. 57 | 9. We might also catigorize `profit` column for better EDA 58 | 59 | ## Exploratory Data Analysis: 60 | After finishing our dataset cleaning, we endded up with a total of 10840 records and 10 columns. The dataset now has no duplicates nor null values, and the data types are consistant with suitable categorical variable to address our questions. 61 | We then perfomed some analytics and created some visualizations to answer our targeted questions. 62 | ### Q1: Do movies with high popularity achive high revenvue? 63 | > More popular movies recieve way more revenue than the less popular movies. 64 | 65 | ### Q2: What are the most filmed genres in this whole dataset? 66 | > - `Drama`, `Comedy` and `Action` are the most three filmed genres in total of 10839 movies in our dataset. 67 | > - `Drama` genre alone is filmed 22.6% of the times on our dataset. 68 | 69 | ### Q3: Is there a correlation between a movie budget and its revenue? 70 | > There is positive correlation between `budget` and `revenue`, indecating a relation between them with little outliers found. 71 | 72 | ## Built with: 73 | - JupyterLab 74 | - Python3 75 | - Pandas 76 | - Numpy 77 | -------------------------------------------------------------------------------- /01-TMDB-Dataset-Analysis/TMDB-dataset-analysis.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"Kaggle\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "# Project: Investigate a Dataset - [TMDB movie data]\n", 15 | "\n", 16 | "## Table of Contents\n", 17 | "\n", 24 | "\n", 25 | "___" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": { 31 | "tags": [] 32 | }, 33 | "source": [ 34 | "\n", 35 | "## Introduction\n", 36 | "\n", 37 | "### Dataset Description \n", 38 | "This data set contains information about 10,000 movies extracted from [TMDB](https://www.themoviedb.org/). The dataset contains movies from 1960 to 2015. Including user ratings and revenue. Original data from [Kaggle](https://www.kaggle.com/tmdb/tmdb-movie-metadata)\n", 39 | "\n", 40 | "### Columns Descreption:\n", 41 | "- `id, imdb_id`: unique id or imdb id for each movie on TMDB\n", 42 | "- `popularity`: a metric used to measure the popularity of the movie.\n", 43 | "- `budget`:the total budget of the moviein USD.\n", 44 | "- `revenue`:the total revenue of the movie in USD.\n", 45 | "- `original_title`: the original title of the movie.\n", 46 | "- `cast`:the names of the cast of the movie separated by \"|\".\n", 47 | "- `homepage`: the website of the movie (if it existed).\n", 48 | "- `director`:name(s) of the director(s) of the movie (separated by \"|\" if there are more than one director).\n", 49 | "- `tagline`:a catchphrase describing the movie.\n", 50 | "- `keywords`: keywords related to the movie.\n", 51 | "- `overview`:summary of the plot of the movie.\n", 52 | "- `runtime`:total runtime of the movie in minutes.\n", 53 | "- `genres`: genres of the movie separated by \"|\".\n", 54 | "- `production_companies`:production compan(y/ies) of the movie.\n", 55 | "- `release_date`:release date of the movie.\n", 56 | "- `vote_count`:number of voters of te movie.\n", 57 | "- `vote_average`:the average user rating of the movie\n", 58 | "- `release_year`:release year of the movie (from 1960 to 2015)\n", 59 | "- `budget_adj`:the total budget of the moviein USD in terms of 2010 dollars, accounting for inflation over time.\n", 60 | "- `revenue_adj`:the total budget of the movie in USD in terms of 2010 dollars, accounting for inflation over time.\n", 61 | "\n", 62 | "### Questions for Analysis:\n", 63 | "- Do movies with high popularity achive high revenvue?\n", 64 | "- What are the most filmed genres in this whole dataset?\n", 65 | "- Is there a correlation between a movie budget and its revenue?" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": [ 72 | "___\n", 73 | "## Environment set-up" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": 1, 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [ 82 | "# importing lib.\n", 83 | "import pandas as pd\n", 84 | "import numpy as np\n", 85 | "import matplotlib.pyplot as plt\n", 86 | "import seaborn as sns" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": {}, 92 | "source": [ 93 | "\n", 94 | "___\n", 95 | "## Data Wrangling\n", 96 | "In this section, we would load our desired data from a flat csv file using `pandas` to further explore our data." 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": 2, 102 | "metadata": {}, 103 | "outputs": [ 104 | { 105 | "data": { 106 | "text/html": [ 107 | "
\n", 108 | "\n", 121 | "\n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | "
idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortagline...overviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj
0135397tt036961032.9857631500000001513528810Jurassic WorldChris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...http://www.jurassicworld.com/Colin TrevorrowThe park is open....Twenty-two years after the events of Jurassic ...124Action|Adventure|Science Fiction|ThrillerUniversal Studios|Amblin Entertainment|Legenda...6/9/1555626.520151.379999e+081.392446e+09
176341tt139219028.419936150000000378436354Mad Max: Fury RoadTom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...http://www.madmaxmovie.com/George MillerWhat a Lovely Day....An apocalyptic story set in the furthest reach...120Action|Adventure|Science Fiction|ThrillerVillage Roadshow Pictures|Kennedy Miller Produ...5/13/1561857.120151.379999e+083.481613e+08
2262500tt290844613.112507110000000295238201InsurgentShailene Woodley|Theo James|Kate Winslet|Ansel...http://www.thedivergentseries.movie/#insurgentRobert SchwentkeOne Choice Can Destroy You...Beatrice Prior must confront her inner demons ...119Adventure|Science Fiction|ThrillerSummit Entertainment|Mandeville Films|Red Wago...3/18/1524806.320151.012000e+082.716190e+08
3140607tt248849611.1731042000000002068178225Star Wars: The Force AwakensHarrison Ford|Mark Hamill|Carrie Fisher|Adam D...http://www.starwars.com/films/star-wars-episod...J.J. AbramsEvery generation has a story....Thirty years after defeating the Galactic Empi...136Action|Adventure|Science Fiction|FantasyLucasfilm|Truenorth Productions|Bad Robot12/15/1552927.520151.839999e+081.902723e+09
4168259tt28208529.3350141900000001506249360Furious 7Vin Diesel|Paul Walker|Jason Statham|Michelle ...http://www.furious7.com/James WanVengeance Hits Home...Deckard Shaw seeks revenge against Dominic Tor...137Action|Crime|ThrillerUniversal Pictures|Original Film|Media Rights ...4/1/1529477.320151.747999e+081.385749e+09
\n", 271 | "

5 rows × 21 columns

\n", 272 | "
" 273 | ], 274 | "text/plain": [ 275 | " id imdb_id popularity budget revenue \\\n", 276 | "0 135397 tt0369610 32.985763 150000000 1513528810 \n", 277 | "1 76341 tt1392190 28.419936 150000000 378436354 \n", 278 | "2 262500 tt2908446 13.112507 110000000 295238201 \n", 279 | "3 140607 tt2488496 11.173104 200000000 2068178225 \n", 280 | "4 168259 tt2820852 9.335014 190000000 1506249360 \n", 281 | "\n", 282 | " original_title \\\n", 283 | "0 Jurassic World \n", 284 | "1 Mad Max: Fury Road \n", 285 | "2 Insurgent \n", 286 | "3 Star Wars: The Force Awakens \n", 287 | "4 Furious 7 \n", 288 | "\n", 289 | " cast \\\n", 290 | "0 Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi... \n", 291 | "1 Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic... \n", 292 | "2 Shailene Woodley|Theo James|Kate Winslet|Ansel... \n", 293 | "3 Harrison Ford|Mark Hamill|Carrie Fisher|Adam D... \n", 294 | "4 Vin Diesel|Paul Walker|Jason Statham|Michelle ... \n", 295 | "\n", 296 | " homepage director \\\n", 297 | "0 http://www.jurassicworld.com/ Colin Trevorrow \n", 298 | "1 http://www.madmaxmovie.com/ George Miller \n", 299 | "2 http://www.thedivergentseries.movie/#insurgent Robert Schwentke \n", 300 | "3 http://www.starwars.com/films/star-wars-episod... J.J. Abrams \n", 301 | "4 http://www.furious7.com/ James Wan \n", 302 | "\n", 303 | " tagline ... \\\n", 304 | "0 The park is open. ... \n", 305 | "1 What a Lovely Day. ... \n", 306 | "2 One Choice Can Destroy You ... \n", 307 | "3 Every generation has a story. ... \n", 308 | "4 Vengeance Hits Home ... \n", 309 | "\n", 310 | " overview runtime \\\n", 311 | "0 Twenty-two years after the events of Jurassic ... 124 \n", 312 | "1 An apocalyptic story set in the furthest reach... 120 \n", 313 | "2 Beatrice Prior must confront her inner demons ... 119 \n", 314 | "3 Thirty years after defeating the Galactic Empi... 136 \n", 315 | "4 Deckard Shaw seeks revenge against Dominic Tor... 137 \n", 316 | "\n", 317 | " genres \\\n", 318 | "0 Action|Adventure|Science Fiction|Thriller \n", 319 | "1 Action|Adventure|Science Fiction|Thriller \n", 320 | "2 Adventure|Science Fiction|Thriller \n", 321 | "3 Action|Adventure|Science Fiction|Fantasy \n", 322 | "4 Action|Crime|Thriller \n", 323 | "\n", 324 | " production_companies release_date vote_count \\\n", 325 | "0 Universal Studios|Amblin Entertainment|Legenda... 6/9/15 5562 \n", 326 | "1 Village Roadshow Pictures|Kennedy Miller Produ... 5/13/15 6185 \n", 327 | "2 Summit Entertainment|Mandeville Films|Red Wago... 3/18/15 2480 \n", 328 | "3 Lucasfilm|Truenorth Productions|Bad Robot 12/15/15 5292 \n", 329 | "4 Universal Pictures|Original Film|Media Rights ... 4/1/15 2947 \n", 330 | "\n", 331 | " vote_average release_year budget_adj revenue_adj \n", 332 | "0 6.5 2015 1.379999e+08 1.392446e+09 \n", 333 | "1 7.1 2015 1.379999e+08 3.481613e+08 \n", 334 | "2 6.3 2015 1.012000e+08 2.716190e+08 \n", 335 | "3 7.5 2015 1.839999e+08 1.902723e+09 \n", 336 | "4 7.3 2015 1.747999e+08 1.385749e+09 \n", 337 | "\n", 338 | "[5 rows x 21 columns]" 339 | ] 340 | }, 341 | "execution_count": 2, 342 | "metadata": {}, 343 | "output_type": "execute_result" 344 | } 345 | ], 346 | "source": [ 347 | "# loading data and showing its first 5 lines\n", 348 | "df = pd.read_csv('tmdb-movies.csv')\n", 349 | "df.head()" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "___\n", 357 | "\n", 358 | "\n", 359 | "## Data Cleaning\n", 360 | "In this section, we would dive deeper into exploring our dataset and perform cleaning operations like (dropping columns, handling NaNs, converting data types). All of which would help us reach a more accurate result in answering our investigating questions." 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": 3, 366 | "metadata": {}, 367 | "outputs": [ 368 | { 369 | "name": "stdout", 370 | "output_type": "stream", 371 | "text": [ 372 | "\n", 373 | "RangeIndex: 10866 entries, 0 to 10865\n", 374 | "Data columns (total 21 columns):\n", 375 | " # Column Non-Null Count Dtype \n", 376 | "--- ------ -------------- ----- \n", 377 | " 0 id 10866 non-null int64 \n", 378 | " 1 imdb_id 10856 non-null object \n", 379 | " 2 popularity 10866 non-null float64\n", 380 | " 3 budget 10866 non-null int64 \n", 381 | " 4 revenue 10866 non-null int64 \n", 382 | " 5 original_title 10866 non-null object \n", 383 | " 6 cast 10790 non-null object \n", 384 | " 7 homepage 2936 non-null object \n", 385 | " 8 director 10822 non-null object \n", 386 | " 9 tagline 8042 non-null object \n", 387 | " 10 keywords 9373 non-null object \n", 388 | " 11 overview 10862 non-null object \n", 389 | " 12 runtime 10866 non-null int64 \n", 390 | " 13 genres 10843 non-null object \n", 391 | " 14 production_companies 9836 non-null object \n", 392 | " 15 release_date 10866 non-null object \n", 393 | " 16 vote_count 10866 non-null int64 \n", 394 | " 17 vote_average 10866 non-null float64\n", 395 | " 18 release_year 10866 non-null int64 \n", 396 | " 19 budget_adj 10866 non-null float64\n", 397 | " 20 revenue_adj 10866 non-null float64\n", 398 | "dtypes: float64(4), int64(6), object(11)\n", 399 | "memory usage: 1.7+ MB\n" 400 | ] 401 | } 402 | ], 403 | "source": [ 404 | "# printing dataframe columns' info\n", 405 | "df.info()" 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": 4, 411 | "metadata": {}, 412 | "outputs": [ 413 | { 414 | "data": { 415 | "text/plain": [ 416 | "id 10865\n", 417 | "imdb_id 10855\n", 418 | "popularity 10814\n", 419 | "budget 557\n", 420 | "revenue 4702\n", 421 | "original_title 10571\n", 422 | "cast 10719\n", 423 | "homepage 2896\n", 424 | "director 5067\n", 425 | "tagline 7997\n", 426 | "keywords 8804\n", 427 | "overview 10847\n", 428 | "runtime 247\n", 429 | "genres 2039\n", 430 | "production_companies 7445\n", 431 | "release_date 5909\n", 432 | "vote_count 1289\n", 433 | "vote_average 72\n", 434 | "release_year 56\n", 435 | "budget_adj 2614\n", 436 | "revenue_adj 4840\n", 437 | "dtype: int64" 438 | ] 439 | }, 440 | "execution_count": 4, 441 | "metadata": {}, 442 | "output_type": "execute_result" 443 | } 444 | ], 445 | "source": [ 446 | "# taking a look at number of unique values\n", 447 | "df.nunique()" 448 | ] 449 | }, 450 | { 451 | "cell_type": "code", 452 | "execution_count": 5, 453 | "metadata": {}, 454 | "outputs": [ 455 | { 456 | "data": { 457 | "text/plain": [ 458 | "0 6.5\n", 459 | "1 7.1\n", 460 | "2 6.3\n", 461 | "3 7.5\n", 462 | "4 7.3\n", 463 | " ... \n", 464 | "10861 7.4\n", 465 | "10862 5.7\n", 466 | "10863 6.5\n", 467 | "10864 5.4\n", 468 | "10865 1.5\n", 469 | "Name: vote_average, Length: 10866, dtype: float64" 470 | ] 471 | }, 472 | "execution_count": 5, 473 | "metadata": {}, 474 | "output_type": "execute_result" 475 | } 476 | ], 477 | "source": [ 478 | "# closer look at the vote_average values\n", 479 | "df.vote_average" 480 | ] 481 | }, 482 | { 483 | "cell_type": "code", 484 | "execution_count": 6, 485 | "metadata": {}, 486 | "outputs": [ 487 | { 488 | "data": { 489 | "text/plain": [ 490 | "1" 491 | ] 492 | }, 493 | "execution_count": 6, 494 | "metadata": {}, 495 | "output_type": "execute_result" 496 | } 497 | ], 498 | "source": [ 499 | "# calculating duplicated values \n", 500 | "df.duplicated().sum()" 501 | ] 502 | }, 503 | { 504 | "cell_type": "markdown", 505 | "metadata": {}, 506 | "source": [ 507 | "\n", 508 | "#### As we can see from the above output: \n", 509 | "1. Our dataset consists of a total of 10866 rows and 21 columns.\n", 510 | "2. We have only 1 duplicated row which would be droped.\n", 511 | "3. Some columns wont be useful in answering our questions using analysis.\n", 512 | "4. Few columns have many missing values that needs to be handled.\n", 513 | "5. Columns `cast` `director` `genre` have values saperated with a '|'.\n", 514 | "6. `release_date`'s data type needs to be casted.\n", 515 | "7. We can append a column for the movie `profit` using formula $profit = revenue - budget$.\n", 516 | "8. `vote_average` better be presented as a catecorical variable that groubs multible ratings values.\n", 517 | "9. We may also catigorize `profit` column for better EDA" 518 | ] 519 | }, 520 | { 521 | "cell_type": "markdown", 522 | "metadata": {}, 523 | "source": [ 524 | "___\n", 525 | "**Start by dropping the duplicated row**" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": 7, 531 | "metadata": {}, 532 | "outputs": [ 533 | { 534 | "data": { 535 | "text/plain": [ 536 | "False" 537 | ] 538 | }, 539 | "execution_count": 7, 540 | "metadata": {}, 541 | "output_type": "execute_result" 542 | } 543 | ], 544 | "source": [ 545 | "# dropping duplicares and validating the excution\n", 546 | "df.drop_duplicates(inplace = True)\n", 547 | "df.duplicated().any()" 548 | ] 549 | }, 550 | { 551 | "cell_type": "markdown", 552 | "metadata": { 553 | "tags": [] 554 | }, 555 | "source": [ 556 | "___\n", 557 | "**Check for data frame columns** " 558 | ] 559 | }, 560 | { 561 | "cell_type": "code", 562 | "execution_count": 8, 563 | "metadata": {}, 564 | "outputs": [ 565 | { 566 | "data": { 567 | "text/plain": [ 568 | "Index(['id', 'imdb_id', 'popularity', 'budget', 'revenue', 'original_title',\n", 569 | " 'cast', 'homepage', 'director', 'tagline', 'keywords', 'overview',\n", 570 | " 'runtime', 'genres', 'production_companies', 'release_date',\n", 571 | " 'vote_count', 'vote_average', 'release_year', 'budget_adj',\n", 572 | " 'revenue_adj'],\n", 573 | " dtype='object')" 574 | ] 575 | }, 576 | "execution_count": 8, 577 | "metadata": {}, 578 | "output_type": "execute_result" 579 | } 580 | ], 581 | "source": [ 582 | "df.columns" 583 | ] 584 | }, 585 | { 586 | "cell_type": "markdown", 587 | "metadata": {}, 588 | "source": [ 589 | "\n", 590 | "after going through all the columns, we decided that columns: `id` `imdb_id` `homepage` `revenue_adj` `budget_adj` `tagline` `cast` `overview` `keywords` `production_companies` `director` `release_date` will not be very usefull, so we will drop them and continue with our analysis" 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "execution_count": 9, 596 | "metadata": {}, 597 | "outputs": [], 598 | "source": [ 599 | "# dropping unuseful columns\n", 600 | "df.drop(['id', 'imdb_id', 'homepage', 'revenue_adj', 'budget_adj', 'tagline', 'cast', 'overview', 'keywords', 'production_companies', 'director', 'release_date'], axis = 1, inplace = True)" 601 | ] 602 | }, 603 | { 604 | "cell_type": "code", 605 | "execution_count": 10, 606 | "metadata": {}, 607 | "outputs": [ 608 | { 609 | "data": { 610 | "text/plain": [ 611 | "(10865, 9)" 612 | ] 613 | }, 614 | "execution_count": 10, 615 | "metadata": {}, 616 | "output_type": "execute_result" 617 | } 618 | ], 619 | "source": [ 620 | "# check for data frame shape\n", 621 | "df.shape" 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": {}, 627 | "source": [ 628 | "\n", 629 | "Now we only have 9 columns to start preprocessing on\n", 630 | "___\n", 631 | "\n", 632 | "**Check for null values**" 633 | ] 634 | }, 635 | { 636 | "cell_type": "code", 637 | "execution_count": 11, 638 | "metadata": {}, 639 | "outputs": [ 640 | { 641 | "data": { 642 | "text/plain": [ 643 | "popularity 0\n", 644 | "budget 0\n", 645 | "revenue 0\n", 646 | "original_title 0\n", 647 | "runtime 0\n", 648 | "genres 23\n", 649 | "vote_count 0\n", 650 | "vote_average 0\n", 651 | "release_year 0\n", 652 | "dtype: int64" 653 | ] 654 | }, 655 | "execution_count": 11, 656 | "metadata": {}, 657 | "output_type": "execute_result" 658 | } 659 | ], 660 | "source": [ 661 | "df.isnull().sum()" 662 | ] 663 | }, 664 | { 665 | "cell_type": "markdown", 666 | "metadata": {}, 667 | "source": [ 668 | "\n", 669 | "We can see that all of our columns are clean exept from the `genre` column. Since it is very important to answer our questions, we would drop all its NaNs. \n" 670 | ] 671 | }, 672 | { 673 | "cell_type": "code", 674 | "execution_count": 12, 675 | "metadata": {}, 676 | "outputs": [ 677 | { 678 | "data": { 679 | "text/plain": [ 680 | "0" 681 | ] 682 | }, 683 | "execution_count": 12, 684 | "metadata": {}, 685 | "output_type": "execute_result" 686 | } 687 | ], 688 | "source": [ 689 | "# dropping NaNs ans confirming excution\n", 690 | "df.dropna(inplace = True)\n", 691 | "df.isnull().any().sum()" 692 | ] 693 | }, 694 | { 695 | "cell_type": "markdown", 696 | "metadata": {}, 697 | "source": [ 698 | "___\n", 699 | "**Dealing with `genres` column's '|' saperated values**\n", 700 | "\n", 701 | "We would do our analysis with the first genre for each movie. We're going to take each hybrid row and extract the first genre (before the '|') and save it to our dataframe." 702 | ] 703 | }, 704 | { 705 | "cell_type": "code", 706 | "execution_count": 13, 707 | "metadata": {}, 708 | "outputs": [ 709 | { 710 | "data": { 711 | "text/html": [ 712 | "
\n", 713 | "\n", 726 | "\n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | "
popularitybudgetrevenueoriginal_titleruntimegenresvote_countvote_averagerelease_year
032.9857631500000001513528810Jurassic World124Action55626.52015
128.419936150000000378436354Mad Max: Fury Road120Action61857.12015
213.112507110000000295238201Insurgent119Adventure24806.32015
311.1731042000000002068178225Star Wars: The Force Awakens136Action52927.52015
49.3350141900000001506249360Furious 7137Action29477.32015
\n", 804 | "
" 805 | ], 806 | "text/plain": [ 807 | " popularity budget revenue original_title runtime \\\n", 808 | "0 32.985763 150000000 1513528810 Jurassic World 124 \n", 809 | "1 28.419936 150000000 378436354 Mad Max: Fury Road 120 \n", 810 | "2 13.112507 110000000 295238201 Insurgent 119 \n", 811 | "3 11.173104 200000000 2068178225 Star Wars: The Force Awakens 136 \n", 812 | "4 9.335014 190000000 1506249360 Furious 7 137 \n", 813 | "\n", 814 | " genres vote_count vote_average release_year \n", 815 | "0 Action 5562 6.5 2015 \n", 816 | "1 Action 6185 7.1 2015 \n", 817 | "2 Adventure 2480 6.3 2015 \n", 818 | "3 Action 5292 7.5 2015 \n", 819 | "4 Action 2947 7.3 2015 " 820 | ] 821 | }, 822 | "execution_count": 13, 823 | "metadata": {}, 824 | "output_type": "execute_result" 825 | } 826 | ], 827 | "source": [ 828 | "# split values in the hyprid dataframe\n", 829 | "df['genres'] = df['genres'].apply(lambda x: x.split(\"|\")[0])\n", 830 | "df.head()" 831 | ] 832 | }, 833 | { 834 | "cell_type": "markdown", 835 | "metadata": {}, 836 | "source": [ 837 | "___\n", 838 | "**Calculating movie profit. $profit = revenue - budget$**" 839 | ] 840 | }, 841 | { 842 | "cell_type": "code", 843 | "execution_count": 14, 844 | "metadata": {}, 845 | "outputs": [ 846 | { 847 | "data": { 848 | "text/html": [ 849 | "
\n", 850 | "\n", 863 | "\n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | "
popularitybudgetrevenueoriginal_titleruntimegenresvote_countvote_averagerelease_yearprofit
032.9857631500000001513528810Jurassic World124Action55626.520151363528810
128.419936150000000378436354Mad Max: Fury Road120Action61857.12015228436354
213.112507110000000295238201Insurgent119Adventure24806.32015185238201
311.1731042000000002068178225Star Wars: The Force Awakens136Action52927.520151868178225
49.3350141900000001506249360Furious 7137Action29477.320151316249360
\n", 947 | "
" 948 | ], 949 | "text/plain": [ 950 | " popularity budget revenue original_title runtime \\\n", 951 | "0 32.985763 150000000 1513528810 Jurassic World 124 \n", 952 | "1 28.419936 150000000 378436354 Mad Max: Fury Road 120 \n", 953 | "2 13.112507 110000000 295238201 Insurgent 119 \n", 954 | "3 11.173104 200000000 2068178225 Star Wars: The Force Awakens 136 \n", 955 | "4 9.335014 190000000 1506249360 Furious 7 137 \n", 956 | "\n", 957 | " genres vote_count vote_average release_year profit \n", 958 | "0 Action 5562 6.5 2015 1363528810 \n", 959 | "1 Action 6185 7.1 2015 228436354 \n", 960 | "2 Adventure 2480 6.3 2015 185238201 \n", 961 | "3 Action 5292 7.5 2015 1868178225 \n", 962 | "4 Action 2947 7.3 2015 1316249360 " 963 | ] 964 | }, 965 | "execution_count": 14, 966 | "metadata": {}, 967 | "output_type": "execute_result" 968 | } 969 | ], 970 | "source": [ 971 | "# adding new column for movie profit (revenue - budget)\n", 972 | "df['profit'] = df.revenue - df.budget\n", 973 | "df.head()" 974 | ] 975 | }, 976 | { 977 | "cell_type": "markdown", 978 | "metadata": {}, 979 | "source": [ 980 | "___\n", 981 | "**Catigorizing `vote_average` `profit_adj` columns**\n", 982 | "\n", 983 | "For usability and functionality sake, we would convert these columns using a function." 984 | ] 985 | }, 986 | { 987 | "cell_type": "code", 988 | "execution_count": 15, 989 | "metadata": {}, 990 | "outputs": [], 991 | "source": [ 992 | "def catigorize_col (df, col, labels):\n", 993 | " \"\"\"\n", 994 | " catigorizes a certain column based on its quartiles\n", 995 | " \n", 996 | " Args:\n", 997 | " (df) df - dataframe we are proccesing\n", 998 | " (col) str - to be catigorized column's name \n", 999 | " (labels) list - list of labels from min to max\n", 1000 | " \n", 1001 | " Returns:\n", 1002 | " (df) df - dataframe with the categorized col\n", 1003 | " \"\"\"\n", 1004 | " \n", 1005 | " # setting the edges to cut the column accordingly\n", 1006 | " edges = [df[col].describe()['min'],\n", 1007 | " df[col].describe()['25%'],\n", 1008 | " df[col].describe()['50%'],\n", 1009 | " df[col].describe()['75%'],\n", 1010 | " df[col].describe()['max']]\n", 1011 | " \n", 1012 | " df[col] = pd.cut(df[col], edges, labels = labels, duplicates='drop')\n", 1013 | " return df\n", 1014 | " " 1015 | ] 1016 | }, 1017 | { 1018 | "cell_type": "markdown", 1019 | "metadata": {}, 1020 | "source": [ 1021 | "___\n", 1022 | "**Converting `vote_average` into a categorical variable** " 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "markdown", 1027 | "metadata": {}, 1028 | "source": [ 1029 | "We would cut the `vote_average` values and make 4 categories: `popular` `average` `below_avg` `not_popular` to describe it more using `catigorize_col()` function provided above." 1030 | ] 1031 | }, 1032 | { 1033 | "cell_type": "code", 1034 | "execution_count": 16, 1035 | "metadata": {}, 1036 | "outputs": [ 1037 | { 1038 | "data": { 1039 | "text/plain": [ 1040 | "['average', 'popular', 'below_avg', 'not_popular', NaN]\n", 1041 | "Categories (4, object): ['not_popular' < 'below_avg' < 'average' < 'popular']" 1042 | ] 1043 | }, 1044 | "execution_count": 16, 1045 | "metadata": {}, 1046 | "output_type": "execute_result" 1047 | } 1048 | ], 1049 | "source": [ 1050 | "# define labels for these edges\n", 1051 | "labels = ['not_popular', 'below_avg', 'average', 'popular']\n", 1052 | "catigorize_col(df, 'vote_average', labels)\n", 1053 | "\n", 1054 | "df['vote_average'].unique()" 1055 | ] 1056 | }, 1057 | { 1058 | "cell_type": "markdown", 1059 | "metadata": {}, 1060 | "source": [ 1061 | "We endded up with some NaNs, so wo would drop them." 1062 | ] 1063 | }, 1064 | { 1065 | "cell_type": "code", 1066 | "execution_count": 17, 1067 | "metadata": {}, 1068 | "outputs": [ 1069 | { 1070 | "data": { 1071 | "text/plain": [ 1072 | "popularity 0\n", 1073 | "budget 0\n", 1074 | "revenue 0\n", 1075 | "original_title 0\n", 1076 | "runtime 0\n", 1077 | "genres 0\n", 1078 | "vote_count 0\n", 1079 | "vote_average 0\n", 1080 | "release_year 0\n", 1081 | "profit 0\n", 1082 | "dtype: int64" 1083 | ] 1084 | }, 1085 | "execution_count": 17, 1086 | "metadata": {}, 1087 | "output_type": "execute_result" 1088 | } 1089 | ], 1090 | "source": [ 1091 | "# dropping nans and confirming\n", 1092 | "df.dropna(inplace = True)\n", 1093 | "df.isnull().sum()" 1094 | ] 1095 | }, 1096 | { 1097 | "cell_type": "markdown", 1098 | "metadata": {}, 1099 | "source": [ 1100 | "___\n", 1101 | "**Converting `profit` into a categorical variable** " 1102 | ] 1103 | }, 1104 | { 1105 | "cell_type": "markdown", 1106 | "metadata": {}, 1107 | "source": [ 1108 | "We would cut the `profit` values and make 3 categories: `high` `average` `low` to describe it more using `catigorize_col()` function provided above." 1109 | ] 1110 | }, 1111 | { 1112 | "cell_type": "code", 1113 | "execution_count": 18, 1114 | "metadata": {}, 1115 | "outputs": [ 1116 | { 1117 | "data": { 1118 | "text/plain": [ 1119 | "['high', 'average', 'low', NaN]\n", 1120 | "Categories (3, object): ['low' < 'average' < 'high']" 1121 | ] 1122 | }, 1123 | "execution_count": 18, 1124 | "metadata": {}, 1125 | "output_type": "execute_result" 1126 | } 1127 | ], 1128 | "source": [ 1129 | "# define labels for these edges\n", 1130 | "labels = ['low', 'average', 'high']\n", 1131 | "catigorize_col(df, 'profit', labels)\n", 1132 | "\n", 1133 | "df['profit'].unique()" 1134 | ] 1135 | }, 1136 | { 1137 | "cell_type": "code", 1138 | "execution_count": 19, 1139 | "metadata": {}, 1140 | "outputs": [ 1141 | { 1142 | "data": { 1143 | "text/plain": [ 1144 | "1" 1145 | ] 1146 | }, 1147 | "execution_count": 19, 1148 | "metadata": {}, 1149 | "output_type": "execute_result" 1150 | } 1151 | ], 1152 | "source": [ 1153 | "df.profit.isnull().sum()" 1154 | ] 1155 | }, 1156 | { 1157 | "cell_type": "markdown", 1158 | "metadata": {}, 1159 | "source": [ 1160 | "We endded up with one NaN value, wich we will drop" 1161 | ] 1162 | }, 1163 | { 1164 | "cell_type": "code", 1165 | "execution_count": 20, 1166 | "metadata": {}, 1167 | "outputs": [ 1168 | { 1169 | "data": { 1170 | "text/plain": [ 1171 | "popularity 0\n", 1172 | "budget 0\n", 1173 | "revenue 0\n", 1174 | "original_title 0\n", 1175 | "runtime 0\n", 1176 | "genres 0\n", 1177 | "vote_count 0\n", 1178 | "vote_average 0\n", 1179 | "release_year 0\n", 1180 | "profit 0\n", 1181 | "dtype: int64" 1182 | ] 1183 | }, 1184 | "execution_count": 20, 1185 | "metadata": {}, 1186 | "output_type": "execute_result" 1187 | } 1188 | ], 1189 | "source": [ 1190 | "# dropping NaNs in profic column and confirming\n", 1191 | "df.dropna(inplace = True)\n", 1192 | "df.isnull().sum()" 1193 | ] 1194 | }, 1195 | { 1196 | "cell_type": "markdown", 1197 | "metadata": {}, 1198 | "source": [ 1199 | "___" 1200 | ] 1201 | }, 1202 | { 1203 | "cell_type": "code", 1204 | "execution_count": 21, 1205 | "metadata": {}, 1206 | "outputs": [ 1207 | { 1208 | "data": { 1209 | "text/html": [ 1210 | "
\n", 1211 | "\n", 1224 | "\n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | " \n", 1236 | " \n", 1237 | " \n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | " \n", 1244 | " \n", 1245 | " \n", 1246 | " \n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | " \n", 1254 | " \n", 1255 | " \n", 1256 | " \n", 1257 | " \n", 1258 | " \n", 1259 | " \n", 1260 | " \n", 1261 | " \n", 1262 | " \n", 1263 | " \n", 1264 | " \n", 1265 | " \n", 1266 | " \n", 1267 | " \n", 1268 | " \n", 1269 | " \n", 1270 | " \n", 1271 | " \n", 1272 | " \n", 1273 | " \n", 1274 | " \n", 1275 | " \n", 1276 | " \n", 1277 | " \n", 1278 | " \n", 1279 | " \n", 1280 | " \n", 1281 | " \n", 1282 | " \n", 1283 | " \n", 1284 | " \n", 1285 | " \n", 1286 | " \n", 1287 | " \n", 1288 | " \n", 1289 | " \n", 1290 | " \n", 1291 | " \n", 1292 | " \n", 1293 | " \n", 1294 | " \n", 1295 | " \n", 1296 | " \n", 1297 | " \n", 1298 | " \n", 1299 | " \n", 1300 | " \n", 1301 | " \n", 1302 | " \n", 1303 | " \n", 1304 | " \n", 1305 | " \n", 1306 | " \n", 1307 | "
popularitybudgetrevenueoriginal_titleruntimegenresvote_countvote_averagerelease_yearprofit
032.9857631500000001513528810Jurassic World124Action5562average2015high
128.419936150000000378436354Mad Max: Fury Road120Action6185popular2015high
213.112507110000000295238201Insurgent119Adventure2480average2015high
311.1731042000000002068178225Star Wars: The Force Awakens136Action5292popular2015high
49.3350141900000001506249360Furious 7137Action2947popular2015high
\n", 1308 | "
" 1309 | ], 1310 | "text/plain": [ 1311 | " popularity budget revenue original_title runtime \\\n", 1312 | "0 32.985763 150000000 1513528810 Jurassic World 124 \n", 1313 | "1 28.419936 150000000 378436354 Mad Max: Fury Road 120 \n", 1314 | "2 13.112507 110000000 295238201 Insurgent 119 \n", 1315 | "3 11.173104 200000000 2068178225 Star Wars: The Force Awakens 136 \n", 1316 | "4 9.335014 190000000 1506249360 Furious 7 137 \n", 1317 | "\n", 1318 | " genres vote_count vote_average release_year profit \n", 1319 | "0 Action 5562 average 2015 high \n", 1320 | "1 Action 6185 popular 2015 high \n", 1321 | "2 Adventure 2480 average 2015 high \n", 1322 | "3 Action 5292 popular 2015 high \n", 1323 | "4 Action 2947 popular 2015 high " 1324 | ] 1325 | }, 1326 | "execution_count": 21, 1327 | "metadata": {}, 1328 | "output_type": "execute_result" 1329 | } 1330 | ], 1331 | "source": [ 1332 | "df.head()" 1333 | ] 1334 | }, 1335 | { 1336 | "cell_type": "code", 1337 | "execution_count": 22, 1338 | "metadata": {}, 1339 | "outputs": [ 1340 | { 1341 | "name": "stdout", 1342 | "output_type": "stream", 1343 | "text": [ 1344 | "\n", 1345 | "Int64Index: 10839 entries, 0 to 10864\n", 1346 | "Data columns (total 10 columns):\n", 1347 | " # Column Non-Null Count Dtype \n", 1348 | "--- ------ -------------- ----- \n", 1349 | " 0 popularity 10839 non-null float64 \n", 1350 | " 1 budget 10839 non-null int64 \n", 1351 | " 2 revenue 10839 non-null int64 \n", 1352 | " 3 original_title 10839 non-null object \n", 1353 | " 4 runtime 10839 non-null int64 \n", 1354 | " 5 genres 10839 non-null object \n", 1355 | " 6 vote_count 10839 non-null int64 \n", 1356 | " 7 vote_average 10839 non-null category\n", 1357 | " 8 release_year 10839 non-null int64 \n", 1358 | " 9 profit 10839 non-null category\n", 1359 | "dtypes: category(2), float64(1), int64(5), object(2)\n", 1360 | "memory usage: 783.6+ KB\n" 1361 | ] 1362 | } 1363 | ], 1364 | "source": [ 1365 | "df.info()" 1366 | ] 1367 | }, 1368 | { 1369 | "cell_type": "markdown", 1370 | "metadata": {}, 1371 | "source": [ 1372 | "Now that we finished our data cleaning, our dataset consists of 10840 records with 10 columns, it has no duplicates nor null values, and the data types are consistant with suitable categorical variable to address our questions. We are ready to move to the next step!\n", 1373 | "___" 1374 | ] 1375 | }, 1376 | { 1377 | "cell_type": "markdown", 1378 | "metadata": { 1379 | "tags": [] 1380 | }, 1381 | "source": [ 1382 | "\n", 1383 | "## Exploratory Data Analysis\n", 1384 | "In this section, we would use describtive statistics and visuals to address the following questions regarding our dataset" 1385 | ] 1386 | }, 1387 | { 1388 | "cell_type": "markdown", 1389 | "metadata": {}, 1390 | "source": [ 1391 | "### Q1: Do movies with high popularity achive high revenvue?" 1392 | ] 1393 | }, 1394 | { 1395 | "cell_type": "code", 1396 | "execution_count": 23, 1397 | "metadata": {}, 1398 | "outputs": [ 1399 | { 1400 | "data": { 1401 | "text/plain": [ 1402 | "0.6476021913460651" 1403 | ] 1404 | }, 1405 | "execution_count": 23, 1406 | "metadata": {}, 1407 | "output_type": "execute_result" 1408 | } 1409 | ], 1410 | "source": [ 1411 | "# get median of popularity to compare with\n", 1412 | "df['popularity'].mean()" 1413 | ] 1414 | }, 1415 | { 1416 | "cell_type": "code", 1417 | "execution_count": 24, 1418 | "metadata": {}, 1419 | "outputs": [], 1420 | "source": [ 1421 | "# split popular movies into two groups around the median\n", 1422 | "less_popular = df.query('popularity <= 0.647')\n", 1423 | "more_popular = df.query('popularity > 0.647')" 1424 | ] 1425 | }, 1426 | { 1427 | "cell_type": "code", 1428 | "execution_count": 25, 1429 | "metadata": {}, 1430 | "outputs": [ 1431 | { 1432 | "name": "stdout", 1433 | "output_type": "stream", 1434 | "text": [ 1435 | "7689823.871224779 121933819.08567691\n" 1436 | ] 1437 | } 1438 | ], 1439 | "source": [ 1440 | "# get mean quality revenue for the less and more popularity groups\n", 1441 | "print(less_popular.revenue.mean(), more_popular.revenue.mean())" 1442 | ] 1443 | }, 1444 | { 1445 | "cell_type": "markdown", 1446 | "metadata": {}, 1447 | "source": [ 1448 | "From the above calculations, it's very clear that more popular movies recieve way more revenue than the less popular movies.\n", 1449 | "___" 1450 | ] 1451 | }, 1452 | { 1453 | "cell_type": "markdown", 1454 | "metadata": {}, 1455 | "source": [ 1456 | "### Q2: What are the most filmed genres in this whole dataset?" 1457 | ] 1458 | }, 1459 | { 1460 | "cell_type": "code", 1461 | "execution_count": 26, 1462 | "metadata": {}, 1463 | "outputs": [ 1464 | { 1465 | "data": { 1466 | "image/png": "\n", 1467 | "text/plain": [ 1468 | "
" 1469 | ] 1470 | }, 1471 | "metadata": { 1472 | "needs_background": "light" 1473 | }, 1474 | "output_type": "display_data" 1475 | } 1476 | ], 1477 | "source": [ 1478 | "# visualising genres distribution\n", 1479 | "plt.figure(figsize=(8,5))\n", 1480 | "df['genres'].value_counts().plot(kind=\"bar\")\n", 1481 | "plt.title(\"The most filmed genres\", fontsize=(10))\n", 1482 | "plt.xlabel(\"genres\", fontsize=10)\n", 1483 | "plt.ylabel(\"Count\",fontsize=10)\n", 1484 | "plt.show()" 1485 | ] 1486 | }, 1487 | { 1488 | "cell_type": "code", 1489 | "execution_count": 27, 1490 | "metadata": {}, 1491 | "outputs": [ 1492 | { 1493 | "data": { 1494 | "text/plain": [ 1495 | "count 10839\n", 1496 | "unique 20\n", 1497 | "top Drama\n", 1498 | "freq 2453\n", 1499 | "Name: genres, dtype: object" 1500 | ] 1501 | }, 1502 | "execution_count": 27, 1503 | "metadata": {}, 1504 | "output_type": "execute_result" 1505 | } 1506 | ], 1507 | "source": [ 1508 | "df['genres'].describe()" 1509 | ] 1510 | }, 1511 | { 1512 | "cell_type": "markdown", 1513 | "metadata": {}, 1514 | "source": [ 1515 | "from the above graph, we can see that `Drama`, `Comedy` and `Action` are the most three filmed genres in total of 10839 movies in our dataset, and that `Drama` genre is filmed 22.6% of the times on our dataset.\n", 1516 | "___" 1517 | ] 1518 | }, 1519 | { 1520 | "cell_type": "markdown", 1521 | "metadata": {}, 1522 | "source": [ 1523 | "### Q3: Is there a correlation between a movie budget and its revenue?" 1524 | ] 1525 | }, 1526 | { 1527 | "cell_type": "code", 1528 | "execution_count": 28, 1529 | "metadata": {}, 1530 | "outputs": [ 1531 | { 1532 | "data": { 1533 | "image/png": "\n", 1534 | "text/plain": [ 1535 | "
" 1536 | ] 1537 | }, 1538 | "metadata": { 1539 | "needs_background": "light" 1540 | }, 1541 | "output_type": "display_data" 1542 | } 1543 | ], 1544 | "source": [ 1545 | "# plotting budget against revenue\n", 1546 | "plt.scatter(df['budget'],df['revenue']);\n", 1547 | "plt.title(\"movie budget against its revenue\");\n", 1548 | "plt.xlabel('budget', fontsize=10);\n", 1549 | "plt.ylabel('revenue',fontsize=10);" 1550 | ] 1551 | }, 1552 | { 1553 | "cell_type": "markdown", 1554 | "metadata": {}, 1555 | "source": [ 1556 | "We can notice the positive correlation between `budget` and `revenue`, indecating a relation between them with little outliers. \n", 1557 | "___" 1558 | ] 1559 | }, 1560 | { 1561 | "cell_type": "markdown", 1562 | "metadata": { 1563 | "tags": [] 1564 | }, 1565 | "source": [ 1566 | "\n", 1567 | "## Conclusions\n", 1568 | "\n", 1569 | "### Q1: Do movies with high popularity achive high revenvue?\n", 1570 | "> More popular movies recieve way more revenue than the less popular movies.\n", 1571 | "### Q2: What are the most filmed genres in this whole dataset?\n", 1572 | "> `Drama`, `Comedy` and `Action` are the most three filmed genres in total of 10839 movies in our dataset, and that `Drama` genre is filmed 22.6% of the times on our dataset.\n", 1573 | "### Q3: Is there a correlation between a movie budget and its revenue?\n", 1574 | "> There is positive correlation between `budget` and `revenue`, indecating a relation between them with little outliers. \n", 1575 | "___\n" 1576 | ] 1577 | } 1578 | ], 1579 | "metadata": { 1580 | "kernelspec": { 1581 | "display_name": "Python 3 (ipykernel)", 1582 | "language": "python", 1583 | "name": "python3" 1584 | }, 1585 | "language_info": { 1586 | "codemirror_mode": { 1587 | "name": "ipython", 1588 | "version": 3 1589 | }, 1590 | "file_extension": ".py", 1591 | "mimetype": "text/x-python", 1592 | "name": "python", 1593 | "nbconvert_exporter": "python", 1594 | "pygments_lexer": "ipython3", 1595 | "version": "3.8.3" 1596 | } 1597 | }, 1598 | "nbformat": 4, 1599 | "nbformat_minor": 4 1600 | } 1601 | -------------------------------------------------------------------------------- /02-Auto-MPG-Dataset-Analysis/README.md: -------------------------------------------------------------------------------- 1 | # Auto-MPG Dataset Analysis 2 | 3 | | Contents | 4 | | -------- | 5 | | [Dataset Description](#Dataset-Description) | 6 | | [Columns Descreption](#Columns-Descreption) | 7 | | [Data Wrangling](#Data-Wrangling) | 8 | | [Data Cleaning](#Data-Cleaning) | 9 | | [Data Visualization](#Data-Visualization) | 10 | | [Conclusion](#Conclusion) | 11 | | [Built with](#Built-with) | 12 | 13 | ## Dataset Description: 14 | The MPG dataset is technical spec of cars originaly provided from [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/auto+mpg) and can be found on Kaggle [here](https://www.kaggle.com/uciml/autompg-dataset). 15 | The data concerns city-cycle fuel consumption in miles per gallon to be analyzed in terms of 3 multivalued discrete and 5 continuous attributes. 16 | 17 | ## Columns Descreption: 18 | 1. `mpg`: miles per galon of fuel (continuous variable). 19 | 2. `cylinders`: number of engine cylinders (multi-valued discrete variable). 20 | 3. `displacement`: (continuous variable) 21 | 4. `horsepower`: the power produced by engine to move the car (continuous variable) 22 | 5. `weight`: car weight (continuous variable) 23 | 6. `acceleration`: the acceleration an engine can get per second (continuous variable) 24 | 7. `model year`: car release year from 1970 to 1982(multi-valued discrete variable) 25 | 8. `origin`: car manufacturing place (1 -> USA, 2 -> Europe, 3 -> Asia) (multi-valued discrete variable) 26 | 9. `car name`: car model name (unique for each instance) 27 | 28 | ## Data Wrangling: 29 | Our data can be found on `auto-mpg.csv` file provided on this repository, downloaded from [Kaggle](https://www.kaggle.com/uciml/autompg-dataset). 30 | 31 | ## Data Cleaning: 32 | **Exploring Summary** 33 | 1. Our dataset had a total of 398 records and 9 columns. 34 | 2. We had no NaNs in our dataset nor duplicated rows. 35 | 3. `horsepower` column had inconsistant data type that needed to be handled and casted to `int`. 36 | 4. `origin` needed to be parsed and casted into a categorical datatype. 37 | 5. No columns needed to be dropped. 38 | 39 | ## Data Visualization 40 | Using `Matplotlib` and `Seaborn`, we made several meaningful visuals and charts to help us gain informative insights regarding any correlation between attributes in our dataset, that'll be discussed in the next section. 41 | 42 | ## Conclusion 43 | These are derived conclusions after comleting our data visualisation phase. 44 | 1. As years pass after `1973`, there has been a noticable increase in `mpg`. 45 | 2. As `cylinders` in the engine increases above 4, `MPG` decreases and engine `horsepower` increases. That indicates negative correlation between `mpg` and `horsepower`. 46 | 3. `mpg` increases as `weight` decreses over time, that also indecates a stron correlation between them. 47 | 4. Althogh `USA` has the biggest count of produced cars, its cars has relatively very low `mpg`, thus the highest possible `weight` compared to `Asia` and `Europe` 48 | 5. `Asia` is the leading contry in producing cars with high `mpg` with a mean close to 30, and it produces the lightest cars 49 | 6. Wa can spot a negative correlation between `acceleration` and `horepower`, this means that it has a positive one with `mpg`. 50 | 51 | ## Built with: 52 | | Tools | 53 | | -------- | 54 | | JupyterLab | 55 | | Python3 | 56 | | Pandas | 57 | | Numpy | 58 | | Matplotlib | 59 | | Seaborn | 60 | -------------------------------------------------------------------------------- /02-Auto-MPG-Dataset-Analysis/auto-mpg.csv: -------------------------------------------------------------------------------- 1 | mpg,cylinders,displacement,horsepower,weight,acceleration,model year,origin,car name 2 | 18,8,307,130,3504,12,70,1,chevrolet chevelle malibu 3 | 15,8,350,165,3693,11.5,70,1,buick skylark 320 4 | 18,8,318,150,3436,11,70,1,plymouth satellite 5 | 16,8,304,150,3433,12,70,1,amc rebel sst 6 | 17,8,302,140,3449,10.5,70,1,ford torino 7 | 15,8,429,198,4341,10,70,1,ford galaxie 500 8 | 14,8,454,220,4354,9,70,1,chevrolet impala 9 | 14,8,440,215,4312,8.5,70,1,plymouth fury iii 10 | 14,8,455,225,4425,10,70,1,pontiac catalina 11 | 15,8,390,190,3850,8.5,70,1,amc ambassador dpl 12 | 15,8,383,170,3563,10,70,1,dodge challenger se 13 | 14,8,340,160,3609,8,70,1,plymouth 'cuda 340 14 | 15,8,400,150,3761,9.5,70,1,chevrolet monte carlo 15 | 14,8,455,225,3086,10,70,1,buick estate wagon (sw) 16 | 24,4,113,95,2372,15,70,3,toyota corona mark ii 17 | 22,6,198,95,2833,15.5,70,1,plymouth duster 18 | 18,6,199,97,2774,15.5,70,1,amc hornet 19 | 21,6,200,85,2587,16,70,1,ford maverick 20 | 27,4,97,88,2130,14.5,70,3,datsun pl510 21 | 26,4,97,46,1835,20.5,70,2,volkswagen 1131 deluxe sedan 22 | 25,4,110,87,2672,17.5,70,2,peugeot 504 23 | 24,4,107,90,2430,14.5,70,2,audi 100 ls 24 | 25,4,104,95,2375,17.5,70,2,saab 99e 25 | 26,4,121,113,2234,12.5,70,2,bmw 2002 26 | 21,6,199,90,2648,15,70,1,amc gremlin 27 | 10,8,360,215,4615,14,70,1,ford f250 28 | 10,8,307,200,4376,15,70,1,chevy c20 29 | 11,8,318,210,4382,13.5,70,1,dodge d200 30 | 9,8,304,193,4732,18.5,70,1,hi 1200d 31 | 27,4,97,88,2130,14.5,71,3,datsun pl510 32 | 28,4,140,90,2264,15.5,71,1,chevrolet vega 2300 33 | 25,4,113,95,2228,14,71,3,toyota corona 34 | 25,4,98,?,2046,19,71,1,ford pinto 35 | 19,6,232,100,2634,13,71,1,amc gremlin 36 | 16,6,225,105,3439,15.5,71,1,plymouth satellite custom 37 | 17,6,250,100,3329,15.5,71,1,chevrolet chevelle malibu 38 | 19,6,250,88,3302,15.5,71,1,ford torino 500 39 | 18,6,232,100,3288,15.5,71,1,amc matador 40 | 14,8,350,165,4209,12,71,1,chevrolet impala 41 | 14,8,400,175,4464,11.5,71,1,pontiac catalina brougham 42 | 14,8,351,153,4154,13.5,71,1,ford galaxie 500 43 | 14,8,318,150,4096,13,71,1,plymouth fury iii 44 | 12,8,383,180,4955,11.5,71,1,dodge monaco (sw) 45 | 13,8,400,170,4746,12,71,1,ford country squire (sw) 46 | 13,8,400,175,5140,12,71,1,pontiac safari (sw) 47 | 18,6,258,110,2962,13.5,71,1,amc hornet sportabout (sw) 48 | 22,4,140,72,2408,19,71,1,chevrolet vega (sw) 49 | 19,6,250,100,3282,15,71,1,pontiac firebird 50 | 18,6,250,88,3139,14.5,71,1,ford mustang 51 | 23,4,122,86,2220,14,71,1,mercury capri 2000 52 | 28,4,116,90,2123,14,71,2,opel 1900 53 | 30,4,79,70,2074,19.5,71,2,peugeot 304 54 | 30,4,88,76,2065,14.5,71,2,fiat 124b 55 | 31,4,71,65,1773,19,71,3,toyota corolla 1200 56 | 35,4,72,69,1613,18,71,3,datsun 1200 57 | 27,4,97,60,1834,19,71,2,volkswagen model 111 58 | 26,4,91,70,1955,20.5,71,1,plymouth cricket 59 | 24,4,113,95,2278,15.5,72,3,toyota corona hardtop 60 | 25,4,97.5,80,2126,17,72,1,dodge colt hardtop 61 | 23,4,97,54,2254,23.5,72,2,volkswagen type 3 62 | 20,4,140,90,2408,19.5,72,1,chevrolet vega 63 | 21,4,122,86,2226,16.5,72,1,ford pinto runabout 64 | 13,8,350,165,4274,12,72,1,chevrolet impala 65 | 14,8,400,175,4385,12,72,1,pontiac catalina 66 | 15,8,318,150,4135,13.5,72,1,plymouth fury iii 67 | 14,8,351,153,4129,13,72,1,ford galaxie 500 68 | 17,8,304,150,3672,11.5,72,1,amc ambassador sst 69 | 11,8,429,208,4633,11,72,1,mercury marquis 70 | 13,8,350,155,4502,13.5,72,1,buick lesabre custom 71 | 12,8,350,160,4456,13.5,72,1,oldsmobile delta 88 royale 72 | 13,8,400,190,4422,12.5,72,1,chrysler newport royal 73 | 19,3,70,97,2330,13.5,72,3,mazda rx2 coupe 74 | 15,8,304,150,3892,12.5,72,1,amc matador (sw) 75 | 13,8,307,130,4098,14,72,1,chevrolet chevelle concours (sw) 76 | 13,8,302,140,4294,16,72,1,ford gran torino (sw) 77 | 14,8,318,150,4077,14,72,1,plymouth satellite custom (sw) 78 | 18,4,121,112,2933,14.5,72,2,volvo 145e (sw) 79 | 22,4,121,76,2511,18,72,2,volkswagen 411 (sw) 80 | 21,4,120,87,2979,19.5,72,2,peugeot 504 (sw) 81 | 26,4,96,69,2189,18,72,2,renault 12 (sw) 82 | 22,4,122,86,2395,16,72,1,ford pinto (sw) 83 | 28,4,97,92,2288,17,72,3,datsun 510 (sw) 84 | 23,4,120,97,2506,14.5,72,3,toyouta corona mark ii (sw) 85 | 28,4,98,80,2164,15,72,1,dodge colt (sw) 86 | 27,4,97,88,2100,16.5,72,3,toyota corolla 1600 (sw) 87 | 13,8,350,175,4100,13,73,1,buick century 350 88 | 14,8,304,150,3672,11.5,73,1,amc matador 89 | 13,8,350,145,3988,13,73,1,chevrolet malibu 90 | 14,8,302,137,4042,14.5,73,1,ford gran torino 91 | 15,8,318,150,3777,12.5,73,1,dodge coronet custom 92 | 12,8,429,198,4952,11.5,73,1,mercury marquis brougham 93 | 13,8,400,150,4464,12,73,1,chevrolet caprice classic 94 | 13,8,351,158,4363,13,73,1,ford ltd 95 | 14,8,318,150,4237,14.5,73,1,plymouth fury gran sedan 96 | 13,8,440,215,4735,11,73,1,chrysler new yorker brougham 97 | 12,8,455,225,4951,11,73,1,buick electra 225 custom 98 | 13,8,360,175,3821,11,73,1,amc ambassador brougham 99 | 18,6,225,105,3121,16.5,73,1,plymouth valiant 100 | 16,6,250,100,3278,18,73,1,chevrolet nova custom 101 | 18,6,232,100,2945,16,73,1,amc hornet 102 | 18,6,250,88,3021,16.5,73,1,ford maverick 103 | 23,6,198,95,2904,16,73,1,plymouth duster 104 | 26,4,97,46,1950,21,73,2,volkswagen super beetle 105 | 11,8,400,150,4997,14,73,1,chevrolet impala 106 | 12,8,400,167,4906,12.5,73,1,ford country 107 | 13,8,360,170,4654,13,73,1,plymouth custom suburb 108 | 12,8,350,180,4499,12.5,73,1,oldsmobile vista cruiser 109 | 18,6,232,100,2789,15,73,1,amc gremlin 110 | 20,4,97,88,2279,19,73,3,toyota carina 111 | 21,4,140,72,2401,19.5,73,1,chevrolet vega 112 | 22,4,108,94,2379,16.5,73,3,datsun 610 113 | 18,3,70,90,2124,13.5,73,3,maxda rx3 114 | 19,4,122,85,2310,18.5,73,1,ford pinto 115 | 21,6,155,107,2472,14,73,1,mercury capri v6 116 | 26,4,98,90,2265,15.5,73,2,fiat 124 sport coupe 117 | 15,8,350,145,4082,13,73,1,chevrolet monte carlo s 118 | 16,8,400,230,4278,9.5,73,1,pontiac grand prix 119 | 29,4,68,49,1867,19.5,73,2,fiat 128 120 | 24,4,116,75,2158,15.5,73,2,opel manta 121 | 20,4,114,91,2582,14,73,2,audi 100ls 122 | 19,4,121,112,2868,15.5,73,2,volvo 144ea 123 | 15,8,318,150,3399,11,73,1,dodge dart custom 124 | 24,4,121,110,2660,14,73,2,saab 99le 125 | 20,6,156,122,2807,13.5,73,3,toyota mark ii 126 | 11,8,350,180,3664,11,73,1,oldsmobile omega 127 | 20,6,198,95,3102,16.5,74,1,plymouth duster 128 | 21,6,200,?,2875,17,74,1,ford maverick 129 | 19,6,232,100,2901,16,74,1,amc hornet 130 | 15,6,250,100,3336,17,74,1,chevrolet nova 131 | 31,4,79,67,1950,19,74,3,datsun b210 132 | 26,4,122,80,2451,16.5,74,1,ford pinto 133 | 32,4,71,65,1836,21,74,3,toyota corolla 1200 134 | 25,4,140,75,2542,17,74,1,chevrolet vega 135 | 16,6,250,100,3781,17,74,1,chevrolet chevelle malibu classic 136 | 16,6,258,110,3632,18,74,1,amc matador 137 | 18,6,225,105,3613,16.5,74,1,plymouth satellite sebring 138 | 16,8,302,140,4141,14,74,1,ford gran torino 139 | 13,8,350,150,4699,14.5,74,1,buick century luxus (sw) 140 | 14,8,318,150,4457,13.5,74,1,dodge coronet custom (sw) 141 | 14,8,302,140,4638,16,74,1,ford gran torino (sw) 142 | 14,8,304,150,4257,15.5,74,1,amc matador (sw) 143 | 29,4,98,83,2219,16.5,74,2,audi fox 144 | 26,4,79,67,1963,15.5,74,2,volkswagen dasher 145 | 26,4,97,78,2300,14.5,74,2,opel manta 146 | 31,4,76,52,1649,16.5,74,3,toyota corona 147 | 32,4,83,61,2003,19,74,3,datsun 710 148 | 28,4,90,75,2125,14.5,74,1,dodge colt 149 | 24,4,90,75,2108,15.5,74,2,fiat 128 150 | 26,4,116,75,2246,14,74,2,fiat 124 tc 151 | 24,4,120,97,2489,15,74,3,honda civic 152 | 26,4,108,93,2391,15.5,74,3,subaru 153 | 31,4,79,67,2000,16,74,2,fiat x1.9 154 | 19,6,225,95,3264,16,75,1,plymouth valiant custom 155 | 18,6,250,105,3459,16,75,1,chevrolet nova 156 | 15,6,250,72,3432,21,75,1,mercury monarch 157 | 15,6,250,72,3158,19.5,75,1,ford maverick 158 | 16,8,400,170,4668,11.5,75,1,pontiac catalina 159 | 15,8,350,145,4440,14,75,1,chevrolet bel air 160 | 16,8,318,150,4498,14.5,75,1,plymouth grand fury 161 | 14,8,351,148,4657,13.5,75,1,ford ltd 162 | 17,6,231,110,3907,21,75,1,buick century 163 | 16,6,250,105,3897,18.5,75,1,chevroelt chevelle malibu 164 | 15,6,258,110,3730,19,75,1,amc matador 165 | 18,6,225,95,3785,19,75,1,plymouth fury 166 | 21,6,231,110,3039,15,75,1,buick skyhawk 167 | 20,8,262,110,3221,13.5,75,1,chevrolet monza 2+2 168 | 13,8,302,129,3169,12,75,1,ford mustang ii 169 | 29,4,97,75,2171,16,75,3,toyota corolla 170 | 23,4,140,83,2639,17,75,1,ford pinto 171 | 20,6,232,100,2914,16,75,1,amc gremlin 172 | 23,4,140,78,2592,18.5,75,1,pontiac astro 173 | 24,4,134,96,2702,13.5,75,3,toyota corona 174 | 25,4,90,71,2223,16.5,75,2,volkswagen dasher 175 | 24,4,119,97,2545,17,75,3,datsun 710 176 | 18,6,171,97,2984,14.5,75,1,ford pinto 177 | 29,4,90,70,1937,14,75,2,volkswagen rabbit 178 | 19,6,232,90,3211,17,75,1,amc pacer 179 | 23,4,115,95,2694,15,75,2,audi 100ls 180 | 23,4,120,88,2957,17,75,2,peugeot 504 181 | 22,4,121,98,2945,14.5,75,2,volvo 244dl 182 | 25,4,121,115,2671,13.5,75,2,saab 99le 183 | 33,4,91,53,1795,17.5,75,3,honda civic cvcc 184 | 28,4,107,86,2464,15.5,76,2,fiat 131 185 | 25,4,116,81,2220,16.9,76,2,opel 1900 186 | 25,4,140,92,2572,14.9,76,1,capri ii 187 | 26,4,98,79,2255,17.7,76,1,dodge colt 188 | 27,4,101,83,2202,15.3,76,2,renault 12tl 189 | 17.5,8,305,140,4215,13,76,1,chevrolet chevelle malibu classic 190 | 16,8,318,150,4190,13,76,1,dodge coronet brougham 191 | 15.5,8,304,120,3962,13.9,76,1,amc matador 192 | 14.5,8,351,152,4215,12.8,76,1,ford gran torino 193 | 22,6,225,100,3233,15.4,76,1,plymouth valiant 194 | 22,6,250,105,3353,14.5,76,1,chevrolet nova 195 | 24,6,200,81,3012,17.6,76,1,ford maverick 196 | 22.5,6,232,90,3085,17.6,76,1,amc hornet 197 | 29,4,85,52,2035,22.2,76,1,chevrolet chevette 198 | 24.5,4,98,60,2164,22.1,76,1,chevrolet woody 199 | 29,4,90,70,1937,14.2,76,2,vw rabbit 200 | 33,4,91,53,1795,17.4,76,3,honda civic 201 | 20,6,225,100,3651,17.7,76,1,dodge aspen se 202 | 18,6,250,78,3574,21,76,1,ford granada ghia 203 | 18.5,6,250,110,3645,16.2,76,1,pontiac ventura sj 204 | 17.5,6,258,95,3193,17.8,76,1,amc pacer d/l 205 | 29.5,4,97,71,1825,12.2,76,2,volkswagen rabbit 206 | 32,4,85,70,1990,17,76,3,datsun b-210 207 | 28,4,97,75,2155,16.4,76,3,toyota corolla 208 | 26.5,4,140,72,2565,13.6,76,1,ford pinto 209 | 20,4,130,102,3150,15.7,76,2,volvo 245 210 | 13,8,318,150,3940,13.2,76,1,plymouth volare premier v8 211 | 19,4,120,88,3270,21.9,76,2,peugeot 504 212 | 19,6,156,108,2930,15.5,76,3,toyota mark ii 213 | 16.5,6,168,120,3820,16.7,76,2,mercedes-benz 280s 214 | 16.5,8,350,180,4380,12.1,76,1,cadillac seville 215 | 13,8,350,145,4055,12,76,1,chevy c10 216 | 13,8,302,130,3870,15,76,1,ford f108 217 | 13,8,318,150,3755,14,76,1,dodge d100 218 | 31.5,4,98,68,2045,18.5,77,3,honda accord cvcc 219 | 30,4,111,80,2155,14.8,77,1,buick opel isuzu deluxe 220 | 36,4,79,58,1825,18.6,77,2,renault 5 gtl 221 | 25.5,4,122,96,2300,15.5,77,1,plymouth arrow gs 222 | 33.5,4,85,70,1945,16.8,77,3,datsun f-10 hatchback 223 | 17.5,8,305,145,3880,12.5,77,1,chevrolet caprice classic 224 | 17,8,260,110,4060,19,77,1,oldsmobile cutlass supreme 225 | 15.5,8,318,145,4140,13.7,77,1,dodge monaco brougham 226 | 15,8,302,130,4295,14.9,77,1,mercury cougar brougham 227 | 17.5,6,250,110,3520,16.4,77,1,chevrolet concours 228 | 20.5,6,231,105,3425,16.9,77,1,buick skylark 229 | 19,6,225,100,3630,17.7,77,1,plymouth volare custom 230 | 18.5,6,250,98,3525,19,77,1,ford granada 231 | 16,8,400,180,4220,11.1,77,1,pontiac grand prix lj 232 | 15.5,8,350,170,4165,11.4,77,1,chevrolet monte carlo landau 233 | 15.5,8,400,190,4325,12.2,77,1,chrysler cordoba 234 | 16,8,351,149,4335,14.5,77,1,ford thunderbird 235 | 29,4,97,78,1940,14.5,77,2,volkswagen rabbit custom 236 | 24.5,4,151,88,2740,16,77,1,pontiac sunbird coupe 237 | 26,4,97,75,2265,18.2,77,3,toyota corolla liftback 238 | 25.5,4,140,89,2755,15.8,77,1,ford mustang ii 2+2 239 | 30.5,4,98,63,2051,17,77,1,chevrolet chevette 240 | 33.5,4,98,83,2075,15.9,77,1,dodge colt m/m 241 | 30,4,97,67,1985,16.4,77,3,subaru dl 242 | 30.5,4,97,78,2190,14.1,77,2,volkswagen dasher 243 | 22,6,146,97,2815,14.5,77,3,datsun 810 244 | 21.5,4,121,110,2600,12.8,77,2,bmw 320i 245 | 21.5,3,80,110,2720,13.5,77,3,mazda rx-4 246 | 43.1,4,90,48,1985,21.5,78,2,volkswagen rabbit custom diesel 247 | 36.1,4,98,66,1800,14.4,78,1,ford fiesta 248 | 32.8,4,78,52,1985,19.4,78,3,mazda glc deluxe 249 | 39.4,4,85,70,2070,18.6,78,3,datsun b210 gx 250 | 36.1,4,91,60,1800,16.4,78,3,honda civic cvcc 251 | 19.9,8,260,110,3365,15.5,78,1,oldsmobile cutlass salon brougham 252 | 19.4,8,318,140,3735,13.2,78,1,dodge diplomat 253 | 20.2,8,302,139,3570,12.8,78,1,mercury monarch ghia 254 | 19.2,6,231,105,3535,19.2,78,1,pontiac phoenix lj 255 | 20.5,6,200,95,3155,18.2,78,1,chevrolet malibu 256 | 20.2,6,200,85,2965,15.8,78,1,ford fairmont (auto) 257 | 25.1,4,140,88,2720,15.4,78,1,ford fairmont (man) 258 | 20.5,6,225,100,3430,17.2,78,1,plymouth volare 259 | 19.4,6,232,90,3210,17.2,78,1,amc concord 260 | 20.6,6,231,105,3380,15.8,78,1,buick century special 261 | 20.8,6,200,85,3070,16.7,78,1,mercury zephyr 262 | 18.6,6,225,110,3620,18.7,78,1,dodge aspen 263 | 18.1,6,258,120,3410,15.1,78,1,amc concord d/l 264 | 19.2,8,305,145,3425,13.2,78,1,chevrolet monte carlo landau 265 | 17.7,6,231,165,3445,13.4,78,1,buick regal sport coupe (turbo) 266 | 18.1,8,302,139,3205,11.2,78,1,ford futura 267 | 17.5,8,318,140,4080,13.7,78,1,dodge magnum xe 268 | 30,4,98,68,2155,16.5,78,1,chevrolet chevette 269 | 27.5,4,134,95,2560,14.2,78,3,toyota corona 270 | 27.2,4,119,97,2300,14.7,78,3,datsun 510 271 | 30.9,4,105,75,2230,14.5,78,1,dodge omni 272 | 21.1,4,134,95,2515,14.8,78,3,toyota celica gt liftback 273 | 23.2,4,156,105,2745,16.7,78,1,plymouth sapporo 274 | 23.8,4,151,85,2855,17.6,78,1,oldsmobile starfire sx 275 | 23.9,4,119,97,2405,14.9,78,3,datsun 200-sx 276 | 20.3,5,131,103,2830,15.9,78,2,audi 5000 277 | 17,6,163,125,3140,13.6,78,2,volvo 264gl 278 | 21.6,4,121,115,2795,15.7,78,2,saab 99gle 279 | 16.2,6,163,133,3410,15.8,78,2,peugeot 604sl 280 | 31.5,4,89,71,1990,14.9,78,2,volkswagen scirocco 281 | 29.5,4,98,68,2135,16.6,78,3,honda accord lx 282 | 21.5,6,231,115,3245,15.4,79,1,pontiac lemans v6 283 | 19.8,6,200,85,2990,18.2,79,1,mercury zephyr 6 284 | 22.3,4,140,88,2890,17.3,79,1,ford fairmont 4 285 | 20.2,6,232,90,3265,18.2,79,1,amc concord dl 6 286 | 20.6,6,225,110,3360,16.6,79,1,dodge aspen 6 287 | 17,8,305,130,3840,15.4,79,1,chevrolet caprice classic 288 | 17.6,8,302,129,3725,13.4,79,1,ford ltd landau 289 | 16.5,8,351,138,3955,13.2,79,1,mercury grand marquis 290 | 18.2,8,318,135,3830,15.2,79,1,dodge st. regis 291 | 16.9,8,350,155,4360,14.9,79,1,buick estate wagon (sw) 292 | 15.5,8,351,142,4054,14.3,79,1,ford country squire (sw) 293 | 19.2,8,267,125,3605,15,79,1,chevrolet malibu classic (sw) 294 | 18.5,8,360,150,3940,13,79,1,chrysler lebaron town @ country (sw) 295 | 31.9,4,89,71,1925,14,79,2,vw rabbit custom 296 | 34.1,4,86,65,1975,15.2,79,3,maxda glc deluxe 297 | 35.7,4,98,80,1915,14.4,79,1,dodge colt hatchback custom 298 | 27.4,4,121,80,2670,15,79,1,amc spirit dl 299 | 25.4,5,183,77,3530,20.1,79,2,mercedes benz 300d 300 | 23,8,350,125,3900,17.4,79,1,cadillac eldorado 301 | 27.2,4,141,71,3190,24.8,79,2,peugeot 504 302 | 23.9,8,260,90,3420,22.2,79,1,oldsmobile cutlass salon brougham 303 | 34.2,4,105,70,2200,13.2,79,1,plymouth horizon 304 | 34.5,4,105,70,2150,14.9,79,1,plymouth horizon tc3 305 | 31.8,4,85,65,2020,19.2,79,3,datsun 210 306 | 37.3,4,91,69,2130,14.7,79,2,fiat strada custom 307 | 28.4,4,151,90,2670,16,79,1,buick skylark limited 308 | 28.8,6,173,115,2595,11.3,79,1,chevrolet citation 309 | 26.8,6,173,115,2700,12.9,79,1,oldsmobile omega brougham 310 | 33.5,4,151,90,2556,13.2,79,1,pontiac phoenix 311 | 41.5,4,98,76,2144,14.7,80,2,vw rabbit 312 | 38.1,4,89,60,1968,18.8,80,3,toyota corolla tercel 313 | 32.1,4,98,70,2120,15.5,80,1,chevrolet chevette 314 | 37.2,4,86,65,2019,16.4,80,3,datsun 310 315 | 28,4,151,90,2678,16.5,80,1,chevrolet citation 316 | 26.4,4,140,88,2870,18.1,80,1,ford fairmont 317 | 24.3,4,151,90,3003,20.1,80,1,amc concord 318 | 19.1,6,225,90,3381,18.7,80,1,dodge aspen 319 | 34.3,4,97,78,2188,15.8,80,2,audi 4000 320 | 29.8,4,134,90,2711,15.5,80,3,toyota corona liftback 321 | 31.3,4,120,75,2542,17.5,80,3,mazda 626 322 | 37,4,119,92,2434,15,80,3,datsun 510 hatchback 323 | 32.2,4,108,75,2265,15.2,80,3,toyota corolla 324 | 46.6,4,86,65,2110,17.9,80,3,mazda glc 325 | 27.9,4,156,105,2800,14.4,80,1,dodge colt 326 | 40.8,4,85,65,2110,19.2,80,3,datsun 210 327 | 44.3,4,90,48,2085,21.7,80,2,vw rabbit c (diesel) 328 | 43.4,4,90,48,2335,23.7,80,2,vw dasher (diesel) 329 | 36.4,5,121,67,2950,19.9,80,2,audi 5000s (diesel) 330 | 30,4,146,67,3250,21.8,80,2,mercedes-benz 240d 331 | 44.6,4,91,67,1850,13.8,80,3,honda civic 1500 gl 332 | 40.9,4,85,?,1835,17.3,80,2,renault lecar deluxe 333 | 33.8,4,97,67,2145,18,80,3,subaru dl 334 | 29.8,4,89,62,1845,15.3,80,2,vokswagen rabbit 335 | 32.7,6,168,132,2910,11.4,80,3,datsun 280-zx 336 | 23.7,3,70,100,2420,12.5,80,3,mazda rx-7 gs 337 | 35,4,122,88,2500,15.1,80,2,triumph tr7 coupe 338 | 23.6,4,140,?,2905,14.3,80,1,ford mustang cobra 339 | 32.4,4,107,72,2290,17,80,3,honda accord 340 | 27.2,4,135,84,2490,15.7,81,1,plymouth reliant 341 | 26.6,4,151,84,2635,16.4,81,1,buick skylark 342 | 25.8,4,156,92,2620,14.4,81,1,dodge aries wagon (sw) 343 | 23.5,6,173,110,2725,12.6,81,1,chevrolet citation 344 | 30,4,135,84,2385,12.9,81,1,plymouth reliant 345 | 39.1,4,79,58,1755,16.9,81,3,toyota starlet 346 | 39,4,86,64,1875,16.4,81,1,plymouth champ 347 | 35.1,4,81,60,1760,16.1,81,3,honda civic 1300 348 | 32.3,4,97,67,2065,17.8,81,3,subaru 349 | 37,4,85,65,1975,19.4,81,3,datsun 210 mpg 350 | 37.7,4,89,62,2050,17.3,81,3,toyota tercel 351 | 34.1,4,91,68,1985,16,81,3,mazda glc 4 352 | 34.7,4,105,63,2215,14.9,81,1,plymouth horizon 4 353 | 34.4,4,98,65,2045,16.2,81,1,ford escort 4w 354 | 29.9,4,98,65,2380,20.7,81,1,ford escort 2h 355 | 33,4,105,74,2190,14.2,81,2,volkswagen jetta 356 | 34.5,4,100,?,2320,15.8,81,2,renault 18i 357 | 33.7,4,107,75,2210,14.4,81,3,honda prelude 358 | 32.4,4,108,75,2350,16.8,81,3,toyota corolla 359 | 32.9,4,119,100,2615,14.8,81,3,datsun 200sx 360 | 31.6,4,120,74,2635,18.3,81,3,mazda 626 361 | 28.1,4,141,80,3230,20.4,81,2,peugeot 505s turbo diesel 362 | 30.7,6,145,76,3160,19.6,81,2,volvo diesel 363 | 25.4,6,168,116,2900,12.6,81,3,toyota cressida 364 | 24.2,6,146,120,2930,13.8,81,3,datsun 810 maxima 365 | 22.4,6,231,110,3415,15.8,81,1,buick century 366 | 26.6,8,350,105,3725,19,81,1,oldsmobile cutlass ls 367 | 20.2,6,200,88,3060,17.1,81,1,ford granada gl 368 | 17.6,6,225,85,3465,16.6,81,1,chrysler lebaron salon 369 | 28,4,112,88,2605,19.6,82,1,chevrolet cavalier 370 | 27,4,112,88,2640,18.6,82,1,chevrolet cavalier wagon 371 | 34,4,112,88,2395,18,82,1,chevrolet cavalier 2-door 372 | 31,4,112,85,2575,16.2,82,1,pontiac j2000 se hatchback 373 | 29,4,135,84,2525,16,82,1,dodge aries se 374 | 27,4,151,90,2735,18,82,1,pontiac phoenix 375 | 24,4,140,92,2865,16.4,82,1,ford fairmont futura 376 | 23,4,151,?,3035,20.5,82,1,amc concord dl 377 | 36,4,105,74,1980,15.3,82,2,volkswagen rabbit l 378 | 37,4,91,68,2025,18.2,82,3,mazda glc custom l 379 | 31,4,91,68,1970,17.6,82,3,mazda glc custom 380 | 38,4,105,63,2125,14.7,82,1,plymouth horizon miser 381 | 36,4,98,70,2125,17.3,82,1,mercury lynx l 382 | 36,4,120,88,2160,14.5,82,3,nissan stanza xe 383 | 36,4,107,75,2205,14.5,82,3,honda accord 384 | 34,4,108,70,2245,16.9,82,3,toyota corolla 385 | 38,4,91,67,1965,15,82,3,honda civic 386 | 32,4,91,67,1965,15.7,82,3,honda civic (auto) 387 | 38,4,91,67,1995,16.2,82,3,datsun 310 gx 388 | 25,6,181,110,2945,16.4,82,1,buick century limited 389 | 38,6,262,85,3015,17,82,1,oldsmobile cutlass ciera (diesel) 390 | 26,4,156,92,2585,14.5,82,1,chrysler lebaron medallion 391 | 22,6,232,112,2835,14.7,82,1,ford granada l 392 | 32,4,144,96,2665,13.9,82,3,toyota celica gt 393 | 36,4,135,84,2370,13,82,1,dodge charger 2.2 394 | 27,4,151,90,2950,17.3,82,1,chevrolet camaro 395 | 27,4,140,86,2790,15.6,82,1,ford mustang gl 396 | 44,4,97,52,2130,24.6,82,2,vw pickup 397 | 32,4,135,84,2295,11.6,82,1,dodge rampage 398 | 28,4,120,79,2625,18.6,82,1,ford ranger 399 | 31,4,119,82,2720,19.4,82,1,chevy s-10 400 | -------------------------------------------------------------------------------- /03-Medical-Appointment-No-Show/README.md: -------------------------------------------------------------------------------- 1 | # **Medical Appointment No Show Dataset Analysis** 2 | 3 | | Contents | 4 | | -------- | 5 | | [Dataset Description](#Dataset-Description) | 6 | | [Columns Descreption](#Columns-Descreption) | 7 | | [EDA Questions](#eda-questions) | 8 | | [Data Wrangling](#Data-Wrangling) | 9 | | [Data Cleaning](#Data-Cleaning) | 10 | | [Data Visualization](#Data-Visualization) | 11 | | [Conclusion](#Conclusion) | 12 | | [Built with](#Built-with) | 13 | 14 | ## Dataset Description: 15 | A person makes a doctor appointment, receives all the instructions and no-show. Who to blame? 16 | This dataset collects information from 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointment. A number of characteristics about the patient are included in each row. 17 | 18 | ## Columns Descreption: 19 | 1. `PatientId`: Identification of a patient. 20 | 2. `AppointmentID`: Identification of each appointment. 21 | 3. `Gender`: Male or Female. 22 | 4. `AppointmentDay`: The day of the actuall appointment, when they have to visit the doctor. 23 | 5. `ScheduledDay`: The day someone called or registered the appointment, this is before appointment of course. 24 | 6. `Age`: How old is the patient. 25 | 7. `Neighbourhood`: Where the appointment takes place. 26 | 8. `Scholarship`: True of False. Observation, this is a broad topic, consider reading this article https://en.wikipedia.org/wiki/Bolsa_Fam%C3%ADlia 27 | 9. `Hipertension`: True or False. 28 | 10. `Diabetes`: True or False. 29 | 11. `Alcoholism`: True or False. 30 | 12. `Handcap`: True or False. 31 | 13. `SMS_received`: 1 or more messages sent to the patient. 32 | 14. `No-show`: True or False. 33 | 34 | ## EDA Questions: 35 | - Q1: How often do men go to hospitals compared to women? Which of them is more likely to show up? 36 | - Q2: Does recieving an SMS as a reminder affect whether or not a patient may show up? is it correlated with number of days before the appointment? 37 | - Q3: Does having a scholarship affects showing up on a hospital appointment? What are the age groups affected by this? 38 | - Q4: Does having certain deseases affect whather or not a patient may show up to their appointment? is it affected by gender? 39 | 40 | ## Data Wrangling: 41 | Our data can be found on `noshowappointments-kagglev2-may-2016.csv` file provided on this repository, downloaded from [Kaggle](https://www.kaggle.com/datasets/joniarroba/noshowappointments). 42 | 43 | ## Data Cleaning: 44 | ### Exploration Summery 45 | 1. our dataset consists of 110527 rows with 14 columns, and has no NaNs nor duplicated values. 46 | 2. `PatientId` and `AppointmentId` columns wouldn't be helpful during analysis. 47 | 3. `ScheduledDay` and `AppointmentDay` needs to be casted to date data type. 48 | 4. we may append a new column for days until appointment. 49 | 5. `Gender` needs to be casted into a categoy type 50 | 6. `Scholarship`, `Hipertension`, `Diabetes`, `Alcoholism` and `SMS_recieved` better be boolean data type. 51 | 7. `No-show` column needs to be parsed and asted to boolean type. 52 | 8. `Handcap` colume needs to be cleaned to have only `0` and `1` values. 53 | 9. `Age` columns has inconsistant unique values that needs to be handled. 54 | 55 | We endded up with a datafram of 110521 rows and 11 columns after completing the cleaning process. 56 | 57 | ## Data Visualization 58 | Using `Matplotlib` and `Seaborn`, we made several meaningful visuals and charts to help us gain informative insights regarding any correlation between attributes in our dataset, that'll be discussed in the next section. 59 | 60 | ## Conclusion 61 | These are derived conclusions after completing our data visualisation phase. 62 | 63 | ### Q1: How often do men go to hospitals compared to women? Which of them is more likely to show up? 64 | - Nearly half of our dataset conists of women with wider age destribution and some outliers, all of which achiees a rate higher than men. 65 | - It is obvious that 79.8% of our patients did show up on their appointments and only 20.1% of them did not. 66 | - Women do show up on their appointments more often than men do, but this may b affected by the percentage of women on this dataset. 67 | ___ 68 | ### Q2: Does recieving an SMS as a reminder affect whether or not a patient may show up? is it correlated with number of days before the appointment? 69 | - 67.8% of our patients did not reciee any SMS reminder of their appointments, yet they showed up on their appointments. 70 | - It is clear that there is a positive correlation between number of due days and whether a patient shows up or not. 71 | - Patient with appointments from 0 to 30 days tend to show up more regularly, while patients with higher number of days tend to not show up. 72 | - gender does not affect number of due days and showing up at an appointment that much. 73 | ___ 74 | ### Q3: Does having a scholarship affects showing up on a hospital appointment? What are the age groups affected by this? 75 | - Having a scholarship does not affect showing up to a doctor appointment that much. 76 | - Huge age group is enrolled to that scholarship and also enrol their babies on. 77 | ___ 78 | ### Q4: Does having certain deseases affect whather or not a patient may show up to their appointment? is it affected by gender? 79 | - We can conclude that the vast majority of our dataset does not have chronic deseases, yet, they are existed in so many young people. 80 | - Having a chronic deseas may affect your showing up at a hospital's appointment. 81 | 82 | ## Built with: 83 | - JupyterLab 84 | - Python3 85 | - Pandas 86 | - Numpy 87 | - Matplotlib 88 | - Seaborn 89 | -------------------------------------------------------------------------------- /04-9000+-Movies-Dataset-Analysis/README.md: -------------------------------------------------------------------------------- 1 | # **Medical Appointment No Show Dataset Analysis** 2 | 3 | | Contents | 4 | | -------- | 5 | | [Dataset Description](#Dataset-Description) | 6 | | [Columns Descreption](#Columns-Descreption) | 7 | | [EDA Questions](#eda-questions) | 8 | | [Data Wrangling](#Data-Wrangling) | 9 | | [Data Cleaning](#Data-Cleaning) | 10 | | [Data Visualization](#Data-Visualization) | 11 | | [Conclusion](#Conclusion) | 12 | | [Built with](#Built-with) | 13 | 14 | ## Dataset Description: 15 | This data set contains information about +9000 movies extracted from TMDB API. 16 | 17 | ## Columns Descreption: 18 | 1. `Release_Date`: Date when the movie was released. 19 | 2. `Title`: Name of the movie. 20 | 3. `Overview`: Brief summary of the movie. 21 | 4. `Popularity`: It is a very important metric computed by TMDB developers based on the number of views per day, votes per day, number of users marked it as "favorite" and "watchlist" for the data, release date and more other metrics. 22 | 5. `Vote_Count`: Total votes received from the viewers. 23 | 6. `Vote_Average`: Average rating based on vote count and the number of viewers out of 10. 24 | 7. `Original_Language`: Original language of the movies. Dubbed version is not considered to be original language. 25 | 8. `Genre`: Categories the movie it can be classified as. 26 | 9. `Poster_Url`: Url of the movie poster. 27 | 28 | ## EDA Questions: 29 | - Q1: What is the most frequent `genre` in the dataset? 30 | - Q2: What `genres` has highest `votes`? 31 | - Q3: What movie got the highest `popularity`? what's its `genre`? 32 | - Q4: Which year has the most filmmed movies? 33 | 34 | ## Data Wrangling: 35 | Our data can be found on `mymoviedb.csv` file provided on this repository, downloaded from [Kaggle](https://www.kaggle.com/datasets/disham993/9000-movies-dataset). 36 | 37 | ## Data Cleaning: 38 | ### Exploration Summery 39 | - we have a dataframe consisting of 9827 rows and 9 columns. 40 | - our dataset looks a bit tidy with no NaNs nor duplicated values. 41 | - `Release_Date` column needs to be casted into date time and to extract only the year value. 42 | - `Overview`, `Original_Languege` and `Poster-Url` wouldn't be so useful during analysis, so we'll drop them. 43 | - there is noticable outliers in `Popularity` column 44 | - `Vote_Average` bettter be categorised for proper analysis. 45 | - `Genre` column has comma saperated values and white spaces that needs to be handled and casted into category. 46 | 47 | We endded up with a datafram of a total of 6 columns and 25551 rows to dig into during our analysis after comleting our cleaning. 48 | 49 | ## Data Visualization 50 | Using `Matplotlib` and `Seaborn`, we made several meaningful visuals and charts to help us gain informative insights regarding any correlation between attributes in our dataset, that'll be discussed in the next section. 51 | 52 | ## Conclusion 53 | These are derived conclusions after completing our data visualisation phase. 54 | 55 | ### Q1: What is the most frequent `genre` in the dataset? 56 | `Drama` genre is the most frequent genre in our dataset and has appeared more than 14% of the times among 19 other genres. 57 | 58 | ### Q2: What `genres` has highest `votes`? 59 | we have 25.5% of our dataset with popular vote (6520 rows). 60 | `Drama` again gets the highest popularity among fans by being having more than 18.5% of movies popularities. 61 | 62 | ### Q3: What movie got the highest `popularity`? what's its `genre`? 63 | `Spider-Man: No Way Home` has the highest popularity rate in our dataset and it has genres of `Action`, `Adventure` and `Sience Fiction`. 64 | 65 | ### Q4: Which year has the most filmmed movies? 66 | year `2020` has the highest filmming rate in our dataset. 67 | 68 | ## Built with: 69 | - JupyterLab 70 | - Python3 71 | - Pandas 72 | - Numpy 73 | - Matplotlib 74 | - Seaborn 75 | -------------------------------------------------------------------------------- /05-Wine-Quality-Dataset/README.md: -------------------------------------------------------------------------------- 1 | # **Wine Quality Dataset Analysis and EDA** 2 | 3 | | Contents | 4 | | -------- | 5 | | [Dataset Description](#Dataset-Description) | 6 | | [Columns Descreption](#Columns-Descreption) | 7 | | [EDA Questions](#eda-questions) | 8 | | [Data Wrangling](#Data-Wrangling) | 9 | | [Data Cleaning](#Data-Cleaning) | 10 | | [Data Visualization](#Data-Visualization) | 11 | | [Conclusion](#Conclusion) | 12 | | [Built with](#Built-with) | 13 | 14 | ## Dataset Description: 15 | There are two datasets that provide information on samples of red and white variants of the Portuguese "Vinho Verde" wine. 16 | Each sample of wine was rated for quality by wine experts and examined with physicochemical tests. Due to privacy and logistic issues, 17 | only data on these physicochemical properties and quality ratings are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.). 18 | data is originaly from [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Wine+Quality). 19 | 20 | 21 | ## Columns Descreption: 22 | 1. `fixed acidity`: most acids involved with wine or fixed or nonvolatile (do not evaporate readily) 23 | 2. `volatile acidity`: the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste 24 | 3. `citric acid`: found in small quantities, citric acid can add 'freshness' and flavor to wines 25 | 4. `residual sugar`: the amount of sugar remaining after fermentation stops, it's rare to find wines with less than 1 gram/liter and wines with greater than 45 grams/liter are considered sweet 26 | 5. `chlorides`: the amount of salt in the wine 27 | 6. `free sulfur dioxide`: the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine 28 | 7. `total sulfur dioxide`: amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine 29 | 8. `density`: the density of water is close to that of water depending on the percent alcohol and sugar content 30 | 9. `pH`: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale 31 | 10. `sulphates`: a wine additive which can contribute to sulfur dioxide gas (S02) levels, wich acts as an antimicrobial and antioxidant 32 | 11. `alcohol`: the percent alcohol content of the wine 33 | 12. `quality`: (score between 0 and 10) 34 | 35 | 36 | ## EDA Questions: 37 | - Q1: What chemical characteristics are most important in predicting the quality of wine? 38 | - Q2: Is a certain type of wine (red or white) associated with higher quality? 39 | - Q3: Do wines with higher alcoholic content receive better ratings? 40 | - Q4: Do sweeter wines (more residual sugar) receive better ratings? 41 | - Q5: What level of acidity (pH) is associated with the highest quality? 42 | 43 | 44 | ## Data Wrangling: 45 | Our data can be found on `wineQualityReds.csv` and `wineQualityWhites.csv` files provided on this repository, 46 | downloaded from [Kaggle](https://www.kaggle.com/datasets/danielpanizzo/wine-quality) 47 | and originaly from [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Wine+Quality). 48 | 49 | 50 | ## Data Cleaning: 51 | ### Exploration Summery 52 | - red dataframe consists of 1599 records and 13 attributes, while white dataframe consists of 4898 records and the same attributes. 53 | - both data frames has no NaNs nor duplicated values. 54 | - we woul combine the two dataframes and append a new categorical column to indecate the wine color for better analysis. 55 | - columns data types are consistant. 56 | - `Unnamed: 0` column would be dropped. 57 | 58 | We endded up with with 13 columns and 6497 rows for our data to begin the analysis with. 59 | a new csv file containing our full data is saved in `wine_full.csv`. 60 | 61 | 62 | ## Data Visualization 63 | Using `Matplotlib` and `Seaborn`, we made several meaningful visuals and charts to help us gain informative insights regarding any correlation between attributes in our dataset, that'll be discussed in the next section. 64 | 65 | 66 | ## Conclusion 67 | These are derived conclusions after completing our data visualisation phase. 68 | 69 | ### Q1: What chemical characteristics are most important in predicting the quality of wine? 70 | - the vast majority of the wine has a `quality` of 6, while less numbers has a `quality` of 9. 71 | - using correlation plot, we can easily see if certain attributes are correlated more strongly to wine `quality` than some others. 72 | 73 | - strong correlated attributes: 74 | - `alcohol` and `quality`, and it's clear that this is the highest relation that affects wine `quality`. 75 | - weak correlated attributes (do not depend on each other): 76 | - `density` and `alcohol`. 77 | - `free.sulphur.dioxide` and `citric.acid` has almost no correlation with quality 78 | - `density` has strong positive correlation with `residual.sugar` and strong negative correlation with `alcohol`. 79 | 80 | --- 81 | ### Q2: Is a certain type of wine (red or white) associated with higher quality? 82 | - there is noticable deviation between `white` and `red` wine counts. 83 | - `white` wine formes the vast majority of our dataset as it appears in more than 75% of the times. 84 | - most of the `white` wine has a `quality` of 6, while most of the `red` wine has a `quality` of 5. 85 | - the mean `quality` of `red` and `white` wine are ve`ry close. 86 | - `white` wine has the best mean `quality` higher than `red` wine. 87 | 88 | --- 89 | ### Q3: Do wines with higher alcoholic content receive better ratings? 90 | - we have the highst `alcohol` content at 14.9. 91 | - most of the wine has `alcoholic` content around 10.4. 92 | - most of our dataset that has a `quality` of 6 appears to have relatively low `acoholic` content, but it's still above the mean. 93 | - high `alcoholic` content only appears in our dataset with high `quality` wine. 94 | 95 | --- 96 | ### Q4: Do sweeter wines (more residual sugar) receive better ratings? 97 | - we can see that the highest `sugar` content is tied to a `quality` of 5, while lower `sugar` content appears to have respectively higher `quality`. 98 | 99 | --- 100 | ### Q5: What level of acidity (pH) is associated with the highest quality? 101 | - most of the wine in our dataset has high `acidity level` 102 | - it's clear that all four acidity levels has close mean `quality`, but the `Low acidity` level has the highest `quality` in our dataset. 103 | 104 | 105 | ## Built with: 106 | - JupyterLab 107 | - Python3 108 | - Pandas 109 | - Numpy 110 | - Matplotlib 111 | - Seaborn 112 | -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/Chinook-SQL-Project-Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/06-Query-a-Digital-Music-Store-Database/Chinook-SQL-Project-Report.pdf -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/Chinook-SQL-Queries.sql: -------------------------------------------------------------------------------- 1 | /* Question 1: Who is the top customer? */ 2 | SELECT (Customer.FirstName || " " || Customer.LastName) as customer_name, 3 | SUM(Invoice.Total) AS total_spending 4 | FROM Customer 5 | JOIN Invoice 6 | ON Customer.CustomerId = Invoice.CustomerId 7 | GROUP BY name 8 | ORDER BY total_spending DESC 9 | LIMIT 5 10 | 11 | /* Question 2: Who is the best selling artist? */ 12 | SELECT Artist.ArtistId, Artist.Name AS artist_name, 13 | SUM(InvoiceLine.UnitPrice * InvoiceLine.Quantity) AS total_sales 14 | FROM InvoiceLine 15 | JOIN Track 16 | ON Track.TrackId = InvoiceLine.TrackId 17 | JOIN Album 18 | ON Album.AlbumId = Track.AlbumId 19 | JOIN Artist 20 | ON Artist.ArtistId = Album.ArtistId 21 | GROUP BY Artist.ArtistId 22 | ORDER BY 3 DESC 23 | LIMIT 5 24 | 25 | /* Question 3: What is the most frequent genre? */ 26 | SELECT Genre.Name AS genre_name, COUNT(Track.GenreId) AS Total 27 | FROM Genre 28 | JOIN Track 29 | ON Genre.GenreId = Track.GenreId 30 | GROUP BY Genre.Name 31 | ORDER BY COUNT(Track.GenreId) DESC 32 | LIMIT 5 33 | 34 | /* Question 4: Who is the top rock artist? */ 35 | SELECT DISTINCT Artist.ArtistId, Artist.Name AS artist_name, 36 | COUNT(Genre.Name) AS rock_songs_count 37 | FROM Artist 38 | JOIN Album 39 | ON Artist.ArtistId = Album.ArtistId 40 | JOIN Track 41 | ON Album.AlbumID = Track.AlbumId 42 | JOIN Genre 43 | ON Track.GenreId = Genre.GenreId 44 | WHERE Genre.Name LIKE 'Rock' 45 | GROUP BY Artist.ArtistId 46 | ORDER BY COUNT(Genre.Name) DESC 47 | LIMIT 10 -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/README.md: -------------------------------------------------------------------------------- 1 | # **Query a Digital Music Store Database** 2 | 3 | | Contents | 4 | | -------- | 5 | | [Project Description](#Project-Description) | 6 | | [Database ERD](#Database-ERD) | 7 | | [Questions](#questions) | 8 | | [Conclusion](#Conclusion) | 9 | | [Built with](#Built-with) | 10 | 11 | ## Project Description: 12 | This project is part of Udacity's Data Analysis Nanodegree Program. In this project, we will query the Chinook Database, a digital music store, w 13 | e will be assisting the Chinook team with understanding the media in their store, their customers and employees, 14 | and their invoice information. The schema for the Chinook Database is provided below. 15 | 16 | 17 | ## Database ERD: 18 | ![Chinook Music Store ERD](https://video.udacity-data.com/topher/2017/June/5956d5ee_screen-shot-2017-06-29-at-10.51.15-pm/screen-shot-2017-06-29-at-10.51.15-pm.png) 19 | 20 | 21 | ## Questions: 22 | - Q1: Who is the top customer at Chinook? 23 | - Q2: Who is the best selling artist? 24 | - Q3: What is the most frequent genre in the database? 25 | - Q4: Who is the top rock artist in the database? 26 | 27 | 28 | ## Conclusion 29 | These are derived conclusions after completing quering Chinook database. 30 | 31 | ### Q1: Who is the top customer at Chinook? 32 | Helena Holy is our best customer who purchased to Chinook. 33 | ![Top 5 Customers](https://github.com/xShaimaa/Data-Analysis-Projects/blob/main/06-Query-a-Digital-Music-Store-Database/img/q1.png) 34 | 35 | ### Q2: Who is the best selling artist? 36 | Iron Maiden is the best selling artist at Chinook. 37 | ![Top 5 Best Selling Artists](https://github.com/xShaimaa/Data-Analysis-Projects/blob/main/06-Query-a-Digital-Music-Store-Database/img/q2.png) 38 | 39 | ### Q3: What is the most frequent genre in the database? 40 | Rock appears to be the most frequent genre in Chinook database. 41 | ![Most Frequent Genres](https://github.com/xShaimaa/Data-Analysis-Projects/blob/main/06-Query-a-Digital-Music-Store-Database/img/q3.png) 42 | 43 | ### Q4: Who is the top rock artist in the database? 44 | Led Zeppelin is the top rock artist to Chinook. 45 | ![Top 10 Rock Artists](https://github.com/xShaimaa/Data-Analysis-Projects/blob/main/06-Query-a-Digital-Music-Store-Database/img/q4.png) 46 | 47 | 48 | ## Built with: 49 | - DB Browser SQLlite 50 | - Python3 51 | - Pandas 52 | - Matplotlib 53 | - Seaborn 54 | - Google Slides 55 | -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/img/q1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/06-Query-a-Digital-Music-Store-Database/img/q1.png -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/img/q2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/06-Query-a-Digital-Music-Store-Database/img/q2.png -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/img/q3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/06-Query-a-Digital-Music-Store-Database/img/q3.png -------------------------------------------------------------------------------- /06-Query-a-Digital-Music-Store-Database/img/q4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/06-Query-a-Digital-Music-Store-Database/img/q4.png -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/README.md: -------------------------------------------------------------------------------- 1 | # **Create a Data Model for Seven Sages Brewing Company** 2 | 3 | Project 1 of Udacity's [Data Analysis and Visualization with Microsoft Power BI Nanodegree Program](https://www.udacity.com/course/data-analysis-and-visualization-with-power-BI-nanodegree--nd331) 4 | in **Introduction to Preparing and Modeling Data** course. 5 | 6 | ## Project Description: 7 | The mission is to tame the datasets and create an efficient data model for a small brewing company that will help them better understand 8 | what products are popular and profitable so they can mark smart decisions about what products to prioritize as the company continues to grow. 9 | The project demonstrates an understanding of core data modeling principles, including the ability to clean, organize and structure data in Power Query, 10 | to make a date table, to build a data model with the appropriate relationships and filters, and to create a simple report 11 | using common visualizations and DAX measures. 12 | 13 | Below is a quick demonestration about project steps. 14 | 15 | ### Get Data: 16 | used files are `CFO Metrics Tracker.xlsx`, `Customer List (as of FY2021).txt`, 17 | `SSBC Product Offerings.pdf`, `USD-CAD Exchange Rates.csv`, 18 | `Monthly Sales Logs/` downloaded from Udacity and can be found on `Source Files/` folder on this repo. 19 | 20 | 21 | ### ETL with Power Query: 22 | We used Power Query to make data cleaning/pre-processing on our datasets, that included: 23 | - Merging 12 monthly sales files into `Full 2021 Sales` query for better analysis. 24 | - Merging `Customer List (as of FY2021).txt` and `SSBC Product Offerings.pdf` to `Product_CP` query to include all product relevalt attributes. 25 | - Promoted first rows as headers. 26 | - Removed NULL values in all datasets. 27 | - Renamed queries and columns with descriptive names. 28 | - Changed columns' data types to suitable ones. 29 | - Built dynamic date table that we'll dive into in the next section. 30 | 31 | 32 | ### Creating Date Table: 33 | A date table has been created using Power Query that is set to dynamically update based on the fact table’s start and end data. 34 | The date table includes standard fields: 35 | 36 | - Calendar month name and number 37 | - Calendar year 38 | - Fiscal period 39 | - Fiscal year 40 | - Fiscal quarter -Quarter - FY (e.g., Q1 - FY2021) 41 | 42 | > Note: Seven Sages' Fiscal year begins on October 1st and runs until September 30th. A transaction on Sept 20th 2020 would fall in FY 2020, but a transaction on October 20th would land in FY 2021 43 | 44 | 45 | ### Create Data Model (build relationships between tables): 46 | We ended up with one fact table `Full 2021 Sales` and four diminsion tables pointing towards it with an active one to many relationship. 47 | a snapshot of the data model is provided below and can be found on `SSBC-Data-Model.png` on this repo. 48 | 49 | ![SSBC Data Model](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/01-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Data-Model.png) 50 | 51 | 52 | ### Writing DAX Measures: 53 | To satisfy the CFO's requirements, we will need to write six measures—to calculate Sales, 54 | Cost of Sales and Gross Profit Margin in two different currencies. 55 | The following measures have been created using DAX, are present on the data model, and are clearly labeled: 56 | 57 | - Sales in USD ($) 58 | - Cost of Sales USD ($) 59 | - Gross Profit Margin (or GPM) in USD (%) 60 | - Sales in CAD ($) 61 | - Unit Sales by Product (%) 62 | - Share of gross profit by Product type (%) 63 | 64 | 65 | ### Build a Report 66 | To satisfy the CFO's requirements, our basic version of report will have two tabs, one summarizing sales by customer and customer type across quarters and would be labeled `Sales and GPM`. 67 | The second will simply summarize the percentages of gross profit and unit sales by product and would be labeled `Gross Profit and Unit Sales`. 68 | Both tabs has a very brief executive summary at the bottom. 69 | Full PDF report can be found on `SSBC-Report` file provided on this repo. 70 | 71 | ![SSBC Report Tab 1](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/01-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report-Tab1.jpg) 72 | 73 | 74 | ![SSBC Report Tab 2](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/01-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report-Tab2.jpg) 75 | -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Data-Model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Data-Model.png -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Project.pbix: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Project.pbix -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report-Tab1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report-Tab1.jpg -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report-Tab2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report-Tab2.jpg -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/SSBC-Report.pdf -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/CFO Metrics Tracker.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/CFO Metrics Tracker.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Customer List (as of FY2021).txt: -------------------------------------------------------------------------------- 1 | PKCustomerID Customer CustType City State/Province Country 2 | BC100 Saanich Pub Bar Victoria British Columbia Canada 3 | BC101 Toques and Blokes Bar Vancouver British Columbia Canada 4 | BC102 Queens Arms Bar Victoria British Columbia Canada 5 | BC103 McDougal's Distributor Vancouver British Columbia Canada 6 | BC105 The Black Bear Bar Victoria British Columbia Canarda 7 | BC109 Ale's What Cures You Distributor Victoria British Columbia Canada 8 | BC110 Barrel's Best Distributor Victoria British Columbia Canada 9 | BC111 Tequila Mockinbird Bar Surry British Columbia Canada 10 | WA001 Cash Purchase SSBC Tsting Room Redmond Washington United States 11 | WA003 Beergatory Bar Tacoma Washington United States 12 | WA004 Moe's Bar Tacoma Washington United States 13 | WA005 The Killer Well Barn Seattle Washington United States 14 | WA006 Bacchus' Best Distributor Everett Washington United States 15 | WA007 Bike n' Brew Bar Everett Washington United States 16 | WA012 Puget's Finest Bar Seattle Washington United States 17 | WA014 Brew Ha Ha Distributor Tacoma Washington United States 18 | WA055 Rainier & Co. Distributor Seattle Washington United States 19 | -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Apr 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Apr 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Aug 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Aug 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Dec 2020 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Dec 2020 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Feb 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Feb 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Jan 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Jan 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Jul 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Jul 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Jun 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Jun 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Mar 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Mar 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - May 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - May 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Nov 2020 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Nov 2020 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Oct 2020 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Oct 2020 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Sep 2021 Sales.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/Monthly Sales Logs/SSBC - Sep 2021 Sales.xlsx -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/SSBC Product Offerings.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/SSBC Product Offerings.pdf -------------------------------------------------------------------------------- /07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/Source Files/USD-CAD Exchange Rates.csv: -------------------------------------------------------------------------------- 1 | Date,USD,CAD 2 | 10/1/2020,1,1.352 3 | 10/2/2020,1,1.353 4 | 10/3/2020,1,1.353 5 | 10/4/2020,1,1.353 6 | 10/5/2020,1,1.353 7 | 10/6/2020,1,1.353 8 | 10/7/2020,1,1.353 9 | 10/8/2020,1,1.354 10 | 10/9/2020,1,1.354 11 | 10/10/2020,1,1.354 12 | 10/11/2020,1,1.354 13 | 10/12/2020,1,1.354 14 | 10/13/2020,1,1.354 15 | 10/14/2020,1,1.354 16 | 10/15/2020,1,1.355 17 | 10/16/2020,1,1.355 18 | 10/17/2020,1,1.355 19 | 10/18/2020,1,1.355 20 | 10/19/2020,1,1.355 21 | 10/20/2020,1,1.355 22 | 10/21/2020,1,1.356 23 | 10/22/2020,1,1.356 24 | 10/23/2020,1,1.356 25 | 10/24/2020,1,1.356 26 | 10/25/2020,1,1.356 27 | 10/26/2020,1,1.356 28 | 10/27/2020,1,1.357 29 | 10/28/2020,1,1.357 30 | 10/29/2020,1,1.357 31 | 10/30/2020,1,1.357 32 | 10/31/2020,1,1.357 33 | 11/1/2020,1,1.357 34 | 11/2/2020,1,1.357 35 | 11/3/2020,1,1.358 36 | 11/4/2020,1,1.358 37 | 11/5/2020,1,1.358 38 | 11/6/2020,1,1.358 39 | 11/7/2020,1,1.358 40 | 11/8/2020,1,1.358 41 | 11/9/2020,1,1.359 42 | 11/10/2020,1,1.359 43 | 11/11/2020,1,1.359 44 | 11/12/2020,1,1.359 45 | 11/13/2020,1,1.359 46 | 11/14/2020,1,1.359 47 | 11/15/2020,1,1.359 48 | 11/16/2020,1,1.360 49 | 11/17/2020,1,1.360 50 | 11/18/2020,1,1.360 51 | 11/19/2020,1,1.360 52 | 11/20/2020,1,1.360 53 | 11/21/2020,1,1.360 54 | 11/22/2020,1,1.361 55 | 11/23/2020,1,1.361 56 | 11/24/2020,1,1.361 57 | 11/25/2020,1,1.361 58 | 11/26/2020,1,1.361 59 | 11/27/2020,1,1.361 60 | 11/28/2020,1,1.362 61 | 11/29/2020,1,1.362 62 | 11/30/2020,1,1.362 63 | 12/1/2020,1,1.362 64 | 12/2/2020,1,1.362 65 | 12/3/2020,1,1.362 66 | 12/4/2020,1,1.362 67 | 12/5/2020,1,1.363 68 | 12/6/2020,1,1.363 69 | 12/7/2020,1,1.363 70 | 12/8/2020,1,1.363 71 | 12/9/2020,1,1.363 72 | 12/10/2020,1,1.363 73 | 12/11/2020,1,1.364 74 | 12/12/2020,1,1.364 75 | 12/13/2020,1,1.364 76 | 12/14/2020,1,1.364 77 | 12/15/2020,1,1.364 78 | 12/16/2020,1,1.364 79 | 12/17/2020,1,1.365 80 | 12/18/2020,1,1.365 81 | 12/19/2020,1,1.365 82 | 12/20/2020,1,1.365 83 | 12/21/2020,1,1.365 84 | 12/22/2020,1,1.365 85 | 12/23/2020,1,1.365 86 | 12/24/2020,1,1.366 87 | 12/25/2020,1,1.366 88 | 12/26/2020,1,1.366 89 | 12/27/2020,1,1.366 90 | 12/28/2020,1,1.366 91 | 12/29/2020,1,1.366 92 | 12/30/2020,1,1.367 93 | 12/31/2020,1,1.367 94 | 1/1/2021,1,1.297 95 | 1/2/2021,1,1.298 96 | 1/3/2021,1,1.299 97 | 1/4/2021,1,1.298 98 | 1/5/2021,1,1.299 99 | 1/6/2021,1,1.297 100 | 1/7/2021,1,1.299 101 | 1/8/2021,1,1.304 102 | 1/9/2021,1,1.302 103 | 1/10/2021,1,1.302 104 | 1/11/2021,1,1.303 105 | 1/12/2021,1,1.303 106 | 1/13/2021,1,1.304 107 | 1/14/2021,1,1.305 108 | 1/15/2021,1,1.305 109 | 1/16/2021,1,1.306 110 | 1/17/2021,1,1.306 111 | 1/18/2021,1,1.307 112 | 1/19/2021,1,1.308 113 | 1/20/2021,1,1.308 114 | 1/21/2021,1,1.309 115 | 1/22/2021,1,1.310 116 | 1/23/2021,1,1.310 117 | 1/24/2021,1,1.311 118 | 1/25/2021,1,1.311 119 | 1/26/2021,1,1.312 120 | 1/27/2021,1,1.313 121 | 1/28/2021,1,1.313 122 | 1/29/2021,1,1.314 123 | 1/30/2021,1,1.314 124 | 1/31/2021,1,1.315 125 | 2/1/2021,1,1.316 126 | 2/2/2021,1,1.316 127 | 2/3/2021,1,1.317 128 | 2/4/2021,1,1.317 129 | 2/5/2021,1,1.318 130 | 2/6/2021,1,1.319 131 | 2/7/2021,1,1.319 132 | 2/8/2021,1,1.320 133 | 2/9/2021,1,1.320 134 | 2/10/2021,1,1.321 135 | 2/11/2021,1,1.322 136 | 2/12/2021,1,1.322 137 | 2/13/2021,1,1.323 138 | 2/14/2021,1,1.323 139 | 2/15/2021,1,1.324 140 | 2/16/2021,1,1.325 141 | 2/17/2021,1,1.325 142 | 2/18/2021,1,1.326 143 | 2/19/2021,1,1.327 144 | 2/20/2021,1,1.327 145 | 2/21/2021,1,1.328 146 | 2/22/2021,1,1.328 147 | 2/23/2021,1,1.329 148 | 2/24/2021,1,1.330 149 | 2/25/2021,1,1.330 150 | 2/26/2021,1,1.331 151 | 2/27/2021,1,1.331 152 | 2/28/2021,1,1.332 153 | 3/1/2021,1,1.333 154 | 3/2/2021,1,1.333 155 | 3/3/2021,1,1.334 156 | 3/4/2021,1,1.334 157 | 3/5/2021,1,1.335 158 | 3/6/2021,1,1.336 159 | 3/7/2021,1,1.336 160 | 3/8/2021,1,1.337 161 | 3/9/2021,1,1.337 162 | 3/10/2021,1,1.338 163 | 3/11/2021,1,1.339 164 | 3/12/2021,1,1.339 165 | 3/13/2021,1,1.340 166 | 3/14/2021,1,1.340 167 | 3/15/2021,1,1.341 168 | 3/16/2021,1,1.342 169 | 3/17/2021,1,1.342 170 | 3/18/2021,1,1.343 171 | 3/19/2021,1,1.344 172 | 3/20/2021,1,1.344 173 | 3/21/2021,1,1.345 174 | 3/22/2021,1,1.345 175 | 3/23/2021,1,1.346 176 | 3/24/2021,1,1.347 177 | 3/25/2021,1,1.347 178 | 3/26/2021,1,1.348 179 | 3/27/2021,1,1.348 180 | 3/28/2021,1,1.349 181 | 3/29/2021,1,1.350 182 | 3/30/2021,1,1.350 183 | 4/1/2021,1,1.297 184 | 4/2/2021,1,1.298 185 | 4/3/2021,1,1.299 186 | 4/4/2021,1,1.298 187 | 4/5/2021,1,1.299 188 | 4/6/2021,1,1.297 189 | 4/7/2021,1,1.299 190 | 4/8/2021,1,1.304 191 | 4/9/2021,1,1.302 192 | 4/10/2021,1,1.302 193 | 4/11/2021,1,1.303 194 | 4/12/2021,1,1.303 195 | 4/13/2021,1,1.304 196 | 4/14/2021,1,1.305 197 | 4/15/2021,1,1.305 198 | 4/16/2021,1,1.306 199 | 4/17/2021,1,1.306 200 | 4/18/2021,1,1.307 201 | 4/19/2021,1,1.308 202 | 4/20/2021,1,1.308 203 | 4/21/2021,1,1.309 204 | 4/22/2021,1,1.310 205 | 4/23/2021,1,1.310 206 | 4/24/2021,1,1.311 207 | 4/25/2021,1,1.311 208 | 4/26/2021,1,1.312 209 | 4/27/2021,1,1.313 210 | 4/28/2021,1,1.313 211 | 4/29/2021,1,1.314 212 | 4/30/2021,1,1.314 213 | 5/1/2021,1,1.315 214 | 5/2/2021,1,1.316 215 | 5/3/2021,1,1.316 216 | 5/4/2021,1,1.317 217 | 5/5/2021,1,1.317 218 | 5/6/2021,1,1.318 219 | 5/7/2021,1,1.319 220 | 5/8/2021,1,1.319 221 | 5/9/2021,1,1.320 222 | 5/10/2021,1,1.320 223 | 5/11/2021,1,1.321 224 | 5/12/2021,1,1.322 225 | 5/13/2021,1,1.322 226 | 5/14/2021,1,1.323 227 | 5/15/2021,1,1.323 228 | 5/16/2021,1,1.324 229 | 5/17/2021,1,1.325 230 | 5/18/2021,1,1.325 231 | 5/19/2021,1,1.326 232 | 5/20/2021,1,1.327 233 | 5/21/2021,1,1.327 234 | 5/22/2021,1,1.328 235 | 5/23/2021,1,1.328 236 | 5/24/2021,1,1.329 237 | 5/25/2021,1,1.330 238 | 5/26/2021,1,1.330 239 | 5/27/2021,1,1.331 240 | 5/28/2021,1,1.331 241 | 5/29/2021,1,1.332 242 | 5/30/2021,1,1.333 243 | 5/31/2021,1,1.333 244 | 6/1/2021,1,1.334 245 | 6/2/2021,1,1.334 246 | 6/3/2021,1,1.335 247 | 6/4/2021,1,1.336 248 | 6/5/2021,1,1.336 249 | 6/6/2021,1,1.337 250 | 6/7/2021,1,1.337 251 | 6/8/2021,1,1.338 252 | 6/9/2021,1,1.339 253 | 6/10/2021,1,1.339 254 | 6/11/2021,1,1.340 255 | 6/12/2021,1,1.340 256 | 6/13/2021,1,1.341 257 | 6/14/2021,1,1.342 258 | 6/15/2021,1,1.342 259 | 6/16/2021,1,1.343 260 | 6/17/2021,1,1.344 261 | 6/18/2021,1,1.344 262 | 6/19/2021,1,1.345 263 | 6/20/2021,1,1.345 264 | 6/21/2021,1,1.346 265 | 6/22/2021,1,1.347 266 | 6/23/2021,1,1.347 267 | 6/24/2021,1,1.348 268 | 6/25/2021,1,1.348 269 | 6/26/2021,1,1.349 270 | 6/27/2021,1,1.350 271 | 6/28/2021,1,1.350 272 | 6/29/2021,1,1.351 273 | 6/30/2021,1,1.351 274 | 7/1/2021,1,1.338 275 | 7/2/2021,1,1.338 276 | 7/3/2021,1,1.338 277 | 7/4/2021,1,1.338 278 | 7/5/2021,1,1.339 279 | 7/6/2021,1,1.339 280 | 7/7/2021,1,1.339 281 | 7/8/2021,1,1.339 282 | 7/9/2021,1,1.339 283 | 7/10/2021,1,1.339 284 | 7/11/2021,1,1.340 285 | 7/12/2021,1,1.340 286 | 7/13/2021,1,1.340 287 | 7/14/2021,1,1.340 288 | 7/15/2021,1,1.340 289 | 7/16/2021,1,1.340 290 | 7/17/2021,1,1.341 291 | 7/18/2021,1,1.341 292 | 7/19/2021,1,1.341 293 | 7/20/2021,1,1.341 294 | 7/21/2021,1,1.341 295 | 7/22/2021,1,1.341 296 | 7/23/2021,1,1.341 297 | 7/24/2021,1,1.342 298 | 7/25/2021,1,1.342 299 | 7/26/2021,1,1.342 300 | 7/27/2021,1,1.342 301 | 7/28/2021,1,1.342 302 | 7/29/2021,1,1.342 303 | 7/30/2021,1,1.343 304 | 7/31/2021,1,1.343 305 | 8/1/2021,1,1.343 306 | 8/2/2021,1,1.343 307 | 8/3/2021,1,1.343 308 | 8/4/2021,1,1.343 309 | 8/5/2021,1,1.343 310 | 8/6/2021,1,1.344 311 | 8/7/2021,1,1.344 312 | 8/8/2021,1,1.344 313 | 8/9/2021,1,1.344 314 | 8/10/2021,1,1.344 315 | 8/11/2021,1,1.344 316 | 8/12/2021,1,1.345 317 | 8/13/2021,1,1.345 318 | 8/14/2021,1,1.345 319 | 8/15/2021,1,1.345 320 | 8/16/2021,1,1.345 321 | 8/17/2021,1,1.345 322 | 8/18/2021,1,1.346 323 | 8/19/2021,1,1.346 324 | 8/20/2021,1,1.346 325 | 8/21/2021,1,1.346 326 | 8/22/2021,1,1.346 327 | 8/23/2021,1,1.346 328 | 8/24/2021,1,1.346 329 | 8/25/2021,1,1.347 330 | 8/26/2021,1,1.347 331 | 8/27/2021,1,1.347 332 | 8/28/2021,1,1.347 333 | 8/29/2021,1,1.347 334 | 8/30/2021,1,1.347 335 | 8/31/2021,1,1.348 336 | 9/1/2021,1,1.348 337 | 9/2/2021,1,1.348 338 | 9/3/2021,1,1.348 339 | 9/4/2021,1,1.348 340 | 9/5/2021,1,1.348 341 | 9/6/2021,1,1.349 342 | 9/7/2021,1,1.349 343 | 9/8/2021,1,1.349 344 | 9/9/2021,1,1.349 345 | 9/10/2021,1,1.349 346 | 9/11/2021,1,1.349 347 | 9/12/2021,1,1.349 348 | 9/13/2021,1,1.350 349 | 9/14/2021,1,1.350 350 | 9/15/2021,1,1.350 351 | 9/16/2021,1,1.350 352 | 9/17/2021,1,1.350 353 | 9/18/2021,1,1.350 354 | 9/19/2021,1,1.351 355 | 9/20/2021,1,1.351 356 | 9/21/2021,1,1.351 357 | 9/22/2021,1,1.351 358 | 9/23/2021,1,1.351 359 | 9/24/2021,1,1.351 360 | 9/25/2021,1,1.351 361 | 9/26/2021,1,1.352 362 | 9/27/2021,1,1.352 363 | 9/28/2021,1,1.352 364 | 9/29/2021,1,1.352 365 | 9/30/2021,1,1.352 366 | -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/README.md: -------------------------------------------------------------------------------- 1 | # **Building a Power BI Report for Waggle** 2 | 3 | Project 2 of Udacity's [Data Analysis and Visualization with Microsoft Power BI Nanodegree Program](https://www.udacity.com/course/data-analysis-and-visualization-with-power-BI-nanodegree--nd331) 4 | in **Creating Visualizations with Power BI** course. 5 | 6 | ## Project Description: 7 | Waggle is a startup that makes smart devices for pets. Recently, they has been thrilled by the success of their new Lapdog device, a fitness collar that lets owners track their dog’s steps, alerts them when it’s time for a walk, and even repels fleas! Reviews have been fantastic, sales are growing, and—best of all—the product really works! 8 | 9 | The product team distributed 1,000 Lapcat prototypes for field testing. Now, after months of data collection, we have been tasked with delivering a boardroom-ready Power BI report that tells the story of how the Lapcat data compares to findings from the dog collar Lapdog devices to either help convince the CEO that Lapcat is the next big thing or a costly mistake to be avoided. 10 | 11 | Below is a quick demonestration about the project components. 12 | 13 | ### Data Model: 14 | A snapshot of the data model is provided below and can be found on `Waggle-data-model.png` on this repo. 15 | 16 | ![Waggle Data Model](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/02-Building-Power-BI-Report-for-Waggle/Waggle-data-model.png) 17 | 18 | 19 | ### Report Requirements: 20 | - The CEO is curious about the following questions: 21 | - Did the average daily steps increase for cats wearing the device as they did for dogs? 22 | - Were owners of Lapcat devices as satisfied with the product as Lapdog owners? 23 | - The Chief Marketing Officer would like the report to be “on-brand” by including only colors from the Waggle color palette, the Waggle logo, and other approved company logos and icons. 24 | 25 | ![Waggle color palette](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/02-Building-Power-BI-Report-for-Waggle/Waggle-color-palette.png) 26 | 27 | 28 | - The product team trusts us to incorporate other visuals and insights as we see fit but is most interested in comparisons between the dogs and cats using Waggle devices as well as any information about the families who own the pets. They would also like slicers to help them filter and explore on their own. 29 | - The report should include: 30 | - at least five slicers on each page with at least one example of a drop-down slicer, at least one example of a slider slicer, at least one example of a hierarchy slicer, at least one example of a slicer with “Select All” enabled, and one example of a slicer with the search box enabled. 31 | - at least two bookmark features. One must allow users to dynamically swap one visual out with a different one and another must reset all applied filters on the page. 32 | - buttons that help users navigate the report tabs. they must respond when users hover over them by changing color or size 33 | 34 | The report is to include 3 tabs 35 | - The first page should highlight the CEO’s business questions, specifically calling out the differences in average step count and average user rating between Lapdog and Lapcat devices. 36 | - The second page should focus on insights related to pets using the device. 37 | - The third page should focus on insights related to the families that own the pets. 38 | 39 | Full PDF report can be found on `Waggle-Report.pdf` file provided on this repo. 40 | 41 | 42 | ### Report Tab 1: 43 | To address the CEO’s questions 44 | - 2 visualizations were ploted to highlight the difference between `average daily steps` over time recorded on Lapdog devices vs. Lapcat devices displaying the trend over time by year and month. 45 | - 2 visualizations highlighted the difference between the customer `ratings` for Lapdog devices vs. Lapcat devices in adition to the number of rates. 46 | 47 | 48 | ![Waggle Report Tab 1](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/02-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab1.jfif) 49 | 50 | ___ 51 | 52 | ### Report Tab 2: 53 | To drive insights from the `pets` dataset, the second tab included: 54 | - A visualization that shows the `breed` distribution of cats and dogs. 55 | - 2 visualiza`tion that highlighted both `gender` and `age` distributions along the dataset with `pet type` as hue. 56 | 57 | 58 | ![Waggle Report Tab 2](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/02-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab2.jfif) 59 | 60 | ___ 61 | 62 | ### Report Tab 3: 63 | To drive insights from the `family` dataset, the third tab included: 64 | - A table that shows important family data. 65 | - A card that shows the count of total pets on the dataset, and has 2 bookmark buttons to show only cat or dog counts. 66 | - A visualization that shows the relation between `house hold income` and `number of owned pets` along the dataset with `pet type` as hue. 67 | 68 | 69 | ![Waggle Report Tab 3](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/02-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab3.jfif) 70 | 71 | ___ -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-Project.pbix: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-Project.pbix -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-Theme.json: -------------------------------------------------------------------------------- 1 | {"name":"Waggle","dataColors":["#FFFFFF","#404040","#4381AB","#00B5CC","#978FC7","#765C69","#F1828D","#FAD859"]} -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-color-palette.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-color-palette.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-Project.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-Project.pdf -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab1.jfif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab1.jfif -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab2.jfif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab2.jfif -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab3.jfif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-dashboard/Waggle-tab3.jfif -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-data-model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-data-model.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/Waggle-datasets.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/Waggle-datasets.xlsx -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_gray.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/cat_face_icon_yellow.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/color_palette.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/color_palette.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_gray.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/dog_face_icon_yellow.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_blue_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_blue_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_green_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_green_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_pink_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_pink_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_teal_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_teal_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_darker_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_darker_gray.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_gray.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_transparent_yellow.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_violet_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_violet_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_wine.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_wine.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_white_transparent_yellow.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_yellow_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapcat_logo_yellow_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_blue_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_blue_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_green_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_green_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_pink_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_pink_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_teal_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_teal_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_darker_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_darker_gray.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_gray.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_transparent_yellow.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_violet_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_violet_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_white_transparent_yellow.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_yellow_background.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/lapdog_logo_yellow_background.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_black.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_black.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_blue.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_green.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_green.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_pink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_pink.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_red.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_red.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_teal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_teal.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_violet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_violet.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_white.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_white.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_wine.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_wine.png -------------------------------------------------------------------------------- /08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_yellow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/08-Building-Power-BI-Report-for-Waggle/marketing_collateral/waggle_logo_yellow.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/census-data.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/census-data.xlsx -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/customer-list.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/customer-list.xlsx -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/purchase-list.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/purchase-list.xlsx -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/state-list.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/Data-Source/state-list.xlsx -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Data-Model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Data-Model.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Project.pbix: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Project.pbix -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Report.pdf -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Summary.doc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Summary.doc -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/README.md: -------------------------------------------------------------------------------- 1 | # **Market Analysis Report for National Clothing Chain** 2 | 3 | Project 3 of Udacity's [Data Analysis and Visualization with Microsoft Power BI Nanodegree Program](https://www.udacity.com/course/data-analysis-and-visualization-with-power-BI-nanodegree--nd331) 4 | in **Advanced Data Analysis with Power BI** course. 5 | 6 | ## Project Description: 7 | An online national clothing chain needs help on creating a targeted marketing campaign. 8 | 9 | Sales have been flat and they want to lure lost customers back. They want to advertise specific products to specific customers in specific locations, 10 | but they don’t know who to target. They have three products in mind: 11 | - Shirt: $25 12 | - Sweater: $100 13 | - Leather Bag: $1,000 14 | They need us to conduct an analysis to determine the best product to advertise to each customer. 15 | ___ 16 | 17 | ## Data Sources 18 | The project will use a variety of data sources, including 19 | - US Census Bureau 20 | - Average income 21 | - location 22 | - population 23 | - industry 24 | 25 | - Business Data 26 | - Product inventory 27 | - Product prices 28 | - Customer rating 29 | - Product return rate 30 | 31 | - Customer Data 32 | - Customer ID 33 | - Names 34 | - Location 35 | - Date of birth 36 | - Purchase history 37 | 38 | - Additional Data 39 | - Weather 40 | - Economics 41 | - Demographics 42 | - Competition 43 | ____ 44 | 45 | ## Project Instruction 46 | In this project, we will use population statistics from the US Census Bureau to determine where the greatest income exists around the country 47 | and whether there is a correlation between sales and income. We don’t know the incomes of our customers, but we should be able to predict it 48 | by looking at their purchase history and locations and comparing that against the census data. 49 | Additionally, we want to analyze our inventory, specifically customer ratings and return rate and see if there’s a correlation between the two. 50 | ___ 51 | 52 | ## Data Model 53 | A snapshot of the data model is provided below and can be found on `National-Clothing-Chain-Data-Model.png` on this repo. 54 | 55 | ![National Clothing Chain Data Model](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/National-Clothing-Chain-Data-Model.png) 56 | 57 | 58 | ## Analysis Questions 59 | 1. What is the correlation (R2 value) between sales and income? 60 | 2. What is the correlation (R2 value) between customer ratings and product return rate? 61 | 3. What are the linear regression formulas to predict customer income from customer sales? 62 | 4. Which customer do you predict has the highest income? 63 | 5. Which product will be advertised the most? 64 | 65 | Full report can be found on `National-Clothing-Chain-Report.pdf` and summery with finding can be found on `National-Clothing-Chain-Summary.doc` file, 66 | bot provided on this repo. The corresponding visuals can be seen grouped below. 67 | 68 | ![avg-income-by-state](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/avg-income-by-state.png) 69 | ___ 70 | ![predicted-income-by-state](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/predicted-income-by-state.png) 71 | ___ 72 | ![sales-income-corr](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/sales-income-corr.png) 73 | ___ 74 | ![customer-return-rate](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/customer-return-rate.png) 75 | ___ 76 | ![product-recomm](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/product-recomm.png) 77 | ___ 78 | ![product-by-price](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/product-by-price.png) 79 | ___ 80 | ![product-instock](https://github.com/xShaimaa/Udacity-Data-Analysis-and-Viz-with-Microsoft-Power-BI/blob/master/03-Market-Analysis-Report-for-National-Clothing-Chain/img/product-instock.png) -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/avg-income-by-state.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/avg-income-by-state.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/customer-return-rate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/customer-return-rate.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/customers-by-income.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/customers-by-income.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/predicted-income-by-state.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/predicted-income-by-state.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/product-by-price.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/product-by-price.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/product-instock.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/product-instock.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/product-recomm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/product-recomm.png -------------------------------------------------------------------------------- /09-Market-Analysis-Report-for-National-Clothing-Chain/img/sales-income-corr.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xShaimaa/Data-Analysis-Projects/3815513144dcf699470b73e25b062a674cf479a4/09-Market-Analysis-Report-for-National-Clothing-Chain/img/sales-income-corr.png -------------------------------------------------------------------------------- /10-Coursera-Sales-Analysis-in-Power-BI-Guided-Project/README.md: -------------------------------------------------------------------------------- 1 | # **Coursera Sales Analysis in Power BI: Clean and Analyze Sales Data Guided Project** 2 | 3 | You may find all materials and files for my [Sales Analysis in Power BI: Clean and Analyze Sales Data](https://www.coursera.org/teach/tahlil-albayanat-biistikhdam-power-bi-maealijih-warasm-bayanat-almabieat/course/overview) guided project on Coursera platform 4 | presented by [Shaimaa Elbadrawy](https://www.linkedin.com/in/xshaimaa/) (Me ^w^) 5 | here on this [repo](https://github.com/xShaimaa/Coursera-Sales-Analysis-in-Power-BI-Clean-and-Visualize-Sales-Data-Guided-Project) 6 | 7 | --- 8 | 9 | ## Main Outcome 10 | By the end of this project, learner would go through the entire data analysis life cycle, he would start off by connecting to datasets, 11 | transform and clean it, build data model, ask valuable question and then answer them through effictive data visualization 12 | that would be used in building a sales report. 13 | 14 | --- 15 | 16 | ## Senario 17 | Learners would be tasked to analyze warehouse sales data that includes information about both its national and international sales, 18 | its sold products and their manufacturing data. 19 | they are to deliver a detailed interactive sales report for the stackeholders summerizing the answers for vital business questions. 20 | 21 | --- 22 | 23 | ## Guided Project Final Report 24 |
25 | Sales Analysis Power BI Report 26 |
Sales Analysis Power BI Report
27 |
28 | 29 | --- 30 | 31 | ## Capstone Project 32 | Learner would go through the entire data analysis life cycle steps. 33 | He would be tasked to load and transform the Facebook paid ads data comming from a cosmatics company, 34 | starting by connecting to the dataset, exploring and cleanining/transforming it, and finaly start building 35 | interactive visuals to gain insights from these campaign and answer the maneger's questions. 36 | 37 | --- 38 | 39 | ## Capstone Project Outcome 40 |
41 | Capstone Question 1 42 |
Capstone Question 1
43 |
44 | 45 | --- 46 | 47 |
48 | Capstone Question 2 49 |
Capstone Question 2
50 |
51 | 52 | --- -------------------------------------------------------------------------------- /11-dyslexia-and-music-notes-paper-analysis/README.md: -------------------------------------------------------------------------------- 1 | # **Dyslexia and Music Notes Paper Data Analysis** 2 | 3 | you may find all project files and materials on this repo [Dyslexia and Music Notes Paper Data Analysis](https://github.com/xShaimaa/dyslexia-and-music-notes-paper) 4 | 5 | this project is presenting data analysis process on paper 6 | [Effects of the design of written music on the readability for children with dyslexia](https://journals.sagepub.com/doi/10.1177/0255761414546245) 7 | to find meaningful insights concerning dyslexic people having difficulties reading traditioinal music notes. 8 | 9 | ## Table of Contents 10 | | No. | Topic | 11 | |-----|------------ | 12 | | 01 | [Ovevrview](#overview) | 13 | | 01 | [Paper Summery](#paper-summery) | 14 | | 02 | [Findings](#findings) | 15 | | 03 | [BI Report](#BI-report) | 16 | 17 | 18 | --- 19 | 20 | ## Overview 21 | Dyslexia is a learning disorder that affects approximately 5-10% of children worldwide. 22 | It is characterized by difficulties with reading fluency and phonological processing, which can significantly impact academic and social development. 23 | Music interventions have been proposed as a potential treatment for dyslexia, 24 | as they can target some of the same cognitive processes that are impaired in dyslexia, such as phonological awareness and auditory memory. 25 | 26 | --- 27 | 28 | ## Paper Summery 29 | David Gómez Oropeza et al.'s (2021) paper focuses on the effects of different music notation designs on the readability 30 | of music for children with dyslexia. They compare the traditional music notation system to a newly designed notation system 31 | that is based on the Gestalt laws of perception. They find that the newly designed notation system is more readable for children with dyslexia. 32 | 33 | 34 | This paper investigates the relationship between the design of written music and the number of mistakes dyslexic and non-dyslexic 35 | children make in reading music. The study used a public dataset of musical notation examples and a group of children with and without 36 | dyslexia to evaluate the readability of different music notation designs. 37 | 38 | --- 39 | 40 | ## Findings 41 | The study found that children with dyslexia made significantly more mistakes than non-dyslexic children when reading music with traditional notation. 42 | However, children with dyslexia made fewer mistakes when reading music with a newly designed notation system that was based on the Gestalt laws of perception. 43 | 44 | The authors of the paper conclude that the design of written music can have a significant impact on the readability of music for children with dyslexia. 45 | They suggest that music notation systems should be designed to be more intuitive and easier to learn for all children, including those with dyslexia. 46 | 47 | - Children with dyslexia found a newly designed music notation system to be more readable than traditional music notation. 48 | - The newly designed notation system was based on the Gestalt principles of perception. 49 | - The Gestalt principles of perception include principles of proximity, similarity, and closure. 50 | - The newly designed notation system used larger note sizes, more note spacing, and a different font type than traditional music notation. 51 | 52 | --- 53 | 54 | ## BI Report 55 |
56 | 57 |
Paper Dataset Stats
58 |
59 | 60 | --- 61 | 62 |
63 | 64 |
Paper test results
65 |
66 | 67 | --- 68 | 69 |
70 | 71 |
sample groups preferences
72 |
73 | 74 | --- 75 | 76 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Data Analysis and EDA Projects 2 | This repo contains notebooks on various datasets as a practice on data analysis, all notebooks include: 3 | 4 | 1. Data Cleaning. 5 | 2. Data Visualization. 6 | 3. Exploratory Data Analysis. 7 | 8 | Each project contains detailed README file that cointains an informative description of the dataset and its columns, 9 | summary of exploring the dataframe and step to be taken in data cleaning phase, visualizations findings and EDA conclusions. 10 | 11 | --- 12 | 13 | ## Table of contents: 14 | | No. | Datasets | Tool | 15 | |--- | --- | --- | 16 | |01 |[TMDB movies' ratings](/01-TMDB-Dataset-Analysis) | Python | 17 | |02 |[Auto MPG](/02-Auto-MPG-Dataset-Analysis) | Python | 18 | |03 |[Medical Appointment No Show](/03-Medical-Appointment-No-Show) | Python | 19 | |04 |[9000+ Movies Dataset Analysis](/04-9000+-Movies-Dataset-Analysis) | Python | 20 | |05 |[Wine Quality Dataset EDA](/05-Wine-Quality-Dataset) | Python | 21 | |06 |[Query a Digital Music Store Database](/06-Query-a-Digital-Music-Store-Database) | SQL | 22 | | 07 |[Build a Data Model for Seven Sages Brewing Company](/07-Create-a-Data-Model-for-Seven-Sages-Brewing-Company/) | Power BI | 23 | | 08 |[Building a Power BI Report for Waggle](/08-Building-Power-BI-Report-for-Waggle/) | Power BI | 24 | | 09 |[Market Analysis Report for National Clothing Chain](/09-Market-Analysis-Report-for-National-Clothing-Chain/) | Power BI | 25 | | 10 |[Sales Analysis in Power BI: Clean and Analyze Sales Data Coursera Guided Project](/10-Coursera-Sales-Analysis-in-Power-BI-Guided-Project) | Power BI| 26 | | 11 |[Dyslexia and Music Notes Paper Data Analysis](https://github.com/xShaimaa/dyslexia-and-music-notes-paper) | Power BI| 27 | 28 | 29 | --------------------------------------------------------------------------------