├── Stage_A_Quiz ├── FoodBalanceSheets_E_Africa_NOFLAG.csv ├── .ipynb_checkpoints │ ├── Course_Practice-checkpoint.ipynb │ └── Hamoye_Stage_A_Quiz-checkpoint.ipynb └── Stage_A_Quiz.ipynb └── README.md /Stage_A_Quiz/FoodBalanceSheets_E_Africa_NOFLAG.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chisomnwa/Hamoye_Data_Engineering_Internship/main/Stage_A_Quiz/FoodBalanceSheets_E_Africa_NOFLAG.csv -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # This is a repository for my Hamoye Data Engineering Internship Stage Quizzes 2 | 3 | ## For Stage A: 4 | 5 | ### Food Balance Sheet Analysis 6 | 7 | This [notebook](https://github.com/Chisomnwa/Hamoye_Data_Engineering_Stage_Quizzes/blob/main/Stage_A_Quiz.ipynb) file contains my Tag along code for the stage A assessment quiz. 8 | 9 | This quiz was an assessment of our understanding of the **Introduction to Python for Machine Learning Course.** The 60-minute quiz had sections numbered 1–20 where we imported, wrangled, and answered some questions on the African Food Production Dataset which was provided by the FAO website. 10 | 11 | This food Balance Sheet compiled by the Food and Agriculture Organization of the United Nations contains a selection of African countries food supply between 2014 to 2018. 12 | 13 | This food balance sheet shows the food items for human consumption, along with how much was produced, used, imported, or exported, and how it benefits society (per capita supply). 14 | 15 | The dataset can be obtained via this link: [FoodBalanceSheets_E_Africa_NOFLAG.csv](https://github.com/HamoyeHQ/HDSC-Introduction-to-Python-for-machine-learning/files/7768140/FoodBalanceSheets_E_Africa_NOFLAG.csv) 16 | 17 | **Please note!:** For some reason, GitHub imposes a 5-second timeout for rich rendering. If the notebook cannot be rendered within the timeout period of 5 seconds, GitHub does not render it. It happens to big and small files as well. If you could not view the codes on the Stage_A_Quiz.ipynb, you can easily view it on Jupyter nbviewer [here](https://nbviewer.org/github/Chisomnwa/Hamoye_Data_Engineering_Stage_Quizzes/blob/main/Stage_A_Quiz.ipynb). 18 | 19 | ## For Stage B: 20 | 21 | ### Appliances Energy Prediction Data 22 | 23 | The dataset for the remainder of this quiz (from question 18) is the Appliances Energy Prediction data. The data set is at 10 min for about 4.5 months. The house temperature and humidity conditions were monitored with a ZigBee wireless sensor network. Each wireless node transmitted the temperature and humidity conditions around 3.3 min. Then, the wireless data was averaged for 10 minutes periods. The energy data was logged every 10 minutes with m-bus energy meters. Weather from the nearest airport weather station (Chievres Airport, Belgium) was downloaded from a public data set from Reliable Prognosis (rp5.ru), and merged together with the experimental data sets using the date and time column. Two random variables have been included in the data set for testing the regression models and to filter out non predictive attributes (parameters). 24 | 25 | In this course, I developed a multivariate multiple regression model to study the effect of eight input variables on two output. 26 | 27 | 28 | If you could not view the codes on the Stage_B_Quiz.ipynb, you can easily view it on Jupyter nbviewer [here](https://nbviewer.org/github/Chisomnwa/Hamoye_Data_Engineering_Stage_Quizzes/blob/main/Stage_B_Quiz/Machine%20Learning%20Regression%20-%20Predicting%20Energy%20Efficiency.ipynb) 29 | -------------------------------------------------------------------------------- /Stage_A_Quiz/.ipynb_checkpoints/Course_Practice-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "id": "4a946a91", 7 | "metadata": {}, 8 | "outputs": [ 9 | { 10 | "name": "stdout", 11 | "output_type": "stream", 12 | "text": [ 13 | "The Zen of Python, by Tim Peters\n", 14 | "\n", 15 | "Beautiful is better than ugly.\n", 16 | "Explicit is better than implicit.\n", 17 | "Simple is better than complex.\n", 18 | "Complex is better than complicated.\n", 19 | "Flat is better than nested.\n", 20 | "Sparse is better than dense.\n", 21 | "Readability counts.\n", 22 | "Special cases aren't special enough to break the rules.\n", 23 | "Although practicality beats purity.\n", 24 | "Errors should never pass silently.\n", 25 | "Unless explicitly silenced.\n", 26 | "In the face of ambiguity, refuse the temptation to guess.\n", 27 | "There should be one-- and preferably only one --obvious way to do it.\n", 28 | "Although that way may not be obvious at first unless you're Dutch.\n", 29 | "Now is better than never.\n", 30 | "Although never is often better than *right* now.\n", 31 | "If the implementation is hard to explain, it's a bad idea.\n", 32 | "If the implementation is easy to explain, it may be a good idea.\n", 33 | "Namespaces are one honking great idea -- let's do more of those!\n" 34 | ] 35 | } 36 | ], 37 | "source": [ 38 | "import this" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 2, 44 | "id": "4b3d123d", 45 | "metadata": {}, 46 | "outputs": [ 47 | { 48 | "data": { 49 | "application/json": { 50 | "cell": { 51 | "!": "OSMagics", 52 | "HTML": "Other", 53 | "SVG": "Other", 54 | "bash": "Other", 55 | "capture": "ExecutionMagics", 56 | "cmd": "Other", 57 | "debug": "ExecutionMagics", 58 | "file": "Other", 59 | "html": "DisplayMagics", 60 | "javascript": "DisplayMagics", 61 | "js": "DisplayMagics", 62 | "latex": "DisplayMagics", 63 | "markdown": "DisplayMagics", 64 | "perl": "Other", 65 | "prun": "ExecutionMagics", 66 | "pypy": "Other", 67 | "python": "Other", 68 | "python2": "Other", 69 | "python3": "Other", 70 | "ruby": "Other", 71 | "script": "ScriptMagics", 72 | "sh": "Other", 73 | "svg": "DisplayMagics", 74 | "sx": "OSMagics", 75 | "system": "OSMagics", 76 | "time": "ExecutionMagics", 77 | "timeit": "ExecutionMagics", 78 | "writefile": "OSMagics" 79 | }, 80 | "line": { 81 | "alias": "OSMagics", 82 | "alias_magic": "BasicMagics", 83 | "autoawait": "AsyncMagics", 84 | "autocall": "AutoMagics", 85 | "automagic": "AutoMagics", 86 | "autosave": "KernelMagics", 87 | "bookmark": "OSMagics", 88 | "cd": "OSMagics", 89 | "clear": "KernelMagics", 90 | "cls": "KernelMagics", 91 | "colors": "BasicMagics", 92 | "conda": "PackagingMagics", 93 | "config": "ConfigMagics", 94 | "connect_info": "KernelMagics", 95 | "copy": "Other", 96 | "ddir": "Other", 97 | "debug": "ExecutionMagics", 98 | "dhist": "OSMagics", 99 | "dirs": "OSMagics", 100 | "doctest_mode": "BasicMagics", 101 | "echo": "Other", 102 | "ed": "Other", 103 | "edit": "KernelMagics", 104 | "env": "OSMagics", 105 | "gui": "BasicMagics", 106 | "hist": "Other", 107 | "history": "HistoryMagics", 108 | "killbgscripts": "ScriptMagics", 109 | "ldir": "Other", 110 | "less": "KernelMagics", 111 | "load": "CodeMagics", 112 | "load_ext": "ExtensionMagics", 113 | "loadpy": "CodeMagics", 114 | "logoff": "LoggingMagics", 115 | "logon": "LoggingMagics", 116 | "logstart": "LoggingMagics", 117 | "logstate": "LoggingMagics", 118 | "logstop": "LoggingMagics", 119 | "ls": "Other", 120 | "lsmagic": "BasicMagics", 121 | "macro": "ExecutionMagics", 122 | "magic": "BasicMagics", 123 | "matplotlib": "PylabMagics", 124 | "mkdir": "Other", 125 | "more": "KernelMagics", 126 | "notebook": "BasicMagics", 127 | "page": "BasicMagics", 128 | "pastebin": "CodeMagics", 129 | "pdb": "ExecutionMagics", 130 | "pdef": "NamespaceMagics", 131 | "pdoc": "NamespaceMagics", 132 | "pfile": "NamespaceMagics", 133 | "pinfo": "NamespaceMagics", 134 | "pinfo2": "NamespaceMagics", 135 | "pip": "PackagingMagics", 136 | "popd": "OSMagics", 137 | "pprint": "BasicMagics", 138 | "precision": "BasicMagics", 139 | "prun": "ExecutionMagics", 140 | "psearch": "NamespaceMagics", 141 | "psource": "NamespaceMagics", 142 | "pushd": "OSMagics", 143 | "pwd": "OSMagics", 144 | "pycat": "OSMagics", 145 | "pylab": "PylabMagics", 146 | "qtconsole": "KernelMagics", 147 | "quickref": "BasicMagics", 148 | "recall": "HistoryMagics", 149 | "rehashx": "OSMagics", 150 | "reload_ext": "ExtensionMagics", 151 | "ren": "Other", 152 | "rep": "Other", 153 | "rerun": "HistoryMagics", 154 | "reset": "NamespaceMagics", 155 | "reset_selective": "NamespaceMagics", 156 | "rmdir": "Other", 157 | "run": "ExecutionMagics", 158 | "save": "CodeMagics", 159 | "sc": "OSMagics", 160 | "set_env": "OSMagics", 161 | "store": "StoreMagics", 162 | "sx": "OSMagics", 163 | "system": "OSMagics", 164 | "tb": "ExecutionMagics", 165 | "time": "ExecutionMagics", 166 | "timeit": "ExecutionMagics", 167 | "unalias": "OSMagics", 168 | "unload_ext": "ExtensionMagics", 169 | "who": "NamespaceMagics", 170 | "who_ls": "NamespaceMagics", 171 | "whos": "NamespaceMagics", 172 | "xdel": "NamespaceMagics", 173 | "xmode": "BasicMagics" 174 | } 175 | }, 176 | "text/plain": [ 177 | "Available line magics:\n", 178 | "%alias %alias_magic %autoawait %autocall %automagic %autosave %bookmark %cd %clear %cls %colors %conda %config %connect_info %copy %ddir %debug %dhist %dirs %doctest_mode %echo %ed %edit %env %gui %hist %history %killbgscripts %ldir %less %load %load_ext %loadpy %logoff %logon %logstart %logstate %logstop %ls %lsmagic %macro %magic %matplotlib %mkdir %more %notebook %page %pastebin %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %pip %popd %pprint %precision %prun %psearch %psource %pushd %pwd %pycat %pylab %qtconsole %quickref %recall %rehashx %reload_ext %ren %rep %rerun %reset %reset_selective %rmdir %run %save %sc %set_env %store %sx %system %tb %time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode\n", 179 | "\n", 180 | "Available cell magics:\n", 181 | "%%! %%HTML %%SVG %%bash %%capture %%cmd %%debug %%file %%html %%javascript %%js %%latex %%markdown %%perl %%prun %%pypy %%python %%python2 %%python3 %%ruby %%script %%sh %%svg %%sx %%system %%time %%timeit %%writefile\n", 182 | "\n", 183 | "Automagic is ON, % prefix IS NOT needed for line magics." 184 | ] 185 | }, 186 | "execution_count": 2, 187 | "metadata": {}, 188 | "output_type": "execute_result" 189 | } 190 | ], 191 | "source": [ 192 | "%lsmagic" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "id": "baeeeab5", 199 | "metadata": {}, 200 | "outputs": [], 201 | "source": [ 202 | "## Introduction to Python for Machine Learning" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "id": "1e8b51d2", 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "id": "e8cb7532", 217 | "metadata": {}, 218 | "outputs": [], 219 | "source": [] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "id": "77c3f93d", 225 | "metadata": {}, 226 | "outputs": [], 227 | "source": [] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": null, 232 | "id": "f1425b5c", 233 | "metadata": {}, 234 | "outputs": [], 235 | "source": [] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "id": "80329667", 241 | "metadata": {}, 242 | "outputs": [], 243 | "source": [] 244 | }, 245 | { 246 | "cell_type": "code", 247 | "execution_count": null, 248 | "id": "4e588e05", 249 | "metadata": {}, 250 | "outputs": [], 251 | "source": [] 252 | } 253 | ], 254 | "metadata": { 255 | "kernelspec": { 256 | "display_name": "Python 3 (ipykernel)", 257 | "language": "python", 258 | "name": "python3" 259 | }, 260 | "language_info": { 261 | "codemirror_mode": { 262 | "name": "ipython", 263 | "version": 3 264 | }, 265 | "file_extension": ".py", 266 | "mimetype": "text/x-python", 267 | "name": "python", 268 | "nbconvert_exporter": "python", 269 | "pygments_lexer": "ipython3", 270 | "version": "3.11.4" 271 | } 272 | }, 273 | "nbformat": 4, 274 | "nbformat_minor": 5 275 | } 276 | -------------------------------------------------------------------------------- /Stage_A_Quiz/.ipynb_checkpoints/Hamoye_Stage_A_Quiz-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "629963e4", 6 | "metadata": {}, 7 | "source": [ 8 | "## Food Balance Sheet Analysis" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 1, 14 | "id": "2f535948", 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "# import all packages and set plots to be embedded inline\n", 19 | "import numpy as np\n", 20 | "import pandas as pd\n", 21 | "import matplotlib.pyplot as plt\n", 22 | "import seaborn as sb\n", 23 | "\n", 24 | "%matplotlib inline\n", 25 | "\n", 26 | "# suppress warnings from final output\n", 27 | "import warnings\n", 28 | "warnings.simplefilter(\"ignore\")" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 2, 34 | "id": "ef43ca9c", 35 | "metadata": {}, 36 | "outputs": [ 37 | { 38 | "name": "stdout", 39 | "output_type": "stream", 40 | "text": [ 41 | "(60943, 12)\n" 42 | ] 43 | }, 44 | { 45 | "data": { 46 | "text/html": [ 47 | "
\n", 48 | "\n", 61 | "\n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | "
Area CodeAreaItem CodeItemElement CodeElementUnitY2014Y2015Y2016Y2017Y2018
04Algeria2501Population511Total Population - Both sexes1000 persons38924.0039728.0040551.0041389.0042228.00
14Algeria2501Population5301Domestic supply quantity1000 tonnes0.000.000.000.000.00
24Algeria2901Grand Total664Food supply (kcal/capita/day)kcal/capita/day3377.003379.003372.003341.003322.00
34Algeria2901Grand Total674Protein supply quantity (g/capita/day)g/capita/day94.9094.3594.7292.8291.83
44Algeria2901Grand Total684Fat supply quantity (g/capita/day)g/capita/day80.0679.3677.4080.1977.28
\n", 157 | "
" 158 | ], 159 | "text/plain": [ 160 | " Area Code Area Item Code Item Element Code \\\n", 161 | "0 4 Algeria 2501 Population 511 \n", 162 | "1 4 Algeria 2501 Population 5301 \n", 163 | "2 4 Algeria 2901 Grand Total 664 \n", 164 | "3 4 Algeria 2901 Grand Total 674 \n", 165 | "4 4 Algeria 2901 Grand Total 684 \n", 166 | "\n", 167 | " Element Unit Y2014 \\\n", 168 | "0 Total Population - Both sexes 1000 persons 38924.00 \n", 169 | "1 Domestic supply quantity 1000 tonnes 0.00 \n", 170 | "2 Food supply (kcal/capita/day) kcal/capita/day 3377.00 \n", 171 | "3 Protein supply quantity (g/capita/day) g/capita/day 94.90 \n", 172 | "4 Fat supply quantity (g/capita/day) g/capita/day 80.06 \n", 173 | "\n", 174 | " Y2015 Y2016 Y2017 Y2018 \n", 175 | "0 39728.00 40551.00 41389.00 42228.00 \n", 176 | "1 0.00 0.00 0.00 0.00 \n", 177 | "2 3379.00 3372.00 3341.00 3322.00 \n", 178 | "3 94.35 94.72 92.82 91.83 \n", 179 | "4 79.36 77.40 80.19 77.28 " 180 | ] 181 | }, 182 | "execution_count": 2, 183 | "metadata": {}, 184 | "output_type": "execute_result" 185 | } 186 | ], 187 | "source": [ 188 | "# import dataset to pandas dataframe\n", 189 | "df_FBS = pd.read_csv('FoodBalanceSheets_E_Africa_NOFLAG.csv', encoding='Latin-1')\n", 190 | "\n", 191 | "# Checking the total number of rows and columns in our dataframe\n", 192 | "print(df_FBS.shape)\n", 193 | "\n", 194 | "#Looking at the first five rows in our dataframe\n", 195 | "df_FBS.head()" 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": 3, 201 | "id": "c3a2e817", 202 | "metadata": {}, 203 | "outputs": [ 204 | { 205 | "data": { 206 | "text/html": [ 207 | "
\n", 208 | "\n", 221 | "\n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | "
Area CodeAreaItem CodeItemElement CodeElementUnitY2014Y2015Y2016Y2017Y2018
60938181Zimbabwe2899Miscellaneous5142Food1000 tonnes42.0046.0033.0019.0016.00
60939181Zimbabwe2899Miscellaneous645Food supply quantity (kg/capita/yr)kg3.063.332.351.331.08
60940181Zimbabwe2899Miscellaneous664Food supply (kcal/capita/day)kcal/capita/day3.004.003.001.001.00
60941181Zimbabwe2899Miscellaneous674Protein supply quantity (g/capita/day)g/capita/day0.100.110.080.040.04
60942181Zimbabwe2899Miscellaneous684Fat supply quantity (g/capita/day)g/capita/day0.040.050.030.020.01
\n", 317 | "
" 318 | ], 319 | "text/plain": [ 320 | " Area Code Area Item Code Item Element Code \\\n", 321 | "60938 181 Zimbabwe 2899 Miscellaneous 5142 \n", 322 | "60939 181 Zimbabwe 2899 Miscellaneous 645 \n", 323 | "60940 181 Zimbabwe 2899 Miscellaneous 664 \n", 324 | "60941 181 Zimbabwe 2899 Miscellaneous 674 \n", 325 | "60942 181 Zimbabwe 2899 Miscellaneous 684 \n", 326 | "\n", 327 | " Element Unit Y2014 Y2015 \\\n", 328 | "60938 Food 1000 tonnes 42.00 46.00 \n", 329 | "60939 Food supply quantity (kg/capita/yr) kg 3.06 3.33 \n", 330 | "60940 Food supply (kcal/capita/day) kcal/capita/day 3.00 4.00 \n", 331 | "60941 Protein supply quantity (g/capita/day) g/capita/day 0.10 0.11 \n", 332 | "60942 Fat supply quantity (g/capita/day) g/capita/day 0.04 0.05 \n", 333 | "\n", 334 | " Y2016 Y2017 Y2018 \n", 335 | "60938 33.00 19.00 16.00 \n", 336 | "60939 2.35 1.33 1.08 \n", 337 | "60940 3.00 1.00 1.00 \n", 338 | "60941 0.08 0.04 0.04 \n", 339 | "60942 0.03 0.02 0.01 " 340 | ] 341 | }, 342 | "execution_count": 3, 343 | "metadata": {}, 344 | "output_type": "execute_result" 345 | } 346 | ], 347 | "source": [ 348 | "# Looking at the last five rows in our dataframe\n", 349 | "df_FBS.tail()" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": 4, 355 | "id": "79025bfc", 356 | "metadata": {}, 357 | "outputs": [ 358 | { 359 | "name": "stdout", 360 | "output_type": "stream", 361 | "text": [ 362 | "\n", 363 | "RangeIndex: 60943 entries, 0 to 60942\n", 364 | "Data columns (total 12 columns):\n", 365 | " # Column Non-Null Count Dtype \n", 366 | "--- ------ -------------- ----- \n", 367 | " 0 Area Code 60943 non-null int64 \n", 368 | " 1 Area 60943 non-null object \n", 369 | " 2 Item Code 60943 non-null int64 \n", 370 | " 3 Item 60943 non-null object \n", 371 | " 4 Element Code 60943 non-null int64 \n", 372 | " 5 Element 60943 non-null object \n", 373 | " 6 Unit 60943 non-null object \n", 374 | " 7 Y2014 59354 non-null float64\n", 375 | " 8 Y2015 59395 non-null float64\n", 376 | " 9 Y2016 59408 non-null float64\n", 377 | " 10 Y2017 59437 non-null float64\n", 378 | " 11 Y2018 59507 non-null float64\n", 379 | "dtypes: float64(5), int64(3), object(4)\n", 380 | "memory usage: 5.6+ MB\n" 381 | ] 382 | } 383 | ], 384 | "source": [ 385 | "# Looking at the information of our data\n", 386 | "df_FBS.info()" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 5, 392 | "id": "2c26aebf", 393 | "metadata": {}, 394 | "outputs": [ 395 | { 396 | "data": { 397 | "text/html": [ 398 | "
\n", 399 | "\n", 412 | "\n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | "
Area CodeItem CodeElement CodeY2014Y2015Y2016Y2017Y2018
count60943.00000060943.00000060943.00000059354.00000059395.00000059408.00000059437.00000059507.000000
mean134.2655762687.1767063814.856456134.196282135.235966136.555222140.917765143.758381
std72.605709146.0557392212.0070331567.6636961603.4039841640.0071941671.8623591710.782658
min4.0000002501.000000511.000000-1796.000000-3161.000000-3225.000000-1582.000000-3396.000000
25%74.0000002562.000000684.0000000.0000000.0000000.0000000.0000000.000000
50%136.0000002630.0000005142.0000000.0900000.0800000.0800000.1000000.070000
75%195.0000002775.0000005511.0000008.3400008.4600008.4300009.0000009.000000
max276.0000002961.0000005911.000000176405.000000181137.000000185960.000000190873.000000195875.000000
\n", 517 | "
" 518 | ], 519 | "text/plain": [ 520 | " Area Code Item Code Element Code Y2014 Y2015 \\\n", 521 | "count 60943.000000 60943.000000 60943.000000 59354.000000 59395.000000 \n", 522 | "mean 134.265576 2687.176706 3814.856456 134.196282 135.235966 \n", 523 | "std 72.605709 146.055739 2212.007033 1567.663696 1603.403984 \n", 524 | "min 4.000000 2501.000000 511.000000 -1796.000000 -3161.000000 \n", 525 | "25% 74.000000 2562.000000 684.000000 0.000000 0.000000 \n", 526 | "50% 136.000000 2630.000000 5142.000000 0.090000 0.080000 \n", 527 | "75% 195.000000 2775.000000 5511.000000 8.340000 8.460000 \n", 528 | "max 276.000000 2961.000000 5911.000000 176405.000000 181137.000000 \n", 529 | "\n", 530 | " Y2016 Y2017 Y2018 \n", 531 | "count 59408.000000 59437.000000 59507.000000 \n", 532 | "mean 136.555222 140.917765 143.758381 \n", 533 | "std 1640.007194 1671.862359 1710.782658 \n", 534 | "min -3225.000000 -1582.000000 -3396.000000 \n", 535 | "25% 0.000000 0.000000 0.000000 \n", 536 | "50% 0.080000 0.100000 0.070000 \n", 537 | "75% 8.430000 9.000000 9.000000 \n", 538 | "max 185960.000000 190873.000000 195875.000000 " 539 | ] 540 | }, 541 | "execution_count": 5, 542 | "metadata": {}, 543 | "output_type": "execute_result" 544 | } 545 | ], 546 | "source": [ 547 | "# Looking at the statistical description of our data\n", 548 | "df_FBS.describe()" 549 | ] 550 | }, 551 | { 552 | "cell_type": "code", 553 | "execution_count": 6, 554 | "id": "02f2f527", 555 | "metadata": {}, 556 | "outputs": [ 557 | { 558 | "data": { 559 | "text/plain": [ 560 | "Area Code 0\n", 561 | "Area 0\n", 562 | "Item Code 0\n", 563 | "Item 0\n", 564 | "Element Code 0\n", 565 | "Element 0\n", 566 | "Unit 0\n", 567 | "Y2014 1589\n", 568 | "Y2015 1548\n", 569 | "Y2016 1535\n", 570 | "Y2017 1506\n", 571 | "Y2018 1436\n", 572 | "dtype: int64" 573 | ] 574 | }, 575 | "execution_count": 6, 576 | "metadata": {}, 577 | "output_type": "execute_result" 578 | } 579 | ], 580 | "source": [ 581 | "# Checking for missing values in our dataframe\n", 582 | "df_FBS.isnull().sum()" 583 | ] 584 | }, 585 | { 586 | "cell_type": "code", 587 | "execution_count": 7, 588 | "id": "a02277ed", 589 | "metadata": {}, 590 | "outputs": [ 591 | { 592 | "data": { 593 | "text/plain": [ 594 | "False" 595 | ] 596 | }, 597 | "execution_count": 7, 598 | "metadata": {}, 599 | "output_type": "execute_result" 600 | } 601 | ], 602 | "source": [ 603 | "# Checking if we have any duplicated rows\n", 604 | "df_FBS.duplicated().any()" 605 | ] 606 | }, 607 | { 608 | "cell_type": "code", 609 | "execution_count": null, 610 | "id": "adca0331", 611 | "metadata": {}, 612 | "outputs": [], 613 | "source": [] 614 | }, 615 | { 616 | "cell_type": "markdown", 617 | "id": "eea55dee", 618 | "metadata": {}, 619 | "source": [ 620 | "## Answering the Quiz Questions" 621 | ] 622 | }, 623 | { 624 | "cell_type": "markdown", 625 | "id": "75509885", 626 | "metadata": {}, 627 | "source": [ 628 | "#### Question 1\n", 629 | "\n", 630 | "What is the total Protein supply quantity in Madagascar in 2015?\n", 631 | "\n", 632 | "#### Ans:\n", 633 | " \n", 634 | "B - 173.05" 635 | ] 636 | }, 637 | { 638 | "cell_type": "code", 639 | "execution_count": 8, 640 | "id": "2b67fe96", 641 | "metadata": {}, 642 | "outputs": [ 643 | { 644 | "name": "stdout", 645 | "output_type": "stream", 646 | "text": [ 647 | "Total Protein supply quantity in Madagascar in 2015: 173.04999999999998 g/capita/day\n" 648 | ] 649 | } 650 | ], 651 | "source": [ 652 | "# Filter the dataframe to select rows for Madagascar and the year 2015\n", 653 | "madagascar_2015 = df_FBS[(df_FBS['Area'] == 'Madagascar') & (df_FBS['Element'] == 'Protein supply quantity (g/capita/day)') & (df_FBS['Y2015'] != 0)]\n", 654 | "\n", 655 | "# Calculate the total protein supply quantity in Madagascar for 2015\n", 656 | "total_protein_supply_2015 = madagascar_2015['Y2015'].sum()\n", 657 | "\n", 658 | "# Print the result\n", 659 | "print(\"Total Protein supply quantity in Madagascar in 2015:\", total_protein_supply_2015, \"g/capita/day\")\n" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": null, 665 | "id": "0ab458db", 666 | "metadata": {}, 667 | "outputs": [], 668 | "source": [] 669 | }, 670 | { 671 | "cell_type": "markdown", 672 | "id": "ed6981a3", 673 | "metadata": {}, 674 | "source": [ 675 | "#### Question 2\n", 676 | "\n", 677 | "What is the mean and standard deviation across the whole dataset for the year 2017 to 2 decimal places?\n", 678 | "\n", 679 | "#### Ans:\n", 680 | " \n", 681 | "E - 140.92 and 1671.86" 682 | ] 683 | }, 684 | { 685 | "cell_type": "code", 686 | "execution_count": 9, 687 | "id": "49c627f0", 688 | "metadata": {}, 689 | "outputs": [ 690 | { 691 | "data": { 692 | "text/html": [ 693 | "
\n", 694 | "\n", 707 | "\n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 811 | "
Area CodeItem CodeElement CodeY2014Y2015Y2016Y2017Y2018
count60943.00000060943.00000060943.00000059354.00000059395.00000059408.00000059437.00000059507.000000
mean134.2655762687.1767063814.856456134.196282135.235966136.555222140.917765143.758381
std72.605709146.0557392212.0070331567.6636961603.4039841640.0071941671.8623591710.782658
min4.0000002501.000000511.000000-1796.000000-3161.000000-3225.000000-1582.000000-3396.000000
25%74.0000002562.000000684.0000000.0000000.0000000.0000000.0000000.000000
50%136.0000002630.0000005142.0000000.0900000.0800000.0800000.1000000.070000
75%195.0000002775.0000005511.0000008.3400008.4600008.4300009.0000009.000000
max276.0000002961.0000005911.000000176405.000000181137.000000185960.000000190873.000000195875.000000
\n", 812 | "
" 813 | ], 814 | "text/plain": [ 815 | " Area Code Item Code Element Code Y2014 Y2015 \\\n", 816 | "count 60943.000000 60943.000000 60943.000000 59354.000000 59395.000000 \n", 817 | "mean 134.265576 2687.176706 3814.856456 134.196282 135.235966 \n", 818 | "std 72.605709 146.055739 2212.007033 1567.663696 1603.403984 \n", 819 | "min 4.000000 2501.000000 511.000000 -1796.000000 -3161.000000 \n", 820 | "25% 74.000000 2562.000000 684.000000 0.000000 0.000000 \n", 821 | "50% 136.000000 2630.000000 5142.000000 0.090000 0.080000 \n", 822 | "75% 195.000000 2775.000000 5511.000000 8.340000 8.460000 \n", 823 | "max 276.000000 2961.000000 5911.000000 176405.000000 181137.000000 \n", 824 | "\n", 825 | " Y2016 Y2017 Y2018 \n", 826 | "count 59408.000000 59437.000000 59507.000000 \n", 827 | "mean 136.555222 140.917765 143.758381 \n", 828 | "std 1640.007194 1671.862359 1710.782658 \n", 829 | "min -3225.000000 -1582.000000 -3396.000000 \n", 830 | "25% 0.000000 0.000000 0.000000 \n", 831 | "50% 0.080000 0.100000 0.070000 \n", 832 | "75% 8.430000 9.000000 9.000000 \n", 833 | "max 185960.000000 190873.000000 195875.000000 " 834 | ] 835 | }, 836 | "execution_count": 9, 837 | "metadata": {}, 838 | "output_type": "execute_result" 839 | } 840 | ], 841 | "source": [ 842 | "# Get the descriptive statistcs of the columns\n", 843 | "df_FBS.describe()" 844 | ] 845 | }, 846 | { 847 | "cell_type": "code", 848 | "execution_count": null, 849 | "id": "e043e507", 850 | "metadata": {}, 851 | "outputs": [], 852 | "source": [] 853 | }, 854 | { 855 | "cell_type": "markdown", 856 | "id": "faa62caa", 857 | "metadata": {}, 858 | "source": [ 859 | "#### Question 3\n", 860 | "\n", 861 | "Perform a groupby operation on ‘Element’. What year has the highest sum of Stock Variation?\n", 862 | "\n", 863 | "#### Ans:\n", 864 | " \n", 865 | "A - 2014" 866 | ] 867 | }, 868 | { 869 | "cell_type": "code", 870 | "execution_count": 10, 871 | "id": "f3b782b1", 872 | "metadata": {}, 873 | "outputs": [ 874 | { 875 | "data": { 876 | "text/html": [ 877 | "
\n", 878 | "\n", 891 | "\n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | " \n", 958 | " \n", 959 | " \n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | "
Y2014Y2015Y2016Y2017Y2018
Element
Domestic supply quantity1996716.352021493.552044842.702088198.102161192.10
Export Quantity150020.64157614.47151920.46182338.80181594.80
Fat supply quantity (g/capita/day)10225.5610235.7410102.7710253.8410258.69
Feed216927.89225050.22228958.65223705.68233489.68
Food1212332.491232361.101247022.171258888.281303841.28
Food supply (kcal/capita/day)454257.00453383.00451810.00454681.00455261.00
Food supply quantity (kg/capita/yr)49650.6349345.1348985.2848690.0449056.85
Import Quantity274144.48267018.46286582.78294559.09287997.09
Losses153223.00155439.00157787.00160614.00163902.00
Other uses (non-food)78718.1366254.4169563.6891645.9791300.97
Processing282923.00287929.00280631.00292836.00308429.00
Production1931287.751947019.391943537.152030056.892075072.89
Protein supply quantity (g/capita/day)11836.4611833.9511779.6911842.4511833.56
Residuals30149.0030045.0037224.0035500.0034864.00
Seed21922.9223976.8223389.2024870.1425263.14
Stock Variation58749.8334910.9933140.1254316.9120577.91
Total Population - Both sexes1031585.001058081.001085107.001112641.001140605.00
Tourist consumption416.00349.0089.0091.0090.00
\n", 1057 | "
" 1058 | ], 1059 | "text/plain": [ 1060 | " Y2014 Y2015 Y2016 \\\n", 1061 | "Element \n", 1062 | "Domestic supply quantity 1996716.35 2021493.55 2044842.70 \n", 1063 | "Export Quantity 150020.64 157614.47 151920.46 \n", 1064 | "Fat supply quantity (g/capita/day) 10225.56 10235.74 10102.77 \n", 1065 | "Feed 216927.89 225050.22 228958.65 \n", 1066 | "Food 1212332.49 1232361.10 1247022.17 \n", 1067 | "Food supply (kcal/capita/day) 454257.00 453383.00 451810.00 \n", 1068 | "Food supply quantity (kg/capita/yr) 49650.63 49345.13 48985.28 \n", 1069 | "Import Quantity 274144.48 267018.46 286582.78 \n", 1070 | "Losses 153223.00 155439.00 157787.00 \n", 1071 | "Other uses (non-food) 78718.13 66254.41 69563.68 \n", 1072 | "Processing 282923.00 287929.00 280631.00 \n", 1073 | "Production 1931287.75 1947019.39 1943537.15 \n", 1074 | "Protein supply quantity (g/capita/day) 11836.46 11833.95 11779.69 \n", 1075 | "Residuals 30149.00 30045.00 37224.00 \n", 1076 | "Seed 21922.92 23976.82 23389.20 \n", 1077 | "Stock Variation 58749.83 34910.99 33140.12 \n", 1078 | "Total Population - Both sexes 1031585.00 1058081.00 1085107.00 \n", 1079 | "Tourist consumption 416.00 349.00 89.00 \n", 1080 | "\n", 1081 | " Y2017 Y2018 \n", 1082 | "Element \n", 1083 | "Domestic supply quantity 2088198.10 2161192.10 \n", 1084 | "Export Quantity 182338.80 181594.80 \n", 1085 | "Fat supply quantity (g/capita/day) 10253.84 10258.69 \n", 1086 | "Feed 223705.68 233489.68 \n", 1087 | "Food 1258888.28 1303841.28 \n", 1088 | "Food supply (kcal/capita/day) 454681.00 455261.00 \n", 1089 | "Food supply quantity (kg/capita/yr) 48690.04 49056.85 \n", 1090 | "Import Quantity 294559.09 287997.09 \n", 1091 | "Losses 160614.00 163902.00 \n", 1092 | "Other uses (non-food) 91645.97 91300.97 \n", 1093 | "Processing 292836.00 308429.00 \n", 1094 | "Production 2030056.89 2075072.89 \n", 1095 | "Protein supply quantity (g/capita/day) 11842.45 11833.56 \n", 1096 | "Residuals 35500.00 34864.00 \n", 1097 | "Seed 24870.14 25263.14 \n", 1098 | "Stock Variation 54316.91 20577.91 \n", 1099 | "Total Population - Both sexes 1112641.00 1140605.00 \n", 1100 | "Tourist consumption 91.00 90.00 " 1101 | ] 1102 | }, 1103 | "execution_count": 10, 1104 | "metadata": {}, 1105 | "output_type": "execute_result" 1106 | } 1107 | ], 1108 | "source": [ 1109 | "# Get the list of years\n", 1110 | "year_list = ['Y2014', 'Y2015', 'Y2016', 'Y2017', 'Y2018']\n", 1111 | "\n", 1112 | "#groupby by Element to view the highst sum of Stock Varaiation\n", 1113 | "df_FBS.groupby(['Element'])[year_list].agg(np.sum)" 1114 | ] 1115 | }, 1116 | { 1117 | "cell_type": "code", 1118 | "execution_count": null, 1119 | "id": "92b90a86", 1120 | "metadata": {}, 1121 | "outputs": [], 1122 | "source": [] 1123 | }, 1124 | { 1125 | "cell_type": "markdown", 1126 | "id": "3a478719", 1127 | "metadata": {}, 1128 | "source": [ 1129 | "#### Question 4\n", 1130 | "\n", 1131 | "Answer the following questions based on the African food production dataset provided by the FAO website already provided\n", 1132 | "\n", 1133 | "What is the total sum of Wine produced in 2015 and 2018 respectively?\n", 1134 | "\n", 1135 | "Hint:Perform a groupby sum aggregation on ‘Item’\n", 1136 | "\n", 1137 | "#### Ans:\n", 1138 | " \n", 1139 | "B - 4251.81 and 4039.32" 1140 | ] 1141 | }, 1142 | { 1143 | "cell_type": "code", 1144 | "execution_count": 11, 1145 | "id": "cb373e12", 1146 | "metadata": {}, 1147 | "outputs": [ 1148 | { 1149 | "data": { 1150 | "text/html": [ 1151 | "
\n", 1152 | "\n", 1165 | "\n", 1166 | " \n", 1167 | " \n", 1168 | " \n", 1169 | " \n", 1170 | " \n", 1171 | " \n", 1172 | " \n", 1173 | " \n", 1174 | " \n", 1175 | " \n", 1176 | " \n", 1177 | " \n", 1178 | " \n", 1179 | " \n", 1180 | " \n", 1181 | " \n", 1182 | " \n", 1183 | " \n", 1184 | " \n", 1185 | " \n", 1186 | " \n", 1187 | " \n", 1188 | " \n", 1189 | " \n", 1190 | " \n", 1191 | " \n", 1192 | " \n", 1193 | " \n", 1194 | " \n", 1195 | " \n", 1196 | " \n", 1197 | " \n", 1198 | " \n", 1199 | " \n", 1200 | " \n", 1201 | " \n", 1202 | " \n", 1203 | " \n", 1204 | " \n", 1205 | " \n", 1206 | " \n", 1207 | " \n", 1208 | " \n", 1209 | " \n", 1210 | " \n", 1211 | " \n", 1212 | " \n", 1213 | " \n", 1214 | " \n", 1215 | " \n", 1216 | " \n", 1217 | " \n", 1218 | " \n", 1219 | " \n", 1220 | " \n", 1221 | " \n", 1222 | " \n", 1223 | " \n", 1224 | " \n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | "
Y2015Y2018
Item
Alcohol, Non-Food2180.002293.00
Alcoholic Beverages98783.7297847.27
Animal Products11811.7311578.61
Animal fats200675.72269648.27
Apples and products10559.159640.51
.........
Vegetables, Other158104.08163987.21
Vegetal Products107064.17107775.39
Wheat and products234710.51242645.19
Wine4251.814039.32
Yams203151.78221272.09
\n", 1236 | "

119 rows × 2 columns

\n", 1237 | "
" 1238 | ], 1239 | "text/plain": [ 1240 | " Y2015 Y2018\n", 1241 | "Item \n", 1242 | "Alcohol, Non-Food 2180.00 2293.00\n", 1243 | "Alcoholic Beverages 98783.72 97847.27\n", 1244 | "Animal Products 11811.73 11578.61\n", 1245 | "Animal fats 200675.72 269648.27\n", 1246 | "Apples and products 10559.15 9640.51\n", 1247 | "... ... ...\n", 1248 | "Vegetables, Other 158104.08 163987.21\n", 1249 | "Vegetal Products 107064.17 107775.39\n", 1250 | "Wheat and products 234710.51 242645.19\n", 1251 | "Wine 4251.81 4039.32\n", 1252 | "Yams 203151.78 221272.09\n", 1253 | "\n", 1254 | "[119 rows x 2 columns]" 1255 | ] 1256 | }, 1257 | "execution_count": 11, 1258 | "metadata": {}, 1259 | "output_type": "execute_result" 1260 | } 1261 | ], 1262 | "source": [ 1263 | "# Grouo by sum aggregation of Item for the year columns 2015 and 2018\n", 1264 | "wine_2015_and_2018 = df_FBS.groupby('Item')[['Y2015', 'Y2018']].sum()\n", 1265 | "wine_2015_and_2018" 1266 | ] 1267 | }, 1268 | { 1269 | "cell_type": "code", 1270 | "execution_count": null, 1271 | "id": "1afa88a2", 1272 | "metadata": {}, 1273 | "outputs": [], 1274 | "source": [] 1275 | }, 1276 | { 1277 | "cell_type": "markdown", 1278 | "id": "fe6e9cc0", 1279 | "metadata": {}, 1280 | "source": [ 1281 | "#### Question 5\n", 1282 | "\n", 1283 | "Given the following numpy array \n", 1284 | "\n", 1285 | "array = ([[94, 89, 63],\n", 1286 | "\n", 1287 | " [93, 92, 48],\n", 1288 | "\n", 1289 | " [92, 94, 56]])\n", 1290 | "\n", 1291 | "How would you select the elements in bold and italics from the array?\n", 1292 | "\n", 1293 | "#### Ans:\n", 1294 | " \n", 1295 | "C - array[ : 2, 1 : ] " 1296 | ] 1297 | }, 1298 | { 1299 | "cell_type": "code", 1300 | "execution_count": null, 1301 | "id": "27dc9ffc", 1302 | "metadata": {}, 1303 | "outputs": [], 1304 | "source": [] 1305 | }, 1306 | { 1307 | "cell_type": "markdown", 1308 | "id": "493a77cd", 1309 | "metadata": {}, 1310 | "source": [ 1311 | "#### Question 6\n", 1312 | "\n", 1313 | "Given the following python code, what would the output of the code give?\n", 1314 | "\n", 1315 | "my_tuppy = (1,2,5,8)\n", 1316 | "\n", 1317 | "my_tuppy[2] = 6\n", 1318 | "\n", 1319 | "#### Ans:\n", 1320 | " \n", 1321 | "B - Type Error" 1322 | ] 1323 | }, 1324 | { 1325 | "cell_type": "code", 1326 | "execution_count": null, 1327 | "id": "fb11ec1f", 1328 | "metadata": {}, 1329 | "outputs": [], 1330 | "source": [] 1331 | }, 1332 | { 1333 | "cell_type": "markdown", 1334 | "id": "2db2a42a", 1335 | "metadata": {}, 1336 | "source": [ 1337 | "#### Question 7\n", 1338 | "\n", 1339 | "Select columns ‘Y2017’ and ‘Area’, Perform a groupby operation on ‘Area’. Which of these Areas had the 7th lowest\n", 1340 | "sum in 2017?\n", 1341 | "\n", 1342 | "#### Ans:\n", 1343 | " \n", 1344 | "B - Guinea-Bissau" 1345 | ] 1346 | }, 1347 | { 1348 | "cell_type": "code", 1349 | "execution_count": 12, 1350 | "id": "d085c27f", 1351 | "metadata": {}, 1352 | "outputs": [ 1353 | { 1354 | "name": "stdout", 1355 | "output_type": "stream", 1356 | "text": [ 1357 | " Area Y2017\n", 1358 | "42 Sudan (former) 0.00\n", 1359 | "16 Ethiopia PDR 0.00\n", 1360 | "9 Comoros 59.84\n", 1361 | "38 Seychelles 442.34\n", 1362 | "36 Sao Tome and Principe 12662.63\n", 1363 | "5 Cabo Verde 14650.74\n", 1364 | "21 Guinea-Bissau 19102.77\n", 1365 | "23 Lesotho 21267.96\n", 1366 | "3 Botswana 22101.30\n", 1367 | "12 Djibouti 22729.91\n", 1368 | "18 Gambia 23154.18\n", 1369 | "17 Gabon 27979.64\n", 1370 | "24 Liberia 29342.20\n", 1371 | "32 Namibia 29874.89\n", 1372 | "7 Central African Republic 29937.00\n", 1373 | "10 Congo 41181.68\n", 1374 | "43 Togo 49841.88\n", 1375 | "29 Mauritius 51114.83\n", 1376 | "14 Eswatini 54343.33\n", 1377 | "39 Sierra Leone 55311.33\n", 1378 | "8 Chad 71594.68\n", 1379 | "35 Rwanda 73663.69\n", 1380 | "48 Zimbabwe 75919.34\n", 1381 | "37 Senegal 95681.15\n", 1382 | "20 Guinea 98138.87\n", 1383 | "4 Burkina Faso 101855.07\n", 1384 | "47 Zambia 103223.77\n", 1385 | "44 Tunisia 124167.20\n", 1386 | "2 Benin 124771.22\n", 1387 | "33 Niger 126707.58\n", 1388 | "25 Madagascar 131197.73\n", 1389 | "27 Mali 149928.33\n", 1390 | "28 Mauritania 156665.46\n", 1391 | "31 Mozambique 161407.98\n", 1392 | "26 Malawi 181098.71\n", 1393 | "45 Uganda 213950.38\n", 1394 | "11 Côte d'Ivoire 224599.01\n", 1395 | "1 Angola 229159.57\n", 1396 | "6 Cameroon 232030.43\n", 1397 | "41 Sudan 239931.92\n", 1398 | "22 Kenya 264660.66\n", 1399 | "46 United Republic of Tanzania 322616.85\n", 1400 | "0 Algeria 325644.27\n", 1401 | "19 Ghana 337599.06\n", 1402 | "30 Morocco 388495.36\n", 1403 | "15 Ethiopia 448683.76\n", 1404 | "40 South Africa 517590.54\n", 1405 | "13 Egypt 866379.92\n", 1406 | "34 Nigeria 1483268.23\n", 1407 | "The Area with the 7th lowest sum in 2017 is: Guinea-Bissau\n" 1408 | ] 1409 | } 1410 | ], 1411 | "source": [ 1412 | "# Select columns 'Y2017' and 'Area'\n", 1413 | "selected_columns = df_FBS[['Y2017', 'Area']]\n", 1414 | "\n", 1415 | "# Group by 'Area' and calculate the sum of 'Y2017' for each Area\n", 1416 | "grouped_data = selected_columns.groupby('Area')['Y2017'].sum().reset_index()\n", 1417 | "\n", 1418 | "# Sort the data by the sum of 'Y2017' in ascending order\n", 1419 | "sorted_data = grouped_data.sort_values(by='Y2017', ascending=True)\n", 1420 | "print(sorted_data)\n", 1421 | "\n", 1422 | "# Get the Area with the 7th lowest sum in 2017\n", 1423 | "seventh_lowest_area = sorted_data.iloc[6]['Area']\n", 1424 | "\n", 1425 | "# Print the result\n", 1426 | "print(\"The Area with the 7th lowest sum in 2017 is:\", seventh_lowest_area)" 1427 | ] 1428 | }, 1429 | { 1430 | "cell_type": "code", 1431 | "execution_count": null, 1432 | "id": "5db878e7", 1433 | "metadata": {}, 1434 | "outputs": [], 1435 | "source": [] 1436 | }, 1437 | { 1438 | "cell_type": "markdown", 1439 | "id": "4e38ed0b", 1440 | "metadata": {}, 1441 | "source": [ 1442 | "#### Question 8\n", 1443 | "\n", 1444 | "Which year had the least correlation with ‘Element Code’?\n", 1445 | "\n", 1446 | "#### Ans:\n", 1447 | " \n", 1448 | "D - 2016" 1449 | ] 1450 | }, 1451 | { 1452 | "cell_type": "code", 1453 | "execution_count": 13, 1454 | "id": "038e18ef", 1455 | "metadata": {}, 1456 | "outputs": [ 1457 | { 1458 | "data": { 1459 | "text/html": [ 1460 | "
\n", 1461 | "\n", 1474 | "\n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | " \n", 1482 | " \n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | " \n", 1518 | " \n", 1519 | " \n", 1520 | " \n", 1521 | " \n", 1522 | " \n", 1523 | " \n", 1524 | " \n", 1525 | " \n", 1526 | " \n", 1527 | " \n", 1528 | " \n", 1529 | " \n", 1530 | " \n", 1531 | " \n", 1532 | " \n", 1533 | " \n", 1534 | " \n", 1535 | " \n", 1536 | " \n", 1537 | " \n", 1538 | " \n", 1539 | " \n", 1540 | " \n", 1541 | " \n", 1542 | "
Element CodeY2014Y2015Y2016Y2017Y2018
Element Code1.0000000.0244570.0238890.0234440.0242540.024279
Y20140.0244571.0000000.9946470.9960810.9952300.994872
Y20150.0238890.9946471.0000000.9957390.9880480.988208
Y20160.0234440.9960810.9957391.0000000.9927850.992757
Y20170.0242540.9952300.9880480.9927851.0000000.998103
Y20180.0242790.9948720.9882080.9927570.9981031.000000
\n", 1543 | "
" 1544 | ], 1545 | "text/plain": [ 1546 | " Element Code Y2014 Y2015 Y2016 Y2017 Y2018\n", 1547 | "Element Code 1.000000 0.024457 0.023889 0.023444 0.024254 0.024279\n", 1548 | "Y2014 0.024457 1.000000 0.994647 0.996081 0.995230 0.994872\n", 1549 | "Y2015 0.023889 0.994647 1.000000 0.995739 0.988048 0.988208\n", 1550 | "Y2016 0.023444 0.996081 0.995739 1.000000 0.992785 0.992757\n", 1551 | "Y2017 0.024254 0.995230 0.988048 0.992785 1.000000 0.998103\n", 1552 | "Y2018 0.024279 0.994872 0.988208 0.992757 0.998103 1.000000" 1553 | ] 1554 | }, 1555 | "execution_count": 13, 1556 | "metadata": {}, 1557 | "output_type": "execute_result" 1558 | } 1559 | ], 1560 | "source": [ 1561 | "# Get a dataframe of the required columns : Element Code and the year columns\n", 1562 | "numeric_columns = df_FBS[['Element Code', 'Y2014', 'Y2015', 'Y2016', 'Y2017', 'Y2018']]\n", 1563 | "\n", 1564 | "# Get the correlaton of the year column with Element code and assign the result to correlation_matrix\n", 1565 | "correlation_matrix = numeric_columns.corr()\n", 1566 | "correlation_matrix" 1567 | ] 1568 | }, 1569 | { 1570 | "cell_type": "code", 1571 | "execution_count": null, 1572 | "id": "dda0b2d8", 1573 | "metadata": {}, 1574 | "outputs": [], 1575 | "source": [] 1576 | }, 1577 | { 1578 | "cell_type": "markdown", 1579 | "id": "d42e992b", 1580 | "metadata": {}, 1581 | "source": [ 1582 | "#### Question 9\n", 1583 | "\n", 1584 | "Which of the following dataframe methods can be used to access elements across rows and columns?\n", 1585 | "\n", 1586 | "#### Ans:\n", 1587 | " \n", 1588 | "E - c and d" 1589 | ] 1590 | }, 1591 | { 1592 | "cell_type": "code", 1593 | "execution_count": null, 1594 | "id": "ce5bd980", 1595 | "metadata": {}, 1596 | "outputs": [], 1597 | "source": [] 1598 | }, 1599 | { 1600 | "cell_type": "markdown", 1601 | "id": "71adcf7e", 1602 | "metadata": {}, 1603 | "source": [ 1604 | "#### Question 10\n", 1605 | "\n", 1606 | "Which of these python data structures is unorderly?\n", 1607 | "\n", 1608 | "#### Ans:\n", 1609 | " \n", 1610 | "A - Set\n", 1611 | "\n", 1612 | "B - Dictionary" 1613 | ] 1614 | }, 1615 | { 1616 | "cell_type": "code", 1617 | "execution_count": null, 1618 | "id": "a6e92161", 1619 | "metadata": {}, 1620 | "outputs": [], 1621 | "source": [] 1622 | }, 1623 | { 1624 | "cell_type": "markdown", 1625 | "id": "980416e2", 1626 | "metadata": {}, 1627 | "source": [ 1628 | "#### Question 11\n", 1629 | "\n", 1630 | "If you have the following list\n", 1631 | "\n", 1632 | "lst = [[35, 'Portugal', 94], [33, 'Argentina', 93], [30 , 'Brazil', 92]]\n", 1633 | "\n", 1634 | "col = [‘Age’,’Nationality’,’Overall’]\n", 1635 | "\n", 1636 | "How do you create a pandas DataFrame using this list, to look like the table below?\n", 1637 | "\n", 1638 | "#### Ans:\n", 1639 | " \n", 1640 | "B - pd.Dataframe(lst, columns = col, index = [1,2,3])" 1641 | ] 1642 | }, 1643 | { 1644 | "cell_type": "code", 1645 | "execution_count": null, 1646 | "id": "a1b672d4", 1647 | "metadata": {}, 1648 | "outputs": [], 1649 | "source": [] 1650 | }, 1651 | { 1652 | "cell_type": "markdown", 1653 | "id": "20928f97", 1654 | "metadata": {}, 1655 | "source": [ 1656 | "#### Question 12\n", 1657 | "\n", 1658 | "Perform a groupby operation on ‘Element’. What is the total number of the sum of Processing in 2017?\n", 1659 | "\n", 1660 | "#### Ans:\n", 1661 | " \n", 1662 | "E - 292836.00" 1663 | ] 1664 | }, 1665 | { 1666 | "cell_type": "code", 1667 | "execution_count": 14, 1668 | "id": "3670d912", 1669 | "metadata": {}, 1670 | "outputs": [ 1671 | { 1672 | "data": { 1673 | "text/html": [ 1674 | "
\n", 1675 | "\n", 1688 | "\n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | "
Y2014Y2015Y2016Y2017Y2018
Element
Domestic supply quantity1996716.352021493.552044842.702088198.102161192.10
Export Quantity150020.64157614.47151920.46182338.80181594.80
Fat supply quantity (g/capita/day)10225.5610235.7410102.7710253.8410258.69
Feed216927.89225050.22228958.65223705.68233489.68
Food1212332.491232361.101247022.171258888.281303841.28
Food supply (kcal/capita/day)454257.00453383.00451810.00454681.00455261.00
Food supply quantity (kg/capita/yr)49650.6349345.1348985.2848690.0449056.85
Import Quantity274144.48267018.46286582.78294559.09287997.09
Losses153223.00155439.00157787.00160614.00163902.00
Other uses (non-food)78718.1366254.4169563.6891645.9791300.97
Processing282923.00287929.00280631.00292836.00308429.00
Production1931287.751947019.391943537.152030056.892075072.89
Protein supply quantity (g/capita/day)11836.4611833.9511779.6911842.4511833.56
Residuals30149.0030045.0037224.0035500.0034864.00
Seed21922.9223976.8223389.2024870.1425263.14
Stock Variation58749.8334910.9933140.1254316.9120577.91
Total Population - Both sexes1031585.001058081.001085107.001112641.001140605.00
Tourist consumption416.00349.0089.0091.0090.00
\n", 1854 | "
" 1855 | ], 1856 | "text/plain": [ 1857 | " Y2014 Y2015 Y2016 \\\n", 1858 | "Element \n", 1859 | "Domestic supply quantity 1996716.35 2021493.55 2044842.70 \n", 1860 | "Export Quantity 150020.64 157614.47 151920.46 \n", 1861 | "Fat supply quantity (g/capita/day) 10225.56 10235.74 10102.77 \n", 1862 | "Feed 216927.89 225050.22 228958.65 \n", 1863 | "Food 1212332.49 1232361.10 1247022.17 \n", 1864 | "Food supply (kcal/capita/day) 454257.00 453383.00 451810.00 \n", 1865 | "Food supply quantity (kg/capita/yr) 49650.63 49345.13 48985.28 \n", 1866 | "Import Quantity 274144.48 267018.46 286582.78 \n", 1867 | "Losses 153223.00 155439.00 157787.00 \n", 1868 | "Other uses (non-food) 78718.13 66254.41 69563.68 \n", 1869 | "Processing 282923.00 287929.00 280631.00 \n", 1870 | "Production 1931287.75 1947019.39 1943537.15 \n", 1871 | "Protein supply quantity (g/capita/day) 11836.46 11833.95 11779.69 \n", 1872 | "Residuals 30149.00 30045.00 37224.00 \n", 1873 | "Seed 21922.92 23976.82 23389.20 \n", 1874 | "Stock Variation 58749.83 34910.99 33140.12 \n", 1875 | "Total Population - Both sexes 1031585.00 1058081.00 1085107.00 \n", 1876 | "Tourist consumption 416.00 349.00 89.00 \n", 1877 | "\n", 1878 | " Y2017 Y2018 \n", 1879 | "Element \n", 1880 | "Domestic supply quantity 2088198.10 2161192.10 \n", 1881 | "Export Quantity 182338.80 181594.80 \n", 1882 | "Fat supply quantity (g/capita/day) 10253.84 10258.69 \n", 1883 | "Feed 223705.68 233489.68 \n", 1884 | "Food 1258888.28 1303841.28 \n", 1885 | "Food supply (kcal/capita/day) 454681.00 455261.00 \n", 1886 | "Food supply quantity (kg/capita/yr) 48690.04 49056.85 \n", 1887 | "Import Quantity 294559.09 287997.09 \n", 1888 | "Losses 160614.00 163902.00 \n", 1889 | "Other uses (non-food) 91645.97 91300.97 \n", 1890 | "Processing 292836.00 308429.00 \n", 1891 | "Production 2030056.89 2075072.89 \n", 1892 | "Protein supply quantity (g/capita/day) 11842.45 11833.56 \n", 1893 | "Residuals 35500.00 34864.00 \n", 1894 | "Seed 24870.14 25263.14 \n", 1895 | "Stock Variation 54316.91 20577.91 \n", 1896 | "Total Population - Both sexes 1112641.00 1140605.00 \n", 1897 | "Tourist consumption 91.00 90.00 " 1898 | ] 1899 | }, 1900 | "execution_count": 14, 1901 | "metadata": {}, 1902 | "output_type": "execute_result" 1903 | } 1904 | ], 1905 | "source": [ 1906 | "# Get the list of years\n", 1907 | "year_list = ['Y2014', 'Y2015', 'Y2016', 'Y2017', 'Y2018']\n", 1908 | "\n", 1909 | "# Use group by operation on 'Element' and get the total sum for each 'Element' values\n", 1910 | "df_FBS.groupby(['Element'])[year_list].agg(np.sum)" 1911 | ] 1912 | }, 1913 | { 1914 | "cell_type": "code", 1915 | "execution_count": null, 1916 | "id": "0be87109", 1917 | "metadata": {}, 1918 | "outputs": [], 1919 | "source": [] 1920 | }, 1921 | { 1922 | "cell_type": "markdown", 1923 | "id": "81c79403", 1924 | "metadata": {}, 1925 | "source": [ 1926 | "#### Question 13\n", 1927 | "\n", 1928 | "How would you check for the number of rows and columns in a pandas DataFrame named df?\n", 1929 | "\n", 1930 | "#### Ans:\n", 1931 | " \n", 1932 | "C - df.shape" 1933 | ] 1934 | }, 1935 | { 1936 | "cell_type": "code", 1937 | "execution_count": null, 1938 | "id": "ea2f98ee", 1939 | "metadata": {}, 1940 | "outputs": [], 1941 | "source": [] 1942 | }, 1943 | { 1944 | "cell_type": "markdown", 1945 | "id": "a4a4446f", 1946 | "metadata": {}, 1947 | "source": [ 1948 | "#### Question 14\n", 1949 | "\n", 1950 | "What is the total number and percentage of missing data in 2014 to 3 decimal places?\n", 1951 | "\n", 1952 | "#### Ans:\n", 1953 | " \n", 1954 | "E - 1589 and 2.607%" 1955 | ] 1956 | }, 1957 | { 1958 | "cell_type": "code", 1959 | "execution_count": 15, 1960 | "id": "bb50e610", 1961 | "metadata": {}, 1962 | "outputs": [ 1963 | { 1964 | "data": { 1965 | "text/plain": [ 1966 | "Area Code 0\n", 1967 | "Area 0\n", 1968 | "Item Code 0\n", 1969 | "Item 0\n", 1970 | "Element Code 0\n", 1971 | "Element 0\n", 1972 | "Unit 0\n", 1973 | "Y2014 1589\n", 1974 | "Y2015 1548\n", 1975 | "Y2016 1535\n", 1976 | "Y2017 1506\n", 1977 | "Y2018 1436\n", 1978 | "dtype: int64" 1979 | ] 1980 | }, 1981 | "execution_count": 15, 1982 | "metadata": {}, 1983 | "output_type": "execute_result" 1984 | } 1985 | ], 1986 | "source": [ 1987 | "# Get the total sum of missing values in each column in the dataframe\n", 1988 | "df_FBS.isnull().sum()" 1989 | ] 1990 | }, 1991 | { 1992 | "cell_type": "code", 1993 | "execution_count": 16, 1994 | "id": "0af275a7", 1995 | "metadata": {}, 1996 | "outputs": [ 1997 | { 1998 | "data": { 1999 | "text/plain": [ 2000 | "2.6073544131401474" 2001 | ] 2002 | }, 2003 | "execution_count": 16, 2004 | "metadata": {}, 2005 | "output_type": "execute_result" 2006 | } 2007 | ], 2008 | "source": [ 2009 | "# percentage of missing values\n", 2010 | "df_FBS['Y2014'].isnull().sum() * 100 / len(df_FBS)" 2011 | ] 2012 | }, 2013 | { 2014 | "cell_type": "code", 2015 | "execution_count": null, 2016 | "id": "b63eac08", 2017 | "metadata": {}, 2018 | "outputs": [], 2019 | "source": [] 2020 | }, 2021 | { 2022 | "cell_type": "markdown", 2023 | "id": "0ded550f", 2024 | "metadata": {}, 2025 | "source": [ 2026 | "#### Question 15\n", 2027 | "\n", 2028 | "Select columns ‘Y2017’ and ‘Area’, Perform a groupby operation on ‘Area’. Which of these Areas had the highest sum in 2017?\n", 2029 | "\n", 2030 | "#### Ans:\n", 2031 | " \n", 2032 | "A - Nigeria" 2033 | ] 2034 | }, 2035 | { 2036 | "cell_type": "code", 2037 | "execution_count": 17, 2038 | "id": "cd3e2ab4", 2039 | "metadata": { 2040 | "scrolled": true 2041 | }, 2042 | "outputs": [ 2043 | { 2044 | "name": "stdout", 2045 | "output_type": "stream", 2046 | "text": [ 2047 | " Area Y2017\n", 2048 | "34 Nigeria 1483268.23\n", 2049 | "13 Egypt 866379.92\n", 2050 | "40 South Africa 517590.54\n", 2051 | "15 Ethiopia 448683.76\n", 2052 | "30 Morocco 388495.36\n", 2053 | "19 Ghana 337599.06\n", 2054 | "0 Algeria 325644.27\n", 2055 | "46 United Republic of Tanzania 322616.85\n", 2056 | "22 Kenya 264660.66\n", 2057 | "41 Sudan 239931.92\n", 2058 | "6 Cameroon 232030.43\n", 2059 | "1 Angola 229159.57\n", 2060 | "11 Côte d'Ivoire 224599.01\n", 2061 | "45 Uganda 213950.38\n", 2062 | "26 Malawi 181098.71\n", 2063 | "31 Mozambique 161407.98\n", 2064 | "28 Mauritania 156665.46\n", 2065 | "27 Mali 149928.33\n", 2066 | "25 Madagascar 131197.73\n", 2067 | "33 Niger 126707.58\n", 2068 | "2 Benin 124771.22\n", 2069 | "44 Tunisia 124167.20\n", 2070 | "47 Zambia 103223.77\n", 2071 | "4 Burkina Faso 101855.07\n", 2072 | "20 Guinea 98138.87\n", 2073 | "37 Senegal 95681.15\n", 2074 | "48 Zimbabwe 75919.34\n", 2075 | "35 Rwanda 73663.69\n", 2076 | "8 Chad 71594.68\n", 2077 | "39 Sierra Leone 55311.33\n", 2078 | "14 Eswatini 54343.33\n", 2079 | "29 Mauritius 51114.83\n", 2080 | "43 Togo 49841.88\n", 2081 | "10 Congo 41181.68\n", 2082 | "7 Central African Republic 29937.00\n", 2083 | "32 Namibia 29874.89\n", 2084 | "24 Liberia 29342.20\n", 2085 | "17 Gabon 27979.64\n", 2086 | "18 Gambia 23154.18\n", 2087 | "12 Djibouti 22729.91\n", 2088 | "3 Botswana 22101.30\n", 2089 | "23 Lesotho 21267.96\n", 2090 | "21 Guinea-Bissau 19102.77\n", 2091 | "5 Cabo Verde 14650.74\n", 2092 | "36 Sao Tome and Principe 12662.63\n", 2093 | "38 Seychelles 442.34\n", 2094 | "9 Comoros 59.84\n", 2095 | "42 Sudan (former) 0.00\n", 2096 | "16 Ethiopia PDR 0.00\n", 2097 | "The Area with the highest sum in 2017 is: Nigeria\n" 2098 | ] 2099 | } 2100 | ], 2101 | "source": [ 2102 | "# Select columns 'Y2017' and 'Area'\n", 2103 | "selected_columns = df_FBS[['Y2017', 'Area']]\n", 2104 | "\n", 2105 | "# Group by 'Area' and calculate the sum of 'Y2017' for each Area\n", 2106 | "grouped_data = selected_columns.groupby('Area')['Y2017'].sum().reset_index()\n", 2107 | "\n", 2108 | "# Sort the data by the sum of 'Y2017' in ascending order\n", 2109 | "sorted_data = grouped_data.sort_values(by='Y2017', ascending=False)\n", 2110 | "print(sorted_data)\n", 2111 | "\n", 2112 | "# Get the Area with the 7th lowest sum in 2017\n", 2113 | "highest_sum_area = sorted_data.iloc[0]['Area']\n", 2114 | "\n", 2115 | "# Print the result\n", 2116 | "print(\"The Area with the highest sum in 2017 is:\", highest_sum_area)" 2117 | ] 2118 | }, 2119 | { 2120 | "cell_type": "code", 2121 | "execution_count": null, 2122 | "id": "20548412", 2123 | "metadata": {}, 2124 | "outputs": [], 2125 | "source": [] 2126 | }, 2127 | { 2128 | "cell_type": "markdown", 2129 | "id": "233087eb", 2130 | "metadata": {}, 2131 | "source": [ 2132 | "#### Question 16\n", 2133 | "\n", 2134 | "Consider the following list of tuples:\n", 2135 | "\n", 2136 | "y = [(2, 4), (7, 8), (1, 5, 9)]\n", 2137 | "\n", 2138 | "How would you assign element 8 from the list to a variable x?\n", 2139 | "\n", 2140 | "#### Ans:\n", 2141 | " \n", 2142 | "B - x = y[1][1]" 2143 | ] 2144 | }, 2145 | { 2146 | "cell_type": "code", 2147 | "execution_count": null, 2148 | "id": "b0e1ac71", 2149 | "metadata": {}, 2150 | "outputs": [], 2151 | "source": [] 2152 | }, 2153 | { 2154 | "cell_type": "markdown", 2155 | "id": "55249720", 2156 | "metadata": {}, 2157 | "source": [ 2158 | "#### Question 17\n", 2159 | "\n", 2160 | "A pandas Dataframe with dimensions (100,3) has how many features and observations?\n", 2161 | "\n", 2162 | "#### Ans:\n", 2163 | " \n", 2164 | "B - 3 features, 100 observations" 2165 | ] 2166 | }, 2167 | { 2168 | "cell_type": "code", 2169 | "execution_count": null, 2170 | "id": "6e0e7026", 2171 | "metadata": {}, 2172 | "outputs": [], 2173 | "source": [] 2174 | }, 2175 | { 2176 | "cell_type": "markdown", 2177 | "id": "a2cb829b", 2178 | "metadata": {}, 2179 | "source": [ 2180 | "#### Question 18\n", 2181 | "\n", 2182 | "Which of the following is a python inbuilt module?\n", 2183 | "\n", 2184 | "#### Ans:\n", 2185 | " \n", 2186 | "C - Math" 2187 | ] 2188 | }, 2189 | { 2190 | "cell_type": "code", 2191 | "execution_count": null, 2192 | "id": "197ce575", 2193 | "metadata": {}, 2194 | "outputs": [], 2195 | "source": [] 2196 | }, 2197 | { 2198 | "cell_type": "markdown", 2199 | "id": "30f39308", 2200 | "metadata": {}, 2201 | "source": [ 2202 | "#### Question 19\n", 2203 | "\n", 2204 | "What is the total number of unique countries in the dataset?\n", 2205 | "\n", 2206 | "#### Ans:\n", 2207 | " \n", 2208 | "A - 49" 2209 | ] 2210 | }, 2211 | { 2212 | "cell_type": "code", 2213 | "execution_count": 18, 2214 | "id": "fe9cd365", 2215 | "metadata": {}, 2216 | "outputs": [ 2217 | { 2218 | "data": { 2219 | "text/plain": [ 2220 | "49" 2221 | ] 2222 | }, 2223 | "execution_count": 18, 2224 | "metadata": {}, 2225 | "output_type": "execute_result" 2226 | } 2227 | ], 2228 | "source": [ 2229 | "# Get the total number of unique 'Area' in the dataframe\n", 2230 | "df_FBS['Area'].nunique()" 2231 | ] 2232 | }, 2233 | { 2234 | "cell_type": "code", 2235 | "execution_count": null, 2236 | "id": "9a000af9", 2237 | "metadata": {}, 2238 | "outputs": [], 2239 | "source": [] 2240 | }, 2241 | { 2242 | "cell_type": "markdown", 2243 | "id": "ca723cc5", 2244 | "metadata": {}, 2245 | "source": [ 2246 | "#### Question 20\n", 2247 | "\n", 2248 | "What would be the output for?\n", 2249 | "\n", 2250 | "S = [['him', 'sell'], [90, 28, 43]]\n", 2251 | "\n", 2252 | "S[0][1][1]\n", 2253 | "\n", 2254 | "#### Ans:\n", 2255 | " \n", 2256 | "D - 'e'" 2257 | ] 2258 | }, 2259 | { 2260 | "cell_type": "code", 2261 | "execution_count": null, 2262 | "id": "00188d0d", 2263 | "metadata": {}, 2264 | "outputs": [], 2265 | "source": [] 2266 | }, 2267 | { 2268 | "cell_type": "code", 2269 | "execution_count": null, 2270 | "id": "dce33880", 2271 | "metadata": {}, 2272 | "outputs": [], 2273 | "source": [] 2274 | } 2275 | ], 2276 | "metadata": { 2277 | "kernelspec": { 2278 | "display_name": "Python 3 (ipykernel)", 2279 | "language": "python", 2280 | "name": "python3" 2281 | }, 2282 | "language_info": { 2283 | "codemirror_mode": { 2284 | "name": "ipython", 2285 | "version": 3 2286 | }, 2287 | "file_extension": ".py", 2288 | "mimetype": "text/x-python", 2289 | "name": "python", 2290 | "nbconvert_exporter": "python", 2291 | "pygments_lexer": "ipython3", 2292 | "version": "3.11.5" 2293 | } 2294 | }, 2295 | "nbformat": 4, 2296 | "nbformat_minor": 5 2297 | } 2298 | -------------------------------------------------------------------------------- /Stage_A_Quiz/Stage_A_Quiz.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "629963e4", 6 | "metadata": {}, 7 | "source": [ 8 | "## Food Balance Sheet Analysis" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 1, 14 | "id": "2f535948", 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "# import all packages and set plots to be embedded inline\n", 19 | "import numpy as np\n", 20 | "import pandas as pd\n", 21 | "import matplotlib.pyplot as plt\n", 22 | "import seaborn as sb\n", 23 | "\n", 24 | "%matplotlib inline\n", 25 | "\n", 26 | "# suppress warnings from final output\n", 27 | "import warnings\n", 28 | "warnings.simplefilter(\"ignore\")" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "id": "4dd46ce2", 34 | "metadata": {}, 35 | "source": [ 36 | "## Gathering and Assessing the Data" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 2, 42 | "id": "ef43ca9c", 43 | "metadata": {}, 44 | "outputs": [ 45 | { 46 | "name": "stdout", 47 | "output_type": "stream", 48 | "text": [ 49 | "(60943, 12)\n" 50 | ] 51 | }, 52 | { 53 | "data": { 54 | "text/html": [ 55 | "
\n", 56 | "\n", 69 | "\n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | "
Area CodeAreaItem CodeItemElement CodeElementUnitY2014Y2015Y2016Y2017Y2018
04Algeria2501Population511Total Population - Both sexes1000 persons38924.0039728.0040551.0041389.0042228.00
14Algeria2501Population5301Domestic supply quantity1000 tonnes0.000.000.000.000.00
24Algeria2901Grand Total664Food supply (kcal/capita/day)kcal/capita/day3377.003379.003372.003341.003322.00
34Algeria2901Grand Total674Protein supply quantity (g/capita/day)g/capita/day94.9094.3594.7292.8291.83
44Algeria2901Grand Total684Fat supply quantity (g/capita/day)g/capita/day80.0679.3677.4080.1977.28
\n", 165 | "
" 166 | ], 167 | "text/plain": [ 168 | " Area Code Area Item Code Item Element Code \\\n", 169 | "0 4 Algeria 2501 Population 511 \n", 170 | "1 4 Algeria 2501 Population 5301 \n", 171 | "2 4 Algeria 2901 Grand Total 664 \n", 172 | "3 4 Algeria 2901 Grand Total 674 \n", 173 | "4 4 Algeria 2901 Grand Total 684 \n", 174 | "\n", 175 | " Element Unit Y2014 \\\n", 176 | "0 Total Population - Both sexes 1000 persons 38924.00 \n", 177 | "1 Domestic supply quantity 1000 tonnes 0.00 \n", 178 | "2 Food supply (kcal/capita/day) kcal/capita/day 3377.00 \n", 179 | "3 Protein supply quantity (g/capita/day) g/capita/day 94.90 \n", 180 | "4 Fat supply quantity (g/capita/day) g/capita/day 80.06 \n", 181 | "\n", 182 | " Y2015 Y2016 Y2017 Y2018 \n", 183 | "0 39728.00 40551.00 41389.00 42228.00 \n", 184 | "1 0.00 0.00 0.00 0.00 \n", 185 | "2 3379.00 3372.00 3341.00 3322.00 \n", 186 | "3 94.35 94.72 92.82 91.83 \n", 187 | "4 79.36 77.40 80.19 77.28 " 188 | ] 189 | }, 190 | "execution_count": 2, 191 | "metadata": {}, 192 | "output_type": "execute_result" 193 | } 194 | ], 195 | "source": [ 196 | "# import dataset to pandas dataframe\n", 197 | "df_FBS = pd.read_csv('FoodBalanceSheets_E_Africa_NOFLAG.csv', encoding='Latin-1')\n", 198 | "\n", 199 | "# Checking the total number of rows and columns in our dataframe\n", 200 | "print(df_FBS.shape)\n", 201 | "\n", 202 | "#Looking at the first five rows in our dataframe\n", 203 | "df_FBS.head()" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": 3, 209 | "id": "c3a2e817", 210 | "metadata": {}, 211 | "outputs": [ 212 | { 213 | "data": { 214 | "text/html": [ 215 | "
\n", 216 | "\n", 229 | "\n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | "
Area CodeAreaItem CodeItemElement CodeElementUnitY2014Y2015Y2016Y2017Y2018
60938181Zimbabwe2899Miscellaneous5142Food1000 tonnes42.0046.0033.0019.0016.00
60939181Zimbabwe2899Miscellaneous645Food supply quantity (kg/capita/yr)kg3.063.332.351.331.08
60940181Zimbabwe2899Miscellaneous664Food supply (kcal/capita/day)kcal/capita/day3.004.003.001.001.00
60941181Zimbabwe2899Miscellaneous674Protein supply quantity (g/capita/day)g/capita/day0.100.110.080.040.04
60942181Zimbabwe2899Miscellaneous684Fat supply quantity (g/capita/day)g/capita/day0.040.050.030.020.01
\n", 325 | "
" 326 | ], 327 | "text/plain": [ 328 | " Area Code Area Item Code Item Element Code \\\n", 329 | "60938 181 Zimbabwe 2899 Miscellaneous 5142 \n", 330 | "60939 181 Zimbabwe 2899 Miscellaneous 645 \n", 331 | "60940 181 Zimbabwe 2899 Miscellaneous 664 \n", 332 | "60941 181 Zimbabwe 2899 Miscellaneous 674 \n", 333 | "60942 181 Zimbabwe 2899 Miscellaneous 684 \n", 334 | "\n", 335 | " Element Unit Y2014 Y2015 \\\n", 336 | "60938 Food 1000 tonnes 42.00 46.00 \n", 337 | "60939 Food supply quantity (kg/capita/yr) kg 3.06 3.33 \n", 338 | "60940 Food supply (kcal/capita/day) kcal/capita/day 3.00 4.00 \n", 339 | "60941 Protein supply quantity (g/capita/day) g/capita/day 0.10 0.11 \n", 340 | "60942 Fat supply quantity (g/capita/day) g/capita/day 0.04 0.05 \n", 341 | "\n", 342 | " Y2016 Y2017 Y2018 \n", 343 | "60938 33.00 19.00 16.00 \n", 344 | "60939 2.35 1.33 1.08 \n", 345 | "60940 3.00 1.00 1.00 \n", 346 | "60941 0.08 0.04 0.04 \n", 347 | "60942 0.03 0.02 0.01 " 348 | ] 349 | }, 350 | "execution_count": 3, 351 | "metadata": {}, 352 | "output_type": "execute_result" 353 | } 354 | ], 355 | "source": [ 356 | "# Looking at the last five rows in our dataframe\n", 357 | "df_FBS.tail()" 358 | ] 359 | }, 360 | { 361 | "cell_type": "code", 362 | "execution_count": 4, 363 | "id": "79025bfc", 364 | "metadata": {}, 365 | "outputs": [ 366 | { 367 | "name": "stdout", 368 | "output_type": "stream", 369 | "text": [ 370 | "\n", 371 | "RangeIndex: 60943 entries, 0 to 60942\n", 372 | "Data columns (total 12 columns):\n", 373 | " # Column Non-Null Count Dtype \n", 374 | "--- ------ -------------- ----- \n", 375 | " 0 Area Code 60943 non-null int64 \n", 376 | " 1 Area 60943 non-null object \n", 377 | " 2 Item Code 60943 non-null int64 \n", 378 | " 3 Item 60943 non-null object \n", 379 | " 4 Element Code 60943 non-null int64 \n", 380 | " 5 Element 60943 non-null object \n", 381 | " 6 Unit 60943 non-null object \n", 382 | " 7 Y2014 59354 non-null float64\n", 383 | " 8 Y2015 59395 non-null float64\n", 384 | " 9 Y2016 59408 non-null float64\n", 385 | " 10 Y2017 59437 non-null float64\n", 386 | " 11 Y2018 59507 non-null float64\n", 387 | "dtypes: float64(5), int64(3), object(4)\n", 388 | "memory usage: 5.6+ MB\n" 389 | ] 390 | } 391 | ], 392 | "source": [ 393 | "# Looking at the information of our data\n", 394 | "df_FBS.info()" 395 | ] 396 | }, 397 | { 398 | "cell_type": "code", 399 | "execution_count": 5, 400 | "id": "2c26aebf", 401 | "metadata": {}, 402 | "outputs": [ 403 | { 404 | "data": { 405 | "text/html": [ 406 | "
\n", 407 | "\n", 420 | "\n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | "
Area CodeItem CodeElement CodeY2014Y2015Y2016Y2017Y2018
count60943.00000060943.00000060943.00000059354.00000059395.00000059408.00000059437.00000059507.000000
mean134.2655762687.1767063814.856456134.196282135.235966136.555222140.917765143.758381
std72.605709146.0557392212.0070331567.6636961603.4039841640.0071941671.8623591710.782658
min4.0000002501.000000511.000000-1796.000000-3161.000000-3225.000000-1582.000000-3396.000000
25%74.0000002562.000000684.0000000.0000000.0000000.0000000.0000000.000000
50%136.0000002630.0000005142.0000000.0900000.0800000.0800000.1000000.070000
75%195.0000002775.0000005511.0000008.3400008.4600008.4300009.0000009.000000
max276.0000002961.0000005911.000000176405.000000181137.000000185960.000000190873.000000195875.000000
\n", 525 | "
" 526 | ], 527 | "text/plain": [ 528 | " Area Code Item Code Element Code Y2014 Y2015 \\\n", 529 | "count 60943.000000 60943.000000 60943.000000 59354.000000 59395.000000 \n", 530 | "mean 134.265576 2687.176706 3814.856456 134.196282 135.235966 \n", 531 | "std 72.605709 146.055739 2212.007033 1567.663696 1603.403984 \n", 532 | "min 4.000000 2501.000000 511.000000 -1796.000000 -3161.000000 \n", 533 | "25% 74.000000 2562.000000 684.000000 0.000000 0.000000 \n", 534 | "50% 136.000000 2630.000000 5142.000000 0.090000 0.080000 \n", 535 | "75% 195.000000 2775.000000 5511.000000 8.340000 8.460000 \n", 536 | "max 276.000000 2961.000000 5911.000000 176405.000000 181137.000000 \n", 537 | "\n", 538 | " Y2016 Y2017 Y2018 \n", 539 | "count 59408.000000 59437.000000 59507.000000 \n", 540 | "mean 136.555222 140.917765 143.758381 \n", 541 | "std 1640.007194 1671.862359 1710.782658 \n", 542 | "min -3225.000000 -1582.000000 -3396.000000 \n", 543 | "25% 0.000000 0.000000 0.000000 \n", 544 | "50% 0.080000 0.100000 0.070000 \n", 545 | "75% 8.430000 9.000000 9.000000 \n", 546 | "max 185960.000000 190873.000000 195875.000000 " 547 | ] 548 | }, 549 | "execution_count": 5, 550 | "metadata": {}, 551 | "output_type": "execute_result" 552 | } 553 | ], 554 | "source": [ 555 | "# Looking at the statistical description of our data\n", 556 | "df_FBS.describe()" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": 6, 562 | "id": "02f2f527", 563 | "metadata": {}, 564 | "outputs": [ 565 | { 566 | "data": { 567 | "text/plain": [ 568 | "Area Code 0\n", 569 | "Area 0\n", 570 | "Item Code 0\n", 571 | "Item 0\n", 572 | "Element Code 0\n", 573 | "Element 0\n", 574 | "Unit 0\n", 575 | "Y2014 1589\n", 576 | "Y2015 1548\n", 577 | "Y2016 1535\n", 578 | "Y2017 1506\n", 579 | "Y2018 1436\n", 580 | "dtype: int64" 581 | ] 582 | }, 583 | "execution_count": 6, 584 | "metadata": {}, 585 | "output_type": "execute_result" 586 | } 587 | ], 588 | "source": [ 589 | "# Checking for missing values in our dataframe\n", 590 | "df_FBS.isnull().sum()" 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "execution_count": 7, 596 | "id": "a02277ed", 597 | "metadata": {}, 598 | "outputs": [ 599 | { 600 | "data": { 601 | "text/plain": [ 602 | "False" 603 | ] 604 | }, 605 | "execution_count": 7, 606 | "metadata": {}, 607 | "output_type": "execute_result" 608 | } 609 | ], 610 | "source": [ 611 | "# Checking if we have any duplicated rows\n", 612 | "df_FBS.duplicated().any()" 613 | ] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "execution_count": null, 618 | "id": "adca0331", 619 | "metadata": {}, 620 | "outputs": [], 621 | "source": [] 622 | }, 623 | { 624 | "cell_type": "markdown", 625 | "id": "eea55dee", 626 | "metadata": {}, 627 | "source": [ 628 | "## Answering the Quiz Questions" 629 | ] 630 | }, 631 | { 632 | "cell_type": "markdown", 633 | "id": "75509885", 634 | "metadata": {}, 635 | "source": [ 636 | "#### Question 1\n", 637 | "\n", 638 | "What is the total Protein supply quantity in Madagascar in 2015?\n", 639 | "\n", 640 | "#### Ans:\n", 641 | " \n", 642 | "B - 173.05" 643 | ] 644 | }, 645 | { 646 | "cell_type": "code", 647 | "execution_count": 8, 648 | "id": "2b67fe96", 649 | "metadata": {}, 650 | "outputs": [ 651 | { 652 | "name": "stdout", 653 | "output_type": "stream", 654 | "text": [ 655 | "Total Protein supply quantity in Madagascar in 2015: 173.04999999999998 g/capita/day\n" 656 | ] 657 | } 658 | ], 659 | "source": [ 660 | "# Filter the dataframe to select rows for Madagascar and the year 2015\n", 661 | "madagascar_2015 = df_FBS[(df_FBS['Area'] == 'Madagascar') & (df_FBS['Element'] == 'Protein supply quantity (g/capita/day)') & (df_FBS['Y2015'] != 0)]\n", 662 | "\n", 663 | "# Calculate the total protein supply quantity in Madagascar for 2015\n", 664 | "total_protein_supply_2015 = madagascar_2015['Y2015'].sum()\n", 665 | "\n", 666 | "# Print the result\n", 667 | "print(\"Total Protein supply quantity in Madagascar in 2015:\", total_protein_supply_2015, \"g/capita/day\")\n" 668 | ] 669 | }, 670 | { 671 | "cell_type": "code", 672 | "execution_count": null, 673 | "id": "37b882a2", 674 | "metadata": {}, 675 | "outputs": [], 676 | "source": [] 677 | }, 678 | { 679 | "cell_type": "markdown", 680 | "id": "ed6981a3", 681 | "metadata": {}, 682 | "source": [ 683 | "#### Question 2\n", 684 | "\n", 685 | "What is the mean and standard deviation across the whole dataset for the year 2017 to 2 decimal places?\n", 686 | "\n", 687 | "#### Ans:\n", 688 | " \n", 689 | "E - 140.92 and 1671.86" 690 | ] 691 | }, 692 | { 693 | "cell_type": "code", 694 | "execution_count": 9, 695 | "id": "49c627f0", 696 | "metadata": {}, 697 | "outputs": [ 698 | { 699 | "data": { 700 | "text/html": [ 701 | "
\n", 702 | "\n", 715 | "\n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 811 | " \n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | "
Area CodeItem CodeElement CodeY2014Y2015Y2016Y2017Y2018
count60943.00000060943.00000060943.00000059354.00000059395.00000059408.00000059437.00000059507.000000
mean134.2655762687.1767063814.856456134.196282135.235966136.555222140.917765143.758381
std72.605709146.0557392212.0070331567.6636961603.4039841640.0071941671.8623591710.782658
min4.0000002501.000000511.000000-1796.000000-3161.000000-3225.000000-1582.000000-3396.000000
25%74.0000002562.000000684.0000000.0000000.0000000.0000000.0000000.000000
50%136.0000002630.0000005142.0000000.0900000.0800000.0800000.1000000.070000
75%195.0000002775.0000005511.0000008.3400008.4600008.4300009.0000009.000000
max276.0000002961.0000005911.000000176405.000000181137.000000185960.000000190873.000000195875.000000
\n", 820 | "
" 821 | ], 822 | "text/plain": [ 823 | " Area Code Item Code Element Code Y2014 Y2015 \\\n", 824 | "count 60943.000000 60943.000000 60943.000000 59354.000000 59395.000000 \n", 825 | "mean 134.265576 2687.176706 3814.856456 134.196282 135.235966 \n", 826 | "std 72.605709 146.055739 2212.007033 1567.663696 1603.403984 \n", 827 | "min 4.000000 2501.000000 511.000000 -1796.000000 -3161.000000 \n", 828 | "25% 74.000000 2562.000000 684.000000 0.000000 0.000000 \n", 829 | "50% 136.000000 2630.000000 5142.000000 0.090000 0.080000 \n", 830 | "75% 195.000000 2775.000000 5511.000000 8.340000 8.460000 \n", 831 | "max 276.000000 2961.000000 5911.000000 176405.000000 181137.000000 \n", 832 | "\n", 833 | " Y2016 Y2017 Y2018 \n", 834 | "count 59408.000000 59437.000000 59507.000000 \n", 835 | "mean 136.555222 140.917765 143.758381 \n", 836 | "std 1640.007194 1671.862359 1710.782658 \n", 837 | "min -3225.000000 -1582.000000 -3396.000000 \n", 838 | "25% 0.000000 0.000000 0.000000 \n", 839 | "50% 0.080000 0.100000 0.070000 \n", 840 | "75% 8.430000 9.000000 9.000000 \n", 841 | "max 185960.000000 190873.000000 195875.000000 " 842 | ] 843 | }, 844 | "execution_count": 9, 845 | "metadata": {}, 846 | "output_type": "execute_result" 847 | } 848 | ], 849 | "source": [ 850 | "# Get the descriptive statistcs of the columns\n", 851 | "df_FBS.describe()" 852 | ] 853 | }, 854 | { 855 | "cell_type": "code", 856 | "execution_count": null, 857 | "id": "e043e507", 858 | "metadata": {}, 859 | "outputs": [], 860 | "source": [] 861 | }, 862 | { 863 | "cell_type": "markdown", 864 | "id": "faa62caa", 865 | "metadata": {}, 866 | "source": [ 867 | "#### Question 3\n", 868 | "\n", 869 | "Perform a groupby operation on ‘Element’. What year has the highest sum of Stock Variation?\n", 870 | "\n", 871 | "#### Ans:\n", 872 | " \n", 873 | "A - 2014" 874 | ] 875 | }, 876 | { 877 | "cell_type": "code", 878 | "execution_count": 10, 879 | "id": "f3b782b1", 880 | "metadata": {}, 881 | "outputs": [ 882 | { 883 | "data": { 884 | "text/html": [ 885 | "
\n", 886 | "\n", 899 | "\n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | " \n", 958 | " \n", 959 | " \n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | "
Y2014Y2015Y2016Y2017Y2018
Element
Domestic supply quantity1996716.352021493.552044842.702088198.102161192.10
Export Quantity150020.64157614.47151920.46182338.80181594.80
Fat supply quantity (g/capita/day)10225.5610235.7410102.7710253.8410258.69
Feed216927.89225050.22228958.65223705.68233489.68
Food1212332.491232361.101247022.171258888.281303841.28
Food supply (kcal/capita/day)454257.00453383.00451810.00454681.00455261.00
Food supply quantity (kg/capita/yr)49650.6349345.1348985.2848690.0449056.85
Import Quantity274144.48267018.46286582.78294559.09287997.09
Losses153223.00155439.00157787.00160614.00163902.00
Other uses (non-food)78718.1366254.4169563.6891645.9791300.97
Processing282923.00287929.00280631.00292836.00308429.00
Production1931287.751947019.391943537.152030056.892075072.89
Protein supply quantity (g/capita/day)11836.4611833.9511779.6911842.4511833.56
Residuals30149.0030045.0037224.0035500.0034864.00
Seed21922.9223976.8223389.2024870.1425263.14
Stock Variation58749.8334910.9933140.1254316.9120577.91
Total Population - Both sexes1031585.001058081.001085107.001112641.001140605.00
Tourist consumption416.00349.0089.0091.0090.00
\n", 1065 | "
" 1066 | ], 1067 | "text/plain": [ 1068 | " Y2014 Y2015 Y2016 \\\n", 1069 | "Element \n", 1070 | "Domestic supply quantity 1996716.35 2021493.55 2044842.70 \n", 1071 | "Export Quantity 150020.64 157614.47 151920.46 \n", 1072 | "Fat supply quantity (g/capita/day) 10225.56 10235.74 10102.77 \n", 1073 | "Feed 216927.89 225050.22 228958.65 \n", 1074 | "Food 1212332.49 1232361.10 1247022.17 \n", 1075 | "Food supply (kcal/capita/day) 454257.00 453383.00 451810.00 \n", 1076 | "Food supply quantity (kg/capita/yr) 49650.63 49345.13 48985.28 \n", 1077 | "Import Quantity 274144.48 267018.46 286582.78 \n", 1078 | "Losses 153223.00 155439.00 157787.00 \n", 1079 | "Other uses (non-food) 78718.13 66254.41 69563.68 \n", 1080 | "Processing 282923.00 287929.00 280631.00 \n", 1081 | "Production 1931287.75 1947019.39 1943537.15 \n", 1082 | "Protein supply quantity (g/capita/day) 11836.46 11833.95 11779.69 \n", 1083 | "Residuals 30149.00 30045.00 37224.00 \n", 1084 | "Seed 21922.92 23976.82 23389.20 \n", 1085 | "Stock Variation 58749.83 34910.99 33140.12 \n", 1086 | "Total Population - Both sexes 1031585.00 1058081.00 1085107.00 \n", 1087 | "Tourist consumption 416.00 349.00 89.00 \n", 1088 | "\n", 1089 | " Y2017 Y2018 \n", 1090 | "Element \n", 1091 | "Domestic supply quantity 2088198.10 2161192.10 \n", 1092 | "Export Quantity 182338.80 181594.80 \n", 1093 | "Fat supply quantity (g/capita/day) 10253.84 10258.69 \n", 1094 | "Feed 223705.68 233489.68 \n", 1095 | "Food 1258888.28 1303841.28 \n", 1096 | "Food supply (kcal/capita/day) 454681.00 455261.00 \n", 1097 | "Food supply quantity (kg/capita/yr) 48690.04 49056.85 \n", 1098 | "Import Quantity 294559.09 287997.09 \n", 1099 | "Losses 160614.00 163902.00 \n", 1100 | "Other uses (non-food) 91645.97 91300.97 \n", 1101 | "Processing 292836.00 308429.00 \n", 1102 | "Production 2030056.89 2075072.89 \n", 1103 | "Protein supply quantity (g/capita/day) 11842.45 11833.56 \n", 1104 | "Residuals 35500.00 34864.00 \n", 1105 | "Seed 24870.14 25263.14 \n", 1106 | "Stock Variation 54316.91 20577.91 \n", 1107 | "Total Population - Both sexes 1112641.00 1140605.00 \n", 1108 | "Tourist consumption 91.00 90.00 " 1109 | ] 1110 | }, 1111 | "execution_count": 10, 1112 | "metadata": {}, 1113 | "output_type": "execute_result" 1114 | } 1115 | ], 1116 | "source": [ 1117 | "# Get the list of years\n", 1118 | "year_list = ['Y2014', 'Y2015', 'Y2016', 'Y2017', 'Y2018']\n", 1119 | "\n", 1120 | "#groupby by Element to view the highst sum of Stock Varaiation\n", 1121 | "df_FBS.groupby(['Element'])[year_list].agg(np.sum)" 1122 | ] 1123 | }, 1124 | { 1125 | "cell_type": "code", 1126 | "execution_count": null, 1127 | "id": "92b90a86", 1128 | "metadata": {}, 1129 | "outputs": [], 1130 | "source": [] 1131 | }, 1132 | { 1133 | "cell_type": "markdown", 1134 | "id": "3a478719", 1135 | "metadata": {}, 1136 | "source": [ 1137 | "#### Question 4\n", 1138 | "\n", 1139 | "Answer the following questions based on the African food production dataset provided by the FAO website already provided\n", 1140 | "\n", 1141 | "What is the total sum of Wine produced in 2015 and 2018 respectively?\n", 1142 | "\n", 1143 | "Hint:Perform a groupby sum aggregation on ‘Item’\n", 1144 | "\n", 1145 | "#### Ans:\n", 1146 | " \n", 1147 | "B - 4251.81 and 4039.32" 1148 | ] 1149 | }, 1150 | { 1151 | "cell_type": "code", 1152 | "execution_count": 11, 1153 | "id": "cb373e12", 1154 | "metadata": {}, 1155 | "outputs": [ 1156 | { 1157 | "data": { 1158 | "text/html": [ 1159 | "
\n", 1160 | "\n", 1173 | "\n", 1174 | " \n", 1175 | " \n", 1176 | " \n", 1177 | " \n", 1178 | " \n", 1179 | " \n", 1180 | " \n", 1181 | " \n", 1182 | " \n", 1183 | " \n", 1184 | " \n", 1185 | " \n", 1186 | " \n", 1187 | " \n", 1188 | " \n", 1189 | " \n", 1190 | " \n", 1191 | " \n", 1192 | " \n", 1193 | " \n", 1194 | " \n", 1195 | " \n", 1196 | " \n", 1197 | " \n", 1198 | " \n", 1199 | " \n", 1200 | " \n", 1201 | " \n", 1202 | " \n", 1203 | " \n", 1204 | " \n", 1205 | " \n", 1206 | " \n", 1207 | " \n", 1208 | " \n", 1209 | " \n", 1210 | " \n", 1211 | " \n", 1212 | " \n", 1213 | " \n", 1214 | " \n", 1215 | " \n", 1216 | " \n", 1217 | " \n", 1218 | " \n", 1219 | " \n", 1220 | " \n", 1221 | " \n", 1222 | " \n", 1223 | " \n", 1224 | " \n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | " \n", 1236 | " \n", 1237 | " \n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | "
Y2015Y2018
Item
Alcohol, Non-Food2180.002293.00
Alcoholic Beverages98783.7297847.27
Animal Products11811.7311578.61
Animal fats200675.72269648.27
Apples and products10559.159640.51
.........
Vegetables, Other158104.08163987.21
Vegetal Products107064.17107775.39
Wheat and products234710.51242645.19
Wine4251.814039.32
Yams203151.78221272.09
\n", 1244 | "

119 rows × 2 columns

\n", 1245 | "
" 1246 | ], 1247 | "text/plain": [ 1248 | " Y2015 Y2018\n", 1249 | "Item \n", 1250 | "Alcohol, Non-Food 2180.00 2293.00\n", 1251 | "Alcoholic Beverages 98783.72 97847.27\n", 1252 | "Animal Products 11811.73 11578.61\n", 1253 | "Animal fats 200675.72 269648.27\n", 1254 | "Apples and products 10559.15 9640.51\n", 1255 | "... ... ...\n", 1256 | "Vegetables, Other 158104.08 163987.21\n", 1257 | "Vegetal Products 107064.17 107775.39\n", 1258 | "Wheat and products 234710.51 242645.19\n", 1259 | "Wine 4251.81 4039.32\n", 1260 | "Yams 203151.78 221272.09\n", 1261 | "\n", 1262 | "[119 rows x 2 columns]" 1263 | ] 1264 | }, 1265 | "execution_count": 11, 1266 | "metadata": {}, 1267 | "output_type": "execute_result" 1268 | } 1269 | ], 1270 | "source": [ 1271 | "# Grouo by sum aggregation of Item for the year columns 2015 and 2018\n", 1272 | "wine_2015_and_2018 = df_FBS.groupby('Item')[['Y2015', 'Y2018']].sum()\n", 1273 | "wine_2015_and_2018" 1274 | ] 1275 | }, 1276 | { 1277 | "cell_type": "code", 1278 | "execution_count": null, 1279 | "id": "1afa88a2", 1280 | "metadata": {}, 1281 | "outputs": [], 1282 | "source": [] 1283 | }, 1284 | { 1285 | "cell_type": "markdown", 1286 | "id": "fe6e9cc0", 1287 | "metadata": {}, 1288 | "source": [ 1289 | "#### Question 5\n", 1290 | "\n", 1291 | "Given the following numpy array \n", 1292 | "\n", 1293 | "array = ([[94, 89, 63],\n", 1294 | "\n", 1295 | " [93, 92, 48],\n", 1296 | "\n", 1297 | " [92, 94, 56]])\n", 1298 | "\n", 1299 | "How would you select the elements in bold and italics from the array?\n", 1300 | "\n", 1301 | "#### Ans:\n", 1302 | " \n", 1303 | "C - array[ : 2, 1 : ] " 1304 | ] 1305 | }, 1306 | { 1307 | "cell_type": "code", 1308 | "execution_count": null, 1309 | "id": "27dc9ffc", 1310 | "metadata": {}, 1311 | "outputs": [], 1312 | "source": [] 1313 | }, 1314 | { 1315 | "cell_type": "markdown", 1316 | "id": "493a77cd", 1317 | "metadata": {}, 1318 | "source": [ 1319 | "#### Question 6\n", 1320 | "\n", 1321 | "Given the following python code, what would the output of the code give?\n", 1322 | "\n", 1323 | "my_tuppy = (1,2,5,8)\n", 1324 | "\n", 1325 | "my_tuppy[2] = 6\n", 1326 | "\n", 1327 | "#### Ans:\n", 1328 | " \n", 1329 | "B - Type Error" 1330 | ] 1331 | }, 1332 | { 1333 | "cell_type": "code", 1334 | "execution_count": null, 1335 | "id": "fb11ec1f", 1336 | "metadata": {}, 1337 | "outputs": [], 1338 | "source": [] 1339 | }, 1340 | { 1341 | "cell_type": "markdown", 1342 | "id": "2db2a42a", 1343 | "metadata": {}, 1344 | "source": [ 1345 | "#### Question 7\n", 1346 | "\n", 1347 | "Select columns ‘Y2017’ and ‘Area’, Perform a groupby operation on ‘Area’. Which of these Areas had the 7th lowest\n", 1348 | "sum in 2017?\n", 1349 | "\n", 1350 | "#### Ans:\n", 1351 | " \n", 1352 | "B - Guinea-Bissau" 1353 | ] 1354 | }, 1355 | { 1356 | "cell_type": "code", 1357 | "execution_count": 12, 1358 | "id": "d085c27f", 1359 | "metadata": {}, 1360 | "outputs": [ 1361 | { 1362 | "name": "stdout", 1363 | "output_type": "stream", 1364 | "text": [ 1365 | " Area Y2017\n", 1366 | "42 Sudan (former) 0.00\n", 1367 | "16 Ethiopia PDR 0.00\n", 1368 | "9 Comoros 59.84\n", 1369 | "38 Seychelles 442.34\n", 1370 | "36 Sao Tome and Principe 12662.63\n", 1371 | "5 Cabo Verde 14650.74\n", 1372 | "21 Guinea-Bissau 19102.77\n", 1373 | "23 Lesotho 21267.96\n", 1374 | "3 Botswana 22101.30\n", 1375 | "12 Djibouti 22729.91\n", 1376 | "18 Gambia 23154.18\n", 1377 | "17 Gabon 27979.64\n", 1378 | "24 Liberia 29342.20\n", 1379 | "32 Namibia 29874.89\n", 1380 | "7 Central African Republic 29937.00\n", 1381 | "10 Congo 41181.68\n", 1382 | "43 Togo 49841.88\n", 1383 | "29 Mauritius 51114.83\n", 1384 | "14 Eswatini 54343.33\n", 1385 | "39 Sierra Leone 55311.33\n", 1386 | "8 Chad 71594.68\n", 1387 | "35 Rwanda 73663.69\n", 1388 | "48 Zimbabwe 75919.34\n", 1389 | "37 Senegal 95681.15\n", 1390 | "20 Guinea 98138.87\n", 1391 | "4 Burkina Faso 101855.07\n", 1392 | "47 Zambia 103223.77\n", 1393 | "44 Tunisia 124167.20\n", 1394 | "2 Benin 124771.22\n", 1395 | "33 Niger 126707.58\n", 1396 | "25 Madagascar 131197.73\n", 1397 | "27 Mali 149928.33\n", 1398 | "28 Mauritania 156665.46\n", 1399 | "31 Mozambique 161407.98\n", 1400 | "26 Malawi 181098.71\n", 1401 | "45 Uganda 213950.38\n", 1402 | "11 Côte d'Ivoire 224599.01\n", 1403 | "1 Angola 229159.57\n", 1404 | "6 Cameroon 232030.43\n", 1405 | "41 Sudan 239931.92\n", 1406 | "22 Kenya 264660.66\n", 1407 | "46 United Republic of Tanzania 322616.85\n", 1408 | "0 Algeria 325644.27\n", 1409 | "19 Ghana 337599.06\n", 1410 | "30 Morocco 388495.36\n", 1411 | "15 Ethiopia 448683.76\n", 1412 | "40 South Africa 517590.54\n", 1413 | "13 Egypt 866379.92\n", 1414 | "34 Nigeria 1483268.23\n", 1415 | "The Area with the 7th lowest sum in 2017 is: Guinea-Bissau\n" 1416 | ] 1417 | } 1418 | ], 1419 | "source": [ 1420 | "# Select columns 'Y2017' and 'Area'\n", 1421 | "selected_columns = df_FBS[['Y2017', 'Area']]\n", 1422 | "\n", 1423 | "# Group by 'Area' and calculate the sum of 'Y2017' for each Area\n", 1424 | "grouped_data = selected_columns.groupby('Area')['Y2017'].sum().reset_index()\n", 1425 | "\n", 1426 | "# Sort the data by the sum of 'Y2017' in ascending order\n", 1427 | "sorted_data = grouped_data.sort_values(by='Y2017', ascending=True)\n", 1428 | "print(sorted_data)\n", 1429 | "\n", 1430 | "# Get the Area with the 7th lowest sum in 2017\n", 1431 | "seventh_lowest_area = sorted_data.iloc[6]['Area']\n", 1432 | "\n", 1433 | "# Print the result\n", 1434 | "print(\"The Area with the 7th lowest sum in 2017 is:\", seventh_lowest_area)" 1435 | ] 1436 | }, 1437 | { 1438 | "cell_type": "code", 1439 | "execution_count": null, 1440 | "id": "5db878e7", 1441 | "metadata": {}, 1442 | "outputs": [], 1443 | "source": [] 1444 | }, 1445 | { 1446 | "cell_type": "markdown", 1447 | "id": "4e38ed0b", 1448 | "metadata": {}, 1449 | "source": [ 1450 | "#### Question 8\n", 1451 | "\n", 1452 | "Which year had the least correlation with ‘Element Code’?\n", 1453 | "\n", 1454 | "#### Ans:\n", 1455 | " \n", 1456 | "D - 2016" 1457 | ] 1458 | }, 1459 | { 1460 | "cell_type": "code", 1461 | "execution_count": 13, 1462 | "id": "038e18ef", 1463 | "metadata": {}, 1464 | "outputs": [ 1465 | { 1466 | "data": { 1467 | "text/html": [ 1468 | "
\n", 1469 | "\n", 1482 | "\n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | " \n", 1518 | " \n", 1519 | " \n", 1520 | " \n", 1521 | " \n", 1522 | " \n", 1523 | " \n", 1524 | " \n", 1525 | " \n", 1526 | " \n", 1527 | " \n", 1528 | " \n", 1529 | " \n", 1530 | " \n", 1531 | " \n", 1532 | " \n", 1533 | " \n", 1534 | " \n", 1535 | " \n", 1536 | " \n", 1537 | " \n", 1538 | " \n", 1539 | " \n", 1540 | " \n", 1541 | " \n", 1542 | " \n", 1543 | " \n", 1544 | " \n", 1545 | " \n", 1546 | " \n", 1547 | " \n", 1548 | " \n", 1549 | " \n", 1550 | "
Element CodeY2014Y2015Y2016Y2017Y2018
Element Code1.0000000.0244570.0238890.0234440.0242540.024279
Y20140.0244571.0000000.9946470.9960810.9952300.994872
Y20150.0238890.9946471.0000000.9957390.9880480.988208
Y20160.0234440.9960810.9957391.0000000.9927850.992757
Y20170.0242540.9952300.9880480.9927851.0000000.998103
Y20180.0242790.9948720.9882080.9927570.9981031.000000
\n", 1551 | "
" 1552 | ], 1553 | "text/plain": [ 1554 | " Element Code Y2014 Y2015 Y2016 Y2017 Y2018\n", 1555 | "Element Code 1.000000 0.024457 0.023889 0.023444 0.024254 0.024279\n", 1556 | "Y2014 0.024457 1.000000 0.994647 0.996081 0.995230 0.994872\n", 1557 | "Y2015 0.023889 0.994647 1.000000 0.995739 0.988048 0.988208\n", 1558 | "Y2016 0.023444 0.996081 0.995739 1.000000 0.992785 0.992757\n", 1559 | "Y2017 0.024254 0.995230 0.988048 0.992785 1.000000 0.998103\n", 1560 | "Y2018 0.024279 0.994872 0.988208 0.992757 0.998103 1.000000" 1561 | ] 1562 | }, 1563 | "execution_count": 13, 1564 | "metadata": {}, 1565 | "output_type": "execute_result" 1566 | } 1567 | ], 1568 | "source": [ 1569 | "# Get a dataframe of the required columns : Element Code and the year columns\n", 1570 | "numeric_columns = df_FBS[['Element Code', 'Y2014', 'Y2015', 'Y2016', 'Y2017', 'Y2018']]\n", 1571 | "\n", 1572 | "# Get the correlaton of the year column with Element code and assign the result to correlation_matrix\n", 1573 | "correlation_matrix = numeric_columns.corr()\n", 1574 | "correlation_matrix" 1575 | ] 1576 | }, 1577 | { 1578 | "cell_type": "code", 1579 | "execution_count": null, 1580 | "id": "dda0b2d8", 1581 | "metadata": {}, 1582 | "outputs": [], 1583 | "source": [] 1584 | }, 1585 | { 1586 | "cell_type": "markdown", 1587 | "id": "d42e992b", 1588 | "metadata": {}, 1589 | "source": [ 1590 | "#### Question 9\n", 1591 | "\n", 1592 | "Which of the following dataframe methods can be used to access elements across rows and columns?\n", 1593 | "\n", 1594 | "#### Ans:\n", 1595 | " \n", 1596 | "E - c and d" 1597 | ] 1598 | }, 1599 | { 1600 | "cell_type": "code", 1601 | "execution_count": null, 1602 | "id": "ce5bd980", 1603 | "metadata": {}, 1604 | "outputs": [], 1605 | "source": [] 1606 | }, 1607 | { 1608 | "cell_type": "markdown", 1609 | "id": "71adcf7e", 1610 | "metadata": {}, 1611 | "source": [ 1612 | "#### Question 10\n", 1613 | "\n", 1614 | "Which of these python data structures is unorderly?\n", 1615 | "\n", 1616 | "#### Ans:\n", 1617 | " \n", 1618 | "A - Set\n", 1619 | "\n", 1620 | "B - Dictionary" 1621 | ] 1622 | }, 1623 | { 1624 | "cell_type": "code", 1625 | "execution_count": null, 1626 | "id": "a6e92161", 1627 | "metadata": {}, 1628 | "outputs": [], 1629 | "source": [] 1630 | }, 1631 | { 1632 | "cell_type": "markdown", 1633 | "id": "980416e2", 1634 | "metadata": {}, 1635 | "source": [ 1636 | "#### Question 11\n", 1637 | "\n", 1638 | "If you have the following list\n", 1639 | "\n", 1640 | "lst = [[35, 'Portugal', 94], [33, 'Argentina', 93], [30 , 'Brazil', 92]]\n", 1641 | "\n", 1642 | "col = [‘Age’,’Nationality’,’Overall’]\n", 1643 | "\n", 1644 | "How do you create a pandas DataFrame using this list, to look like the table below?\n", 1645 | "\n", 1646 | "#### Ans:\n", 1647 | " \n", 1648 | "B - pd.Dataframe(lst, columns = col, index = [1,2,3])" 1649 | ] 1650 | }, 1651 | { 1652 | "cell_type": "code", 1653 | "execution_count": null, 1654 | "id": "a1b672d4", 1655 | "metadata": {}, 1656 | "outputs": [], 1657 | "source": [] 1658 | }, 1659 | { 1660 | "cell_type": "markdown", 1661 | "id": "20928f97", 1662 | "metadata": {}, 1663 | "source": [ 1664 | "#### Question 12\n", 1665 | "\n", 1666 | "Perform a groupby operation on ‘Element’. What is the total number of the sum of Processing in 2017?\n", 1667 | "\n", 1668 | "#### Ans:\n", 1669 | " \n", 1670 | "E - 292836.00" 1671 | ] 1672 | }, 1673 | { 1674 | "cell_type": "code", 1675 | "execution_count": 14, 1676 | "id": "3670d912", 1677 | "metadata": {}, 1678 | "outputs": [ 1679 | { 1680 | "data": { 1681 | "text/html": [ 1682 | "
\n", 1683 | "\n", 1696 | "\n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | "
Y2014Y2015Y2016Y2017Y2018
Element
Domestic supply quantity1996716.352021493.552044842.702088198.102161192.10
Export Quantity150020.64157614.47151920.46182338.80181594.80
Fat supply quantity (g/capita/day)10225.5610235.7410102.7710253.8410258.69
Feed216927.89225050.22228958.65223705.68233489.68
Food1212332.491232361.101247022.171258888.281303841.28
Food supply (kcal/capita/day)454257.00453383.00451810.00454681.00455261.00
Food supply quantity (kg/capita/yr)49650.6349345.1348985.2848690.0449056.85
Import Quantity274144.48267018.46286582.78294559.09287997.09
Losses153223.00155439.00157787.00160614.00163902.00
Other uses (non-food)78718.1366254.4169563.6891645.9791300.97
Processing282923.00287929.00280631.00292836.00308429.00
Production1931287.751947019.391943537.152030056.892075072.89
Protein supply quantity (g/capita/day)11836.4611833.9511779.6911842.4511833.56
Residuals30149.0030045.0037224.0035500.0034864.00
Seed21922.9223976.8223389.2024870.1425263.14
Stock Variation58749.8334910.9933140.1254316.9120577.91
Total Population - Both sexes1031585.001058081.001085107.001112641.001140605.00
Tourist consumption416.00349.0089.0091.0090.00
\n", 1862 | "
" 1863 | ], 1864 | "text/plain": [ 1865 | " Y2014 Y2015 Y2016 \\\n", 1866 | "Element \n", 1867 | "Domestic supply quantity 1996716.35 2021493.55 2044842.70 \n", 1868 | "Export Quantity 150020.64 157614.47 151920.46 \n", 1869 | "Fat supply quantity (g/capita/day) 10225.56 10235.74 10102.77 \n", 1870 | "Feed 216927.89 225050.22 228958.65 \n", 1871 | "Food 1212332.49 1232361.10 1247022.17 \n", 1872 | "Food supply (kcal/capita/day) 454257.00 453383.00 451810.00 \n", 1873 | "Food supply quantity (kg/capita/yr) 49650.63 49345.13 48985.28 \n", 1874 | "Import Quantity 274144.48 267018.46 286582.78 \n", 1875 | "Losses 153223.00 155439.00 157787.00 \n", 1876 | "Other uses (non-food) 78718.13 66254.41 69563.68 \n", 1877 | "Processing 282923.00 287929.00 280631.00 \n", 1878 | "Production 1931287.75 1947019.39 1943537.15 \n", 1879 | "Protein supply quantity (g/capita/day) 11836.46 11833.95 11779.69 \n", 1880 | "Residuals 30149.00 30045.00 37224.00 \n", 1881 | "Seed 21922.92 23976.82 23389.20 \n", 1882 | "Stock Variation 58749.83 34910.99 33140.12 \n", 1883 | "Total Population - Both sexes 1031585.00 1058081.00 1085107.00 \n", 1884 | "Tourist consumption 416.00 349.00 89.00 \n", 1885 | "\n", 1886 | " Y2017 Y2018 \n", 1887 | "Element \n", 1888 | "Domestic supply quantity 2088198.10 2161192.10 \n", 1889 | "Export Quantity 182338.80 181594.80 \n", 1890 | "Fat supply quantity (g/capita/day) 10253.84 10258.69 \n", 1891 | "Feed 223705.68 233489.68 \n", 1892 | "Food 1258888.28 1303841.28 \n", 1893 | "Food supply (kcal/capita/day) 454681.00 455261.00 \n", 1894 | "Food supply quantity (kg/capita/yr) 48690.04 49056.85 \n", 1895 | "Import Quantity 294559.09 287997.09 \n", 1896 | "Losses 160614.00 163902.00 \n", 1897 | "Other uses (non-food) 91645.97 91300.97 \n", 1898 | "Processing 292836.00 308429.00 \n", 1899 | "Production 2030056.89 2075072.89 \n", 1900 | "Protein supply quantity (g/capita/day) 11842.45 11833.56 \n", 1901 | "Residuals 35500.00 34864.00 \n", 1902 | "Seed 24870.14 25263.14 \n", 1903 | "Stock Variation 54316.91 20577.91 \n", 1904 | "Total Population - Both sexes 1112641.00 1140605.00 \n", 1905 | "Tourist consumption 91.00 90.00 " 1906 | ] 1907 | }, 1908 | "execution_count": 14, 1909 | "metadata": {}, 1910 | "output_type": "execute_result" 1911 | } 1912 | ], 1913 | "source": [ 1914 | "# Get the list of years\n", 1915 | "year_list = ['Y2014', 'Y2015', 'Y2016', 'Y2017', 'Y2018']\n", 1916 | "\n", 1917 | "# Use group by operation on 'Element' and get the total sum for each 'Element' values\n", 1918 | "df_FBS.groupby(['Element'])[year_list].agg(np.sum)" 1919 | ] 1920 | }, 1921 | { 1922 | "cell_type": "code", 1923 | "execution_count": null, 1924 | "id": "0be87109", 1925 | "metadata": {}, 1926 | "outputs": [], 1927 | "source": [] 1928 | }, 1929 | { 1930 | "cell_type": "markdown", 1931 | "id": "81c79403", 1932 | "metadata": {}, 1933 | "source": [ 1934 | "#### Question 13\n", 1935 | "\n", 1936 | "How would you check for the number of rows and columns in a pandas DataFrame named df?\n", 1937 | "\n", 1938 | "#### Ans:\n", 1939 | " \n", 1940 | "C - df.shape" 1941 | ] 1942 | }, 1943 | { 1944 | "cell_type": "code", 1945 | "execution_count": null, 1946 | "id": "ea2f98ee", 1947 | "metadata": {}, 1948 | "outputs": [], 1949 | "source": [] 1950 | }, 1951 | { 1952 | "cell_type": "markdown", 1953 | "id": "a4a4446f", 1954 | "metadata": {}, 1955 | "source": [ 1956 | "#### Question 14\n", 1957 | "\n", 1958 | "What is the total number and percentage of missing data in 2014 to 3 decimal places?\n", 1959 | "\n", 1960 | "#### Ans:\n", 1961 | " \n", 1962 | "E - 1589 and 2.607%" 1963 | ] 1964 | }, 1965 | { 1966 | "cell_type": "code", 1967 | "execution_count": 15, 1968 | "id": "bb50e610", 1969 | "metadata": {}, 1970 | "outputs": [ 1971 | { 1972 | "data": { 1973 | "text/plain": [ 1974 | "Area Code 0\n", 1975 | "Area 0\n", 1976 | "Item Code 0\n", 1977 | "Item 0\n", 1978 | "Element Code 0\n", 1979 | "Element 0\n", 1980 | "Unit 0\n", 1981 | "Y2014 1589\n", 1982 | "Y2015 1548\n", 1983 | "Y2016 1535\n", 1984 | "Y2017 1506\n", 1985 | "Y2018 1436\n", 1986 | "dtype: int64" 1987 | ] 1988 | }, 1989 | "execution_count": 15, 1990 | "metadata": {}, 1991 | "output_type": "execute_result" 1992 | } 1993 | ], 1994 | "source": [ 1995 | "# Get the total sum of missing values in each column in the dataframe\n", 1996 | "df_FBS.isnull().sum()" 1997 | ] 1998 | }, 1999 | { 2000 | "cell_type": "code", 2001 | "execution_count": 16, 2002 | "id": "0af275a7", 2003 | "metadata": {}, 2004 | "outputs": [ 2005 | { 2006 | "data": { 2007 | "text/plain": [ 2008 | "2.6073544131401474" 2009 | ] 2010 | }, 2011 | "execution_count": 16, 2012 | "metadata": {}, 2013 | "output_type": "execute_result" 2014 | } 2015 | ], 2016 | "source": [ 2017 | "# percentage of missing values\n", 2018 | "df_FBS['Y2014'].isnull().sum() * 100 / len(df_FBS)" 2019 | ] 2020 | }, 2021 | { 2022 | "cell_type": "code", 2023 | "execution_count": null, 2024 | "id": "b63eac08", 2025 | "metadata": {}, 2026 | "outputs": [], 2027 | "source": [] 2028 | }, 2029 | { 2030 | "cell_type": "markdown", 2031 | "id": "0ded550f", 2032 | "metadata": {}, 2033 | "source": [ 2034 | "#### Question 15\n", 2035 | "\n", 2036 | "Select columns ‘Y2017’ and ‘Area’, Perform a groupby operation on ‘Area’. Which of these Areas had the highest sum in 2017?\n", 2037 | "\n", 2038 | "#### Ans:\n", 2039 | " \n", 2040 | "A - Nigeria" 2041 | ] 2042 | }, 2043 | { 2044 | "cell_type": "code", 2045 | "execution_count": 17, 2046 | "id": "cd3e2ab4", 2047 | "metadata": { 2048 | "scrolled": true 2049 | }, 2050 | "outputs": [ 2051 | { 2052 | "name": "stdout", 2053 | "output_type": "stream", 2054 | "text": [ 2055 | " Area Y2017\n", 2056 | "34 Nigeria 1483268.23\n", 2057 | "13 Egypt 866379.92\n", 2058 | "40 South Africa 517590.54\n", 2059 | "15 Ethiopia 448683.76\n", 2060 | "30 Morocco 388495.36\n", 2061 | "19 Ghana 337599.06\n", 2062 | "0 Algeria 325644.27\n", 2063 | "46 United Republic of Tanzania 322616.85\n", 2064 | "22 Kenya 264660.66\n", 2065 | "41 Sudan 239931.92\n", 2066 | "6 Cameroon 232030.43\n", 2067 | "1 Angola 229159.57\n", 2068 | "11 Côte d'Ivoire 224599.01\n", 2069 | "45 Uganda 213950.38\n", 2070 | "26 Malawi 181098.71\n", 2071 | "31 Mozambique 161407.98\n", 2072 | "28 Mauritania 156665.46\n", 2073 | "27 Mali 149928.33\n", 2074 | "25 Madagascar 131197.73\n", 2075 | "33 Niger 126707.58\n", 2076 | "2 Benin 124771.22\n", 2077 | "44 Tunisia 124167.20\n", 2078 | "47 Zambia 103223.77\n", 2079 | "4 Burkina Faso 101855.07\n", 2080 | "20 Guinea 98138.87\n", 2081 | "37 Senegal 95681.15\n", 2082 | "48 Zimbabwe 75919.34\n", 2083 | "35 Rwanda 73663.69\n", 2084 | "8 Chad 71594.68\n", 2085 | "39 Sierra Leone 55311.33\n", 2086 | "14 Eswatini 54343.33\n", 2087 | "29 Mauritius 51114.83\n", 2088 | "43 Togo 49841.88\n", 2089 | "10 Congo 41181.68\n", 2090 | "7 Central African Republic 29937.00\n", 2091 | "32 Namibia 29874.89\n", 2092 | "24 Liberia 29342.20\n", 2093 | "17 Gabon 27979.64\n", 2094 | "18 Gambia 23154.18\n", 2095 | "12 Djibouti 22729.91\n", 2096 | "3 Botswana 22101.30\n", 2097 | "23 Lesotho 21267.96\n", 2098 | "21 Guinea-Bissau 19102.77\n", 2099 | "5 Cabo Verde 14650.74\n", 2100 | "36 Sao Tome and Principe 12662.63\n", 2101 | "38 Seychelles 442.34\n", 2102 | "9 Comoros 59.84\n", 2103 | "42 Sudan (former) 0.00\n", 2104 | "16 Ethiopia PDR 0.00\n", 2105 | "The Area with the highest sum in 2017 is: Nigeria\n" 2106 | ] 2107 | } 2108 | ], 2109 | "source": [ 2110 | "# Select columns 'Y2017' and 'Area'\n", 2111 | "selected_columns = df_FBS[['Y2017', 'Area']]\n", 2112 | "\n", 2113 | "# Group by 'Area' and calculate the sum of 'Y2017' for each Area\n", 2114 | "grouped_data = selected_columns.groupby('Area')['Y2017'].sum().reset_index()\n", 2115 | "\n", 2116 | "# Sort the data by the sum of 'Y2017' in ascending order\n", 2117 | "sorted_data = grouped_data.sort_values(by='Y2017', ascending=False)\n", 2118 | "print(sorted_data)\n", 2119 | "\n", 2120 | "# Get the Area with the 7th lowest sum in 2017\n", 2121 | "highest_sum_area = sorted_data.iloc[0]['Area']\n", 2122 | "\n", 2123 | "# Print the result\n", 2124 | "print(\"The Area with the highest sum in 2017 is:\", highest_sum_area)" 2125 | ] 2126 | }, 2127 | { 2128 | "cell_type": "code", 2129 | "execution_count": null, 2130 | "id": "20548412", 2131 | "metadata": {}, 2132 | "outputs": [], 2133 | "source": [] 2134 | }, 2135 | { 2136 | "cell_type": "markdown", 2137 | "id": "233087eb", 2138 | "metadata": {}, 2139 | "source": [ 2140 | "#### Question 16\n", 2141 | "\n", 2142 | "Consider the following list of tuples:\n", 2143 | "\n", 2144 | "y = [(2, 4), (7, 8), (1, 5, 9)]\n", 2145 | "\n", 2146 | "How would you assign element 8 from the list to a variable x?\n", 2147 | "\n", 2148 | "#### Ans:\n", 2149 | " \n", 2150 | "B - x = y[1][1]" 2151 | ] 2152 | }, 2153 | { 2154 | "cell_type": "code", 2155 | "execution_count": null, 2156 | "id": "b0e1ac71", 2157 | "metadata": {}, 2158 | "outputs": [], 2159 | "source": [] 2160 | }, 2161 | { 2162 | "cell_type": "markdown", 2163 | "id": "55249720", 2164 | "metadata": {}, 2165 | "source": [ 2166 | "#### Question 17\n", 2167 | "\n", 2168 | "A pandas Dataframe with dimensions (100,3) has how many features and observations?\n", 2169 | "\n", 2170 | "#### Ans:\n", 2171 | " \n", 2172 | "B - 3 features, 100 observations" 2173 | ] 2174 | }, 2175 | { 2176 | "cell_type": "code", 2177 | "execution_count": null, 2178 | "id": "6e0e7026", 2179 | "metadata": {}, 2180 | "outputs": [], 2181 | "source": [] 2182 | }, 2183 | { 2184 | "cell_type": "markdown", 2185 | "id": "a2cb829b", 2186 | "metadata": {}, 2187 | "source": [ 2188 | "#### Question 18\n", 2189 | "\n", 2190 | "Which of the following is a python inbuilt module?\n", 2191 | "\n", 2192 | "#### Ans:\n", 2193 | " \n", 2194 | "C - Math" 2195 | ] 2196 | }, 2197 | { 2198 | "cell_type": "code", 2199 | "execution_count": null, 2200 | "id": "197ce575", 2201 | "metadata": {}, 2202 | "outputs": [], 2203 | "source": [] 2204 | }, 2205 | { 2206 | "cell_type": "markdown", 2207 | "id": "30f39308", 2208 | "metadata": {}, 2209 | "source": [ 2210 | "#### Question 19\n", 2211 | "\n", 2212 | "What is the total number of unique countries in the dataset?\n", 2213 | "\n", 2214 | "#### Ans:\n", 2215 | " \n", 2216 | "A - 49" 2217 | ] 2218 | }, 2219 | { 2220 | "cell_type": "code", 2221 | "execution_count": 18, 2222 | "id": "fe9cd365", 2223 | "metadata": {}, 2224 | "outputs": [ 2225 | { 2226 | "data": { 2227 | "text/plain": [ 2228 | "49" 2229 | ] 2230 | }, 2231 | "execution_count": 18, 2232 | "metadata": {}, 2233 | "output_type": "execute_result" 2234 | } 2235 | ], 2236 | "source": [ 2237 | "# Get the total number of unique 'Area' in the dataframe\n", 2238 | "df_FBS['Area'].nunique()" 2239 | ] 2240 | }, 2241 | { 2242 | "cell_type": "code", 2243 | "execution_count": null, 2244 | "id": "9a000af9", 2245 | "metadata": {}, 2246 | "outputs": [], 2247 | "source": [] 2248 | }, 2249 | { 2250 | "cell_type": "markdown", 2251 | "id": "ca723cc5", 2252 | "metadata": {}, 2253 | "source": [ 2254 | "#### Question 20\n", 2255 | "\n", 2256 | "What would be the output for?\n", 2257 | "\n", 2258 | "S = [['him', 'sell'], [90, 28, 43]]\n", 2259 | "\n", 2260 | "S[0][1][1]\n", 2261 | "\n", 2262 | "#### Ans:\n", 2263 | " \n", 2264 | "D - 'e'" 2265 | ] 2266 | }, 2267 | { 2268 | "cell_type": "code", 2269 | "execution_count": null, 2270 | "id": "00188d0d", 2271 | "metadata": {}, 2272 | "outputs": [], 2273 | "source": [] 2274 | }, 2275 | { 2276 | "cell_type": "code", 2277 | "execution_count": null, 2278 | "id": "dce33880", 2279 | "metadata": {}, 2280 | "outputs": [], 2281 | "source": [] 2282 | } 2283 | ], 2284 | "metadata": { 2285 | "kernelspec": { 2286 | "display_name": "Python 3 (ipykernel)", 2287 | "language": "python", 2288 | "name": "python3" 2289 | }, 2290 | "language_info": { 2291 | "codemirror_mode": { 2292 | "name": "ipython", 2293 | "version": 3 2294 | }, 2295 | "file_extension": ".py", 2296 | "mimetype": "text/x-python", 2297 | "name": "python", 2298 | "nbconvert_exporter": "python", 2299 | "pygments_lexer": "ipython3", 2300 | "version": "3.11.5" 2301 | } 2302 | }, 2303 | "nbformat": 4, 2304 | "nbformat_minor": 5 2305 | } 2306 | --------------------------------------------------------------------------------