├── Books ├── Dimensionality+Reduction.pdf ├── Python4DataAnalysis.pdf ├── Scikit_Learn_Cheat_Sheet_Python.pdf └── thinkstats2.pdf ├── README.md ├── _config.yml ├── cheatsheets ├── 21616132_483510435358709_4877869234411042695_n.jpg ├── Quandl-NumPy-SciPy-Pandas-Cheat-Sheet.pdf ├── Scikit-Learn-Infographic.pdf ├── bokeh(plot).pdf ├── importing data.pdf ├── keras.pdf ├── matplotlib.pdf ├── ml_02.jpg ├── numpy.pdf ├── pandas11.pdf ├── pyt.pdf ├── scikitlearn.pdf └── seeborn.pdf ├── notebooks ├── 01+Data+Representation+for+Machine+Learning (1).ipynb ├── 02+Training+and+Testing+Data.ipynb ├── 02b-Data-preprocessing.ipynb ├── 03-KNN.ipynb ├── 04-linear-models.ipynb ├── 08_cross_validation.ipynb ├── Decision+Trees+and+Random+Forest.ipynb ├── Week1.ipynb ├── Week2.ipynb ├── Week3.ipynb └── Week4.ipynb └── slides ├── Algorithms+in+DS.ppt ├── Correlations.ppt ├── Dimensionality+Reduction.ppt ├── Introduction+to+Regular+Expressions.ppt ├── Kmeans+Naive+Bayes.ppt ├── Python+Basics.ppt ├── Simple+NLP+Tasks.ppt ├── Statistical+Inference,+Exploratory+Data+Analysis,+and+the+Data+Science+Process.ppt ├── TF-IDF.ppt └── The+Data+Science+Process.ppt /Books/Dimensionality+Reduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/Books/Dimensionality+Reduction.pdf -------------------------------------------------------------------------------- /Books/Python4DataAnalysis.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/Books/Python4DataAnalysis.pdf -------------------------------------------------------------------------------- /Books/Scikit_Learn_Cheat_Sheet_Python.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/Books/Scikit_Learn_Cheat_Sheet_Python.pdf -------------------------------------------------------------------------------- /Books/thinkstats2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/Books/thinkstats2.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Data-Science-in-Python 2 | 3 | ### Table of contents 4 | * [Installation](#installation ) 5 | * [Books](#books) 6 | * [Online courses](#online-courses) 7 | * [Youtube channels & Videos](#youtube-videos--channels ) 8 | * [Presentations](#presentations) 9 | * [Data Science using Python](#data-science-using-python) 10 | 11 | * [Competitions](#competitions) 12 | * [Data Science Ideas](#data-science-ideas) 13 | * [Data Sets](#data-sets) 14 | 15 | 16 | ## Installation 17 | 18 | * [Anaconda](https://www.anaconda.com/download/) 19 | 20 | or 21 | * [Winpy](https://sourceforge.net/projects/winpython/files/WinPython_3.6/3.6.3.0/) (alternative) 22 | 23 | 24 | ## What is Data Science? 25 | 26 | * [What is Data Science @ O'reilly](https://www.oreilly.com/ideas/what-is-data-science) 27 | * [What is Data Science @ Quora](https://www.quora.com/Data-Science/What-is-data-science) 28 | * [The sexiest job of 21st century](https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century) 29 | * [What is data science](http://www.datascientists.net/what-is-data-science) 30 | * [What is a data scientist](http://www.becomingadatascientist.com/2014/02/14/what-is-a-data-scientist/) 31 | * [Wikipedia](https://en.wikipedia.org/wiki/Data_science) 32 | * [a very short history of #datascience](http://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/) 33 | * [An Introduction to Data Science, PDF](https://ischool.syr.edu/media/documents/2012/3/DataScienceBook1_1.pdf). 34 | * [Data Science Methodology by John Rollins PhD](http://www.ibmbigdatahub.com/blog/why-we-need-methodology-data-science) 35 | * [A Day in the Life of a Data Scientist by Rutgers University](http://online.rutgers.edu/resources/articles/a-day-in-the-life-of-a-data-scientist/) 36 | 37 | 38 | ## Books 39 | - [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) 40 | - [The Data Science Handbook](http://www.thedatasciencehandbook.com/) 41 | - [The Art of Data Usability](https://www.manning.com/books/the-art-of-data-usability) - Early access 42 | - [Think Like a Data Scientist](https://www.manning.com/books/think-like-a-data-scientist) 43 | - [R in Action, Second Edition](https://www.manning.com/books/r-in-action-second-edition) 44 | - [Introducing Data Science](https://www.manning.com/books/introducing-data-science) 45 | - [Practical Data Science with R](https://www.manning.com/books/practical-data-science-with-r) 46 | - [Exploring Data Science](https://www.manning.com/books/exploring-data-science) - free eBook sampler 47 | - [Exploring the Data Jungle](https://www.manning.com/books/exploring-the-data-jungle) - free eBook sampler 48 | 49 | ## Online courses 50 | 51 | - [Applied Data Science with Python Specialization](https://www.coursera.org/specializations/data-science-python) 52 | - [Microsoft Professional Program in Data Science](https://www.edx.org/microsoft-professional-program-data-science) 53 | - [Intro to Data Science](https://www.udacity.com/course/intro-to-data-science--ud359) 54 | - [Python Data camp](https://www.datacamp.com/courses/tech:python) 55 | - [Introduction to Python for Data Science](https://www.edx.org/course/introduction-python-data-science-microsoft-dat208x-8) 56 | - [Intro to Data Science by Microsoft](https://www.edx.org/course/microsoft-professional-orientation-data-microsoft-dat101x-0) 57 | 58 | 59 | ## Youtube Videos & Channels 60 | 61 | - [What is machine learning?](https://www.youtube.com/watch?v=WXHM_i-fgGo) 62 | - [Andrew Ng: Deep Learning, Self-Taught Learning and Unsupervised Feature Learning](https://www.youtube.com/watch?v=n1ViNeWhC24) 63 | - [Deep Learning: Intelligence from Big Data](https://www.youtube.com/watch?v=czLI3oLDe8M) 64 | - [Interview with Google's AI and Deep Learning 'Godfather' Geoffrey Hinton](https://www.youtube.com/watch?v=1Wp3IIpssEc) 65 | - [Introduction to Deep Learning with Python](https://www.youtube.com/watch?v=S75EdAcXHKk) 66 | - [What is machine learning, and how does it work?](https://www.youtube.com/watch?v=elojMnjn4kk) 67 | - [Data School](https://www.youtube.com/channel/UCnVzApLJE2ljPZSeQylSEyg) - Data Science Education 68 | - [Neural Nets for Newbies by Melanie Warrick (May 2015)](https://www.youtube.com/watch?v=Cu6A96TUy_o) 69 | - [Neural Networks video series by Hugo Larochelle](https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH) 70 | - [Google DeepMind co-founder Shane Legg - Machine Super Intelligence](https://www.youtube.com/watch?v=evNCyRL3DOU) 71 | 72 | 73 | ## Presentations 74 | 75 | * [How to Become a Data Scientist](http://www.slideshare.net/ryanorban/how-to-become-a-data-scientist) 76 | * [Introduction to Data Science](http://www.slideshare.net/NikoVuokko/introduction-to-data-science-25391618) 77 | * [Intro to Data Science for Enterprise Big Data](http://www.slideshare.net/pacoid/intro-to-data-science-for-enterprise-big-data) 78 | * [How to Interview a Data Scientist](http://www.slideshare.net/dtunkelang/how-to-interview-a-data-scientist) 79 | * [How to Share Data with a Statistician](https://github.com/jtleek/datasharing) 80 | * [The Science of a Great Career in Data Science](http://www.slideshare.net/katemats/the-science-of-a-great-career-in-data-science) 81 | * [What Does a Data Scientist Do?](http://www.slideshare.net/datasciencelondon/big-data-sorry-data-science-what-does-a-data-scientist-do) 82 | * [Building Data Start-Ups: Fast, Big, and Focused](http://www.slideshare.net/medriscoll/driscoll-strata-buildingdatastartups25may2011clean) 83 | * [How to win data science competitions with Deep Learning](http://www.slideshare.net/0xdata/how-to-win-data-science-competitions-with-deep-learning) 84 | 85 | ## Data Science using Python 86 | This list covers only Python, as many are already familiar with this language. [Data Science tutorials using R](https://github.com/ujjwalkarn/DataScienceR). 87 | 88 | ### Learning Python 89 | 90 | - [YouTube tutorial series by sentdex](https://www.youtube.com/watch?v=oVp1vrfL_w4&list=PLQVvvaa0QuDe8XSftW-RAxdo6OmaeL85M) 91 | - [Interactive Python tutorial website](http://www.learnpython.org/) 92 | 93 | ### numpy 94 | [numpy](http://www.numpy.org/) is a Python library which provides large multidimensional arrays and fast mathematical operations on them. 95 | 96 | - [Numpy tutorial on DataCamp](https://www.datacamp.com/community/tutorials/python-numpy-tutorial#gs.h3DvLnk) 97 | 98 | ### pandas 99 | [pandas](http://pandas.pydata.org/index.html) provides efficient data structures and analysis tools for Python. It is build on top of numpy. 100 | 101 | - [Introduction to pandas](http://www.synesthesiam.com/posts/an-introduction-to-pandas.html) 102 | - [DataCamp pandas foundations](https://www.datacamp.com/courses/pandas-foundations) - Paid course, but 30 free days upon account creation (enough to complete course). 103 | - [Pandas cheatsheet](https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf) - Quick overview over the most important functions. 104 | 105 | ### scikit-learn 106 | [scikit-learn](http://scikit-learn.org/stable/) is the most common library for Machine Learning and Data Science in Python. 107 | 108 | - [Introduction and first model application](https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.02-Introducing-Scikit-Learn.ipynb) 109 | - [Rough guide for choosing estimators](http://scikit-learn.org/stable/tutorial/machine_learning_map/) 110 | - [Scikit-learn complete user guide](http://scikit-learn.org/stable/user_guide.html) 111 | - [Model ensemble: Implementation in Python](http://machinelearningmastery.com/ensemble-machine-learning-algorithms-python-scikit-learn/) 112 | 113 | ### Jupyter Notebook 114 | [Jupyter Notebook](https://jupyter.org/) is a web application for easy data visualisation and code presentation. 115 | 116 | - [Downloading and running first Jupyter notebook](https://jupyter.org/install.html) 117 | - [Example notebook for data exploration](https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-instacart) 118 | - [Seaborn data visualization tutorial](https://elitedatascience.com/python-seaborn-tutorial) - Plot library that works great with Jupyter. 119 | 120 | ### Common Algorithms and Procedures 121 | 122 | - [Supervised vs unsupervised learning](https://stackoverflow.com/questions/1832076/what-is-the-difference-between-supervised-learning-and-unsupervised-learning) - The two most common types of Machine Learning algorithms. 123 | - [9 important Data Science algorithms and their implementation](https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.05-Naive-Bayes.ipynb) 124 | - [Cross validation](https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.03-Hyperparameters-and-Model-Validation.ipynb) - Evaluate the performance of your algorithm / model. 125 | - [Feature engineering](https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.04-Feature-Engineering.ipynb) - Modifying the data to better model predictions. 126 | - [Scientific introduction to 10 important Data Science algorithms](http://www.cs.umd.edu/%7Esamir/498/10Algorithms-08.pdf) 127 | - [Model ensemble: Explanation](https://www.analyticsvidhya.com/blog/2017/02/introduction-to-ensembling-along-with-implementation-in-r/) - Combine multiple models into one for better performance. 128 | 129 | 130 | 131 | 132 | ## Competitions 133 | 134 | Some data mining competition platforms 135 | * [Kaggle](https://www.kaggle.com/) 136 | * [DrivenData](https://www.drivendata.org/) 137 | * [Analytics Vidhya](https://www.analyticsvidhya.com/blog/tag/data-science-competitions/) 138 | * [The Data Science Game](http://www.datasciencegame.com/) 139 | * [InnoCentive](https://www.innocentive.com/) 140 | * [TuneedIT](http://tunedit.org/challenges) 141 | 142 | ## Data Science Ideas 143 | 144 | ### Human Resources 145 | - [Competency forecasting](http://onlinelibrary.wiley.com/doi/10.1111/j.1468-2389.1993.tb00083.x/full) 146 | - [Employee churn analytics](http://www.predictiveanalyticsworld.com/patimes/employee-churn-201-calculating-employee-value/3321/) 147 | - [Employee performance analytics](http://www.halogensoftware.com/blog/employee-performance-data-the-most-underused-data-set-in-hr) 148 | - [Network analytics on employee interactions](http://lcs.ios.ac.cn/~shil/paper/Smallblue_PIEEE.pdf) 149 | - [Resume matching, preselection and tagging](https://www.quora.com/What-is-the-best-algorithm-to-match-resumes-with-jobs) 150 | - [Workforce planning](https://www.slideshare.net/wayneraw/workforce-planning) 151 | 152 | ### Finance 153 | - Cost analytics 154 | - [Fraud detection](https://en.wikipedia.org/wiki/Data_analysis_techniques_for_fraud_detection) 155 | - Waste and abuse detection 156 | 157 | ### IT 158 | - [Component quality analytics](https://www.backblaze.com/blog/hard-drive-reliability-stats-q1-2016/) 159 | - [Cybercrime detection](http://www.sas.com/en_be/software/fraud-security-intelligence/cybersecurity-solutions.html) 160 | - [Server performance monitoring and alerting](http://www.coscale.com/) 161 | - Incident management tickets [automatic routing and reply](https://www.channele2e.com/2016/12/23/automate-trouble-ticketing-management-with-natural-language-processing/) or [clustering](https://link.springer.com/chapter/10.1007/978-3-319-46295-0_58) 162 | 163 | ### Marketing 164 | 165 | 166 | - [Churn/Customer attrition](https://en.wikipedia.org/wiki/Customer_attrition#prediction) 167 | - [Customer segmentation](https://ds4ci.files.wordpress.com/2013/09/user08_jimp_custseg_revnov08.pdf) 168 | - [Life Time Value](https://dataorigami.net/blogs/napkin-folding/18868411-lifetimes-measuring-customer-lifetime-value-in-python) 169 | - [Personalized advertising](http://ieeexplore.ieee.org/document/7273289/) 170 | - [Product recommendation engines](http://www.kdnuggets.com/2015/10/big-data-recommendation-systems-change-lives.html) using recommendation engines 171 | - [Marketing Optimization](http://www.marketingoptimizer.com/marketing-optimization/) 172 | - [Social Media Analytics](https://cran.r-project.org/web/packages/SocialMediaLab/) 173 | - [Text Analytics on customer complaints](https://dev.socrata.com/blog/2016/05/03/natural-language-with-sodapy-and-algorithmia.html) 174 | 175 | ### Sales 176 | 177 | - [Cross-sell opportunities](https://www.analyticsvidhya.com/blog/2015/08/learn-cross-selling-upselling/) using propensity models 178 | - [Lead scoring](http://marketingland.com/maximizing-lead-scoring-analytics-use-big-data-b2b-101956) 179 | - [Price elasticity](https://support.sas.com/rnd/app/ets/examples/simpelast/index.htm) 180 | - [Revenue forecasting](http://analytics.ncsu.edu/sesug/2007/PO10.pdf) or [Kaggle](https://www.kaggle.com/c/rossmann-store-sales) 181 | 182 | 183 | ### Supply chain 184 | - [Demand forecasting](https://www.slideshare.net/vishnuvsvn/demand-forecasting-in-supply-chain) 185 | - [Gas purchase optimization](http://pubsonline.informs.org/doi/pdf/10.1287/opre.40.3.446) 186 | - [Inventory forecasting](https://hbr.org/1971/07/how-to-choose-the-right-forecasting-technique) 187 | - [Optimal routes](http://www.sciencedirect.com/science/article/pii/S22125671163004780) 188 | - [Warehouse location optimization](https://en.wikipedia.org/wiki/Weber_problem) 189 | 190 | ### Insurance 191 | - [Fraud detection](http://www.jstor.org/stable/3182781) 192 | - [Litigation prediction](http://www.propertycasualty360.com/2014/08/22/using-predictive-analytics-in-litigation-managemen?slreturn=1483353120) 193 | - [Pricing using telematics](https://lirias.kuleuven.be/handle/123456789/552745) 194 | - Solvency II and ORSA compliance 195 | - [Risk analytics](https://en.wikipedia.org/wiki/Analytics#Risk_analytics) 196 | 197 | ### Life sciences 198 | - Design of experiments 199 | - [R&D portfolio optimization](http://www.athlycs.be/portfolio-insight) 200 | 201 | ### Manufacturing 202 | - [Predictive Asset Maintenance](http://www.genesissolutions.com/asset-management-to-be-a-key-in-internet-of-things-manufacturing-deployments/) 203 | - [Quality control](http://necsi.edu/affiliates/braha/IEEE-Cleaning_02.pdf) 204 | 205 | 206 | #### Finance and Tax 207 | - [Budget planning and simulation](https://www.edx.org/course/macroeconometric-forecasting-imfx-mfx-0) 208 | - [Customs fraud detection](http://ieeexplore.ieee.org/document/1167400/) 209 | - [Tax audit triage](https://the-modeling-agency.com/triage-for-tax-auditstm/) 210 | 211 | #### Public Safety 212 | - Crime Wave Detection 213 | - Patrolling Suggestions (Preventative Policing) 214 | - Crime Case Resolution Prediction 215 | - Crime Clustering 216 | - Complex/Organised Crime network detection 217 | - Terrorist Cell Identification 218 | - Alerting & Officer Safety 219 | - Criminal Evolution 220 | - Domestic Violence 221 | - Radicalisation prediction 222 | - [Mass scale surveillance](https://en.wikipedia.org/wiki/PRISM_(surveillance_program)) 223 | 224 | ## Data Sets 225 | 226 | * [Academic Torrents](http://academictorrents.com/) 227 | * [hadoopilluminated.com](http://hadoopilluminated.com/hadoop_illuminated/Public_Bigdata_Sets.html) 228 | * [data.gov](https://catalog.data.gov/dataset) - The home of the U.S. Government's open data 229 | * [United States Census Bureau](http://www.census.gov/) 230 | * [usgovxml.com](http://usgovxml.com/) 231 | * [enigma.com](http://enigma.com/) - Navigate the world of public data - Quickly search and analyze billions of public records published by governments, companies and organizations. 232 | * [datahub.io](https://datahub.io/) 233 | * [aws.amazon.com/datasets](https://aws.amazon.com/datasets/) 234 | * [databib.org](http://databib.org/) 235 | * [datacite.org](https://www.datacite.org) 236 | * [quandl.com](https://www.quandl.com/) - Get the data you need in the form you want; instant download, API or direct to your app. 237 | * [figshare.com](https://figshare.com/) 238 | * [GeoLite Legacy Downloadable Databases](http://dev.maxmind.com/geoip/legacy/geolite/) 239 | * [Quora's Big Datasets Answer](https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public) 240 | * [Public Big Data Sets](http://hadoopilluminated.com/hadoop_illuminated/Public_Bigdata_Sets.html) 241 | * [Houston Data Portal](http://data.ohouston.org/) 242 | * [Kaggle Data Sources](https://www.kaggle.com/wiki/DataSources) 243 | * [Kaggle Datasets](https://www.kaggle.com/datasets) 244 | * [A Deep Catalog of Human Genetic Variation](http://www.internationalgenome.org/data) 245 | * [A community-curated database of well-known people, places, and things](https://developers.google.com/freebase/) 246 | * [Google Public Data](http://www.google.com/publicdata/directory) 247 | * [World Bank Data](http://data.worldbank.org/) 248 | * [NYC Taxi data](http://chriswhong.github.io/nyctaxi/) 249 | * [Open Data Philly](https://www.opendataphilly.org/) Connecting people with data for Philadelphia 250 | * [A list of useful sources](http://ahmetkurnaz.net/en/statistical-data-sources/) A blog post includes many data set databases 251 | * [grouplens.org](https://grouplens.org/datasets/) Sample movie (with ratings), book and wiki datasets 252 | * [UC Irvine Machine Learning Repository](http://archive.ics.uci.edu/ml/) - contains data sets good for machine learning 253 | * [research-quality data sets](http://web.archive.org/web/20150320022752/https://bitly.com/bundles/hmason/1) by [Hilary Mason](http://web.archive.org/web/20150501033715/https://bitly.com/u/hmason/bundles) 254 | * [National Climatic Data Center - NOAA](https://www.ncdc.noaa.gov/) 255 | * [ClimateData.us](http://www.climatedata.us/) (related: [U.S. Climate Resilience Toolkit](https://toolkit.climate.gov/)) 256 | * [r/datasets](https://www.reddit.com/r/datasets/) 257 | * [MapLight](http://maplight.org/data) - provides a variety of data free of charge for uses that are freely available to the general public. Click on a data set below to learn more 258 | * [GHDx](http://ghdx.healthdata.org/) - Institute for Health Metrics and Evaluation - a catalog of health and demographic datasets from around the world and including IHME results 259 | * [St. Louis Federal Reserve Economic Data - FRED](https://fred.stlouisfed.org/) 260 | * [New Zealand Institute of Economic Research – Data1850](https://data1850.nz/) 261 | * [Dept. of Politics @ New York University](http://www.nyu.edu/projects/politicsdatalab/datasupp_datasources.html) 262 | * [Open Data Sources](https://github.com/datasciencemasters/data) 263 | * [UNICEF Statistics and Monitoring](https://www.unicef.org/statistics/index_24287.html) 264 | * [UNICEF Data](https://data.unicef.org/) 265 | * [undata](http://data.un.org/) 266 | * [NASA SocioEconomic Data and Applications Center - SEDAC](http://sedac.ciesin.columbia.edu/) 267 | * [The GDELT Project](http://gdeltproject.org/) 268 | * [Sweden, Statistics](http://www.scb.se/en/) 269 | * [Github free data source list](http://www.datasciencecentral.com/profiles/blogs/great-github-list-of-public-data-sets) 270 | * [StackExchange Data Explorer](http://data.stackexchange.com) - an open source tool for running arbitrary queries against public data from the Stack Exchange network. 271 | * [San Fransisco Government Open Data](https://data.sfgov.org/) 272 | * [IBM Blog abour open data](http://www.datasciencecentral.com/profiles/blogs/the-free-big-data-sources-everyone-should-know) 273 | * [Open data Index](http://index.okfn.org/) 274 | * [Liver Tumor Segmentation Challenge Dataset](http://www.lits-challenge.com/) 275 | 276 | Reference 277 | https://github.com/bulutyazilim/awesome-datascience 278 | https://github.com/JosPolfliet/awesome-datascience-ideas 279 | https://github.com/siboehm/awesome-learn-datascience 280 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-minimal -------------------------------------------------------------------------------- /cheatsheets/21616132_483510435358709_4877869234411042695_n.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/21616132_483510435358709_4877869234411042695_n.jpg -------------------------------------------------------------------------------- /cheatsheets/Quandl-NumPy-SciPy-Pandas-Cheat-Sheet.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/Quandl-NumPy-SciPy-Pandas-Cheat-Sheet.pdf -------------------------------------------------------------------------------- /cheatsheets/Scikit-Learn-Infographic.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/Scikit-Learn-Infographic.pdf -------------------------------------------------------------------------------- /cheatsheets/bokeh(plot).pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/bokeh(plot).pdf -------------------------------------------------------------------------------- /cheatsheets/importing data.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/importing data.pdf -------------------------------------------------------------------------------- /cheatsheets/keras.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/keras.pdf -------------------------------------------------------------------------------- /cheatsheets/matplotlib.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/matplotlib.pdf -------------------------------------------------------------------------------- /cheatsheets/ml_02.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/ml_02.jpg -------------------------------------------------------------------------------- /cheatsheets/numpy.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/numpy.pdf -------------------------------------------------------------------------------- /cheatsheets/pandas11.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/pandas11.pdf -------------------------------------------------------------------------------- /cheatsheets/pyt.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/pyt.pdf -------------------------------------------------------------------------------- /cheatsheets/scikitlearn.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/scikitlearn.pdf -------------------------------------------------------------------------------- /cheatsheets/seeborn.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/cheatsheets/seeborn.pdf -------------------------------------------------------------------------------- /notebooks/02+Training+and+Testing+Data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Scikit-learn Tutorial" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Training and Testing Data\n", 15 | "=====================================\n", 16 | "\n", 17 | "To evaluate how well our supervised models generalize, we can split our data into a training and a test set:\n", 18 | "\n", 19 | "" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 1, 25 | "metadata": { 26 | "collapsed": true 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "%matplotlib inline\n", 31 | "import matplotlib.pyplot as plt\n", 32 | "import numpy as np" 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": 2, 38 | "metadata": { 39 | "collapsed": true 40 | }, 41 | "outputs": [], 42 | "source": [ 43 | "from sklearn.datasets import load_iris\n", 44 | "from sklearn.neighbors import KNeighborsClassifier\n", 45 | "\n", 46 | "iris = load_iris()\n", 47 | "X, y = iris.data, iris.target\n", 48 | "\n", 49 | "classifier = KNeighborsClassifier()" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "Thinking about how machine learning is normally performed, the idea of a train/test split makes sense. Real world systems train on the data they have, and as other data comes in (from customers, sensors, or other sources) the classifier that was trained must predict on fundamentally *new* data. We can simulate this during training using a train/test split - the test data is a simulation of \"future data\" which will come into the system during production. \n", 57 | "\n", 58 | "Specifically for iris, the 150 labels in iris are sorted, which means that if we split the data using a proportional split, this will result in fudamentally altered class distributions. For instance, if we'd perform a common 2/3 training data and 1/3 test data split, our training dataset will only consists of flower classes 0 and 1 (Setosa and Versicolor), and our test set will only contain samples with class label 2 (Virginica flowers).\n", 59 | "\n", 60 | "Under the assumption that all samples are independent of each other (in contrast time series data), we want to **randomly shuffle the dataset before we split the dataset** as illustrated above." 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 2, 66 | "metadata": {}, 67 | "outputs": [ 68 | { 69 | "data": { 70 | "text/plain": [ 71 | "array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 72 | " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 73 | " 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n", 74 | " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n", 75 | " 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,\n", 76 | " 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,\n", 77 | " 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])" 78 | ] 79 | }, 80 | "execution_count": 2, 81 | "metadata": {}, 82 | "output_type": "execute_result" 83 | } 84 | ], 85 | "source": [ 86 | "y" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": {}, 92 | "source": [ 93 | "Now we need to split the data into training and testing. Luckily, this is a common pattern in machine learning and scikit-learn has a pre-built function to split data into training and testing sets for you. Here, we use 50% of the data as training, and 50% testing. 80% and 20% is another common split, but there are no hard and fast rules. The most important thing is to fairly evaluate your system on data it *has not* seen during training!" 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": 3, 99 | "metadata": {}, 100 | "outputs": [ 101 | { 102 | "name": "stdout", 103 | "output_type": "stream", 104 | "text": [ 105 | "Labels for training and testing data\n", 106 | "[1 1 0 2 2 0 0 1 1 2 0 0 1 0 1 2 0 2 0 0 1 0 0 1 2 1 1 1 0 0 1 2 0 0 1 1 1\n", 107 | " 2 1 1 1 2 0 0 1 2 2 2 2 0 1 0 1 1 0 1 2 1 2 2 0 1 0 2 2 1 1 2 2 1 0 1 1 2\n", 108 | " 2]\n", 109 | "[1 2 2 1 0 2 1 0 0 1 2 0 1 2 2 2 0 0 1 0 0 2 0 2 0 0 0 2 2 0 2 2 0 0 1 1 2\n", 110 | " 0 0 1 1 0 2 2 2 2 2 1 0 0 2 0 0 1 1 1 1 2 1 2 0 2 1 0 0 2 1 2 2 0 1 1 2 0\n", 111 | " 2]\n" 112 | ] 113 | } 114 | ], 115 | "source": [ 116 | "from sklearn.model_selection import train_test_split\n", 117 | "\n", 118 | "train_X, test_X, train_y, test_y = train_test_split(X, y, \n", 119 | " train_size=0.5, \n", 120 | " random_state=123)\n", 121 | "print(\"Labels for training and testing data\")\n", 122 | "print(train_y)\n", 123 | "print(test_y)" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "---" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "**Tip: Stratified Split**\n", 138 | "\n", 139 | "Especially for relatively small datasets, it's better to stratify the split. Stratification means that we maintain the original class proportion of the dataset in the test and training sets. For example, after we randomly split the dataset as shown in the previous code example, we have the following class proportions in percent:" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": 4, 145 | "metadata": {}, 146 | "outputs": [ 147 | { 148 | "name": "stdout", 149 | "output_type": "stream", 150 | "text": [ 151 | "All: [ 33.33333333 33.33333333 33.33333333]\n", 152 | "Training: [ 30.66666667 40. 29.33333333]\n", 153 | "Test: [ 36. 26.66666667 37.33333333]\n" 154 | ] 155 | } 156 | ], 157 | "source": [ 158 | "print('All:', np.bincount(y) / float(len(y)) * 100.0)\n", 159 | "print('Training:', np.bincount(train_y) / float(len(train_y)) * 100.0)\n", 160 | "print('Test:', np.bincount(test_y) / float(len(test_y)) * 100.0)" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": {}, 166 | "source": [ 167 | "So, in order to stratify the split, we can pass the label array as an additional option to the `train_test_split` function:" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 5, 173 | "metadata": {}, 174 | "outputs": [ 175 | { 176 | "name": "stdout", 177 | "output_type": "stream", 178 | "text": [ 179 | "All: [ 33.33333333 33.33333333 33.33333333]\n", 180 | "Training: [ 33.33333333 33.33333333 33.33333333]\n", 181 | "Test: [ 33.33333333 33.33333333 33.33333333]\n" 182 | ] 183 | } 184 | ], 185 | "source": [ 186 | "train_X, test_X, train_y, test_y = train_test_split(X, y, \n", 187 | " train_size=0.5, \n", 188 | " random_state=123,\n", 189 | " stratify=y)\n", 190 | "\n", 191 | "print('All:', np.bincount(y) / float(len(y)) * 100.0)\n", 192 | "print('Training:', np.bincount(train_y) / float(len(train_y)) * 100.0)\n", 193 | "print('Test:', np.bincount(test_y) / float(len(test_y)) * 100.0)" 194 | ] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "metadata": {}, 199 | "source": [ 200 | "---" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": {}, 206 | "source": [ 207 | "## The scikit-learn estimator API\n", 208 | "" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "By evaluating our classifier performance on data that has been seen during training, we could get false confidence in the predictive power of our model. In the worst case, it may simply memorize the training samples but completely fails classifying new, similar samples -- we really don't want to put such a system into production!\n", 216 | "\n", 217 | "Instead of using the same dataset for training and testing (this is called \"resubstitution evaluation\"), it is much much better to use a train/test split in order to estimate how well your trained model is doing on new data." 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": 12, 223 | "metadata": {}, 224 | "outputs": [ 225 | { 226 | "name": "stdout", 227 | "output_type": "stream", 228 | "text": [ 229 | "0.96\n" 230 | ] 231 | } 232 | ], 233 | "source": [ 234 | "classifier.fit(train_X, train_y)\n", 235 | "pred_y = classifier.predict(test_X)\n", 236 | "pred_y\n", 237 | "#print(\"Fraction Correct [Accuracy]:\")\n", 238 | "print(np.sum(pred_y == test_y) / float(len(test_y)))\n" 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "metadata": {}, 244 | "source": [ 245 | "We can also visualize the correct and failed predictions" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 26, 251 | "metadata": {}, 252 | "outputs": [ 253 | { 254 | "name": "stdout", 255 | "output_type": "stream", 256 | "text": [ 257 | "Samples correctly classified:\n", 258 | "[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24\n", 259 | " 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 45 46 47 48 50 51\n", 260 | " 52 53 54 55 56 57 58 59 61 62 63 64 65 66 67 68 69 70 71 72 73 74]\n", 261 | "\n", 262 | "Samples incorrectly classified:\n", 263 | "[44 49 60]\n" 264 | ] 265 | } 266 | ], 267 | "source": [ 268 | "print('Samples correctly classified:')\n", 269 | "correct_idx = np.where(pred_y == test_y)[0]\n", 270 | "print(correct_idx)\n", 271 | "\n", 272 | "print('\\nSamples incorrectly classified:')\n", 273 | "incorrect_idx = np.where(pred_y != test_y)[0]\n", 274 | "print(incorrect_idx)\n", 275 | "\n", 276 | " " 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": 27, 282 | "metadata": {}, 283 | "outputs": [ 284 | { 285 | "data": { 286 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAEWCAYAAABliCz2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XucHGWZ9//PNwcNgQTIAQTCZBB2XZNwjhIRMOKjIqL7\n4gc8gpEFVjJIfuvCurKsZh9PSx5/iALqbnBHUQiM6wHlkZ+7HtAFAVnUBAlnUGEmJKhAAgkQOSRz\nPX9UDXQm3TM1XV19/L5fr37N9N3VVVdXKtdUX3XfdykiMDOz9jeu0QGYmVl9OOGbmXUIJ3wzsw7h\nhG9m1iGc8M3MOoQTvplZh3DCt0wkLZL04wLWe7qkW2q93pL1/0DSaSXPL5D0hKQ/SOqS9Iyk8QVs\n9xlJr671eutF0ickXd3oOKy2nPANAEn9kv5Hpdcjoi8i3lblut8u6SZJT0t6XNLPJL27+mizi4h3\nRMSVaRxdwN8DcyLiVRGxJiJ2ioitebYh6UZJZw7b7k4R8VCe9TYLSd2SQtKERsdi+Tjh26jy/EeX\ndCLwbWAFMAvYHfgY8K7aRDcmXcD6iHisAduuOSdgGysnfNtOWmb5uaRLJK0HPlFaelHiEkmPSdok\n6S5J88qsR8DFwD9HxFciYmNEDEbEzyJicYVtf17SI+l6V0k6suS110tamb72R0kXp+2TJF0tab2k\npyT9StLu6Ws3Sjoz/fZyPbBnWm65YviZq6Rpkr4m6VFJT0r6P2n7rpK+n347eTL9fVb62jLgSOBf\n0vX+S9oekvZLf99Z0or0/QOS/knSuJJ9fYukz6brfljSO0b4t+mXdL6kO4FnJU2QtKek76Trf1jS\n32bYZwslrS2z7nLf8m5Kfz6VfsY3SNov/aa2MS2RfbNSzNY8nPCtksOAh0jOyJcNe+1twFHAnwM7\nA/8TWF9mHa8B9gauGcN2fwUcBEwDvg58W9Kk9LXPA5+PiKnAvsC30vbT0jj2BqYDHwD+VLrSiPgJ\n8A7g0bTccnqZbV8FTAbmArsBl6Tt44CvAbNJviX8CfiXdL1LgZuBv0nX+zdl1vvFNL5XA28C/go4\no+T1w4AHgBnAZ4DL0z+WlZwCvBPYBRgE/n9gNbAX8BbgXElvT5ettM/G4qj05y7pZ/xv4J+BHwO7\nknxz+2IV67U6c8K3Sh6NiC9GxJaI+NOw114EpgB/ASgi7ouI35dZx/T0Z7nXyoqIqyNifbrdzwGv\nJPnDMbTd/STNiIhnIuK2kvbpwH4RsTUiVkXEpqzbBJC0B8kfhA9ExJMR8WJE/CyNaX1EfCciNkfE\n0yR/AN+Ucb3jgZOBj0TE0xHRD3wOOLVksYGI+HJ6LeFKYA+SP7SVfCEiHkn/XV4HzIyIT0XEC+l1\ngy+n24TK+yyvF0n+AO4ZEc9FRGEX3q12nPCtkkcqvRAR/0VyhvuvwGOSeiVNLbPo0Fn/Hlk3KunD\nku5LSwVPkZwZz0hffj/Jt4r707LNcWn7VcCPgG+k5ZjPSJqYdZupvYENEfFkmZgmS/q3tByziaTE\nsYuy9e6ZAUwEBkraBkjOxof8YeiXiNic/rrTCOss/beZTVKmemroAXyUl/9gVNpnef0DIOCXku6R\n9Nc1Wq8VyAnfKhlxGtWI+EJEHArMIUko55VZ7AGS5HRClg2m9fp/ICkR7RoRuwAbSRILEfGbiDiF\npNxyIXCNpB3Ts/FPRsQc4HDgOJKyyVg8AkyTtEuZ1/6e5FvGYWlpZKjEMVR2GWlfPcHLZ8NDuoB1\nY4yvVOn2HgEejohdSh5TIuJYqLzPgGdJylfJB0n+eM3MsD3S9f4hIhZHxJ7AWcDyoWsW1ryc8G3M\nJL1O0mHpWfSzwHMkteRtRDL39oeA/yXpDElTJY2TdISk3jKrngJsAR4HJkj6GPDSNwdJ75M0MyIG\ngafS5kFJb5a0f5q0NpEk2O3iGUlakvoBSeLaVdJESUOJfQpJ3f4pSdOAjw97+x9J6vPl1ruVpG6+\nTNIUSbPTfVKrPu6/BJ5OL+TuIGm8pHmSXgeV9xnwIDBJ0jvTf8d/IimflfN4+p6XPqOkk4YuXANP\nkvxRGNM+t/pzwrdqTCWpEz9JUp5YD1xUbsGIuAZ4D/DXwKMkyfEC4HtlFv8R8EOSZDRA8oektHxx\nDHCPpGdILkaenNaxX0VyYXgTcB/wM5Iyz1idSvLH4n7gMeDctP1SYAeSs/Xb0hhLfR44Me1l84Uy\n6/0gyR/Gh4BbSC5Gf7WK+LaT/kE5juRC98NpjF8hKYVBhX0WERuBJemy69L41lJGWmZaBvw8LRst\nILl28It0vdcB57TLuIN2Jt8AxcysM/gM38ysQzjhm5l1CCd8M7MOUVjCl/QaSXeUPDZJOnf0d5qZ\nWRHqctE27S63jqQf80Cl5WbMmBHd3d2Fx2Nm1i5WrVr1RERUGkOxjXrNtvcW4HcjJXuA7u5uVq5c\nWaeQzMxan6QR82qpetXwTwb+vdwLknrS2fxWPv7443UKx8ys8xSe8CW9Ang3yZzo24mI3oiYHxHz\nZ87M9K3EzMyqUI8z/HcAt0fEH+uwLTMzq6AeCf8UKpRzzMysfgpN+OmsfG8FvlvkdszMbHSF9tKJ\niGd5+SYYZmbWQB5pa2bWIZzwrQp9QDfJ4dOdPjfLy8dV0eo18MraRh/QAwzdiW8gfQ6wqCERWTvw\ncVUPPsO3MVrKy/8ph2xO282q5eOqHpzwbYzWjLHdLAsfV/XghG9j1DXG9lpyjbd9NfK46hxO+DZG\ny4DJw9omp+1FGqrxDpDcL3uoxuuk3x4adVx1Fid8G6NFQC8wG1D6s5fiL6y5xtveGnVcdRb30rEq\nLKL+/xFd421/jTiuOovP8K1FuMZrlpcTvrUI13jN8nLCtxbhGq9ZXq7hWwtxjdcsD5/hm5l1CCd8\nM7MO4YRvdebRsmaN4hq+1ZFnRDRrJJ/hWx15tKxZIznhWx15tKxZIznhWx15tKxZIznhWx15tKxZ\nIznhWx15tKxZIznhWxXydK1cBPQDg+nPsSR7d+k0y8PdMm2MGtW10l06zfLyGb6NUaO6VrpLp1le\nhSZ8SbtIukbS/ZLuk/SGIrdn9dCorpWN7NKZp5TkMpQ1j6JLOp8HfhgRJ0p6Bdt30bCW00VSTinX\n3o7bzVNKchnKmkthZ/iSdgaOAi4HiIgXIuKporZn9dKorpWN2m6eUpLLUNZciizp7AM8DnxN0q8l\nfUXSjsMXktQjaaWklY8//niB4VhtNKprZaO2m6eU5JHF1lwUEcWsWJoP3Aa8MSJ+IenzwKaI+F+V\n3jN//vxYuXJlIfGYVaeb8qWk2STdSot6r1k2klZFxPwsyxZ5hr8WWBsRv0ifXwMcUuD2zApw7Bjb\nS3lksTWXwhJ+RPwBeETSa9KmtwD3FrU9s2L85xjbS3lksTWXovvhfxDok3QncBDwvwvenrW1JSQd\ny5T+XFKHbeatw+cZWWxWW4V2y4yIO4BMtSWzkS0BLit5vrXk+fICt9uo7qBmteeRttYiesfYXiuu\nw1v7cMK3FrF1jO214jq8tQ8nfGsR48fY3iw8tYI1Dyd8axE9Y2yvlaHpEQaA4OXpEbIk7jzvNas9\nJ3xrEcuBs3n5jH58+rzIC7bgqRWsnXg+fGshyyk+wQ/nqRWsffgM32xEeW687pu2W3NxwjcbUZ5u\nme7Sac3FCd9aSCNG2ubplukundZcXMO3FtGokbaQJOhqk3Se95rVls/wrUU0aqStWftwwrcW0aiR\ntmbtwwnfWkSrjrQ1ax5O+NYiGjXS1qx9+KKttYihC7O9JGWc8STJvt4Dscxal8/wrYUsB7aQzEuz\nhfole0+AZu3BZ/hmIxqaAG1oTpyhCdDA3S2t1fgM32xEngDN2ocTvtmIPAGatQ8nfLMReQI0ax9O\n+GYj8gRo1j6c8M1G5AnQrH044XesPDNPupuiWStyt8yOlGfmyU7rpthpn9famSKi0TG8ZP78+bFy\n5cpGh9EBJlB+0rHxJAOaRtJNkvSGmw3054qqOXXTWZ/XWo2kVRExP8uyhZ7hS+oHnibJLluyBmVF\nyzPzZKd1U+y0z2vtrGINX9J1GR5XZNjGmyPioPok+1asLTci5jwzTzaum+Itn1zC2p0nMCixducJ\n3PLJetzxyt0yrX2MdIb/WuDMEV4X8K+1DSePVqy1NirmHrat4Ze2j2YZ28YM9eimeMsnl3DwBZex\nY1pxmrVpK7tecBm3AEd8vMg5dRrzec2KULGGL+l/RsS3RnzzKMtIehjYSFIr+LeIGPH2RPlq+N20\nXq21m8bFvITqZ57sI5laYA3Jme4yiv6junbnCczatH3Jae3U8czaONp1h7zq/3nNshpLDb/Qi7aS\n9oqIdZJ2A64HPhgRNw1bpof01LKrq+vQgYFyCTCLcSSzKG4XBTBY5TqL1ooxN8agVLb+OAiMa6KO\nB2b1NpaEP2o/fEnzJV0r6XZJd0q6S9KdWVYeEevSn48B1wKvL7NMb0TMj4j5M2fOzLLaClqv1vrM\n89PG1N7JnppR/lCt1G5m28vyv6UP+BpwAvAu4Lj054gk7ShpytDvwNuAu6sPdTStNwT+o9+FZ5/f\ntu3Z55N229bkz+5Q9p938md3aEg8Zq0oS8J/PCKui4iHI2Jg6JHhfbsDt0haDfwS+I+I+GGuaEfU\nekPg/+W/NrB4BfQ/AYOR/Fy8ImlvZn239dF9fjfjFo+j+/xu+m4rvmfRpNM2l/3nnXTa8KmLa68x\nvYPMam/UGr6ktwCnAD8FXjofjYian4d22sCr7vO7Gdiw/d/O2dNm039hf/0DyqDvtj56ruph8wsv\nJ9rJr5hM76m9LFpQ5B/XGcD6Mu3TgScK2+rw3kEAz06AX//T2QX3DjLLpqY1fOAM4CDgGJJSzlBZ\nx3JadvwyJr9i2zrF5FdMZtnxzVuGWnrt0m2SPcDmFzaz9Nr2vCFI98W92yR7gB23JO1mrSbLSNvX\nRcRrCo+kAw2dES+9dilrNqyha1oXy45fVvCZcj5rNpQfYVqpvXYqlbmKLX/tWaYr6EjtZs0syxn+\nrZLmFB5Jh1q0YBH9F/Yz+OVB+i/sb+pkD9A1rXyvp0rtw1Vf/8/XC6va7T46tfzo40rttdquWRGy\nJPwFwB2SHhhrt0xrP3nKUEP1/4ENAwTBwIYBeq7qyZgEq++FlWe7/R/q4dlh34OfnZC0F7ldsyJk\nuWg7u1x7xp46Y9JpF21bVd9tfVWVofJfpK5uxGve7d7yySV0X9zLnpu28ujU8fR/qCfTBdtWvChv\nrafWs2XuAdwTEU+nK59KMs9OzRO+tYZFCxZVVXrKW//vuw2WXgtrNkDXNFh2PCxaUPx2j/j4ckgT\n/Kz0kUXjrneYlZelpHMZ8EzJ82coP/OW2Yjy1P/zlEfyXneoVqO2a1ZJloSvKKn7RMQgvlOWVSFP\n/T9Pd9BGdX9txW631t6yJPyHJP2tpInp4xzgoaIDs/azaMEiek/tZfa02Qgxe9rszAO28pRH8mw3\nj0Zt16ySLBdtdwO+ABxNMrXjT4Fz0wnRairvRdtqLyY2VN8SWNoLa7ZC13hY1gOLPIJzuEZeAG3J\n48o6Rk0v2qaJ/eTcURVs+JD/oRov0Lz/OfuWQM9lJfc/2Zo8Byf9Yfbbbb+yCX+/3fYrdLsteVyZ\nVTDSDVB6RrthSZZlxiLPGX5LdoHrnpAk+eFmj4f+om/q0VomnDWBrYPb76vx48az5d+K21cteVxZ\nR6nVGf4/ShppVioB55DMYdhwLdkFbk2F4fmV2jtYuWQ/Uvtw1ZZlWvK4MqtgpIT/M0af9/76GsaS\nS9e0rrJnYk3dBa5rfPkz/K5sw/Y7yTiNYzC2vwvYOI3e7yBPWaYljyuzCir+b4mIMzI8zq1nsCNp\nyS5wy3oqzBaQ5WbinWWHieVvdFKpvVQrduk0K0Lb3B+uJbvALVoOvWcnNXuR/Ow9u60v2C65egkT\nzpqAFosJZ01gydXZbiYyPGGP1l6qFbt0mhWh0JuYj5Xn0mlvS65ewmU/236Q9tlvOpvl7xv5j1ye\ni6e+8GrtrNY3QDGrid6by1/fr9ReKk9pxWUZs8So/fAlvZLkBubdpctHxKeKC8vaUZ6eNnluFtOK\nN5oxK0KWOXG+B2wEVlFyT1uzsRo/bnzFvvRFq3aGT7N2kiXhz4qIYwqPxNpez5E9ZWv4PUdmv5mI\nR7yaVS/rLQ73LzwSa3vL37ecs9909ktn9OPHjc90wRY67+bpZkUYaWqFu0gmS5sA/BnJDJnPk3Qg\njIg4oNbBuJdO/bTahGDjFo8j2P5YFWLwy9sPyDLrFLWaWuG4GsVjTaYVyyMe8WqW30gjbQfS+9Ze\nMPR7aVv9QrRaa8XyiLtWmuWXpYY/t/SJpPHAocWEY/XQihOCecSrWX4VE76kj0h6GjhA0qb08TTw\nGElXzUwkjZf0a0nfr0G8I+q7rY/u87sZt3gc3ed3Z7rfaSur9vO26r1WFy1YRP+F/Qx+eZD+C/ud\n7M3GaKSSzqcjYgpwUURMTR9TImJ6RHxkDNs4B7gvd6SjyHOT61aU5/O6PGLWmbKUdL4t6ZBhj30l\nZRmlOwt4J/CV3JGOohXr0nnk+bwuj5h1piwDr5YDhwB3knTJ3B+4G9hZ0tkR8eMR3nsp8A/AlEoL\nSOoBegC6uqovKbRiXRoad2MOjzw16zxZzvAfBQ6OiPkRcShwEEmf/LcCn6n0JknHAY9FxKqRVh4R\nvem658+cOXMMoW+rFevSecoyrfh5zayxsiT8P4+Ie4aeRMS9wF9ExEOjvO+NwLsl9QPfAI6WdHXV\nkY6iFevSvjGHmdVTloR/j6TLJL0pfSwH7k1n0Xyx0psi4iMRMSsiuoGTgf+KiPfVJuzttWJd2jfm\nMLN6GvUGKJJ2AJYAR6RNPyep6z8HTI6IZ0bdiLQQ+HBEjDh6t1WnVqi2Du8bc5hZXjW9AUpE/Cki\nPhcRx6ePz0bE5ogYzJLs03XcOFqyb1XuHmlmrWLUhC/pjZKul/SgpIeGHvUIrhW4e6SZtYosJZ37\ngb8juQHKS3eviIj1tQ6mFUs6nsXRzBqpVrNlDtkYET/IGVPb8iyOZtYqsvTSuUHSRZLeUDratvDI\nWoTr8GbWKrKc4R+W/iz9yhDA0bUPp/X4Btlj02o3XjFrJ6PW8OupFWv4lt3wG69A8m3IF6rNqlfT\nbpmSdpd0uaQfpM/nSHp/3iCt83TaBHdmzSZLDf8K4EfAnunzB4FziwrI2lerTnBn1i6yJPwZEfEt\nYBAgIrZQ0j3TLCtP+GbWWFkS/rOSppNcqEXSAmBjoVFZW3KPJrPGytJL50PAdcC+kn4OzAROLDQq\na0vu0WTWWJl66aR3t3oNyQ1QHoiIirNk5uFeOmZmY1OTkbaS/p8KL/25JCLiu1VFZ2ZmDTFSSedd\nI7wWgBO+mVkLqZjwI+KMegZiZmbFytJLpyP03dZH9/ndjFs8ju7zuzPNZ1+L95qZ1UuWXjptb/iQ\n/6GbmACj9iDJ814zs3ryGT75hvx7ugAzaxXV9NIBaKteOnmG/Hu6ADNrFe6lQ76bmPgGKGbWKtxL\nh2TIf7lpe7MM+c/zXjOzesp00VbSO4G5wKShtoj4VFFB1VueIf+eLsDMWkWWm5h/CZgMvBn4Csk8\nOr+MiJrPie+pFczMxqamN0ABDo+IvwKejIhPAm8A/jxPgGZmVn9ZEv6f0p+bJe0JvAjsUVxIZmZW\nhCw1/O9L2gW4CLidpIfOV0Z7k6RJwE3AK9PtXBMRH88Rq5mZ5ZAl4X8mIp4HviPp+yQXbp/L8L7n\ngaMj4hlJE4FbJP0gIm7LEa+ZmVUpS0nnv4d+iYjnI2JjaVslkXgmfToxfYw++b6ZmRVipJG2rwL2\nAnaQdDDJzU8AppL02hmVpPHAKmA/4F8j4hdllukBegC6ujxYycysKCOVdN4OnA7MAi4uad8EfDTL\nyiNiK3BQeg3gWknzIuLuYcv0Ar2QdMvMHrqZmY3FSCNtrwSulHRCRHwnz0Yi4ilJNwDHAHePtryZ\nmdVelhr+zyVdLukHAJLmSBp10JWkmemZPZJ2AN4K3J8rWjMzq1qWhP814EfAnunzB4FzM7xvD+AG\nSXcCvwKuj4jvVxWlmZnllqVb5oyI+JakjwBExBZJW0d7U0TcCRycN0AzM6uNLGf4z0qaTtqlUtIC\nYGOhUZmZWc1lOcP/EHAdsK+knwMzSSZQMzOzFjJqwo+I2yW9CXgNSV/8ByLixcIjMzOzmho14adz\n4iwBjiAp69ws6UsRkWV6BTMzaxJZSjorgKeBL6bP3wtcBZxUVFBmZlZ7WRL+vIiYU/L8Bkn3FhWQ\nmZkVI0svndvTnjkASDoM8G2pzMxaTJYz/EOBWyWtSZ93AQ9IuotkUswDCovOzMxqJkvCP6bwKMzM\nrHBZumUO1CMQMzMrVpYavpmZtQEnfDOzDuGEb2bWIZzwzcw6hBO+mVmHcMI3M+sQTvhmZh3CCd/M\nrEM44ZuZdQgnfDOzDpFlLp2GevHFF1m7di3PPef7rZSaNGkSs2bNYuLEiY0OxcxaRNMn/LVr1zJl\nyhS6u7uR1OhwmkJEsH79etauXcs+++zT6HDMrEU0fUnnueeeY/r06U72JSQxffp0f+sxszFp+oQP\nONmX4X1iZmPVEgnfzMzyKyzhS9pb0g2S7pV0j6RzitpW0f7whz9w8skns++++3LooYdy7LHH8uCD\nD9Lf38+8efMK2ebzzz/Pe97zHvbbbz8OO+ww+vv7C9mOmXWOIs/wtwB/n94AfQHw/0qaM8p7cuvr\nu5fu7l7Gjfss3d299PXlu996RHD88cezcOFCfve737Fq1So+/elP88c//rFGEZd3+eWXs+uuu/Lb\n3/6Wv/u7v+P8888vdHtm1v4KS/gR8fuIuD39/WngPmCvorYHSbLv6fkxAwObiICBgU309Pw4V9K/\n4YYbmDhxIh/4wAdeajvwwAM58sgjt1muv7+fI488kkMOOYRDDjmEW2+9FYDf//73HHXUURx00EHM\nmzePm2++ma1bt3L66aczb9489t9/fy655JLttvu9732P0047DYATTzyRn/70p0RE1Z/DzKwu3TIl\ndQMHA78o81oP0APQ1dWVaztLl97C5s1btmnbvHkLS5fewqJF1X25uPvuuzn00ENHXW633Xbj+uuv\nZ9KkSfzmN7/hlFNOYeXKlXz961/n7W9/O0uXLmXr1q1s3ryZO+64g3Xr1nH33XcD8NRTT223vnXr\n1rH33nsDMGHCBHbeeWfWr1/PjBkzqvocZmaFX7SVtBPwHeDciNg0/PWI6I2I+RExf+bMmbm2tWbN\ndqsfsb2WXnzxRRYvXsz+++/PSSedxL33Jt8qXve61/G1r32NT3ziE9x1111MmTKFV7/61Tz00EN8\n8IMf5Ic//CFTp04tPD6zZlfrcqxtr9CEL2kiSbLvi4jvFrktgK6u8omzUnsWc+fOZdWqVaMud8kl\nl7D77ruzevVqVq5cyQsvvADAUUcdxU033cRee+3F6aefzooVK9h1111ZvXo1Cxcu5Etf+hJnnnnm\nduvba6+9eOSRRwDYsmULGzduZPr06VV/DrNmVkQ51rZXZC8dAZcD90XExUVtp9SyZUcwefK2VarJ\nkyewbNkRVa/z6KOP5vnnn6e3t/eltjvvvJObb755m+U2btzIHnvswbhx47jqqqvYunUrAAMDA+y+\n++4sXryYM888k9tvv50nnniCwcFBTjjhBC644AJuv/327bb77ne/myuvvBKAa665hqOPPtp9761t\njVSOtdopsob/RuBU4C5Jd6RtH42I/yxqg0N1+qVLb2HNmk10dU1l2bIjqq7fQzLA6dprr+Xcc8/l\nwgsvZNKkSXR3d3PppZdus9ySJUs44YQTWLFiBccccww77rgjADfeeCMXXXQREydOZKeddmLFihWs\nW7eOM844g8HBQQA+/elPb7fd97///Zx66qnst99+TJs2jW984xtVfwazZtfIcmwnUTP1/Jg/f36s\nXLlym7b77ruP1772tQ2KqLl531i76O7uZWBg++Q+e/ZU+vt7GhBR65C0KiLmZ1nWI23NrOGKKMfa\n9pzwzQqUp+dJJ/VaWbRoDr29b2P27KlIyZl9b+/bcpVjbXtNPz2yWasa6nkydDFyqOcJMGoiy/Pe\nVrVo0Zy2/WzNwmf4ZgXJ0/PEvVasCE74ZgXJ0/PEvVasCE741jJaraadZyBg3kGErbavrD6c8DNo\nxPTIN910E4cccggTJkzgmmuuKWQbraQVR2Lm6XmS572tuK+sPtou4ffd1kf3+d2MWzyO7vO76but\nL9f6GjU9cldXF1dccQXvfe97C91Oq2jFmnaenid53tuK+8rqo60Sft9tffRc1cPAhgGCYGDDAD1X\n9eRK+o2aHrm7u5sDDjiAcePa6p+oanlr2nPnfhXpsy895s79ai3Dq2jRojn09/cwOPhh+vt76tIL\npZH1/07rhtpqMbdVt8yl1y5l8wubt2nb/MJmll67lEULFlW1zkZNj2zb6uqaWnYkZpaa9ty5X+Xe\nezds03bvvRuYO/er3HPPX9csxlrK0y0zz77Ko9O6obZizG11+rhmw5oxtdeSp0cuVp6a9vBkP1p7\nM8hTlmnUqNVO64baijG3VcLvmlb+BiqV2rNo1PTItq1OG4mZpyzTqH3Vad1QWzHmtkr4y45fxuRX\nTN6mbfIrJrPs+GVVr7NR0yNbZyvi3g5ZVVuXbmQ31EZoxa6zbZXwFy1YRO+pvcyeNhshZk+bTe+p\nvVXX7+Hl6ZF/8pOfsO+++zJ37lw+8pGP8KpXvWqb5ZYsWcKVV17JgQceyP3337/N9MgHHnggBx98\nMN/85jc555xzWLduHQsXLuSggw7ife97X9npkX/1q18xa9Ysvv3tb3PWWWcxd+7cqj9DO8jT1XDO\nnGljam8Gxx67z5jaS+XZV3ne26huqI3Sil1nPT1yC+ukfZN3+tzhF27nzJnWtBdsId/nbdR7IUlk\n1d6PIs97G6XamGs5HfRYpkd2wm9hnbRvxo37LOUOVQkGBz9c6LaXLLme3t472bo1GD9e9PQcwPLl\nby10m3m7IpfaAAALf0lEQVQ+b6Pe20it9seilvvZ8+Fb25k2bdKY2mtlyZLrueyy1Wzdmvzv3Lo1\nuOyy1SxZcn2h281TH86zr1qxlt6KI4sbtZ+d8M1G0Nt755jaa6VRNe1WrKW3YvfIRu1nJ3xrCRs2\nPDem9loZOrPP2l4rebpW5tlXrdj9tRW7RzZqP7fVSFtrX40aPTp+vMom9/HjVeh2ofobguTdV612\nI5K8n7dR9f9G7Gef4VtLaNRX4J6eA8bU3gxasSyTRyt2j2wUJ/wMGjE98sUXX8ycOXM44IADeMtb\n3sLAwEAh22kVjfoKvHz5Wzn77ANfOqMfP16cffaBhffSyaMVyzJ5eGbR7NqwW2YfsBRYA3QBy4Dq\nB15FBIcffjinnXbaSzNmrl69mk2bNrH33ntz3HHHvTQJWi3dcMMNHHbYYUyePJnLLruMG2+8kW9+\n85vbLNNJ3TLNitCq3VBLdXC3zD6gBxgAIv3Zk7ZXp1HTI7/5zW9m8uRkmogFCxawdu3aqj+DmZXX\nit1Q82izi7ZLgc3D2jan7a07PfLll1/OO97xjqriN7PKli07YpspjqG9r3cUdoYv6auSHpNU+3pH\nRZWmQW7d6ZGvvvpqVq5cyXnnnVf4ZzCD1rupRx6ddr2jyJLOFcAxBa6/jErTILfm9Mg/+clPWLZs\nGddddx2vfOUrq/4MZll1Wq8VaMxdyRqlsIQfETcBdb7DxDJg8rC2yWl7dRo1PfKvf/1rzjrrLK67\n7jp22223quM3G4tO67XSaRpew5fUQ3Jlla6u6s/EE0N1+tr10hmaHvncc8/lwgsvZNKkSXR3d3Pp\npZdus9ySJUs44YQTWLFiBcccc8w20yNfdNFFTJw4kZ122okVK1awbt06zjjjDAYHBwHKTo983nnn\n8cwzz3DSSScByb657rrrqv4cZlm04qhVy67QbpmSuoHvR0SmzuqeLXNsvG+s1mo5ba/VRwd3yzSz\nPDptlG6nccI3s5d0Wq+VTlNYDV/SvwMLgRmS1gIfj4jLq1lXRCAVP1lVK2mmEdLWXlpt8jTLrrCE\nHxGn1GI9kyZNYv369UyfPt1JPxURrF+/nkmTir35h5m1l4b30hnNrFmzWLt2LY8//nijQ2kqkyZN\nYtasWY0Ow8xaSNMn/IkTJ7LPPvs0Ogwzs5bni7ZmZh3CCd/MrEM44ZuZdYimugGKpMdJJrEv0gzg\niYK3UY1mjKsZY4LmjKsZY4LmjKsZY4LmjCtLTLMjYmaWlTVVwq8HSSuzDkOup2aMqxljguaMqxlj\nguaMqxljguaMq9YxuaRjZtYhnPDNzDpEJyb83tEXaYhmjKsZY4LmjKsZY4LmjKsZY4LmjKumMXVc\nDd/MrFN14hm+mVlHcsI3M+sQbZPwJe0t6QZJ90q6R9I5ZZZZJOlOSXdJulXSgSWv9aftd0haOfy9\nBca0UNLGdLt3SPpYyWvHSHpA0m8l/WMtYhpDXOeVxHS3pK2SpqWvFbGvJkn6paTVaUyfLLOMJH0h\n3R93Sjqk5LWi9lWWuOp9XGWJqRHHVZa46npclWx3vKRfS/p+mdfqflxljKv2x1VEtMUD2AM4JP19\nCvAgMGfYMocDu6a/vwP4Rclr/cCMBsS0kOQ2kMPfOx74HfBq4BXA6uHvLTKuYcu/C/ivgveVgJ3S\n3ycCvwAWDFvmWOAH6bILhv79Ct5XWeKq93GVJaZGHFejxlXv46pk3R8Cvl5hn9T9uMoYV82Pq7Y5\nw4+I30fE7envTwP3AXsNW+bWiHgyfXobUOj8wlliGsHrgd9GxEMR8QLwDeAvGxTXKcC/12LbI8QU\nEfFM+nRi+hjeo+AvgRXpsrcBu0jag2L31ahxNeC4yrKvKmnovhqm8OMKQNIs4J3AVyosUvfjKktc\nRRxXbZPwSym5efrBJGcYlbyf5K/6kAB+ImmVpJrfrXmUmA5Pv7r9QNLctG0v4JGSZdaS/Y9FreJC\n0mTgGOA7Jc2F7Kv06+0dwGPA9RExPKZK+6TQfZUhrlJ1Oa4yxlT34yrrvqrncQVcCvwDMFjh9YYc\nVxniKlWT46rp58MfK0k7kRxE50bEpgrLvJlkB5bemfmIiFgnaTfgekn3R8RNdYjpdqArIp6RdCzw\nf4A/q8V2c8Y15F3AzyNiQ0lbIfsqIrYCB0naBbhW0ryIuDvveusVVz2PqwwxNeS4GsO/YV2OK0nH\nAY9FxCpJC/Osq5bGElctj6u2OsOXNJEkgfVFxHcrLHMAyVeov4yI9UPtEbEu/fkYcC3J17nCY4qI\nTUNfgyPiP4GJkmYA64C9SxadlbbVRJZ9lTqZYV+7i9pXJet/CriB5AywVKV9Uui+yhBX3Y+r0WJq\n1HE1Wlwl6nVcvRF4t6R+kpLM0ZKuHrZMI46rLHHV/rgaa9G/WR8kF1xWAJeOsEwX8Fvg8GHtOwJT\nSn6/FTimTjG9ipcHwL0eWJO+bwLwELAPL18wmluvfZUutzOwAdixDvtqJrBL+vsOwM3AccOWeSfb\nXlz7Zdpe5L7KEle9j6ssMTXiuBo1rnofV8O2u5DyF0frflxljKvmx1U7lXTeCJwK3JXWEAE+SrLT\niIgvAR8DpgPLldwQfUskM9HtTvL1E5J/5K9HxA/rFNOJwNmStgB/Ak6O5F9yi6S/AX5E0lvgqxFx\nTw1iyhoXwPHAjyPi2ZL3FrWv9gCulDSe5JvntyLi+5I+UBLTf5L0qPgtsBk4I32tyH2VJa56H1dZ\nYmrEcZUlLqjvcVVWExxXWeKq+XHlqRXMzDpEW9XwzcysMid8M7MO4YRvZtYhnPDNzDqEE76ZWYdw\nwre2pmTWyO1mIszwvj0lXVPhtRslzU9//2hJe7ekUUcGS7pC0sNDXfDykPQeJTM5jvkzWudxwjcr\nIyIejYgTMyz60dEXKeu8kn7pVYuIbwJn5l2PdQYnfGsoSTtK+g8lc6jfLek9afuhkn6WTg71IyWz\nFw6dXX9eL8+n/vq0/fWS/lvJ3OK3SnrNKNv9j3TYOul7Ppb+/ilJi0vP1iXtIOkbku6TdC3JKFIk\n/X/ADmksfemqx0v6spL54H8saYcM+2B3Sdem+2C1pMPT7d+ffht4UFKfpP8h6eeSfjP0uc3Gwgnf\nGu0Y4NGIODAi5gE/TOf5+SJwYkQcCnwVWFbynskRcRCwJH0N4H7gyIg4mGSE4v8eZbs3A0dK2hnY\nQjL6GOBIYPgkVGcDmyPitcDHgUMBIuIfgT9FxEERsShd9s+Af42IucBTwAkZ9sEXgJ9FxIHAIcDQ\naM79gM8Bf5E+3ksygdaHqf6bhXWwdppawVrTXcDnJF1IMp/IzZLmAfNIZgGEZFj770ve8+8AEXGT\npKlKZmacQjKs/89Ipo6dOMp2bwb+FngY+A/grUqm7N0nIh5QMm30kKNIkjIRcaekO0dY78MRMTRd\nxSqge4RlhxwN/FW6/q3ARkm7puu6C0DSPcBPIyIk3ZVxvWbbcMK3hoqIB5XcUu5Y4AJJPyWZ/e+e\niHhDpbeVef7PwA0RcXyarG8cZdO/AuaTTI51PTADWEySpPN4vuT3raTlnxqsa7Dk+SD+v2tVcEnH\nGkrSniTlkquBi0hKGg8AMyW9IV1mol6+gQfAUJ3/CGBjRGwkmYFxaOra00fbbiR3MHoEOAn4b5Iz\n/g+zfTmHtO296TbnAQeUvPZiWoLK46ckZaOhG4jsnHN9ZmU54Vuj7Q/8Mp218+PABWkyPhG4UNJq\n4A6S+3sOeU7Sr4EvkdwYAuAzwKfT9qxnvzeT3ITiT+nvs9Kfw10G7CTpPuBTbPstoBe4s+SibTXO\nAd6clmpWAXNyrMusIs+WaS1F0o3AhyNiZaNjqZakK0iuV5Tt51/F+haS7JPjarE+a18+wzerv43A\nP9dq4BWwHHhytGXNfIZvZtYhfIZvZtYhnPDNzDqEE76ZWYdwwjcz6xBO+GZmHeL/Ah2j1j/77S2P\nAAAAAElFTkSuQmCC\n", 287 | "text/plain": [ 288 | "" 289 | ] 290 | }, 291 | "metadata": {}, 292 | "output_type": "display_data" 293 | } 294 | ], 295 | "source": [ 296 | "# Plot two dimensions\n", 297 | "\n", 298 | "colors = [\"darkblue\", \"darkgreen\", \"yellow\"]\n", 299 | "\n", 300 | "for n, color in enumerate(colors):\n", 301 | " idx = np.where(test_y == n)[0]\n", 302 | " plt.scatter(test_X[idx, 1], test_X[idx, 2], color=color, label=\"Class %s\" % str(n))\n", 303 | "\n", 304 | "plt.scatter(test_X[incorrect_idx, 1], test_X[incorrect_idx, 2], color=\"red\")\n", 305 | "\n", 306 | "plt.xlabel('sepal width [cm]')\n", 307 | "plt.ylabel('petal length [cm]')\n", 308 | "plt.legend(loc=3)\n", 309 | "plt.title(\"Iris Classification results\")\n", 310 | "plt.show()" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "metadata": {}, 316 | "source": [ 317 | "We can see that the errors occur in the area where green (class 1) and yellow (class 2) overlap. This gives us insight about what features to add - any feature which helps separate class 1 and class 2 should improve classifier performance." 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": {}, 323 | "source": [ 324 | "# Exercise" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "Print the true labels of 3 wrong predictions and modify the scatterplot code, which we used above, to visualize and distinguish these three samples with different markers in the 2D scatterplot. Can you explain why our classifier made these wrong predictions?" 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "metadata": { 338 | "collapsed": true 339 | }, 340 | "outputs": [], 341 | "source": [ 342 | "# %load solutions/04_wrong-predictions.py" 343 | ] 344 | } 345 | ], 346 | "metadata": { 347 | "anaconda-cloud": {}, 348 | "kernelspec": { 349 | "display_name": "Python 3", 350 | "language": "python", 351 | "name": "python3" 352 | }, 353 | "language_info": { 354 | "codemirror_mode": { 355 | "name": "ipython", 356 | "version": 3 357 | }, 358 | "file_extension": ".py", 359 | "mimetype": "text/x-python", 360 | "name": "python", 361 | "nbconvert_exporter": "python", 362 | "pygments_lexer": "ipython3", 363 | "version": "3.6.1" 364 | } 365 | }, 366 | "nbformat": 4, 367 | "nbformat_minor": 1 368 | } 369 | -------------------------------------------------------------------------------- /notebooks/08_cross_validation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Cross-validation for parameter tuning, model selection, and feature selection\n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Review of model evaluation procedures" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "**Motivation:** Need a way to choose between machine learning models\n", 22 | "\n", 23 | "- Goal is to estimate likely performance of a model on **out-of-sample data**\n", 24 | "\n", 25 | "**Initial idea:** Train and test on the same data\n", 26 | "\n", 27 | "- But, maximizing **training accuracy** rewards overly complex models which **overfit** the training data\n", 28 | "\n", 29 | "**Alternative idea:** Train/test split\n", 30 | "\n", 31 | "- Split the dataset into two pieces, so that the model can be trained and tested on **different data**\n", 32 | "- **Testing accuracy** is a better estimate than training accuracy of out-of-sample performance\n", 33 | "- But, it provides a **high variance** estimate since changing which observations happen to be in the testing set can significantly change testing accuracy" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": 8, 39 | "metadata": { 40 | "collapsed": true 41 | }, 42 | "outputs": [], 43 | "source": [ 44 | "from sklearn.datasets import load_iris\n", 45 | "#from sklearn.cross_validation import train_test_split\n", 46 | "from sklearn.model_selection import train_test_split\n", 47 | "from sklearn.neighbors import KNeighborsClassifier\n", 48 | "from sklearn import metrics" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 9, 54 | "metadata": { 55 | "collapsed": true 56 | }, 57 | "outputs": [], 58 | "source": [ 59 | "# read in the iris data\n", 60 | "iris = load_iris()\n", 61 | "\n", 62 | "# create X (features) and y (response)\n", 63 | "X = iris.data\n", 64 | "y = iris.target" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 10, 70 | "metadata": {}, 71 | "outputs": [ 72 | { 73 | "name": "stdout", 74 | "output_type": "stream", 75 | "text": [ 76 | "0.973684210526\n" 77 | ] 78 | } 79 | ], 80 | "source": [ 81 | "# use train/test split with different random_state values\n", 82 | "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=4)\n", 83 | "\n", 84 | "# check classification accuracy of KNN with K=5\n", 85 | "knn = KNeighborsClassifier(n_neighbors=5)\n", 86 | "knn.fit(X_train, y_train)\n", 87 | "y_pred = knn.predict(X_test)\n", 88 | "print(metrics.accuracy_score(y_test, y_pred))" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": {}, 94 | "source": [ 95 | "**Question:** What if we created a bunch of train/test splits, calculated the testing accuracy for each, and averaged the results together?\n", 96 | "\n", 97 | "**Answer:** That's the essense of cross-validation!" 98 | ] 99 | }, 100 | { 101 | "cell_type": "markdown", 102 | "metadata": {}, 103 | "source": [ 104 | "## Steps for K-fold cross-validation" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": {}, 110 | "source": [ 111 | "1. Split the dataset into K **equal** partitions (or \"folds\").\n", 112 | "2. Use fold 1 as the **testing set** and the union of the other folds as the **training set**.\n", 113 | "3. Calculate **testing accuracy**.\n", 114 | "4. Repeat steps 2 and 3 K times, using a **different fold** as the testing set each time.\n", 115 | "5. Use the **average testing accuracy** as the estimate of out-of-sample accuracy." 116 | ] 117 | }, 118 | { 119 | "cell_type": "markdown", 120 | "metadata": {}, 121 | "source": [ 122 | "Diagram of **5-fold cross-validation:**\n", 123 | "\n", 124 | "![5-fold cross-validation](images/07_cross_validation_diagram.png)" 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": 20, 130 | "metadata": {}, 131 | "outputs": [ 132 | { 133 | "name": "stdout", 134 | "output_type": "stream", 135 | "text": [ 136 | "TRAIN: [ 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47\n", 137 | " 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65\n", 138 | " 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83\n", 139 | " 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101\n", 140 | " 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119\n", 141 | " 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137\n", 142 | " 138 139 140 141 142 143 144 145 146 147 148 149] TEST: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24\n", 143 | " 25 26 27 28 29]\n", 144 | "TRAIN: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17\n", 145 | " 18 19 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65\n", 146 | " 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83\n", 147 | " 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101\n", 148 | " 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119\n", 149 | " 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137\n", 150 | " 138 139 140 141 142 143 144 145 146 147 148 149] TEST: [30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54\n", 151 | " 55 56 57 58 59]\n", 152 | "TRAIN: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17\n", 153 | " 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35\n", 154 | " 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53\n", 155 | " 54 55 56 57 58 59 90 91 92 93 94 95 96 97 98 99 100 101\n", 156 | " 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119\n", 157 | " 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137\n", 158 | " 138 139 140 141 142 143 144 145 146 147 148 149] TEST: [60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84\n", 159 | " 85 86 87 88 89]\n", 160 | "TRAIN: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17\n", 161 | " 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35\n", 162 | " 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53\n", 163 | " 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71\n", 164 | " 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89\n", 165 | " 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137\n", 166 | " 138 139 140 141 142 143 144 145 146 147 148 149] TEST: [ 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107\n", 167 | " 108 109 110 111 112 113 114 115 116 117 118 119]\n", 168 | "TRAIN: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17\n", 169 | " 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35\n", 170 | " 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53\n", 171 | " 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71\n", 172 | " 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89\n", 173 | " 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107\n", 174 | " 108 109 110 111 112 113 114 115 116 117 118 119] TEST: [120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137\n", 175 | " 138 139 140 141 142 143 144 145 146 147 148 149]\n" 176 | ] 177 | } 178 | ], 179 | "source": [ 180 | "# simulate splitting a dataset of 25 observations into 5 folds\n", 181 | "#from sklearn.cross_validation import KFold\n", 182 | "from sklearn.model_selection import KFold\n", 183 | "#kf = KFold(25, n_folds=5, shuffle=False)\n", 184 | "kf = KFold(n_splits=5, shuffle=False)\n", 185 | "#kf = kf.get_n_splits(X,y)\n", 186 | "#and testing set\n", 187 | "#print('{} {:^61} {}'.format('Iteration', 'Training set observations', 'Testing set observations'))\n", 188 | "for train_index, test_index in kf.split(X):\n", 189 | " print(\"TRAIN:\", train_index, \"TEST:\", test_index)\n", 190 | " #Generate Training and Test datasets according to these indices\n", 191 | " X_train, X_test = X[train_index], X[test_index]\n", 192 | " y_train, y_test = y[train_index], y[test_index\n", 193 | " # CODE FOR EVALUATION HERE\n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " ]\n", 198 | "#for iteration, data in enumerate(kf, start=1):\n", 199 | "# print('{:^9} {} {:^25}'.format(iteration, data[0], data[1]))" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "\n", 207 | "- 5-fold cross-validation, thus it runs for **5 iterations**\n", 208 | "- For each iteration, every observation is either in the training set or the testing set, **but not both**\n", 209 | "- Every observation is in the testing set **exactly once**" 210 | ] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "metadata": {}, 215 | "source": [ 216 | "## Comparing cross-validation to train/test split" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": {}, 222 | "source": [ 223 | "Advantages of **cross-validation:**\n", 224 | "\n", 225 | "- More accurate estimate of out-of-sample accuracy\n", 226 | "- More \"efficient\" use of data (every observation is used for both training and testing)\n", 227 | "\n", 228 | "Advantages of **train/test split:**\n", 229 | "\n", 230 | "- Runs K times faster than K-fold cross-validation\n", 231 | "- Simpler to examine the detailed results of the testing process" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": {}, 237 | "source": [ 238 | "## Cross-validation recommendations" 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "metadata": {}, 244 | "source": [ 245 | "1. K can be any number, but **K=10** is generally recommended\n", 246 | "2. For classification problems, **stratified sampling** is recommended for creating the folds\n", 247 | " - Each response class should be represented with equal proportions in each of the K folds\n", 248 | " - scikit-learn's `cross_val_score` function does this by default" 249 | ] 250 | }, 251 | { 252 | "cell_type": "markdown", 253 | "metadata": {}, 254 | "source": [ 255 | "## Cross-validation example: parameter tuning" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "**Goal:** Select the best tuning parameters (aka \"hyperparameters\") for KNN on the iris dataset" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": 23, 268 | "metadata": { 269 | "collapsed": true 270 | }, 271 | "outputs": [], 272 | "source": [ 273 | "from sklearn.cross_validation import cross_val_score" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": 29, 279 | "metadata": {}, 280 | "outputs": [ 281 | { 282 | "name": "stdout", 283 | "output_type": "stream", 284 | "text": [ 285 | "[ 1. 0.93333333 1. 1. 0.86666667 0.93333333\n", 286 | " 0.93333333 1. 1. 1. ]\n" 287 | ] 288 | } 289 | ], 290 | "source": [ 291 | "# 10-fold cross-validation with K=5 for KNN (the n_neighbors parameter)\n", 292 | "knn = KNeighborsClassifier(n_neighbors=5)\n", 293 | "scores = cross_val_score(knn, X, y, cv=10, scoring='accuracy')\n", 294 | "print(scores)\n" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 25, 300 | "metadata": {}, 301 | "outputs": [ 302 | { 303 | "name": "stdout", 304 | "output_type": "stream", 305 | "text": [ 306 | "0.966666666667\n" 307 | ] 308 | } 309 | ], 310 | "source": [ 311 | "# use average accuracy as an estimate of out-of-sample accuracy\n", 312 | "print(scores.mean())" 313 | ] 314 | }, 315 | { 316 | "cell_type": "code", 317 | "execution_count": 30, 318 | "metadata": {}, 319 | "outputs": [ 320 | { 321 | "name": "stdout", 322 | "output_type": "stream", 323 | "text": [ 324 | "[0.95999999999999996, 0.95333333333333337, 0.96666666666666656, 0.96666666666666656, 0.96666666666666679, 0.96666666666666679, 0.96666666666666679, 0.96666666666666679, 0.97333333333333338, 0.96666666666666679, 0.96666666666666679, 0.97333333333333338, 0.98000000000000009, 0.97333333333333338, 0.97333333333333338, 0.97333333333333338, 0.97333333333333338, 0.98000000000000009, 0.97333333333333338, 0.98000000000000009, 0.96666666666666656, 0.96666666666666656, 0.97333333333333338, 0.95999999999999996, 0.96666666666666656, 0.95999999999999996, 0.96666666666666656, 0.95333333333333337, 0.95333333333333337, 0.95333333333333337]\n" 325 | ] 326 | } 327 | ], 328 | "source": [ 329 | "# search for an optimal value of K for KNN\n", 330 | "k_range = list(range(1, 31))\n", 331 | "k_scores = []\n", 332 | "for k in k_range:\n", 333 | " knn = KNeighborsClassifier(n_neighbors=k)\n", 334 | " scores = cross_val_score(knn, X, y, cv=10, scoring='accuracy')\n", 335 | " k_scores.append(scores.mean())\n", 336 | "print(k_scores)" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": 27, 342 | "metadata": {}, 343 | "outputs": [ 344 | { 345 | "data": { 346 | "text/plain": [ 347 | "" 348 | ] 349 | }, 350 | "execution_count": 27, 351 | "metadata": {}, 352 | "output_type": "execute_result" 353 | }, 354 | { 355 | "data": { 356 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZIAAAEKCAYAAAA4t9PUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl4W/d14P3vIbiAIglAC0VQImx5t2VLpBPVddI0kzTN\n4mZxk3SJm61u0sSdxpN0+rb1m25Jp9PXk0za5m0z8aRpUrfNvrjxdDxxEydp2tSNLVuAJduSF0UW\nKJHUYgHgvuHMH/deCiIB8GIjCPB8nkcPgYt7L34QSBzc33KOqCrGGGNMuVrq3QBjjDGNzQKJMcaY\nilggMcYYUxELJMYYYypigcQYY0xFLJAYY4ypiAUSY4wxFbFAYowxpiIWSIwxxlSktd4NWAvbtm3T\nXbt21bsZxhjTUB555JEzqtq72n4bIpDs2rWL/fv317sZxhjTUETkOT/7WdeWMcaYilggMcYYUxEL\nJMYYYypigcQYY0xFLJAYY4ypSE0DiYi8RkSOiMgzInJHnsc3i8g9IvKYiDwkItflPPYbIvK4iBwS\nkS+ISNDdvkVEviUiT7s/N9fyNRhjjCmuZoFERALAJ4CbgN3ALSKye9luHwTiqroXeAfwcffYncB/\nAvap6nVAAHiLe8wdwAOqegXwgHvfGGNMndTyiuQG4BlVPaqqc8AXgZuX7bMb+A6Aqh4GdolIn/tY\nK9ApIq3AJuCku/1m4G739t3Az9buJZiN6KmxcX7wzJl6N6NqslnlSw8fZ2puoarn/PLDSabnFqt2\nTtO4ahlIdgLJnPvD7rZcCeBNACJyA3AxMKCqJ4D/DhwHRoC0qv6Te0yfqo64t0eBPvIQkfeIyH4R\n2X/69OlqvB6zQfzJfU/y659/FFWtd1Oq4pHj5/idrx3k64+eqNo5Hzr2PL/9tcf4pydGq3ZO07jq\nPdh+JxARkThwO3AAWHTHPW4GLgF2AF0i8rblB6vzl573r11VP6Wq+1R1X2/vqiv8jQFAVUkkU6Sm\n5jl2dqrezamK+PGU8zOZqt453XOdTM1U7ZymcdUykJwAYjn3B9xtS1Q1o6q3quoQzhhJL3AU+Gng\nR6p6WlXnga8DL3YPGxORfgD356kavgazwRx/fopzU/MAJKr4wVtP8WHndVTz9XjnGstYIDG1DSQP\nA1eIyCUi0o4zWH5v7g4iEnEfA3g38H1VzeB0ad0oIptERIBXAE+6+90LvNO9/U7gGzV8DWaDyf3W\nXs1v8PXkfeg/c3qC8Zn5qp5zJD1dlfOZxlazQKKqC8D7gPtxgsCXVfVxEblNRG5zd7sGOCQiR3Bm\nd73fPfaHwFeBR4GDbjs/5R5zJ/BKEXka58rlzlq9BrPxxJMpgm0tvPDizU0RSM5MzDJ8bpqfvGIb\nqnBwOF3xOU9lZjiZdq5ERjOzFZ/PNL6aZv9V1fuA+5Ztuyvn9oPAlQWO/UPgD/NsP4tzhWJM1SWS\nKfbsDPOCizbz2R8cY24hS3trvYcSy+ddObz9xov5l6fPEB9O8eLLt1V0Ti/AXrqti7G0dW2Z+g+2\nG7NuzC9mOXQyw+BAhMFYhLnFLIdHM/VuVkUSyRQtAi+5YhuXbOuqyjhJYjhFoEX4qau3c3pilsVs\nc8xuM+WzQGKM6/DIOHMLWQZjTiCBxh8nOZBMcWVfD5vaWxkcCFfl9cSTKa6O9rBrWxeLWeXMhHVv\nbXQWSIxxebObhmIRdoSD9PZ0NHQg8aYyX3+RExSHYhHGMrOMVtAdlc0qjyXTDMUiRENBAEase2vD\ns0BijCt+PMXWrnYGNnciIgwORBo6kPzozCSZmQUGB5xAcv4q61zZ5zx6ZoLx2QUGYxGiYSeQVBKY\nTHOwQGKMKzGcYigWwZlxDtdfFOHo6UnS09WZMrvWEt4VlntFsntHiLaAEE+WP3PLO3YoFqHPvSKx\ntSTGAokxQGZmnmdPTyx9aweWvslXY8psPSSSaTa1B7hiew8AHa0BdveHKhpwTyRTdHe0cllvN1u7\n2mkLCKMWSDY8CyTG4AQLVS4IJHsGwkBlXUH1dCCZ4rqdYQItsrRtMBbhseFU2TOt4u706ECL0NIi\nbO8J2hRgY4HEGDg/O2vQDR4A4c42LuvtqqgrqF5mFxZ58mSG63MCIzhdUpNzizx7eqLkc87ML/Lk\nSOaCYBsNB+2KxFggMQacLptLtnUR2dR+wfbBmDPg3miZgJ8cGWduMXvBhz5Q0bTmJ0YyLGSVodxA\nEgraYLuxQGKMqhJPpi64GvEMxSKcmZhdSgnSKLxxkOWB5JKtXfQEW8sKJF4W4dxA0hdyrkgaLdCa\n6rJAYja80cwMp8ZnL/iA9HjbGi0TcCKZorengx3uFF1PS4swFIuU9XoSwymioeDStF+AaLiDqblF\nxmerVzTLNB4LJGbDK/TtHeDqaIj21paGW0/iXGGdn8qca3AgwuHRcWbmS6tumEimGIxdeNW2NAW4\nwa7YTHVZIDEb3oFkiraAsHtHaMVj7a0tXLsj1FCBJD01z9Ezk0sr2pcbikVYzCqHTvifRHBuco5j\nZ6dWBNv+cCeADbhvcBZIzIaXSKbY3R+iozWQ9/HBgQgHh9MsLGbXuGXleeyENwMtfyDZG/OmNfsP\njonhleMjgKVJMYAFErPBLWaVg8PpvN1anqFYhOn5RZ4+VfqU2XrwBsX35Jk8ALC9J8jOSGdJgSSe\nTCECe3ZeeM7toQ7AurY2OgskZkN75tQEk3OLeQfaPY024J4YTnFZbxfhzraC+wzFIktXGb7OmUxx\neW83PcELzxlsC7B5U5t1bW1wFkjMhlZsoN1z8dZNhDvbGmKcZGkqc5HXAzAYC5N8fpqzPlLAqyqJ\n4XTBYNsXClq+rQ3OAonZ0A4kU/QEW7lka1fBfURkaWHienciNc2ZibmiV1hwfvzEz1VJ8vlpnp+c\nKxicbHW7sUBiNrRE0sn429KycppsrqFYhKfGxpmaW9/rJRI52XmL2TMQpkXwlf4lXmCg3dMfttXt\nG50FErNhTc8tcmRsvODsplxDsTBZXf+ZgOPJc7S3tnB1dOVU5lyb2lu5sq/H17hPIpmio7WFq6I9\neR/vCwU5MzHH3EJjzGoz1WeBxGxYh06mWVyWO6qQUrqC6imRTHPtDmcR5Wquv8gZcF8tvUnczSLc\nFsh/Tm8K8KlxuyrZqCyQmA3L+za+N5Z/mmyurd0dxLZ0LnUdrUcLi1kOnkj7usICJzimpuZ57uxU\nwX3mF7McOlF4oB2gL2wFrjY6CyRmwzqQTLEz0sn2nuDqO8O6L7371NgE0/PFpzLn8gbPi11lHRkd\nZ3ZhZRbhXN4VyWh69RlgpjlZIDEbljfQ7tdQLMKJ1PS67cIptPq8kCv7etjUHuDA8cKBxAucy+ua\n5OoPe6vbp/021TQZCyRmQzozMcvwuekVSQiLOb8wcX12b8WPpwh3tnHx1k2+9g+0CNftDBe9Ikkk\nU2zpamdgc2fBfcKdbXS0tljX1gZmgcRsSEsLEX2OJwBcu8MpMbteV7gnhp2FiPky/hYyFIvw+MlM\nwRlXXp2WYucUEXctiXVtbVQWSMyGlEimaJHC+ajy6WwPcHW0Z13O3JqcXeCpsfGSuurACSRzC1kO\nj2ZWPDY+M88zpycYim1e9Tx9IavdvpFZIDEbUnw47Y4RtJZ0nLfCPZtdXxUBD55Ik1VnvUspBovk\nETt4Io0qvrr/oiFb3b6R1TSQiMhrROSIiDwjInfkeXyziNwjIo+JyEMicp27/SoRief8y4jIB9zH\nPiQiJ3Ie+5lavgbTfFSVRDJVsF5HMUOxCOMzC/zo7GQNWla+crrqAHaEg/T2dHAgTyCJl3BOL02K\nldzdmGoWSEQkAHwCuAnYDdwiIruX7fZBIK6qe4F3AB8HUNUjqjqkqkPAC4Ep4J6c4/7Me1xV76vV\nazDN6djZKdLT8yV/6ML6zQScGE4R29LJ1u6Oko4TEQYH8pfeTSRT7Nq6ic1d7aueJxoKMreQ5dzU\nfEnPb5rDqoFERF4vIuUEnBuAZ1T1qKrOAV8Ebl62z27gOwCqehjYJSJ9y/Z5BfCsqj5XRhuMWcFP\nxt9CLuvtpqs9sO7Wk8SPp8oKjOB0hz17epLMzIVBIJEsXqcll1fH3XJubUx+AsQvAk+LyEdE5OoS\nzr0TSObcH3a35UoAbwIQkRuAi4GBZfu8BfjCsm23u91hnxGRvCOBIvIeEdkvIvtPnz5dQrNNs4sn\nU2xqD3BlX/7cUcUEWoS9Bb7B18upzAwn0zMlD7R7vMH0x3KmNY+mZxjN+D/nUu12GyfZkFYNJKr6\nNuB64Fngb0TkQfdDuvS/wpXuBCIiEgduBw4Ai96DItIOvAH4Ss4xnwQuBYaAEeBjBdr9KVXdp6r7\nent7q9BU0yy83FGBVTL+FjIYi/DESIaZ+cXVd14D3tVRuYHEm7mWOxstXuJV29IViQWSDclXl5Wq\nZoCv4nRP9QNvBB4VkduLHHYCiOXcH3C3XXBeVb3VHQt5B9ALHM3Z5SbgUVUdyzlmTFUXVTUL/BVO\nF5oxvswuLPLEyUzZH7rgdAXNLypPjqycMlsPieEUgRbh2h2lzdjyhDvbuLS364IV7vFkiraAsLu/\neBZhz/aeDkSsa2uj8jNG8gYRuQf4HtAG3KCqNwGDwG8WOfRh4AoRucS9sngLcO+yc0fcxwDeDXzf\nDVqeW1jWrSUi/Tl33wgcWu01GOM5PDLO3GK2wkDidAWtl+6tRDLN1dEeOtsDZZ9jyJ3W7M26SiRT\nXNMfItjm75xtgRa2dXdYINmg/FyRvBlnltQeVf2oqp4CUNUp4F2FDlLVBeB9wP3Ak8CXVfVxEblN\nRG5zd7sGOCQiR3CuPt7vHS8iXcArga8vO/VHROSgiDwGvBz4DT8v1Bg4331TzkC7JxoO0hfqWBcD\n7tmsM5W5ktcDTiA5MzHLSHqGxayWlEXYY2tJNi4/q7E+hDMWAYCIdAJ9qnpMVR8odqA7Nfe+Zdvu\nyrn9IHBlgWMnga15tr/dR5uNySt+PEVvTwc7wv4y/hYyFIuQWAdFro6emWR8dqGiKyw4P74ST6a4\nfHs3E7MLJQenvlCQ4XOFU9Kb5uXniuQrQG4inkUuHPw2pmHEh51psqXko8pnMBbhR2cmSU3NVall\n5UlUONDuuToaoj3QQiKZKnvwPhrusCuSDcpPIGl114EA4N5efYWSMetMenqeo6cnS04jks/QUsXE\n+l6VxJMputoDXNbbXdF52ltb2L0jRDyZIpFM0RNs5dJtXSWdIxoKkpqaXzez2cza8RNITovIG7w7\nInIzcKZ2TTKmNh5bqtexehLC1ewZCCNS/wH3xHCKvQORsqcy5xqKRTh4Is0jz51jcCBCS4nn7AvZ\nosSNyk8guQ34oIgcF5Ek8DvAe2vbLGOqz/vQLyXjbyE9wTYu7+2u64D7zPwiT45kKh5o9wzFIkzN\nLXJ4dLykOi2e/rBTs8S6tzaeVQfbVfVZ4EYR6XbvT9S8VcbUQDyZ4tLeLsKdbVU532AswncPn0JV\nKx5zKccTIxnmF7UqXXVw4Uy2ctKtRMNOni9b3b7x+MqhLSKvBa4Fgt4fjKr+UQ3bZZqQqqJKyV0m\n1XrueDLNS6/cVrVzDsUifPWRYQ6dyCyt7F5LDz571m1H5V11ALu2biLc2UZ6er6swfu17NrKZhUR\n6hLAzUqrBhIRuQvYhLNm49PAzwEP1bhdpgn9t28e4cFnz/CN971kzZ/7ZHqGMxOzZSc2zMdLQ//6\nv/zXqp2zVNFQsGpBTES4/qIIT49NsD1U+jl7gm10tQfWpGvr975xiJHUNJ+91RJbrAd+rkherKp7\nReQxVf2wiHwM+D+1bphpPt87corDo+M8PznHFh+pyaspfrw602Rz7e4P8Ze/dH1dU6dft8NfChO/\n/svN1zE+s1D28X3h4Jp0bX3v8CnmFq32yXrhJ5B4vxVTIrIDOIuTb8sY36bmnFKw4Mw0evlV29f0\n+RPDKdoDLVzjM3eUHyLC6/buqNr51oPYlk0VHd8fDjJS464tL9uxCMwvZmkLWKHXevPzDvwvEYkA\nHwUeBY4Bn69lo0zzOXQig1edth5TZuPJFLt3hGhvtQ+dWlqL2u3e2h1VODU+W9PnMv4U/atyC1o9\noKopVf0aTr2Qq1X1D9akdaZpxJPnAKdPf62nzC4sZjk4nK5qt5bJLxoKcmp8tqY17b3fJbA1K+tF\n0UDipmr/RM79WVWtf4Ih03ASyTSxLZ289MptJHKyzK6Fp09NMD2/aIFkDUTDQRayypnJ2l0pJJJp\nNrmZji2QrA9+rvMfEJE3i82zMxWIJ50cV4OxCOem5jn+/Nol96uktK4pzVKlxHRtAomX7dgbY7PF\nj+uDn0DyXpwkjbMikhGRcRFZHxV9TEM4NT7DidQ0Q7HIBVlm10o8mSLc2caurZUNJJvVRd1AMpKe\nrsn5vWzH/+GqXtpbW2zx4zrhp9Ruj6q2qGq7qobc+9Wdc2iamlcLfCgW4cq+HoJtLSSSa9dDGnfr\nddhFde31h2tbu927urw+FnHqn1jX1rrgZ0HiS/NtV9XvV785phnFk+dLwbYFWrhuR/iCAdNampx1\nph2/anffmjzfRre1u4NAi9SsyymeTNHd0cqlvd1Ew1ZIa73ws47kt3JuB3FqpD8C/FRNWmSaTmI4\ndUEp2KFYhL/99+fWZA3AoRNpsgpDF9n4yFoItAjbezoYrdEYiZPtOEygReoyA9Dk56dr6/U5/14J\nXAeszddJ0/DylYIdjEWYW8hyZHS85s+/VFq3iqlRTHF9odqsbl+e7di7IlnLGYAmv3K+Dg7j1Fo3\nZlU/OjtJZubCUrDe7QNr8G0ynkwR29LJ1u6Omj+XcURDwZoMtnvZjr0vBX2hIHMLWVJ1TFFjHH7G\nSP4C8EJ+CzCEs8LdmFXlKwU7sLmTrV3tJJIp3n7jxTV+/vRSckWzNqLhIP/6TPVr3y0NtLvvpzdD\nbDQzw+Y1zt1mLuRnjGR/zu0F4Auq+oMatcc0mXylYEWEwVik5v3b3rTjW39iV02fx1woGg4yMbvA\nxOwC3R2+KlX4Ek+miIaCS2tVvPono5mZquZQM6Xz8y5/FZhR1UUAEQmIyCZVXbsVZaZhJZL5S8EO\nxSJ898gpMjPzhILVKTS18rnPTzs2ayeaU5fk8u2V1ZLP5Yy1nS/iFfUqMtoU4LrztbId6My53wl8\nuzbNMc1kdmGRJwqUgh2MRVCFQ8O1W0+SyJl2bNbO0ur2Kg64p6bmOHZ26oIiXtt7OhCxQLIe+Akk\nwdzyuu5tWyJsVvXEycKlYAfduum1HHBPDKe4qu/8tGOzNrxCW9VMJx9fSnNz/nepLdDC1q4OW92+\nDvgJJJMi8gLvjoi8EKhN/gPTVM4PtK8sBRvZ1M4l27pqllI+m1XiyZStH6mDaA2uSBLJNCKwZ+eF\nX0qi4Q5blLgO+Bkj+QDwFRE5CQgQBX6xpq0yTSGeTNEX6ihYCnZwIMy/PXsWVa16+pIfnZ1kfGaB\nIVs/suY62wOEO9uq2uUUT57jiu3d9CwbT4uGggyfs++19eZnQeLDwNXArwG3Adeo6iO1bphpfIlV\naoAMxSKcGp+tyTfKpdK6dkVSF9FQ9dKXqCqJ4XTeRaWWJmV9WDWQiMivA12qekhVDwHdIvIfa980\n08hSU3P86Mxk0dTt3mO16N5KDK+cdmzWTjVrtw+fm+b5ybm8v0vRUJDU1Dwz84tVeS5THj9jJL+q\nqkt/6ap6DvhVPycXkdeIyBEReUZE7sjz+GYRuUdEHhORh0TkOnf7VSISz/mXEZEPuI9tEZFvicjT\n7s+VHfCm7rxyqMW6lq7pD9EWkJoMuCeSKfa4OZnM2ouGOqrWtXUgz6JWTy1miJnS+QkkgdyiViIS\nAFZdRuru9wngJmA3cIuI7F622weBuKruBd4BfBxAVY+o6pCqDgEvBKaAe9xj7sAp/3sFztTkFQHK\n1F8imXIGRwcKT70NtgXY3R+q+hXJzLwz7TjfIL9ZG9FQkNMTs8wvZis+VyKZoqO1hauiPSufJ3x+\nzYqpHz+B5JvAl0TkFSLyCuAL7rbV3AA8o6pHVXUO+CJw87J9dgPfAVDVw8AuEVme7/sVwLOq+px7\n/2bgbvf23cDP+miLWWOJZIrLe1cOji43GItwcDjNYhVrfD85UnjasVkb0XAnqnB6vPIswIlkiut2\nhvNmis5Nk2Lqx08g+R2cD/tfc/89wIWp5QvZCSRz7g+723IlgDcBiMgNwMXAwLJ93oITvDx9qjri\n3h4F8haaEJH3iMh+Edl/+vRpH8011aKqS8WkVjM4EGFybpFnTk2suq9fVlq3/nLTl1RifjHLwROF\nJ2302RXJuuBn1lZWVe9S1Z9T1Z8D7gN+s0rPfycQEZE4cDtwAFgaNRORduANOKV+87VNOZ9Qcvlj\nn1LVfaq6r7e3t0rNNX4Mn5vm7OScr9Qk3qyqanZvedOO+8Odq+9sauJ87fbKPuCPjI4zu5At+KWg\np6OVrvaAXZHUma808iLSKyL/UUT+BfgeBa4CljkBxHLuD7jblqhqRlVvdcdC3gH0AkdzdrkJeFRV\nx3K2jYlIv9uufuCUn9dg1k68yODocpds7aIn2FrVAfdCU0XN2qlWl5P3u3R9gd8lEanqDDFTnoKB\nRER6ROSdInI/8BBwGXCJql6mqv+Pj3M/DFwhIpe4VxZvAe5d9hwR9zGAdwPfV9VMzi63cGG3Fu45\n3unefifwDR9tMWuo2ODoci0twlAsUrUrEm/asa0fqa8tXe20B1oq7nJKJFNs6WpnYHPhq0ur3V5/\nxa5ITgG/AvwxcKmq/iYw5/fEqroAvA+4H3gS+LKqPi4it4nIbe5u1wCHROQIztXH+73jRaQLeCXw\n9WWnvhN4pYg8Dfy0e9+sI4nhwoOj+QwORDgyNs70XOVrAfxMOza1JyJsD1WeviQxnGJwIFw080E0\nFGQsU5vSvsafYilS/l+cq4j/AXxBRL5U6slV9T6cMZXcbXfl3H4QuLLAsZPA1jzbz+LM5DLrkDc4\n+ks3+C9YNRiLsJhVDp1M82O7tlT0/H6mHZu10R+u7EphfGaep09N8No9O4ru53VtZbNKi60bqouC\nXxlV9c9V9UbOT9n9B2CHiPyOiOT98DfmqbFxZuazJXUteRldq9G95Xfasam9Smu3HzyRRvXCjL/5\n9IeDLGSVM5N2VVIvfmZtHVXVP1HVPcA+IMSyqwxjPEsD7SV0LW3vCbIz0lnxgHsp045N7Xn5tpzJ\nlaVbSh2/yu/S+RliFkjqxV8ntsvNt/W7qnp5rRpkGps3OBrbUtrU28FYuOIrEm/asQWS9SEaDjIz\nnyU9PV/W8Ylkil1bN61aj90WJdZfSYHEmNUkkulVB0fzGYpFGD43zZmJ8r9VrjZV1Kytvgo/4BPJ\ntK8vBUtpUiyQ1I0FElM1E7MLPHVqvKwrAq/7opKrklKmHZva669g1floeobRzIyv9UDbujsItEjF\nix9N+SyQmKo5OOwMjvpZiLjcnoEwLVJhIClx2rGprUoy8y6NtfmYtBFoEXq7O6pa2teUpuD0XxE5\nSIH0IwBuxl5jliSG/Q2O5rOpvZUr+3qIu+tASlXOtGNTW0tdW2UMgieGU7S2CLv7Q772j9rq9roq\nto7kde7PX3d//p378621a45pZPHjKS72MThayFAswv85NFpW6V1v2vFqU0XN2mlvbWFrV3tZYxfx\n4ymu6Q8RbAv42j8aCvLM6eol/jSlKbaO5Dk3dfsrVfW3VfWg++8O4FVr10TTKBLDqbK6tTxDsQjp\n6XmOnZ0q/bmTzpXM9VaDZF3pCwUZTZdWU30xq0Uz/uYTDQdtjKSO/HQmi4j8RM6dF/s8zmwgY5kZ\nRtL+BkcL8Qbp48lzJR8bT55j86a2kqcdm9rqDwcZLTF9ybOnJ5iYXShp0kZfKMj47AITswulNtFU\ngZ+A8C7gf4jIMRE5hpMy5Vdq2irTcOJVqAFyxfZuOtsCS1cXpfCmipbaJWZqq5zMvKVkj/Ys1T+x\nq5K6KDZGAoCqPgIMikjYvV/eaKhpaomkMzh67Q5/g6P5tAZa2DMQXvog8cubdnzTnmjZz21qIxoK\n8vzkHLMLi3S0+hvvSCRT9HS0cum2rhKex7kSHcvMcPn27rLaasq36hWJiPSJyF8DX1TVtIjsFpF3\nrUHbTAOJJ0sbHC1kKBbhiZMZZhf8ZwL2ph3bivb1x1t1fqqE7q14MsXeWLikBIxWu72+/HRt/Q1O\nKngvBedTwAdq1SDTeLJZ5bHh0gZHCxmKRZhbzHJ4ZNz3Md60Y0sdv/54pXD9rvGYmV/k8Oh4yb9L\nlialvvwEkm2q+mUgC0t1RiovHGGaRjmDo4WcH3D3371V6bRjUzulfsAfOpFmMaslT9robA8QCrba\nWpI68RNIJkVkK+7iRBG5EbBxErPk/OBo5Ws4doSDbOvuKGmFu1P8yK5G1iOvy8nv1NxyBtpzn8tW\nt9fHqoPtwH/GKW97mYj8AKeu+s/XtFWmoSSGvcHRygc5RZzSu/Fhf4HEm3ZcjW41U32hYCudbQHf\nVySJ4TQ7wkG2u1cypai0/okpn59A8jjwH4CrAAGOYOtITI5yBkeLGYqF+faTY6Sn5glvKl6gqhrT\njk3tiAjRcNB3IIknz5X9XvaHgxwZ9T+2ZqrHTyB5UFVfgBNQABCRR4EX1KxVTezfj57lbx88Rpm1\nftalwyPjvOell1btfN4Hya997hHCncUDyY/OTFY87djUVl+og39/9iy/9vePFN1PFZLPT/PWHy8v\nX1o0FOTMxCwLi1lay0jc+e0nxkhPz/PmFw6U9fz5fOuJMabmFrh5aGfVzrkeFUvaGAV2Ap0icj3O\n1Qg4FRI3rUHbmtIXHzrOt584xa5tzfNfeGVfDz+zp79q53vhxZu58dItnJmY9VWf5Jd+/KKKpx2b\n2nnd3h387YPHeNZHLqw9O8O8cndfWc/TFw6SVTg9MUt/uPQMB3/27acYy8zwphfsrNrC1j/91lPM\nzi9u3EACvBr4ZWAA+NOc7ePAB2vYpqY2kp5hMBbmK7e9uN5NWbc2tbfyxfe8qN7NMFXythsv5m03\n1j4rszcCpLpXAAAfmUlEQVRDbCQ9U3Ig8aYdL2aVE6lpBjZX/kVvam6Bp8bG6WhtKSsRaSMpGEhU\n9W7gbhF5s6p+bQ3b1NTGMjPssRlGxlTd+drtpQ+4e9OOwRl3q0YgOXQiw2JWmZpbZHx2gVCweDdt\nI/OTIuVrIvJa4FogmLP9j2rZsGakqoxmZnhlqKPeTTGm6fRXUHLXm7TR2iIkkilet3fHKkesLncK\n+1h6pqkDiZ8UKXcBvwjcjjNO8vOAVQ8qQ2Z6gZn57NI3J2NM9Wzpaqc90FJWIPGmHe8ZCJeVNDSf\n3Cnszb7i3s/Uhher6juAc6r6YeBFwJW1bVZz8n6ZvEVaxpjqERG2hzrK6tqKJ88xdFGEoViEgyfS\nLCxmK25P/HiKPTudRbrNngPMTyDxqtJMicgOYB6o3hSdDWTELfATtSsSY2oiGip9dfvZiVmSz08z\nOOAEkun5RZ4aq6za4unxWU6kpnn1tc4MNAsk8I8iEgE+CjwKHAO+UMtGNStv1a11bRlTG+XUP3ls\n2OnKGoxFllLtlFrKYDlvfOTHL91KZFObdW2p6n9R1ZQ7c+ti4GpV/f3aN635jKadNREWSIypjWjI\nWUWvJaz4PZBM0SLOGpaLt24isqmtpFxv+SSGUwRahOt2hIlugNQtxRYkvqnIY6jq11c7uYi8Bvg4\nEAA+rap3Lnt8M/AZ4DJgBvgVVT3kPhYBPg1ch5Mw8ldU9UER+RDwq8Bp9zQfVNX7VmvLejCamWFb\ndzvtrZZhxpha6A8HmZnPkpleWDW9jieRTHFlXw9dHc7H4eBAZKk0QbniyRRX9fXQ2R5w6tY3eSAp\n9on2evffu4C/Bt7q/vs0PkrtikgA+ARwE7AbuEVEdi/b7YNAXFX3Au/ACTqejwPfVNWrgUHgyZzH\n/kxVh9x/DRFEwOnasqsRY2qnr8S09apKYjh1QdLPoViEp8bGmSyz/ns2qySSqaVUP/3h4FJvRLMq\nGEhU9VZVvRVoA3ar6ptV9c0460n8hPobgGdU9aiqzgFfBG5ets9u4Dvu8x0GdrkVGcPAS3ECGKo6\np6qVfUVYB0bSMzbQbkwNRUtcS/Lc2SlSU/MXJIocikXIKhw8Ud404GNnJ8nMLCyVVegLBTk7Ocvc\nQuUzwdYrP30sMVUdybk/Blzk47idQDLn/rC7LVcCeBOAiNyAMwYzAFyC03X1WRE5ICKfFpHcAs63\ni8hjIvIZt3tsBRF5j4jsF5H9p0+fzrfLmhvLzCxVjDPGVN9SIa309Cp7OrwurNx6NnsHnABQ7jjJ\nUsXOmPPRFA0HUYVT483bveUnkDwgIveLyC+LyC8D/xv4dpWe/04gIiJxnAWPB3CqL7biZBf+pKpe\nD0wCd7jHfBK4FBgCRoCP5Tuxqn5KVfep6r7e3t4qNbd8swuLPD85Z1ckxtTQdjdrhN+upAPHU3S2\nBbiy73wtna3dHVy0ZVPZM7fix1N0tQe4fLtzTu9vvpkH3P2kSHmfO/D+k+6mT6nqPT7OfQKI5dwf\ncLflnjsD3AogTkazHwFHcbILD6vqD91dv4obSFR1zDteRP4K+Ecfbam7UxnnF9sWIxpTOx2tAbZ2\ntZdQSMtZNLg87fxgLMIjx54vqw3x4TR7BsIE3Po8S+M2TTxO4mv6kKp+XVV/w/3nJ4gAPAxcISKX\niEg78BacSotLRCTiPgbwbuD7qppR1VEgKSJXuY+9AnjCPSZ3MeQbgUM+21NXS6va7YrEmJryWylx\nbiHL4yczDOYpET04EOZkeoZTJV5FzC4s8uTJzAVjLpXkAGsUxab//quqvkRExnHrtXsPAaqqRSsJ\nqeqCiLwPuB9n+u9nVPVxEbnNffwu4BqcDMOKUzjrXTmnuB34nBtojuJeuQAfEZEht03HgPf6frV1\n5K1stSsSY2orGg76Wkl+eDTD3EJ2aSwj1/UXnV+Y+Kpro76f+8mRceYWswzljLlENrXR3tqyMbu2\nVPUl7s+eck/uTs29b9m2u3JuP0iBvF2qGgf25dn+9nLbU0/eL7ZN/zWmtvpCQV/jG4mlMs0rr0iu\n3eF0TSWGSwsk3jmHLjofSESkrNQtjaTYFcmWYgeqankdiBvUaGaGzrYAoaCf6sbGmHJFQ0Gen5xj\ndmGRjtbClTPjyTTbutvZGVlZBCvYFuDqaE/JA+7xZIrtPR0rurCjoWBZySQbRbFPtUdwuo/ylfVS\nnJlTxqfRzAz94WBTV0kzZj2Ihp2ZW6cys8S2FC5QFU+eYygWKfg3ORSLcG/8JNms0tLi7+/WW4i4\n/Jx94WDFaVfWs2ILEi9R1Uvdn8v/WRAp0VjaVrUbsxaibpndYoPbmZl5nj09ecH6keUGYxHGZxc4\nembS1/Omp+Y5embyglXynv5w6TnAGomvWVsisllEbhCRl3r/at2wZjOambGBdmPWwPlFiYUDyUE3\n42/uWMZy18dKywR8fiHiynP2hYLMLWRJTc37Olej8VMh8d3A93FmX33Y/fmh2jaruWSzanm2jFkj\nfgKJFxz27iwcSC7t7aa7o9V3l1QimUIE9gysHLyPlpgDrNH4uSJ5P/BjwHOq+nLgeqB5O/tq4Pmp\nOeYXlajVajem5kKdrQTbipfcjSdTXLqtq2iG4ECLsGdn2Hcm4MRwist6u/PWZvfGbZq1wJWfQDKj\nqjMAItLhJle8apVjTA5bQ2LM2vGm2xYKJKpKPJnK2wW13NBFEZ4cyTAzv1h0P++chcZcSs1K3Gj8\nBJJhtzbIPwDfEpFvAM/VtlnNZWypVvvKaYbGmOqLhgtPtx1Jz3B6fPaC1eeFDA5EmF9UnhjJFN3v\nRGqaMxNzSxl/l9ves3p3WyPzk2vrje7ND4nId4Ew8M2atqrJWHoUY9ZWNBRk/3Pn8j52fiGijysS\nb8D9eIoXXJQ30bjzuLcQMc8qeYD21ha2dXc07er2YgsS7wM+D/yDqk4AqOo/r1XDmslYeoYWgW3d\n7avvbIypmFe7Pd8akHgyRXughWv6V0/aEQ0HiYaCq46TJJIp2ltbuCpa+JzRcMeG7Nr6n8BrgR+J\nyJdF5I05CRZNCUbSM/T2dKzIMGqMqY1oKMj8ovL81NyKx+LJFNfsCBVd9Z5rMBZedeZWIpnmuh2h\nomW0oyF/OcAaUbEFid9Q1Vtwik19DacU7nER+ayIvHKtGtgMRjNWGdGYtVRoCvBiVjl4Ir20RsSP\nodhmjp2d4tzkyqAEsLCY5eCJ9KpdZc1cu33Vr8iqOqWqX3LHSl6FU1DKxkhKMGaLEY1ZU14l0uVj\nEk+fGmdqbjFvosZCvH0LdW89NTbB9PziqrPAoqEgqan5VWeANSI/CxL7ROR2EfkBzsyt+3GqFxqf\nRq1WuzFrqlANkKWB9iKpUZbbszOMiNN9lU+xFe25ogWCWzMoNtj+q8AtOGtGvgb8lqr+21o1rFlM\nzS2QmVmwWu3GrKHe7g5aZGXXVjyZJhRs5ZJtXb7P1RNs44rt3cST+WeBxY+niGxq46IiCSLhfCAZ\nTc9w8Vb/z98Iik3/fRHw/wEPqGp2jdrTdJYWI9oViTFrpjXgTLddGUjyZ+ddzeBAhAcOn0JVVxyb\nGHYWIq52zmZOk1JssP1XVPVbuUFERD60Jq1qIraGxJj6iIYvHNyemlvgqbFxXyvalxuMRXh+co7k\n89MXbJ+c9X/OvnDzLkosdT7qG2rSiibm9Yda15Yxa2t57fbHT2ZYzGpZgWRpYeKyAfeDJ9JkdfXx\nEYCejlY2tQc21hVJAVaVqUSj6VnArkiMWWv9y2q3x4/7X9G+3FXRHjpaW1asJ/Hu782T8Xc5EXFS\nt1gg4YU1aUUTG8vM0BNspavDSuwas5b6QkEyMwtMzS0AztXEwOZOtnWXnoW7LdDCnp3hFbVJ4skU\nF23ZxFaf52zWRYl+pv9+RERCItKGk7TxtIi8bQ3a1hRs6q8x9bF8UaJXBrdcg7EIh06kmV88P/eo\n1HNGQ0HGMrNlt2G98nNF8ipVzQCvA44BlwO/VctGNZMRW4xoTF1Ec9aSnJmYZfjcNEMlrB9ZbjAW\nYXYhy5HRcQBOZWY4mZ5h0Ee3lic3B1gz8RNIvD6Z1wJfUdX8q3JMXlar3Zj68P7uxjIzS2MZxUrr\nrmZ56V3v5/UlnDMaCrKQVc5MNtdViZ9A8o8ichhnfOQBEekFmq+TrwYWs8rpiVnr2jKmDs4vAJwl\nnkwRaBGu3REq+3wDmzvZ0tW+FJQSwylaW4Rrd/i/Illa3Z7eYIFEVe8AXgzsU9V5YBK4udYNawZn\nJmZZzKp1bRlTB90drfR0tDKWmSGeTHFlXw+b2suf9CIiDA6EL7giubq/h2CbvyzC0LyLEv0Mtv88\nMK+qiyLye8DfAztq3rImYKvajamvvnCQk6lpEj5L665mKLaZZ05PkJ6e57FkuqScXXDhuE0z8dO1\n9fuqOi4iLwF+Gvhr4JO1bVZzGLFa7cbUlVcpMTOzULAMbikGY2FU4d7EScZnF0qeBbatu4NAixQs\nA9yo/AQSL+fxa4FPqer/BqzAlQ9Lq9rtisSYuugLBXnerSNSqAxuKbyrmr/9t2MAJdU1AQi0CL3d\nHUtfMpuFn0ByQkT+J/CLwH0i0uHzOETkNSJyRESeEZE78jy+WUTuEZHHROQhEbku57GIiHxVRA6L\nyJMi8iJ3+xYR+ZaIPO3+rPy3o0ZGMzO0BYStXRZ3jamHaNhZKLipPcDl27srPl9kUzu7tm7i6VMT\ndHe0cmlv6edsxtXtfgLCL+DUIHm1qqaALfhYRyIiAeATwE3AbuAWEdm9bLcPAnFV3YtTgfHjOY99\nHPimql4NDAJPutvvwMlIfAXwgHt/XRpLz7C9J7iiZrQxZm1Ew52AU1MkUKW/Q687q9xzRpuwUqKv\nConAs8CrReR9wHZV/Scf574BeEZVj6rqHPBFVs722g18x32ew8Aut5BWGHgpzngMqjrnBjHcc9zt\n3r4b+FkfbSnLydQ0Dz57tuzjR20xojF15U10qWT9yHJe91a554yGgxtvjERE3g98Dtju/vt7Ebnd\nx7l3Asmc+8PutlwJ4E3u89yAUx9+ALgEOA18VkQOiMinRcSrBNOnqiPu7VGgr0C73yMi+0Vk/+nT\np300d6W/+M7TvPfv9qNa3ipUq9VuTH15BaxuvGRr1c55wyVbLvhZqr5QkPHZBSZnF6rWpnrz07X1\nLuDHVfUPVPUPgBuBX63S898JREQkDtwOHMAZ3G/FKef7SVW9HmftyoouLHU+4fN+yqvqp1R1n6ru\n6+3tLatxgwMRMjML/OjMZMnHqiqjtqrdmLq6fHs3//xbL+NlV5X3GZDPtTvCzjmvLO+c3rhNM3Vv\n+QkkwvmZW7i3/XQMngBiOfcH3G1LVDWjqreq6hDOGEkvcBTn6mVYVX/o7vpVzteJHxORfgD35ykf\nbSmLd+maWFaDwI/x2QWm5haXfmmMMfVx8daukisi1vKc0ZAzbtNMWYD9BJLPAj8UkQ+5FRL/HXfs\nYhUPA1eIyCUi0g68Bbg3dwd3ZpY3pendwPfd4DIKJEXkKvexVwBPuLfvBd7p3n4n8A0fbSnLFdt7\n2NQeIJEsPb2Y1wdqVyTGmFzRJqyUuGq+AFX9UxH5HvASd9OtqnrAx3EL7uD8/UAA+IyqPi4it7mP\n3wVcA9wtIgo8jtON5rkd+JwbaI4Ct7rb7wS+LCLvAp7DmVVWE4EW4bqdYQ4kS78i8S5b+91ZI8YY\nA82ZJqVoIHGn8D7uTsF9tNSTq+p9wH3Ltt2Vc/tB4MoCx8aBfXm2n8W5QlkT18cifPYHx5hdWKSj\n1X9OHUuPYozJp7M9QCjY2lRrSYp2banqInBERC5ao/asO4OxCHOLWQ6PjJd0nBdItodsjMQYc6Fo\nuLkqJfpJhbkZeFxEHsKZPQWAqr6hZq1aRwZzahCUkldnNDPD5k1tJWUGNcZsDH2h5lrd7ieQ/H7N\nW7GO7QgH6e3pWKpB4NdYxqb+GmPy6w8HlyotNoOCgURELsdZ/PfPy7a/BBjJf1TzcWoQRJZqEPhl\nq9qNMYVEQ0HOTMyysJilNeArdeG6VuwV/DmQybM97T62YVx/UYSjZyZJT837PmY0PUu/BRJjTB59\n4SBZhdMTzVEpsVgg6VPVg8s3utt21axF65BXvOaxE/6uSuYWspydnLWuLWNMXktTgJtkwL1YICk2\nsryhFkfsGXAK4sSP+wskp8ZnULWpv8aY/Lwvmc0y4F4skOwXkRU5tUTk3cAjtWvS+hPubOOy3i7f\nqVKWClpZ15YxJg+v27tZClwVm7X1AeAeEXkr5wPHPpzqiG+sdcPWm8FYhO8/dQZVXTXHzmja6fe0\nKxJjTD5butppD7Q0zer2glckqjqmqi8GPgwcc/99WFVf5ObC2lCGYhHOTMxyIjW96r7n06NYIDHG\nrCQibA91NE1dEj+5tr4LfHcN2rKuecVsEsk0A5s3Fd13LDNDR2sL4c62tWiaMaYBNVOlxMafwLxG\nro6GaG9tIZ48t+q+I2lnDUm1U1cbY5pHXzjIWKb5p/+aHO2tLVy7I+QrpfyYFbQyxqyiP+Tk2yq3\nAut6YoGkBIMDEQ6eSLOwmC26n5XYNcasJhoOMj2/SGa68UvuWiApwVAswvT8Ik+NTRTcR1UtPYox\nZlV9TVSXxAJJCZYG3IusJ0lNzTO3kLUrEmNMUUuVEi2QbCwXb91EuLOtaCZg75fCrkiMMcV4Xzab\nYQqwBZISiAiDseKZgEetVrsxxgev6J1dkWxAQ7EIT42NMzmbf4DMrkiMMX50tAbY2tVugWQjGoqF\nySocPJF/GvBoegYR2N5jJXaNMcX1hZqj5K4FkhJ5KeULjZOMZWbY1t1BWxMUqzHG1Faz1G63T7sS\nbe3uILals+DMLVtDYozxq1lqt1sgKcPgQKRgbZJRW9VujPGpPxzk7OQcswuL9W5KRSyQlGEoFuFk\neoZTeb5JOIsRbXzEGLM6r/fiVIPn3LJAUgZvYeLyacAz84ukpuata8sY40tfkyxKtEBShmt3hAm0\nyIpxkqXKiBZIjDE+NEvtdgskZehsD3B1tGdFJmDvl6E/vKFK2htjyhRtktrtFkjKNBiLkEimyGbP\np4A+vxjRxkiMMasLdbbS2RawK5JiROQ1InJERJ4RkTvyPL5ZRO4RkcdE5CERuS7nsWMiclBE4iKy\nP2f7h0TkhLs9LiI/U8vXUMhQLML47AJHz0wubbP0KMaYUoiIs5bErkjyE5EA8AngJmA3cIuI7F62\n2weBuKruBd4BfHzZ4y9X1SFV3bds+5+524dU9b5atH8150vvnh8nGc3M0NUeoCdoJXaNMf70hTqs\na6uIG4BnVPWoqs4BXwRuXrbPbuA7AKp6GNglIn01bFPVXNbbTVd74IKZW2OZmaVZGMYY40c0FGTE\nurYK2gkkc+4Pu9tyJYA3AYjIDcDFwID7mALfFpFHROQ9y4673e0O+4yIbK5+01cXaBH2DkQumLk1\nmrZV7caY0vSFg5zKzDZ0yd16D7bfCUREJA7cDhwAvCWeL1HVIZyusV8XkZe62z8JXAoMASPAx/Kd\nWETeIyL7RWT/6dOna9L4wViEJ0cyzMw7TR7LzFrWX2NMSaKhIHOLWZ6fnKt3U8pWy0ByAojl3B9w\nty1R1Yyq3uoGjHcAvcBR97ET7s9TwD04XWWo6piqLqpqFvgrb/tyqvopVd2nqvt6e3ur+8pcQ7EI\n84vKEyMZslllzPJsGWNK1N8EixJrGUgeBq4QkUtEpB14C3Bv7g4iEnEfA3g38H1VzYhIl4j0uPt0\nAa8CDrn3+3NO8UZvez3kDrifmZxlIat2RWKMKUlfE6wlaa3ViVV1QUTeB9wPBIDPqOrjInKb+/hd\nwDXA3SKiwOPAu9zD+4B7RMRr4+dV9ZvuYx8RkSGcMZRjwHtr9RpWEw0H6Qt1EE+m2HfxFsCm/hpj\nSuN9+WzkAfeaBRIAd2rufcu23ZVz+0HgyjzHHQUGC5zz7VVuZkWG3IWJS4sRLZAYY0rQ291BizR2\n7fZ6D7Y3vMFYhGNnpzgymgHO93caY4wfrYEWtnV32BjJRjbkVky8//ExAi3C1m5Lj2KMKU1/OMho\nA6eSt0BSoT0DYUScGu7bezoItEi9m2SMaTB9oaB1bW1kPcE2Lu/tBmyg3RhTnkbPt2WBpAq8acA2\n0G6MKUdfKEh6ep7pucYsuWuBpAoGvUBiA+3GmDIsFbhq0KuSmk7/3Si8KxLr2jLGlMOb7fmOz/yQ\nYGugquf+kzft4cd2banqOZezQFIF1/SHuP2nLud1e/tX39kYY5YZuijCL+wbYGJ2oern7myrbmDK\nRxo546Rf+/bt0/3796++ozHGmCUi8kieelAr2BiJMcaYilggMcYYUxELJMYYYypigcQYY0xFLJAY\nY4ypiAUSY4wxFbFAYowxpiIWSIwxxlRkQyxIFJHTwHPLNm8DztShObXSbK8Hmu81NdvrgeZ7Tc32\neqCy13SxqvauttOGCCT5iMh+Pys2G0WzvR5ovtfUbK8Hmu81NdvrgbV5Tda1ZYwxpiIWSIwxxlRk\nIweST9W7AVXWbK8Hmu81NdvrgeZ7Tc32emANXtOGHSMxxhhTHRv5isQYY0wVbLhAIiKvEZEjIvKM\niNxR7/ZUg4gcE5GDIhIXkYYrvCIinxGRUyJyKGfbFhH5log87f7cXM82lqrAa/qQiJxw36e4iPxM\nPdtYChGJich3ReQJEXlcRN7vbm/I96nI62nk9ygoIg+JSMJ9TR92t9f8PdpQXVsiEgCeAl4JDAMP\nA7eo6hN1bViFROQYsE9VG3L+u4i8FJgA/lZVr3O3fQR4XlXvdAP+ZlX9nXq2sxQFXtOHgAlV/e/1\nbFs5RKQf6FfVR0WkB3gE+Fngl2nA96nI6/kFGvc9EqBLVSdEpA34V+D9wJuo8Xu00a5IbgCeUdWj\nqjoHfBG4uc5t2vBU9fvA88s23wzc7d6+G+ePvGEUeE0NS1VHVPVR9/Y48CSwkwZ9n4q8noaljgn3\nbpv7T1mD92ijBZKdQDLn/jAN/svjUuDbIvKIiLyn3o2pkj5VHXFvjwJ99WxMFd0uIo+5XV8N0Q20\nnIjsAq4HfkgTvE/LXg808HskIgERiQOngG+p6pq8RxstkDSrl6jqEHAT8Otut0rTUKf/tRn6YD8J\nXAoMASPAx+rbnNKJSDfwNeADqprJfawR36c8r6eh3yNVXXQ/CwaAG0TkumWP1+Q92miB5AQQy7k/\n4G5raKp6wv15CrgHpwuv0Y25/dhef/apOrenYqo65v6hZ4G/osHeJ7ff/WvA51T16+7mhn2f8r2e\nRn+PPKqaAr4LvIY1eI82WiB5GLhCRC4RkXbgLcC9dW5TRUSkyx0sRES6gFcBh4of1RDuBd7p3n4n\n8I06tqUqvD9m1xtpoPfJHcj9a+BJVf3TnIca8n0q9Hoa/D3qFZGIe7sTZ1LRYdbgPdpQs7YA3Ol8\nfw4EgM+o6n+tc5MqIiKX4lyFALQCn2+01yQiXwBehpOldAz4Q+AfgC8DF+Fkbv4FVW2YwesCr+ll\nOF0mChwD3pvTd72uichLgH8BDgJZd/MHccYVGu59KvJ6bqFx36O9OIPpAZyLhC+r6h+JyFZq/B5t\nuEBijDGmujZa15Yxxpgqs0BijDGmIhZIjDHGVMQCiTHGmIpYIDHGGFMRCySmKbiZXF+9bNsHROST\nqxw3UezxKrSrV0R+KCIHROQnlz32PRHZ596+xM3O+uo85/iom831o2W24WUi8o859/9YRL4pIh1u\nG/bnPLZPRL6Xc5yKyOtzHv9HEXlZOe0wzcsCiWkWX8BZYJrrLe72enoFcFBVr1fVf8m3g4gMAN8E\nflNV78+zy3uAvar6W36eUERaizz2e8BPAG9U1Vl383YRuanAIcPA7/p5XrNxWSAxzeKrwGvdjAVe\nIr4dwL+ISLeIPCAij4pTt2VFxuc839r/UkR+2b39QhH5Zzcp5v3LVj97++8Ske+4yf4eEJGLRGQI\n+Ahwszi1LTrztLsf+Cfgd1V1RZYFEbkX6AYeEZFfzPc87n5/IyJ3icgP3edcQUR+Eycf2+tVdTrn\noY9SOFgkgLSIvLLA48ZYIDHNwV2p+xDOByU4VyNfdpPUzeB8A38B8HLgY26KjFW5+Zj+Avg5VX0h\n8BkgX+aAvwDuVtW9wOeA/19V48AfAF9S1aFlH96eu4G/VNWvFnhdbwCm3eO/lO95cnYfAF6sqv85\nz6l+ArgNuCkn1bjnQWBORF6erw3u6/29Ao8ZY4HENJXc7q3cbi0B/kREHgO+jVM6wG8q7auA64Bv\nuem5fw/nA3u5FwGfd2//HfASn+f/NvA2Ednkc/9iz/MVVV0scNwzOP8Pha4s/pgCwcKtreKlFTFm\nBQskppl8A3iFiLwA2KSqj7jb3wr0Ai90U2yPAcFlxy5w4d+D97gAj7tXBEOqukdVX1XFNn8EJ5no\nV4qNbfg0WeSxMeBngD/Pd+Whqt8BOoEbCxxvVyWmIAskpmm4XTbfxel+yh1kDwOnVHXe/RC9OM/h\nzwG73ZlMEZxBcoAjQK+IvAicri4RuTbP8f/G+auht+IkBPTrA0AG+GsfXW5lP4+qPoVTdvXv3fGb\n5f4Y+O0Cx/4TsBnY6/f5zMZhgcQ0my8Ag1wYSD4H7BORg8A7cFJrX0BVkzgZUg+5Pw+42+eAnwP+\nm4gkgDjw4jzPeztwq9t99nacWtm+uOM478QZeM87UF6N53Gf62HgVuBeEbls2WP3AaeLHP5fubCe\njzGAZf81xhhTIbsiMcYYUxELJMYYYypigcQYY0xFLJAYY4ypiAUSY4wxFbFAYowxpiIWSIwxxlTE\nAokxxpiK/F93pGKkN6h/eQAAAABJRU5ErkJggg==\n", 357 | "text/plain": [ 358 | "" 359 | ] 360 | }, 361 | "metadata": {}, 362 | "output_type": "display_data" 363 | } 364 | ], 365 | "source": [ 366 | "import matplotlib.pyplot as plt\n", 367 | "%matplotlib inline\n", 368 | "\n", 369 | "# plot the value of K for KNN (x-axis) versus the cross-validated accuracy (y-axis)\n", 370 | "plt.plot(k_range, k_scores)\n", 371 | "plt.xlabel('Value of K for KNN')\n", 372 | "plt.ylabel('Cross-Validated Accuracy')" 373 | ] 374 | }, 375 | { 376 | "cell_type": "markdown", 377 | "metadata": {}, 378 | "source": [ 379 | "## Cross-validation example: model selection" 380 | ] 381 | }, 382 | { 383 | "cell_type": "markdown", 384 | "metadata": {}, 385 | "source": [ 386 | "**Goal:** Compare the best KNN model with logistic regression on the iris dataset" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 31, 392 | "metadata": {}, 393 | "outputs": [ 394 | { 395 | "name": "stdout", 396 | "output_type": "stream", 397 | "text": [ 398 | "0.98\n" 399 | ] 400 | } 401 | ], 402 | "source": [ 403 | "# 10-fold cross-validation with the best KNN model\n", 404 | "knn = KNeighborsClassifier(n_neighbors=20)\n", 405 | "print(cross_val_score(knn, X, y, cv=10, scoring='accuracy').mean())" 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": 32, 411 | "metadata": {}, 412 | "outputs": [ 413 | { 414 | "name": "stdout", 415 | "output_type": "stream", 416 | "text": [ 417 | "0.953333333333\n" 418 | ] 419 | } 420 | ], 421 | "source": [ 422 | "# 10-fold cross-validation with logistic regression\n", 423 | "from sklearn.linear_model import LogisticRegression\n", 424 | "logreg = LogisticRegression()\n", 425 | "print(cross_val_score(logreg, X, y, cv=10, scoring='accuracy').mean())" 426 | ] 427 | }, 428 | { 429 | "cell_type": "markdown", 430 | "metadata": {}, 431 | "source": [ 432 | "## Cross-validation example: feature selection" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "## Improvements to cross-validation" 440 | ] 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "metadata": {}, 445 | "source": [ 446 | "**Repeated cross-validation**\n", 447 | "\n", 448 | "- Repeat cross-validation multiple times (with **different random splits** of the data) and average the results\n", 449 | "- More reliable estimate of out-of-sample performance by **reducing the variance** associated with a single trial of cross-validation\n", 450 | "\n", 451 | "**Creating a hold-out set**\n", 452 | "\n", 453 | "- \"Hold out\" a portion of the data **before** beginning the model building process\n", 454 | "- Locate the best model using cross-validation on the remaining data, and test it **using the hold-out set**\n", 455 | "- More reliable estimate of out-of-sample performance since hold-out set is **truly out-of-sample**\n" 456 | ] 457 | }, 458 | { 459 | "cell_type": "markdown", 460 | "metadata": { 461 | "collapsed": true 462 | }, 463 | "source": [ 464 | "## 1. Excercise\n", 465 | "\n", 466 | "Use cross-validation technique to choose the two best features that can be used for classification." 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": null, 472 | "metadata": { 473 | "collapsed": true 474 | }, 475 | "outputs": [], 476 | "source": [] 477 | } 478 | ], 479 | "metadata": { 480 | "kernelspec": { 481 | "display_name": "Python 3", 482 | "language": "python", 483 | "name": "python3" 484 | }, 485 | "language_info": { 486 | "codemirror_mode": { 487 | "name": "ipython", 488 | "version": 3 489 | }, 490 | "file_extension": ".py", 491 | "mimetype": "text/x-python", 492 | "name": "python", 493 | "nbconvert_exporter": "python", 494 | "pygments_lexer": "ipython3", 495 | "version": "3.6.1" 496 | } 497 | }, 498 | "nbformat": 4, 499 | "nbformat_minor": 1 500 | } 501 | -------------------------------------------------------------------------------- /notebooks/Decision+Trees+and+Random+Forest.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Decision Tree" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 17, 13 | "metadata": {}, 14 | "outputs": [ 15 | { 16 | "data": { 17 | "text/plain": [ 18 | "0.94736842105263153" 19 | ] 20 | }, 21 | "execution_count": 17, 22 | "metadata": {}, 23 | "output_type": "execute_result" 24 | } 25 | ], 26 | "source": [ 27 | "#Import Library\n", 28 | "#Import other necessary libraries like pandas, numpy...\n", 29 | "from sklearn.tree import DecisionTreeClassifier\n", 30 | "from sklearn.datasets import load_iris\n", 31 | "from sklearn.model_selection import train_test_split\n", 32 | "from sklearn import metrics\n", 33 | "#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset\n", 34 | "iris_data = load_iris()\n", 35 | "X = iris_data.data\n", 36 | "y = iris_data.target\n", 37 | "X_train,X_test,y_train,y_test= train_test_split(X,y,test_size=0.25, random_state=153)\n", 38 | "# Create tree object \n", 39 | "model = DecisionTreeClassifier(criterion='gini') # for classification, here you can change the algorithm as gini or entropy (information gain) by default it is gini \n", 40 | "#model = tree.DecisionTreeRegressor() for regression\n", 41 | "# Train the model using the training sets and check score\n", 42 | "model.fit(X_train, y_train)\n", 43 | "\n", 44 | "#Predict Output\n", 45 | "predicted= model.predict(X_test)\n", 46 | "#model.score(X_test, y_test)\n", 47 | "metrics.accuracy_score(y_test,predicted)" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 15, 53 | "metadata": {}, 54 | "outputs": [ 55 | { 56 | "data": { 57 | "text/plain": [ 58 | "array([ True, True, True, True, False, True, True, True, True,\n", 59 | " True, True, True, True, True, True, True, True, True,\n", 60 | " True, True, True, True, True, True, True, False, True,\n", 61 | " True, True, True, True, True, True, True, True, True,\n", 62 | " True, True, True, True, True, True, True, True, True,\n", 63 | " True, True, True, True, True, True, False, True, True,\n", 64 | " True, True, True, True, True, True, True, True, True,\n", 65 | " True, True, True, True, True, False, True, True, True,\n", 66 | " True, False, True], dtype=bool)" 67 | ] 68 | }, 69 | "execution_count": 15, 70 | "metadata": {}, 71 | "output_type": "execute_result" 72 | } 73 | ], 74 | "source": [ 75 | "predicted==y_test" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": 2, 81 | "metadata": {}, 82 | "outputs": [ 83 | { 84 | "data": { 85 | "text/plain": [ 86 | "array([0, 0, 2, 0, 2, 0, 2, 2, 2, 2, 2, 2, 1, 1, 0, 1, 2, 1, 0, 1, 0, 0, 0,\n", 87 | " 1, 2, 2, 0, 0, 1, 2, 1, 1, 1, 0, 2, 0, 0, 1, 1, 1, 2, 0, 2, 2, 2, 1,\n", 88 | " 2, 2, 1, 2, 1, 1, 1, 0, 0, 2, 1, 0, 0, 2, 2, 2, 0, 2, 1, 0, 0, 2, 2,\n", 89 | " 0, 2, 0, 1, 1, 1])" 90 | ] 91 | }, 92 | "execution_count": 2, 93 | "metadata": {}, 94 | "output_type": "execute_result" 95 | } 96 | ], 97 | "source": [ 98 | "predicted" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "# Random Forest" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": 61, 111 | "metadata": {}, 112 | "outputs": [ 113 | { 114 | "data": { 115 | "text/plain": [ 116 | "0.94666666666666666" 117 | ] 118 | }, 119 | "execution_count": 61, 120 | "metadata": {}, 121 | "output_type": "execute_result" 122 | } 123 | ], 124 | "source": [ 125 | "#Import Library\n", 126 | "from sklearn.ensemble import RandomForestClassifier #use RandomForestRegressor for regression problem\n", 127 | "#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset\n", 128 | "# Create Random Forest object\n", 129 | "model= RandomForestClassifier(n_estimators=300,criterion=\"entropy\")\n", 130 | "# Train the model using the training sets and check score\n", 131 | "model.fit(X_train, y_train)\n", 132 | "#Predict Output\n", 133 | "predicted= model.predict(X_test)\n", 134 | "metrics.accuracy_score(y_test, predicted)" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": { 141 | "collapsed": true 142 | }, 143 | "outputs": [], 144 | "source": [] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": { 150 | "collapsed": true 151 | }, 152 | "outputs": [], 153 | "source": [] 154 | } 155 | ], 156 | "metadata": { 157 | "kernelspec": { 158 | "display_name": "Python 3", 159 | "language": "python", 160 | "name": "python3" 161 | }, 162 | "language_info": { 163 | "codemirror_mode": { 164 | "name": "ipython", 165 | "version": 3 166 | }, 167 | "file_extension": ".py", 168 | "mimetype": "text/x-python", 169 | "name": "python", 170 | "nbconvert_exporter": "python", 171 | "pygments_lexer": "ipython3", 172 | "version": "3.6.1" 173 | } 174 | }, 175 | "nbformat": 4, 176 | "nbformat_minor": 2 177 | } 178 | -------------------------------------------------------------------------------- /notebooks/Week4.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "---\n", 8 | "\n", 9 | "_You are currently looking at **version 1.0** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._\n", 10 | "\n", 11 | "---" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "# Distributions in Pandas" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 1, 24 | "metadata": { 25 | "collapsed": false 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "import pandas as pd\n", 30 | "import numpy as np" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 23, 36 | "metadata": { 37 | "collapsed": false 38 | }, 39 | "outputs": [ 40 | { 41 | "data": { 42 | "text/plain": [ 43 | "0" 44 | ] 45 | }, 46 | "execution_count": 23, 47 | "metadata": {}, 48 | "output_type": "execute_result" 49 | } 50 | ], 51 | "source": [ 52 | "np.random.binomial(1, 0.5)" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 13, 58 | "metadata": { 59 | "collapsed": false 60 | }, 61 | "outputs": [ 62 | { 63 | "data": { 64 | "text/plain": [ 65 | "0.499456" 66 | ] 67 | }, 68 | "execution_count": 13, 69 | "metadata": {}, 70 | "output_type": "execute_result" 71 | } 72 | ], 73 | "source": [ 74 | "np.random.binomial(1000000, 0.5)/1000000" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": 27, 80 | "metadata": { 81 | "collapsed": false 82 | }, 83 | "outputs": [ 84 | { 85 | "data": { 86 | "text/plain": [ 87 | "9" 88 | ] 89 | }, 90 | "execution_count": 27, 91 | "metadata": {}, 92 | "output_type": "execute_result" 93 | } 94 | ], 95 | "source": [ 96 | "chance_of_tornado = 0.01/100\n", 97 | "np.random.binomial(100000, chance_of_tornado)" 98 | ] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "execution_count": 30, 103 | "metadata": { 104 | "collapsed": false 105 | }, 106 | "outputs": [ 107 | { 108 | "name": "stdout", 109 | "output_type": "stream", 110 | "text": [ 111 | "118 tornadoes back to back in 2739.72602739726 years\n" 112 | ] 113 | } 114 | ], 115 | "source": [ 116 | "chance_of_tornado = 0.01\n", 117 | "\n", 118 | "tornado_events = np.random.binomial(1, chance_of_tornado, 1000000)\n", 119 | " \n", 120 | "two_days_in_a_row = 0\n", 121 | "for j in range(1,len(tornado_events)-1):\n", 122 | " if tornado_events[j]==1 and tornado_events[j-1]==1:\n", 123 | " two_days_in_a_row+=1\n", 124 | "\n", 125 | "print('{} tornadoes back to back in {} years'.format(two_days_in_a_row, 1000000/365))" 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": 55, 131 | "metadata": { 132 | "collapsed": false 133 | }, 134 | "outputs": [ 135 | { 136 | "data": { 137 | "text/plain": [ 138 | "0.6295388722617103" 139 | ] 140 | }, 141 | "execution_count": 55, 142 | "metadata": {}, 143 | "output_type": "execute_result" 144 | } 145 | ], 146 | "source": [ 147 | "np.random.uniform(0, 1)" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": 64, 153 | "metadata": { 154 | "collapsed": false 155 | }, 156 | "outputs": [ 157 | { 158 | "data": { 159 | "text/plain": [ 160 | "-1.2469247211946008" 161 | ] 162 | }, 163 | "execution_count": 64, 164 | "metadata": {}, 165 | "output_type": "execute_result" 166 | } 167 | ], 168 | "source": [ 169 | "np.random.normal(0.75)" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "Formula for standard deviation\n", 177 | "$$\\sqrt{\\frac{1}{N} \\sum_{i=1}^N (x_i - \\overline{x})^2}$$" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 70, 183 | "metadata": { 184 | "collapsed": false 185 | }, 186 | "outputs": [ 187 | { 188 | "data": { 189 | "text/plain": [ 190 | "1.0065258567421238" 191 | ] 192 | }, 193 | "execution_count": 70, 194 | "metadata": {}, 195 | "output_type": "execute_result" 196 | } 197 | ], 198 | "source": [ 199 | "distribution = np.random.normal(0.75,size=10000)\n", 200 | "\n", 201 | "np.sqrt(np.sum((np.mean(distribution)-distribution)**2)/len(distribution))" 202 | ] 203 | }, 204 | { 205 | "cell_type": "code", 206 | "execution_count": 75, 207 | "metadata": { 208 | "collapsed": false, 209 | "scrolled": true 210 | }, 211 | "outputs": [ 212 | { 213 | "data": { 214 | "text/plain": [ 215 | "1.0065258567421238" 216 | ] 217 | }, 218 | "execution_count": 75, 219 | "metadata": {}, 220 | "output_type": "execute_result" 221 | } 222 | ], 223 | "source": [ 224 | "np.std(distribution)" 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": 78, 230 | "metadata": { 231 | "collapsed": false 232 | }, 233 | "outputs": [ 234 | { 235 | "data": { 236 | "text/plain": [ 237 | "-0.013249564329728791" 238 | ] 239 | }, 240 | "execution_count": 78, 241 | "metadata": {}, 242 | "output_type": "execute_result" 243 | } 244 | ], 245 | "source": [ 246 | "import scipy.stats as stats\n", 247 | "stats.kurtosis(distribution)" 248 | ] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "execution_count": 79, 253 | "metadata": { 254 | "collapsed": false 255 | }, 256 | "outputs": [ 257 | { 258 | "data": { 259 | "text/plain": [ 260 | "-0.0048870324877872545" 261 | ] 262 | }, 263 | "execution_count": 79, 264 | "metadata": {}, 265 | "output_type": "execute_result" 266 | } 267 | ], 268 | "source": [ 269 | "stats.skew(distribution)" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": 82, 275 | "metadata": { 276 | "collapsed": false 277 | }, 278 | "outputs": [ 279 | { 280 | "data": { 281 | "text/plain": [ 282 | "1.9889311393293112" 283 | ] 284 | }, 285 | "execution_count": 82, 286 | "metadata": {}, 287 | "output_type": "execute_result" 288 | } 289 | ], 290 | "source": [ 291 | "chi_squared_df2 = np.random.chisquare(2, size=100000)\n", 292 | "stats.skew(chi_squared_df2)" 293 | ] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": 88, 298 | "metadata": { 299 | "collapsed": false 300 | }, 301 | "outputs": [ 302 | { 303 | "data": { 304 | "text/plain": [ 305 | "1.3322980185207178" 306 | ] 307 | }, 308 | "execution_count": 88, 309 | "metadata": {}, 310 | "output_type": "execute_result" 311 | } 312 | ], 313 | "source": [ 314 | "chi_squared_df5 = np.random.chisquare(5, size=10000)\n", 315 | "stats.skew(chi_squared_df5)" 316 | ] 317 | }, 318 | { 319 | "cell_type": "code", 320 | "execution_count": 85, 321 | "metadata": { 322 | "collapsed": false 323 | }, 324 | "outputs": [ 325 | { 326 | "data": { 327 | "text/plain": [ 328 | "" 329 | ] 330 | }, 331 | "execution_count": 85, 332 | "metadata": {}, 333 | "output_type": "execute_result" 334 | }, 335 | { 336 | "data": { 337 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAFkCAYAAACAUFlOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAAPYQAAD2EBqD+naQAAIABJREFUeJzs3Xl8VdW5//HPc5IQEqYwBmUKgyAiIoSKijJIX0BxQhzT\nSwVsLaWt9Yf2YsWJQQvFISgqBaTigLEKRepVIUpBaaXYAiJXAaFacCgoU0AwDMnz++Oc5CYh54SE\nZAfk+/Z1XpK9nrP32iuE8806a+9j7o6IiIhIUELV3QERERE5tSh8iIiISKAUPkRERCRQCh8iIiIS\nKIUPERERCZTCh4iIiARK4UNEREQCpfAhIiIigVL4EBERkUApfIiIiEigyhU+zOxnZrbWzHIij3fN\nbGCJmglm9qWZHTCzN82sXYn2RDN7wsx2mNk+M5tnZk1K1NQ3s7mRY+w2s6fMrFaJmhZm9pqZ7Tez\nbWY2xcwUpkRERE5w5X2x/gy4A+gGpAN/ARaaWUcAM7sD+CXwU+A8YD+w2MxqFNnHVOBS4GqgF3A6\nML/EcV4AOgL9IrW9gBkFjZGQ8ToQD5wPDAOGAxPKeT4iIiISMDveD5Yzs53Ar939aTP7EnjQ3TMj\nbXWB7cAwd38p8vXXwA3uviBS0wFYD5zv7u9FgsyHQLq7r4nUDABeA5q7+zYz+wHwZ+A0d98RqRkJ\nTAYau/uR4zopERERqTIVfpvCzEJmdgOQDLxrZq2BpsCSghp33wusBC6IbOpOeLaiaM1GYGuRmvOB\n3QXBI+ItwIEeRWrWFQSPiMVAPaBTRc9JREREql58eZ9gZmcDK4CawD7gKnffaGYXEA4I20s8ZTvh\nUAKQChyKhJJoNU2Br4o2unueme0qUVPacQra1kbpe0NgAPBvIDf6WYqIiEgJNYE0YLG77zyeHZU7\nfAAbgC6EZxmuAZ41s17H04kADQDmVncnRERETmL/RXhtZoWVO3xE1lN8EvlyjZmdB9wKTAGM8OxG\n0VmJVKDgLZRtQA0zq1ti9iM10lZQU/LqlzigQYma75XoWmqRtmj+DfD888/TsWPHGGVS1OjRo8nM\nzKzubpx0NG7lpzGrGI1b+WnMym/9+vUMHToUIq+lx6MiMx8lhYBEd//UzLYRvkLlAyhccNoDeCJS\nuwo4EqkpuuC0JeG3coj8P8XMuhZZ99GPcLBZWaRmrJk1KrLuoz+QA3wUo6+5AB07dqRbt24VP+NT\nTL169TReFaBxKz+NWcVo3MpPY3ZcjnvZQrnCh5n9FniD8ALROoSnXnoTfuGH8GW0d5vZZsLJaCLw\nObAQwgtQzWw28IiZ7Sa8ZuQx4G/u/l6kZoOZLQZmmdkooAYwDchy94JZjWzCIeO5yOW9p0WO9bi7\nHy73KIiIiEhgyjvz0QR4hvCLfQ7hGY7+7v4XAHefYmbJhO/JkQIsB37g7oeK7GM0kAfMAxKBRcAv\nShznh8DjhK9yyY/U3lrQ6O75ZnYZMB14l/D9ROYA95XzfERERCRg5Qof7v6TY6gZB4yL0X4QuCXy\niFazBxhaxnE+Ay4rqz8iIiJyYtHtyKVMGRkZ1d2Fk5LGrfw0ZhWjcSs/jVn1Ou47nJ5MzKwbsGrV\nqlVaaCQiIlIOq1evJj09HcJ3IF99PPuqjKtdREQKbd26lR07dpRdKCInnEaNGtGyZcsqP47Ch4hU\nmq1bt9KxY0cOHDhQ3V0RkQpITk5m/fr1VR5AFD5EpNLs2LGDAwcO6EZ+IiehgpuI7dixQ+FDRE4+\nupGfiMSiq11EREQkUAofIiIiEiiFDxEREQmUwoeIiIgESuFDRKQaDR8+nNatW1d3N046Dz74IG3b\ntiU+Pr7Mxc3PPfccHTt2pEaNGjRo0CCgHpZt3LhxhEKn5suwrnYRkUBszdnKjgPVf/OxRsmNaFmv\n/JcRvv322/Tt2/eo7WbGihUrOO+88yrUHzPDzCr03FNVdnY2d9xxBzfeeCPjx4+nUaNGUWs3btzI\niBEjGDRoEHfeeSfJyckB9jS2U/l7r/AhIlVua85WOj7RkQOHq//mY8kJyaz/xfoKBRCA//f//h/d\nu3cvtq1du3aV0TU5RkuXLiUuLo7Zs2cTFxcXs3bZsmW4O48++qhmmE4gCh8iUuV2HNjBgcMHeP6q\n5+nYuPpuPrb+6/UMXTCUHQd2VDh8XHTRRQwZMqSSexa8AwcOnFCzAOWxfft2kpKSygweBbUAdevW\nLbM2NzeXmjVrHnf/5Bi4+ynzALoBvmrVKheRyrdq1Sov7Wds1ZernHH4qi+r92fvePqxbNkyNzOf\nP3++79u3z48cOVLufSxYsMA7derkNWvW9M6dO/uCBQt8+PDh3rp162J1+fn5npmZWVibmprqI0eO\n9N27dx9Vd9999/npp5/uycnJfskll/hHH33krVq18hEjRhTWzZkzx83M3377bR81apQ3adLEGzRo\nUNj+xRdf+IgRIzw1NdUTExO9U6dO/oc//OGo/h88eNDvvfdeb9eunScmJnqLFi18zJgxfvDgwWJ1\n2dnZftFFF3lKSorXrl3bO3To4GPHji1zfI4cOeITJkzwtm3bemJioqelpfnYsWOL7d/MPBQKeSgU\nKvzzM888U+r+0tLSCmvMzM3Mx48f7+7urVq18ssvv9wXL17s3bt395o1a/qjjz5a+NznnnvO09PT\nPSkpyRs0aOA33HCDf/bZZ0cd4+9//7sPGDDA69Wr58nJyd67d2//29/+dlTd8uXLC4/Trl07nzFj\nho8bN85DoVC5x6Bo/5ctW+bdu3f3pKQk79y5sy9btszd3efPn++dO3f2mjVrenp6uq9Zs6bM8Y/2\n81uyHejmx/t6fLw7OJkeCh8iVetUCB9169Z1M/P4+Hjv27ev//Of/zym5y9evNjj4uK8S5cuPnXq\nVL/nnns8JSXFzz777KPCx09+8hOvUaOG/+xnP/OZM2f6nXfe6bVr1/YePXoUCz1jxoxxM/PBgwf7\nk08+6SNHjvQWLVp4kyZNSg0fnTp18r59+/oTTzzhU6ZMcXf37du3e/Pmzb1Vq1b+wAMP+IwZM3zw\n4MFuZsVejPPz871///5eu3Ztv/32233WrFn+q1/9yhMSEvyqq64qrPvwww89MTHRe/To4dOmTfOZ\nM2f6mDFjvE+fPmWO0bBhw9zM/Prrr/fp06f78OHD3cx8yJAhhTVz5871Xr16eVJSkr/wwgs+d+5c\n//TTT0vd38KFC33IkCEeCoV85syZPnfuXF+3bp27h4PJGWec4Q0bNvSxY8f6zJkz/e2333Z39/vv\nv99DoZD/8Ic/9N///vc+ceJEb9y4sbdp08ZzcnIK979kyRJPTEz0nj17emZmpj/66KN+7rnnemJi\nov/jH/8orFu3bp0nJyd7WlqaT5kyxR944AE/7bTTvEuXLkeFj2MZg4L+n3nmmd6sWTOfMGGCP/ro\no968eXOvW7euz50719PS0vzBBx/0KVOmeEpKirdv377M8Vf4UPgQOSl9l8PHu+++69dee60//fTT\n/uqrr/rvfvc7b9y4sScnJ/v7779f5vPPPfdcb9asme/bt69w21tvveVmVix8LF++3M3MX3zxxWLP\nz87OdjPzrKwsdw+HhoSEBL/66quL1Y0fP97NrNTw0bt3b8/Pzy9W/+Mf/9ibNWt21KxKRkaG169f\n33Nzc909PBMQHx/v7777brG6GTNmeCgU8hUrVri7+9SpUz0UCvmuXbvKHJOi1q5d62bmI0eOLLb9\nv//7vz0UChX+Ru/uPnz4cK9Tp84x7bdgdmHnzp3FtqelpXkoFPI333yz2PYtW7Z4fHy8T548udj2\nDz/80BMSEnzSpEmF29q3b++DBg0qVpebm+tt2rTxAQMGFG4bPHiwJycn++eff164bcOGDR4fH18s\nfJRnDAr6v3LlysJtBX9HatWqVexYM2fO9FAoVBiuogkyfJya1/iIiJTTBRdcwEsvvcTw4cO57LLL\nGDNmDCtWrADgzjvvjPncbdu2sXbtWoYPH07t2rULt/fr14+zzjqrWO28efNISUmhX79+7Ny5s/DR\ntWtXateuzdKlSwF46623yMvLY9SoUcWef8stt5TaBzPj5ptvPurqij/96U9cfvnl5OXlFTte//79\n2bNnD6tXry7sV8eOHWnfvn2xur59++Luhf1KSUkBYMGCBQW/9B2T119/HTNj9OjRxbbffvvtuDuv\nvfbaMe/rWLVu3Zrvf//7xbbNnz8fd+faa68tdp5NmjThjDPOKDzPNWvWsGnTJjIyMorV7du3j379\n+vHOO+8AkJ+fT3Z2NldddRXNmjUrPE6HDh0YMGDAcY3BWWedVewqqx49egDhv1dFj9WjRw/cnU8+\n+aSiQ1XptOBURKSC2rZty5VXXln4QhvtssktW7YApV8V06FDB9asWVP49aZNm9izZw9NmjQ5qtbM\n+OqrrwDYunVrqfusX78+9evXL7UfaWlpxb7++uuv2bNnDzNnzmTGjBkxj7dp0yY2bNhA48aNY9Zd\nf/31zJ49m5tvvpnf/OY39OvXjyFDhnDNNdfEvKx0y5YthEKho84nNTWVlJSUwjGsTKVd/bJ582by\n8/NL/V6ZGTVq1CisA7jxxhtL3XcoFCInJ4fc3Fy+/fbbqN/7N954o/Dr8o5ByU+eLVhU27x582Lb\n69WrB8Du3btL7Wt1UPgQETkOLVq04NChQ+zfv7/YrEZF5efnk5qaygsvvFDqzEFpL/7HKikp6ahj\nAQwdOpRhw4aV+pxzzjmnsLZz585kZmaW2q8WLVoAULNmTd555x2WLl3Ka6+9xqJFi/jjH/9Iv379\nyM7OLvO+FkHe96LkeED4PEOhEIsWLSr1BmAF3+OCsXv44Yfp0qVLqfuvXbs2ubm55e7XsY5BtKt9\nom0vz0xUVVP4EBE5Dv/617+oWbNmzODRqlUrIDx7UNLGjRuLfd22bVuWLFnChRdeSGJiYpn73Lx5\nc+GfAXbt2nXMv+E2btyYOnXqkJeXxyWXXBKztm3btnzwwQel3mitNH379qVv37489NBDTJo0ibvv\nvpulS5dGPU6rVq3Iz89n06ZNdOjQoXD7V199xZ49e4qdY1Vq27Yt7k5aWlrM+7e0bdsWgDp16sQc\nu8aNG5OUlFTq937Dhg3Fvj5RxiAIWvMhInIMduw4+u6sa9eu5dVXXz3qvfuSmjZtyrnnnsszzzzD\nvn37Cre/+eabfPTRR8Vqr7vuOo4cOcKECROO2k9eXh45OTlA+H39uLg4pk+fXqxm2rRpx3xOoVCI\nq6++mvnz5/Phhx8e1V70nK+77jo+//xzZs2adVRdbm4uBw6EbyBXWvDp0qUL7s7Bgwej9mXQoEG4\nO1OnTi22/eGHH8bMuPTSS4/5vI7HkCFDCIVCjB8/vtT2Xbt2AZCenk7btm156KGH2L9//1F1BWMX\nCoUYMGAAr7zyCp9//nlh+/r168nOzi72nBNlDIKgmQ8RCcz6r9eftMe//vrrSUpK4sILL6RJkyZ8\n+OGHzJo1i9q1azNp0qQynz9p0iQuu+wyevbsyU033cTOnTt5/PHHOfvss/nmm28K63r16sXIkSOZ\nPHky77//Pv379ychIYGPP/6YefPm8dhjjzFkyBCaNGnCrbfeyiOPPMKVV17JwIEDWbt2LW+88QaN\nGzc+auo+2pT75MmTWbZsGT169ODmm2/mrLPOYteuXaxatYq//OUvhS+iP/rRj3jppZcYNWoUS5cu\npWfPnuTl5bF+/XpefvllsrOz6datGxMmTOCdd97h0ksvpVWrVmzfvp3p06fTsmVLLrrooqjjc845\n5zBs2DBmzpzJ7t276d27NytXruTZZ59lyJAh9O7d+1i+TcetTZs23H///YwdO5ZPP/2UwYMHU6dO\nHT755BNeeeUVRo4cyW233YaZ8dRTTzFo0CA6derEiBEjaNasGV988QVLly6lXr16LFy4EIDx48ez\naNEiLrroIn7+859z+PDhwu/9Bx98cMKNQRAUPkSkyjVKbkRyQjJDFwyt7q6QnJBMo+TonwUSzVVX\nXcXcuXPJzMxk7969NG7cmGuuuYZ7772XNm3alPn8AQMG8PLLL3P33XczduxY2rZty5w5c3jllVcK\nr4woMH36dLp3786MGTO46667iI+PJy0tjRtvvJGePXsW1k2ZMoVatWoxa9YslixZwvnnn8/ixYu5\n+OKLj7pTZ7R1BE2aNOG9995jwoQJLFiwgOnTp9OwYUM6derElClTij1/4cKFZGZm8uyzz/LKK6+Q\nnJxMmzZtGD16NO3btwfgyiuvZMuWLTz99NPs2LGDRo0a0adPH8aNG0edOnVijtHs2bOLjUvTpk25\n6667uPfee4+qPd61IbE+V+WOO+6gQ4cOZGZmFs5AtWjRgoEDB3LFFVcU1vXu3ZsVK1YwceJEnnji\nCb755huaNm1Kjx49GDlyZGFd586dyc7O5rbbbuO+++6jefPmTJgwgS+//LJY+CjPGETrf3m3Vxc7\nkRagVDUz6wasWrVqVZmfgigi5bd69WrS09Mp7WfsZP9guZNFTk4O9evX54EHHijzEmCRomL9/BZt\nB9LdffXxHEszHyISiJb1Wn6nX/SrQ2mfRZKZmYmZ0adPn+rplMgxUPgQETlJ/fGPf2TOnDkMGjSI\n2rVrs3z5cl588UUGDhzIBRdcUN3dE4lK4UNE5CR1zjnnkJCQwIMPPsjevXtJTU1l9OjRTJw4sbq7\nJhKTwoeIyEmqa9euR12uKXIy0H0+REREJFAKHyIiIhIohQ8REREJlMKHiIiIBErhQ0RERAKl8CEi\nIiKBUvgQERGRQCl8iIhUo+HDh9O6devq7sZJ58EHH6Rt27bEx8eX+Vldzz33HB07dqRGjRo0aNAg\noB6Wbdy4cYRCp+bLsG4yJiKB2LoVdlT/58rRqBG0rMBHzPzzn/9kzpw5LFu2jH//+980bNiQ888/\nn/vvv58zzjijwv050T5t9GSQnZ3NHXfcwY033sj48eNp1Cj6pxRv3LiRESNGMGjQIO68806Sk5MD\n7Glsp/L3XuFDRKrc1q3QsSMcOFDdPYHkZFi/vvwB5He/+x3vvvsu1157Leeccw7btm1j2rRpdOvW\njZUrV3LWWWdVTYflKEuXLiUuLo7Zs2cTFxcXs3bZsmW4O48++qhmmE4gCh8iUuV27AgHj+efD4eQ\n6rJ+PQwdGu5PecPH7bffTlZWFvHx//fP5nXXXUfnzp2ZPHkyzz77bCX3tmodOHDghJoFKI/t27eT\nlJRUZvAoqAWoW7dumbWlfUqwVBF3P2UeQDfAV61a5SJS+VatWuWl/YytWuUO4f9Xp6roR3p6unfv\n3v2YahcsWOCdOnXymjVreufOnX3BggU+fPhwb926dbG6/Px8z8zMLKxNTU31kSNH+u7du4+qu+++\n+/z000/35ORkv+SSS/yjjz7yVq1a+YgRIwrr5syZ42bmb7/9to8aNcqbNGniDRo0KGz/4osvfMSI\nEZ6amuqJiYneqVMn/8Mf/nBU/w8ePOj33nuvt2vXzhMTE71FixY+ZswYP3jwYLG67Oxsv+iiizwl\nJcVr167tHTp08LFjx5Y5PkeOHPEJEyZ427ZtPTEx0dPS0nzs2LHF9m9mHgqFPBQKFf75mWeeKXV/\naWlphTVm5mbm48ePd3f3Vq1a+eWXX+6LFy/27t27e82aNf3RRx8tfO5zzz3n6enpnpSU5A0aNPAb\nbrjBP/vss6OO8fe//90HDBjg9erV8+TkZO/du7f/7W9/O6pu+fLlhcdp166dz5gxw8eNG+ehUKjc\nY1C0/8uWLfPu3bt7UlKSd+7c2ZctW+bu7vPnz/fOnTt7zZo1PT093desWVPm+Ef7+S3ZDnTz4309\nLlcx3Am8B+wFtgMLgPYlap4G8ks8Xi9Rkwg8AewA9gHzgCYlauoDc4EcYDfwFFCrRE0L4DVgP7AN\nmAKEYvRf4UOkCp2K4aN58+Y+cODAMusWL17scXFx3qVLF586darfc889npKS4mefffZR4eMnP/mJ\n16hRw3/2s5/5zJkz/c477/TatWt7jx49/MiRI4V1Y8aMcTPzwYMH+5NPPukjR470Fi1aeJMmTUoN\nH506dfK+ffv6E0884VOmTHF39+3bt3vz5s29VatW/sADD/iMGTN88ODBbmbFXozz8/O9f//+Xrt2\nbb/99tt91qxZ/qtf/coTEhL8qquuKqz78MMPPTEx0Xv06OHTpk3zmTNn+pgxY7xPnz5ljtGwYcPc\nzPz666/36dOn+/Dhw93MfMiQIYU1c+fO9V69enlSUpK/8MILPnfuXP/0009L3d/ChQt9yJAhHgqF\nfObMmT537lxft26du4eDyRlnnOENGzb0sWPH+syZM/3tt992d/f777/fQ6GQ//CHP/Tf//73PnHi\nRG/cuLG3adPGc3JyCve/ZMkST0xM9J49e3pmZqY/+uijfu6553piYqL/4x//KKxbt26dJycne1pa\nmk+ZMsUfeOABP+2007xLly5HhY9jGYOC/p955pnerFkznzBhgj/66KPevHlzr1u3rs+dO9fT0tL8\nwQcf9ClTpnhKSoq3b9++zPE/kcPH68CPgI5AZ+B/gH8DSUVqno4EgsZAk8ijXon9TI88rzfQFXgX\nWF6i5g1gNdAduBD4GHi+SHsIWAcsjvRlAPAVcH+M/it8iFShUy18PPfcc25mPmfOnDJrzz33XG/W\nrJnv27evcNtbb73lZlYsfCxfvtzNzF988cViz8/OznYz86ysLHcPh4aEhAS/+uqri9WNHz/ezazU\n8NG7d2/Pz88vVv/jH//YmzVrdtSsSkZGhtevX99zc3MLzzU+Pt7ffffdYnUzZszwUCjkK1ascHf3\nqVOneigU8l27dpU5JkWtXbvWzcxHjhxZbPt///d/eygUKvyN3t19+PDhXqdOnWPab8Hsws6dO4tt\nT0tL81Ao5G+++Wax7Vu2bPH4+HifPHlyse0ffvihJyQk+KRJkwq3tW/f3gcNGlSsLjc319u0aeMD\nBgwo3DZ48GBPTk72zz//vHDbhg0bPD4+vlj4KM8YFPR/5cqVhdsK/o7UqlWr2LFmzpzpoVCoMFxF\nE2T4KNc1Pu4+yN2fc/f17r4OGA60BNJLlB5096/d/avII6egwczqAjcBo939bXdfA4wAeprZeZGa\njpEw8WN3/6e7vwvcAtxgZk0juxoAnAn8l7uvc/fFwD3AL8xMa1lEpEpt2LCBX/7yl/Ts2ZMbb7wx\nZu22bdtYu3Ytw4cPp3bt2oXb+/Xrd9RC1Xnz5pGSkkK/fv3YuXNn4aNr167Url2bpUuXAvDWW2+R\nl5fHqFGjij3/lltuKbUPZsbNN9981NUVf/rTn7j88svJy8srdrz+/fuzZ88eVq9eXdivjh070r59\n+2J1ffv2xd0L+5WSkgLAggULCn7pOyavv/46Zsbo0aOLbb/99ttxd1577bVj3texat26Nd///veL\nbZs/fz7uzrXXXlvsPJs0acIZZ5xReJ5r1qxh06ZNZGRkFKvbt28f/fr145133gEgPz+f7Oxsrrrq\nKpo1a1Z4nA4dOjBgwIDjGoOzzjqL8847r/DrHj16AOG/V0WP1aNHD9ydTz75pKJDVemO90U6hXAK\n2lViex8z20747ZK/AHe7e0FNeuS4SwqK3X2jmW0FLiD8ts75wO5IMCnwVuRYPYCFkZp17l704r3F\nhGdVOgFro3U6Oxs2by69zQyuuAISE2OctYic0rZv386ll15K/fr1efnll8u8XHLLli0AtGvX7qi2\nDh06sGbN//1Tt2nTJvbs2UOTJk2OqjUzvvrqKwC2bt1a6j7r169P/fr1S+1HWlpasa+//vpr9uzZ\nw8yZM5kxY0bM423atIkNGzbQuHHjmHXXX389s2fP5uabb+Y3v/kN/fr1Y8iQIVxzzTUxx2nLli2E\nQqGjzic1NZWUlJTCMaxMpV39snnzZvLz80v9XpkZNWrUKKwDogbPUChETk4Oubm5fPvtt1G/92+8\n8Ubh1+Udg5YlVk0XLKpt3rx5se316tUDYPfu3aX2tTpUOHxY+G/RVOCv7v5RkaY3gPnAp0BbYBLw\nupld4OEY3BQ45O57S+xye6SNyP+/Ktro7nlmtqtEzfZS9lHQFjV83Hln7HO76y64//7YNSJyatq7\ndy8DBw5k7969/PWvf6Vp06ZlP6kc8vPzSU1N5YUXXih15qC0F/9jlZSUdNSxAIYOHcqwYcNKfc45\n55xTWNu5c2cyMzNL7VeLFi0AqFmzJu+88w5Lly7ltddeY9GiRfzxj3+kX79+ZGdnlxnUgrzvRcnx\ngPB5hkIhFi1aVOoNwApmrgrG7uGHH6ZLly6l7r927drk5uaWu1/HOgbRrvaJtr08M1FV7XhmPp4E\nzgJ6Ft3o7i8V+fJDM1sH/AvoAyw9juNVmgsvHF2YBAtcc00G116bwYUXwp491dQxETmhHTx4kMsu\nu4zNmzezZMkSOnTocEzPa9WqFRCePShp48aNxb5u27YtS5Ys4cILLyQxxhRswT43b95c+GeAXbt2\nHfNvuI0bN6ZOnTrk5eVxySWXxKxt27YtH3zwAX379j2mffft25e+ffvy0EMPMWnSJO6++26WLl0a\n9TitWrUiPz+fTZs2FRvXr776ij179hQ7x6rUtm1b3J20tLRSZyuK1gHUqVMn5tg1btyYpKSkUr/3\nGzZsKPb1iTIGAFlZWWRlZRXblpOTE6W6/Cp0X1czexwYBPRx9//EqnX3Twlf1VLwXdwG1Iis/Sgq\nNdJWUFNsztHM4oAGJWpSS9kHRWpKNW1aJq+//udij5tuyqBOHTiGy8ZF5BSUn5/Pddddx8qVK5k3\nb16x99rL0rRpU84991yeeeYZ9u3bV7j9zTff5KOPPipWe91113HkyBEmTJhw1H7y8vIKXwD69etH\nXFwc06dPL1Yzbdq0Y+5XKBTi6quvZv78+Xz44YdHte8ockva6667js8//5xZs2YdVZebm8uByB3k\nSgs+Xbp0wd05ePBg1L4MGjQId2fq1KnFtj/88MOYGZdeeukxn9fxGDJkCKFQiPHjx5favmtXeAVB\neno6bdsucq2oAAAgAElEQVS25aGHHmL//v1H1RWMXSgUYsCAAbzyyit8/vnnhe3r168nOzu72HNO\nlDEAyMjI4M9//nOxR2ZmZqXtv9wzH5HgcSXQ2923HkN9c6AhUBBSVgFHgH6EL9XFzDoQXri6IlKz\nAkgxs65F1n30AwxYWaRmrJk1KrLuoz/hS3OL/zSLiByn2267jVdffZUrrriCHTt2MHfu3GLt//Vf\n/xXz+ZMmTeKyyy6jZ8+e3HTTTezcuZPHH3+cs88+m2+++aawrlevXowcOZLJkyfz/vvv079/fxIS\nEvj444+ZN28ejz32GEOGDKFJkybceuutPPLII1x55ZUMHDiQtWvX8sYbb9C4ceOjpu6jTblPnjyZ\nZcuW0aNHD26++WbOOussdu3axapVq/jLX/5S+CL6ox/9iJdeeolRo0axdOlSevbsSV5eHuvXr+fl\nl18mOzubbt26MWHCBN555x0uvfRSWrVqxfbt25k+fTotW7bkoosuijo+55xzDsOGDWPmzJns3r2b\n3r17s3LlSp599lmGDBlC7969Y45vZWnTpg33338/Y8eO5dNPP2Xw4MHUqVOHTz75hFdeeYWRI0dy\n2223YWY89dRTDBo0iE6dOjFixAiaNWvGF198wdKlS6lXrx4LFy4EYPz48SxatIiLLrqIn//85xw+\nfLjwe//BBx+ccGMQhHKFDzN7EsgArgD2m1nBTEOOu+eaWS3gPsJrPrYRnu34HeHLZBcDuPteM5sN\nPGJmuwnf5+Mx4G/u/l6kZoOZLQZmmdkooAYwDchy94JZjWzCIeM5M7sDOA2YCDzu7ocrMBYiUsXW\nrz95j7927VrMjFdffZVXX331qPaywseAAQN4+eWXufvuuxk7dixt27Zlzpw5vPLKK4VXRhSYPn06\n3bt3Z8aMGdx1113Ex8eTlpbGjTfeSM+e//dO95QpU6hVqxazZs1iyZIlnH/++SxevJiLL774qDt1\nRltH0KRJE9577z0mTJjAggULmD59Og0bNqRTp05MmTKl2PMXLlxIZmYmzz77LK+88grJycm0adOG\n0aNH0759ewCuvPJKtmzZwtNPP82OHTto1KgRffr0Ydy4cdSpUyfmGM2ePbvYuDRt2pS77rqLe++9\n96ja410bEutzVe644w46dOhAZmZm4QxUixYtGDhwIFdccUVhXe/evVmxYgUTJ07kiSee4JtvvqFp\n06b06NGDkSNHFtZ17tyZ7OxsbrvtNu677z6aN2/OhAkT+PLLL4uFj/KMQbT+l3d7tSnPdbmEbxiW\nV8rjxkh7TWAR4eCRC3xC+OqTxiX2k0g4TBTcZOxljr7JWArwPP93k7FZQHKJmhaE7zXyDeHFpr/j\nOG8y1qWL+y9+EbVZRGKIdp+ALVvck5PD99io7kdycrg/31V79uxxM/Pf/va31d0VOckEeZ+Pcs18\nuHvMNSLungsMPIb9HCR8347SL0gP1+wBhpaxn8+Ay8o6nohUr5Ytw7MOJ/On2p6ISvsskszMTMyM\nPn36VE+nRI6BbsYlIoFo2fK786J/ovjjH//InDlzGDRoELVr12b58uW8+OKLDBw4kAsuuKC6uycS\nlcKHiMhJ6pxzziEhIYEHH3yQvXv3kpqayujRo5k4cWJ1d00kJoUPEZGTVNeuXY+6XFPkZFCh+3yI\niIiIVJTCh4iIiARK4UNEREQCpfAhIiIigdKCUxGpdOur+1amIlJuQf7cKnyISKVp1KgRycnJDB0a\n8/6AInKCSk5OplGjRlV+HIUPEak0LVu2ZP369cU+DVVETh6NGjWiZQB3A1T4EJFK1bJly0D+8RKR\nk5cWnIqIiEigFD5EREQkUAofIiIiEiiFDxEREQmUwoeIiIgESuFDREREAqXwISIiIoFS+BAREZFA\nKXyIiIhIoBQ+REREJFAKHyIiIhIohQ8REREJlMKHiIiIBErhQ0RERAKl8CEiIiKBUvgQERGRQCl8\niIiISKAUPkRERCRQCh8iIiISKIUPERERCZTCh4iIiARK4UNEREQCpfAhIiIigVL4EBERkUApfIiI\niEigFD5EREQkUAofIiIiEiiFDxEREQmUwoeIiIgEqlzhw8zuNLP3zGyvmW03swVm1r6Uuglm9qWZ\nHTCzN82sXYn2RDN7wsx2mNk+M5tnZk1K1NQ3s7lmlmNmu83sKTOrVaKmhZm9Zmb7zWybmU0xMwUq\nERGRE1h5X6gvBqYBPYDvAwlAtpklFRSY2R3AL4GfAucB+4HFZlajyH6mApcCVwO9gNOB+SWO9QLQ\nEegXqe0FzChynBDwOhAPnA8MA4YDE8p5TiIiIhKg+PIUu/ugol+b2XDgKyAd+Gtk863ARHf/n0jN\njcB2YDDwkpnVBW4CbnD3tyM1I4D1Znaeu79nZh2BAUC6u6+J1NwCvGZmv3b3bZH2M4G+7r4DWGdm\n9wCTzWycux8p72CIiIhI1TvetyhSAAd2AZhZa6ApsKSgwN33AiuBCyKbuhMOPUVrNgJbi9ScD+wu\nCB4Rb0WO1aNIzbpI8CiwGKgHdDrO8xIREZEqUuHwYWZG+O2Tv7r7R5HNTQkHhO0lyrdH2gBSgUOR\nUBKtpinhGZVC7p5HOOQUrSntOBSpERERkRNMud52KeFJ4CygZyX1RURERE4BFQofZvY4MAi42N3/\nU6RpG2CEZzeKzkqkAmuK1NQws7olZj9SI20FNSWvfokDGpSo+V6JrqUWaYtq9OjR1KtXr9i2jIwM\nMjIyYj1NRETklJCVlUVWVlaxbTk5OZW2/3KHj0jwuBLo7e5bi7a5+6dmto3wFSofROrrEl6n8USk\nbBVwJFKzIFLTAWgJrIjUrABSzKxrkXUf/QgHm5VFasaaWaMi6z76AzlAwdtApcrMzKRbt27lPXUR\nEZFTQmm/kK9evZr09PRK2X+5woeZPQlkAFcA+82sYKYhx91zI3+eCtxtZpuBfwMTgc+BhRBegGpm\ns4FHzGw3sA94DPibu78XqdlgZouBWWY2CqhB+BLfrMiVLgDZhEPGc5HLe0+LHOtxdz9cznEQERGR\ngJR35uNnhBeULiuxfQTwLIC7TzGzZML35EgBlgM/cPdDRepHA3nAPCARWAT8osQ+fwg8Tvgql/xI\n7a0Fje6eb2aXAdOBdwnfT2QOcF85z0lEREQCVN77fBzT1THuPg4YF6P9IHBL5BGtZg8wtIzjfAZc\ndix9EhERkRODbkUuIiIigVL4EBERkUApfIiIiEigFD5EREQkUAofIiIiEiiFDxEREQnU8Xy2y3dS\nQgLMng0vvxy9Ji4O5s6Fvn2D65eIiMh3hcJHCU8/DQsXxq55+GHIzlb4EBERqQiFjxLOPjv8iOXp\np4Ppi4iIyHeR1nyIiIhIoBQ+REREJFAKHyIiIhIohQ8REREJlMKHiIiIBErhQ0RERAKl8CEiIiKB\nUvgQERGRQCl8iIiISKAUPkRERCRQCh8iIiISKIUPERERCZTCh4iIiARK4UNEREQCpfAhIiIigVL4\nEBERkUApfIiIiEigFD5EREQkUAofIiIiEiiFDxEREQmUwoeIiIgESuFDREREAqXwISIiIoFS+BAR\nEZFAKXyIiIhIoBQ+REREJFAKHyIiIhIohQ8REREJlMKHiIiIBErhQ0RERAKl8CEiIiKBUvgQERGR\nQJU7fJjZxWb2ZzP7wszyzeyKEu1PR7YXfbxeoibRzJ4wsx1mts/M5plZkxI19c1srpnlmNluM3vK\nzGqVqGlhZq+Z2X4z22ZmU8xMgUpEROQEVpEX6lrA+8DPAY9S8waQCjSNPDJKtE8FLgWuBnoBpwPz\nS9S8AHQE+kVqewEzChojIeN1IB44HxgGDAcmVOCcREREJCDx5X2Cuy8CFgGYmUUpO+juX5fWYGZ1\ngZuAG9z97ci2EcB6MzvP3d8zs47AACDd3ddEam4BXjOzX7v7tkj7mUBfd98BrDOze4DJZjbO3Y+U\n99xERESk6lXVWxR9zGy7mW0wsyfNrEGRtnTCoWdJwQZ33whsBS6IbDof2F0QPCLeIjzT0qNIzbpI\n8CiwGKgHdKrUsxEREZFKUxXh4w3gRuASYAzQG3i9yCxJU+CQu+8t8bztkbaCmq+KNrp7HrCrRM32\nUvZBkRoRERE5wZT7bZeyuPtLRb780MzWAf8C+gBLK/t4FTF69Gjq1atXbFtGRgYZGSWXpoiIiJx6\nsrKyyMrKKrYtJyen0vZf6eGjJHf/1Mx2AO0Ih49tQA0zq1ti9iM10kbk/yWvfokDGpSo+V6Jw6UW\naYsqMzOTbt26lfdURERETgml/UK+evVq0tPTK2X/VX5Zqpk1BxoC/4lsWgUcIXwVS0FNB6AlsCKy\naQWQYmZdi+yqH2DAyiI1nc2sUZGa/kAO8FEln4aIiIhUknLPfETutdGOcBAAaGNmXQivx9gF3Ef4\nstltkbrfAR8TXgyKu+81s9nAI2a2G9gHPAb8zd3fi9RsMLPFwCwzGwXUAKYBWZErXQCyCYeM58zs\nDuA0YCLwuLsfLu95iYiISDAq8rZLd8Jvn3jk8XBk+zOE7/1xDuEFpynAl4RDx70lAsFoIA+YByQS\nvnT3FyWO80PgccJXueRHam8taHT3fDO7DJgOvAvsB+YQDj8iIiJygqrIfT7eJvbbNQOPYR8HgVsi\nj2g1e4ChZeznM+Cyso4nIiIiJw7dilxEREQCpfAhIiIigVL4EBERkUApfIiIiEigqvwmY99V8+fD\nxo3R2xMS4OGHoXnz4PokIiJyMtDMRwXccw+0awe5udEfCxbAM89Ud09FREROPJr5qIBhw8KPWFJT\nY7eLiIicqjTzISIiIoFS+BAREZFAKXyIiIhIoBQ+REREJFAKHyIiIhIohQ8REREJlMKHiIiIBErh\nQ0RERAKl8CEiIiKBUvgQERGRQCl8iIiISKAUPkRERCRQCh8iIiISKIUPERERCZTCh4iIiARK4UNE\nREQCpfAhIiIigVL4EBERkUApfIiIiEigFD5EREQkUAofIiIiEiiFDxEREQmUwoeIiIgESuFDRERE\nAqXwISIiIoFS+BAREZFAKXyIiIhIoBQ+REREJFAKHyIiIhIohQ8REREJlMKHiIiIBErho4rUqAEP\nPACNGkV/tGgBa9ZUd09FRESCFV/dHfiumj8f/vKX2DX33APLl0PXrsH0SURE5ESg8FFFzjsv/Ihl\n4sRg+iIiInIiKffbLmZ2sZn92cy+MLN8M7uilJoJZvalmR0wszfNrF2J9kQze8LMdpjZPjObZ2ZN\nStTUN7O5ZpZjZrvN7Ckzq1WipoWZvWZm+81sm5lNMTO9lSQiInICq8gLdS3gfeDngJdsNLM7gF8C\nPwXOA/YDi82sRpGyqcClwNVAL+B0YH6JXb0AdAT6RWp7ATOKHCcEvE549uZ8YBgwHJhQgXMSERGR\ngJT7bRd3XwQsAjAzK6XkVmCiu/9PpOZGYDswGHjJzOoCNwE3uPvbkZoRwHozO8/d3zOzjsAAIN3d\n10RqbgFeM7Nfu/u2SPuZQF933wGsM7N7gMlmNs7dj5T33ERERKTqVepbFGbWGmgKLCnY5u57gZXA\nBZFN3QmHnqI1G4GtRWrOB3YXBI+ItwjPtPQoUrMuEjwKLAbqAZ0q6ZRERESkklX2+oimhAPC9hLb\nt0faAFKBQ5FQEq2mKfBV0UZ3zwN2lagp7TgUqREREZETzCl5tcvo0aOpV69esW0ZGRlkZGRUU49E\nREROHFlZWWRlZRXblpOTU2n7r+zwsQ0wwrMbRWclUoE1RWpqmFndErMfqZG2gpqSV7/EAQ1K1Hyv\nxPFTi7RFlZmZSbdu3co8GRERkVNRab+Qr169mvT09ErZf6W+7eLunxJ+4e9XsC2ywLQH8G5k0yrg\nSImaDkBLYEVk0wogxcyK3n6rH+Fgs7JITWcza1Skpj+QA3xUSackIiIilazcMx+Re220IxwEANqY\nWRdgl7t/Rvgy2rvNbDPwb2Ai8DmwEMILUM1sNvCIme0G9gGPAX9z9/ciNRvMbDEwy8xGATWAaUBW\n5EoXgGzCIeO5yOW9p0WO9bi7Hy7veYmIiEgwKvK2S3dgKeGFpQ48HNn+DHCTu08xs2TC9+RIAZYD\nP3D3Q0X2MRrIA+YBiYQv3f1FieP8EHic8FUu+ZHaWwsa3T3fzC4DphOeVdkPzAHuq8A5iYiISEAq\ncp+Ptynj7Rp3HweMi9F+ELgl8ohWswcYWsZxPgMui1UjIiIiJxbdilxEREQCpfAhIiIigVL4EBER\nkUApfIiIiEigFD5EREQkUAofIiIiEiiFDxEREQmUwoeIiIgESuFDREREAqXwISIiIoFS+BAREZFA\nKXyIiIhIoBQ+REREJFAKHyIiIhIohQ8REREJVHx1d+BU9/zzsHp19PbatWHSJKhTJ7g+iYiIVCWF\nj2o0Zgy8+SZs3lx6uzu8+y506wY33RRs30RERKqKwkc1uu++8COa/HyIiwuuPyIiIkHQmg8REREJ\nlMKHiIiIBErhQ0RERAKl8CEiIiKBUvgQERGRQCl8iIiISKAUPkRERCRQCh8iIiISKIUPERERCZTC\nh4iIiARK4UNEREQCpfAhIiIigVL4EBERkUApfIiIiEigFD5EREQkUAofIiIiEiiFDxEREQlUfHV3\nQGKLi4PbboO7745ek5ICr78OaWmBdUtERKTCFD5OYKFQOFSsXBm9JjcXfvtb+Mc/FD5EROTkoPBx\nguvfP/yIZu/ecPgQERE5WWjNh4iIiARK4UNEREQCVenhw8zuM7P8Eo+PStRMMLMvzeyAmb1pZu1K\ntCea2RNmtsPM9pnZPDNrUqKmvpnNNbMcM9ttZk+ZWa3KPh8RERGpXFU18/G/QCrQNPK4qKDBzO4A\nfgn8FDgP2A8sNrMaRZ4/FbgUuBroBZwOzC9xjBeAjkC/SG0vYEYVnIuIiIhUoqpacHrE3b+O0nYr\nMNHd/wfAzG4EtgODgZfMrC5wE3CDu78dqRkBrDez89z9PTPrCAwA0t19TaTmFuA1M/u1u2+rovMS\nERGR41RVMx9nmNkXZvYvM3vezFoAmFlrwjMhSwoK3X0vsBK4ILKpO+FQVLRmI7C1SM35wO6C4BHx\nFuBAj6o5JREREakMVRE+/g4MJzwz8TOgNfBOZD1GU8IBYXuJ52yPtEH47ZpDkVASraYp8FXRRnfP\nA3YVqREREZETUKW/7eLui4t8+b9m9h6wBbgO2FDZxxMREZGTS5XfZMzdc8zsY6AdsAwwwrMbRWc/\nUoGCt1C2ATXMrG6J2Y/USFtBTcmrX+KABkVqoho9ejT16tUrti0jI4OMjIxjPCsREZHvrqysLLKy\nsopty8nJqbT9V3n4MLPahIPHM+7+qZltI3yFygeR9rqE12k8EXnKKuBIpGZBpKYD0BJYEalZAaSY\nWdci6z76EQ42MW5GHpaZmUm3bt0q4exERES+e0r7hXz16tWkp6dXyv4rPXyY2YPAq4TfamkGjAcO\nAy9GSqYCd5vZZuDfwETgc2AhhBegmtls4BEz2w3sAx4D/ubu70VqNpjZYmCWmY0CagDTgCxd6SIi\nInJiq4qZj+aE78HREPga+CtwvrvvBHD3KWaWTPieHCnAcuAH7n6oyD5GA3nAPCARWAT8osRxfgg8\nTvgql/xI7a1VcD4iIiJSiapiwWmZCyfcfRwwLkb7QeCWyCNazR5gaPl7KCIiItVJn+0iIiIigary\nBacSjG3b4F//it5eqxY01R1QRETkBKDwcZJLSIAGDeBXvwo/oqlZE/75T+jUKbi+iYiIlEbh4ySX\nlASrVsGnn0av+fpruP562LpV4UNERKqfwsd3QFpa+BHNF18E1RMREZGyacGpiIiIBErhQ0RERAKl\n8CEiIiKBUvgQERGRQCl8iIiISKAUPkRERCRQutT2FHLPPTBtWvT2008PtyclBdcnERE59Wjm4xRw\n+ulw773QqlU4WJT2OHIEZs8O37BMRESkKmnm4xRgBuPHx67ZuBHOPDOY/oiIyKlNMx8iIiISKIUP\nERERCZTCh4iIiARK4UNEREQCpfAhIiIigdLVLlLMH/4AS5ZEb09Lg2HDAuuOiIh8Byl8CADNm0Of\nPpCdHb3m0CH4+mvo3h06dQqsayIi8h2j8CEA1KoFS5fGrvnnP+F734PDh4Ppk4iIfDdpzYeIiIgE\nSuFDREREAqXwISIiIoFS+BAREZFAacGplFt2Nnz8cfT21q3DC1NFRERKo/Ahx6x5c0hNhTvuiF2X\nkAD//jecfnog3RIRkZOMwoccs6ZNw6Hi0KHoNcuXw2WXwYEDgXVLREROMgofUi41a4Yf0dSqFVxf\nRETk5KQFpyIiIhIozXxIlVi4MPw2TTTt22tRqojIqUrhQypVmzbQqBH8+tex6xIT4T//gfr1g+mX\niIicOPS2i1Sqli3hiy/CC06jPV5+GQ4ejL1wVUREvrs08yGVrkaN2O2JicH0Q0RETkwKHxK4UGS+\n7aKLID7G38DzzoOnn/6/ehER+W5Q+KhG67av4x9f/iNmTXwonotbXoyZRa359vC3HMw7GHM/9RLr\n0bp+6wr1s7INHAi//S3s3Bm9Zv16ePZZePJJXb4rIvJdo/BRRUb9zyj+tOFPMWu+2v9VQL0J69Cw\nA+0atCu1zXGa1WnGYz94jJrxMW7kUQni4uDOO2PXZGXB66/D738f+22a730PevSo3P6JiEjVUvio\nIn/a8CfOST2HPq36RK2pGV+TX573S2rERV8ksTVnK5t3bY55rJCFqJtYN2p7vufz+3/+np3fRp9q\n+GT3J7y+6XVmrZ4VNaAAtE5pzfzr5lMnsU7MPh2vrl3Dt3K/++7oNUeOhNeX7N9fpV0REZFKpvBR\nAa9ufJX56+fHrMnJzaFPqz7c1euu4zpWq5RWtEppdVz7APhes7JvqvHU6qfYtHNT1Pb/fPMfnvvg\nOYYuGErDpIZR6zo07MCYnmNivlVUljPPhG3bYtfMnQtDh4ZnUmK59FL4858r3BUREalkCh8VMHrx\naHKP5MZcQ3Fhiwu5ttO1Afbq+P2k209K3Z6VlUVGRgYHDh/gcP5htuzZwtf7vy61dtV/VnEo7xAz\nV88kLSWt1Bp3p0tqFx7q/xBxoTKSQwwZGeAee+bjzTfDn8JbHQrGTY6dxqxiNG7lpzGrXid9+DCz\nXwC/BpoCa4Fb3D32Ks4Y1vxnDQs2LIhZs/Pbnfws/WdM+v6kih7mpFLwQ5qckEzW1Vkxaw/lHeI3\nb/2G/3zzn6g1Sz5ZwtJ/L2Xe+nk0Sm5Uak2+51MroRY3db2JkEW/3KXb97txbtNzo7YfOQLz50O9\nejG7zfXXw8yZsWvKS/+4lZ/GrGI0buWnMateJ3X4MLPrgYeBnwLvAaOBxWbW3t13VGSfN796Mxt2\nbKBBUoOoNSk1Uxh0xqCK7P47r0ZcDR4Z8EjMmsN5h5n4zkR2Hoi+BmXTrk28+cmbrPh8RZnHbFmv\nZdS2Qwfh3Jvu4bLW10StWfluIrNmJfHSS7GPc9NN8EjsUxMRkWNwUocPwmFjhrs/C2BmPwMuBW4C\nplRkh0fyjzD83OE8PujxyuulFJMQl8CEvhPKrHP3mO2H8g7xyIpHOHD4QNSa+evn837Lm3k/7+bo\nO+qWCP5T4hKjfxjNgQ09yZx6MbP/ELvPv/yl88D9FX8rSUTkVHDShg8zSwDSgd8WbHN3N7O3gAti\nPffPG//MB6EPSm3b9e2uyuymHIeyFqwmxidy58Wxr9md0HcC72x5h9wjuVFr9h3ax+q+q4FvotbM\nXDGM3OVD2Osx7ni28Up+O6kHU6Z/CUDevm9JaPhZsRLPiyeh3tc07PBxjF47XbpA764tolbk5xln\nN+oa80opM+jTBxISYhxKRKQanLThA2gExAHbS2zfDnSI8pyaAOPnjQ8/O4rTTj+N1atXV0IXvxty\ncnJO6vGoE/kvmsY0pk39NjH3cdXAq1jZeSX5nh+15svt77Jo4UdAODR9uuorWqe/Uaxm+5Z67Pu6\nHjkfJEfdzzfbm/LFm/B6zB4B/G+ZFQA0WRe9zSEuMZeEpOgftOPuJNQ6QELN6AHOgbq14qlVxr3z\nUxodoUZ89FD5wep/cXHG41iMdT4AdeuECJVxMVVKvdj/vB0+GMI99k7MnJo1Y9eEQkZSzdj9/fbw\ntzHbAeJDcdRMiB4m893JjfItWLvhM26+6xkA4uKMxITos2+Ok5efV2Z/4kIh4mLdXtjhWD4eLBSi\n4Eciyn6c/DJmOQFCZuFEHXU3+RzDbgpnVP/348/5f/c/X2pNXChUKccKBXR75iNHov+7VKDVabXp\n0y36LzTHYv369QV/PO6bQVlZU9snKjM7DfgCuMDdVxbZ/jugl7sfNfthZj+E/9/e3YVIWcVxHP/+\nhMiyNHpBoS4ytqAIJIm6KMs0MrrQiohgwbqUCryJJAq2DIIQKiq2m2glyiCI0CDTqCDEVJCKLCzy\npdXMogQNX8L038U5i7PjvLjONM/MPL8PPOA++7Ae/vxm5jznOXMO73aulWZmZn1nMCJWtfIHennk\n40/gBDC96vx0oN4KEeuAQWA3UP82zszMzKpNBq4kfZa2pGdHPgAkbQI2R8TS/LOAUeDViFhRaOPM\nzMyspl4e+QB4CVgpaSunvmp7PrCyyEaZmZlZfT3d+YiI9yVdCiwnPW75BlgQEbWX3zQzM7PC9fRj\nFzMzM+s9nfkekJmZmVnmzoeZmZl1VGk6H5Iek7RL0lFJmyQ132O+xCQNSTpZdfxQdLu6jaQ5ktZI\n+jXXaGGNa5ZL2ifpiKRPJQ0U0dZu0axmkkZqZK/5mmt9TNJTkrZIOiTpd0kfSrqmxnXOWoUzqZvz\nNp6kJZK+lXQwHxsl3V11Tcs5K0Xno2IDuiHgBtLut+vyZFWrbxtpIu+MfNxabHO60hTSROdHyWs+\nVpK0DHictPnhTcBhUvbqL2XZ/xrWLFvL+OyVffvROcBrwM3AncA5wHpJ541d4KzV1LRumfN2yh5g\nGVo9ztoAAANJSURBVDCbtIXJ58BqSddC+3JWigmnddYD2UNaD+SsNqDrd5KGgEURMbvotvQKSSeB\neyNiTcW5fcCKiHg5/zyVtAXAwxHRZB/d/lenZiPAtIi4v7iWdbd84/QHaTXnDfmcs9ZEnbo5b01I\n+gt4IiJG2pWzvh/5qNiA7rOxc5F6XE03oDOuzkPjOyS9I6m1jQFKRtJM0l1UZfYOAZtx9pqZm4fJ\nt0salnRx0Q3qMheRRo0OgLM2AePqVsF5q0HSJEkPkdbP2tjOnPV954PGG9DV30PdNgGPAAuAJcBM\n4EtJU4psVI+ZQXqjc/YmZi2wGJgHPAncDnysZtscl0SuwyvAhogYm4flrDVRp27gvJ1G0vWS/gb+\nAYaB+yLiR9qYs55eZMz+PxFRuXb/NklbgF+AB4GRYlplZVA1dPu9pO+AHcBc4ItCGtVdhoHrgFuK\nbkiPqVk3562m7cAsYBrwAPC2pNva+R+UYeTjbDagsyoRcRD4CSj17PkJ2k/aTNzZa0FE7CK9jkuf\nPUmvA/cAcyPit4pfOWsNNKjbaZw3iIh/I2JnRHwdEU+TvqSxlDbmrO87HxFxHNgKzB87l4fT5gMb\ni2pXr5F0AenF2PCFa6fkN7H9jM/eVNLMe2fvDEm6AriEkmcvf4AuAu6IiNHK3zlr9TWqW53rnbfT\nTQLObWfOyvLYxRvQTZCkFcBHpEctlwPPAceB94psV7fJc2AGSHcDAFdJmgUciIg9pGfMz0j6GdgN\nPA/sBVYX0Nyu0Khm+RgCPiC9yQ0AL5JG3VrexrtXSRomff1zIXBY0tid58GIOJb/7axVaVa3nEXn\nrYKkF0jzYEaBC4FB0jyYu/Il7clZRJTiIK0psBs4CnwF3Fh0m7r5IHUy9uZ6jQKrgJlFt6vbjvyi\nPEl6tFd5vFVxzbPAPuAI6Q1toOh2d2vNgMnAJ6QPgmPATuAN4LKi211wzWrV6wSwuOo6Z20CdXPe\natbszVyHo7ku64F5Vde0nLNSrPNhZmZm3aPv53yYmZlZd3Hnw8zMzDrKnQ8zMzPrKHc+zMzMrKPc\n+TAzM7OOcufDzMzMOsqdDzMzM+sodz7MzMyso9z5MDMzs45y58PMzMw6yp0PMzMz66j/ANrRqHdO\nOq2fAAAAAElFTkSuQmCC\n", 338 | "text/plain": [ 339 | "" 340 | ] 341 | }, 342 | "metadata": {}, 343 | "output_type": "display_data" 344 | } 345 | ], 346 | "source": [ 347 | "%matplotlib inline\n", 348 | "import matplotlib\n", 349 | "import matplotlib.pyplot as plt\n", 350 | "\n", 351 | "output = plt.hist([chi_squared_df2,chi_squared_df5], bins=50, histtype='step', \n", 352 | " label=['2 degrees of freedom','5 degrees of freedom'])\n", 353 | "plt.legend(loc='upper right')\n" 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": {}, 359 | "source": [ 360 | "# Hypothesis Testing" 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": 40, 366 | "metadata": { 367 | "collapsed": false 368 | }, 369 | "outputs": [], 370 | "source": [ 371 | "df = pd.read_csv('grades.csv')" 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 41, 377 | "metadata": { 378 | "collapsed": false 379 | }, 380 | "outputs": [ 381 | { 382 | "data": { 383 | "text/html": [ 384 | "
\n", 385 | "\n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | "
student_idassignment1_gradeassignment1_submissionassignment2_gradeassignment2_submissionassignment3_gradeassignment3_submissionassignment4_gradeassignment4_submissionassignment5_gradeassignment5_submissionassignment6_gradeassignment6_submission
0B73F2C11-70F0-E37D-8B10-1D20AFED50B192.7339462015-11-02 06:55:34.28200000083.0305522015-11-09 02:22:58.93800000067.1644412015-11-12 08:58:33.99800000053.0115532015-11-16 01:21:24.66300000047.7103982015-11-20 13:24:59.69200000038.1683182015-11-22 18:31:15.934000000
198A0FAE0-A19A-13D2-4BB5-CFBFD94031D186.7908212015-11-29 14:57:44.42900000086.2908212015-12-06 17:41:18.44900000069.7726572015-12-10 08:54:55.90400000055.0981252015-12-13 17:32:30.94100000049.5883132015-12-19 23:26:39.28500000044.6294822015-12-21 17:07:24.275000000
2D0F62040-CEB0-904C-F563-2F8620916C4E85.5125412016-01-09 05:36:02.38900000085.5125412016-01-09 06:39:44.41600000068.4100332016-01-15 20:22:45.88200000054.7280262016-01-11 12:41:50.74900000049.2552242016-01-11 17:31:12.48900000044.3297012016-01-17 16:24:42.765000000
3FFDF2B2C-F514-EF7F-6538-A6A53518E9DC86.0306652016-04-30 06:50:39.80100000068.8245322016-04-30 17:20:38.72700000061.9420792016-05-12 07:47:16.32600000049.5536632016-05-07 16:09:20.48500000049.5536632016-05-24 12:51:18.01600000044.5982972016-05-26 08:09:12.058000000
45ECBEEB6-F1CE-80AE-3164-E45E99473FB464.8138002015-12-13 17:06:10.75000000051.4910402015-12-14 12:25:12.05600000041.9328322015-12-29 14:25:22.59400000036.9295492015-12-28 01:29:55.90100000033.2365942015-12-29 14:46:06.62800000033.2365942016-01-05 01:06:59.546000000
\n", 487 | "
" 488 | ], 489 | "text/plain": [ 490 | " student_id assignment1_grade \\\n", 491 | "0 B73F2C11-70F0-E37D-8B10-1D20AFED50B1 92.733946 \n", 492 | "1 98A0FAE0-A19A-13D2-4BB5-CFBFD94031D1 86.790821 \n", 493 | "2 D0F62040-CEB0-904C-F563-2F8620916C4E 85.512541 \n", 494 | "3 FFDF2B2C-F514-EF7F-6538-A6A53518E9DC 86.030665 \n", 495 | "4 5ECBEEB6-F1CE-80AE-3164-E45E99473FB4 64.813800 \n", 496 | "\n", 497 | " assignment1_submission assignment2_grade \\\n", 498 | "0 2015-11-02 06:55:34.282000000 83.030552 \n", 499 | "1 2015-11-29 14:57:44.429000000 86.290821 \n", 500 | "2 2016-01-09 05:36:02.389000000 85.512541 \n", 501 | "3 2016-04-30 06:50:39.801000000 68.824532 \n", 502 | "4 2015-12-13 17:06:10.750000000 51.491040 \n", 503 | "\n", 504 | " assignment2_submission assignment3_grade \\\n", 505 | "0 2015-11-09 02:22:58.938000000 67.164441 \n", 506 | "1 2015-12-06 17:41:18.449000000 69.772657 \n", 507 | "2 2016-01-09 06:39:44.416000000 68.410033 \n", 508 | "3 2016-04-30 17:20:38.727000000 61.942079 \n", 509 | "4 2015-12-14 12:25:12.056000000 41.932832 \n", 510 | "\n", 511 | " assignment3_submission assignment4_grade \\\n", 512 | "0 2015-11-12 08:58:33.998000000 53.011553 \n", 513 | "1 2015-12-10 08:54:55.904000000 55.098125 \n", 514 | "2 2016-01-15 20:22:45.882000000 54.728026 \n", 515 | "3 2016-05-12 07:47:16.326000000 49.553663 \n", 516 | "4 2015-12-29 14:25:22.594000000 36.929549 \n", 517 | "\n", 518 | " assignment4_submission assignment5_grade \\\n", 519 | "0 2015-11-16 01:21:24.663000000 47.710398 \n", 520 | "1 2015-12-13 17:32:30.941000000 49.588313 \n", 521 | "2 2016-01-11 12:41:50.749000000 49.255224 \n", 522 | "3 2016-05-07 16:09:20.485000000 49.553663 \n", 523 | "4 2015-12-28 01:29:55.901000000 33.236594 \n", 524 | "\n", 525 | " assignment5_submission assignment6_grade \\\n", 526 | "0 2015-11-20 13:24:59.692000000 38.168318 \n", 527 | "1 2015-12-19 23:26:39.285000000 44.629482 \n", 528 | "2 2016-01-11 17:31:12.489000000 44.329701 \n", 529 | "3 2016-05-24 12:51:18.016000000 44.598297 \n", 530 | "4 2015-12-29 14:46:06.628000000 33.236594 \n", 531 | "\n", 532 | " assignment6_submission \n", 533 | "0 2015-11-22 18:31:15.934000000 \n", 534 | "1 2015-12-21 17:07:24.275000000 \n", 535 | "2 2016-01-17 16:24:42.765000000 \n", 536 | "3 2016-05-26 08:09:12.058000000 \n", 537 | "4 2016-01-05 01:06:59.546000000 " 538 | ] 539 | }, 540 | "execution_count": 41, 541 | "metadata": {}, 542 | "output_type": "execute_result" 543 | } 544 | ], 545 | "source": [ 546 | "df.head()" 547 | ] 548 | }, 549 | { 550 | "cell_type": "code", 551 | "execution_count": 42, 552 | "metadata": { 553 | "collapsed": false 554 | }, 555 | "outputs": [ 556 | { 557 | "data": { 558 | "text/plain": [ 559 | "2315" 560 | ] 561 | }, 562 | "execution_count": 42, 563 | "metadata": {}, 564 | "output_type": "execute_result" 565 | } 566 | ], 567 | "source": [ 568 | "len(df)" 569 | ] 570 | }, 571 | { 572 | "cell_type": "code", 573 | "execution_count": 90, 574 | "metadata": { 575 | "collapsed": false 576 | }, 577 | "outputs": [], 578 | "source": [ 579 | "early = df[df['assignment1_submission'] <= '2015-12-31']\n", 580 | "late = df[df['assignment1_submission'] > '2015-12-31']" 581 | ] 582 | }, 583 | { 584 | "cell_type": "code", 585 | "execution_count": 91, 586 | "metadata": { 587 | "collapsed": false 588 | }, 589 | "outputs": [ 590 | { 591 | "data": { 592 | "text/plain": [ 593 | "assignment1_grade 74.972741\n", 594 | "assignment2_grade 67.252190\n", 595 | "assignment3_grade 61.129050\n", 596 | "assignment4_grade 54.157620\n", 597 | "assignment5_grade 48.634643\n", 598 | "assignment6_grade 43.838980\n", 599 | "dtype: float64" 600 | ] 601 | }, 602 | "execution_count": 91, 603 | "metadata": {}, 604 | "output_type": "execute_result" 605 | } 606 | ], 607 | "source": [ 608 | "early.mean()" 609 | ] 610 | }, 611 | { 612 | "cell_type": "code", 613 | "execution_count": 92, 614 | "metadata": { 615 | "collapsed": false 616 | }, 617 | "outputs": [ 618 | { 619 | "data": { 620 | "text/plain": [ 621 | "assignment1_grade 74.017429\n", 622 | "assignment2_grade 66.370822\n", 623 | "assignment3_grade 60.023244\n", 624 | "assignment4_grade 54.058138\n", 625 | "assignment5_grade 48.599402\n", 626 | "assignment6_grade 43.844384\n", 627 | "dtype: float64" 628 | ] 629 | }, 630 | "execution_count": 92, 631 | "metadata": {}, 632 | "output_type": "execute_result" 633 | } 634 | ], 635 | "source": [ 636 | "late.mean()" 637 | ] 638 | }, 639 | { 640 | "cell_type": "code", 641 | "execution_count": 93, 642 | "metadata": { 643 | "collapsed": false 644 | }, 645 | "outputs": [], 646 | "source": [ 647 | "from scipy import stats\n", 648 | "stats.ttest_ind?" 649 | ] 650 | }, 651 | { 652 | "cell_type": "code", 653 | "execution_count": 94, 654 | "metadata": { 655 | "collapsed": false 656 | }, 657 | "outputs": [ 658 | { 659 | "data": { 660 | "text/plain": [ 661 | "Ttest_indResult(statistic=1.400549944897566, pvalue=0.16148283016060577)" 662 | ] 663 | }, 664 | "execution_count": 94, 665 | "metadata": {}, 666 | "output_type": "execute_result" 667 | } 668 | ], 669 | "source": [ 670 | "stats.ttest_ind(early['assignment1_grade'], late['assignment1_grade'])" 671 | ] 672 | }, 673 | { 674 | "cell_type": "code", 675 | "execution_count": 95, 676 | "metadata": { 677 | "collapsed": false 678 | }, 679 | "outputs": [ 680 | { 681 | "data": { 682 | "text/plain": [ 683 | "Ttest_indResult(statistic=1.3239868220912567, pvalue=0.18563824610067967)" 684 | ] 685 | }, 686 | "execution_count": 95, 687 | "metadata": {}, 688 | "output_type": "execute_result" 689 | } 690 | ], 691 | "source": [ 692 | "stats.ttest_ind(early['assignment2_grade'], late['assignment2_grade'])" 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "execution_count": 98, 698 | "metadata": { 699 | "collapsed": false 700 | }, 701 | "outputs": [ 702 | { 703 | "data": { 704 | "text/plain": [ 705 | "Ttest_indResult(statistic=1.7116160037010733, pvalue=0.087101516341556676)" 706 | ] 707 | }, 708 | "execution_count": 98, 709 | "metadata": {}, 710 | "output_type": "execute_result" 711 | } 712 | ], 713 | "source": [ 714 | "stats.ttest_ind(early['assignment3_grade'], late['assignment3_grade'])" 715 | ] 716 | }, 717 | { 718 | "cell_type": "code", 719 | "execution_count": null, 720 | "metadata": { 721 | "collapsed": true 722 | }, 723 | "outputs": [], 724 | "source": [] 725 | } 726 | ], 727 | "metadata": { 728 | "kernelspec": { 729 | "display_name": "Python 3", 730 | "language": "python", 731 | "name": "python3" 732 | }, 733 | "language_info": { 734 | "codemirror_mode": { 735 | "name": "ipython", 736 | "version": 3 737 | }, 738 | "file_extension": ".py", 739 | "mimetype": "text/x-python", 740 | "name": "python", 741 | "nbconvert_exporter": "python", 742 | "pygments_lexer": "ipython3", 743 | "version": "3.5.2" 744 | } 745 | }, 746 | "nbformat": 4, 747 | "nbformat_minor": 0 748 | } 749 | -------------------------------------------------------------------------------- /slides/Algorithms+in+DS.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Algorithms+in+DS.ppt -------------------------------------------------------------------------------- /slides/Correlations.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Correlations.ppt -------------------------------------------------------------------------------- /slides/Dimensionality+Reduction.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Dimensionality+Reduction.ppt -------------------------------------------------------------------------------- /slides/Introduction+to+Regular+Expressions.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Introduction+to+Regular+Expressions.ppt -------------------------------------------------------------------------------- /slides/Kmeans+Naive+Bayes.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Kmeans+Naive+Bayes.ppt -------------------------------------------------------------------------------- /slides/Python+Basics.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Python+Basics.ppt -------------------------------------------------------------------------------- /slides/Simple+NLP+Tasks.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Simple+NLP+Tasks.ppt -------------------------------------------------------------------------------- /slides/Statistical+Inference,+Exploratory+Data+Analysis,+and+the+Data+Science+Process.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/Statistical+Inference,+Exploratory+Data+Analysis,+and+the+Data+Science+Process.ppt -------------------------------------------------------------------------------- /slides/TF-IDF.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/TF-IDF.ppt -------------------------------------------------------------------------------- /slides/The+Data+Science+Process.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/umer7/Data-Science-in-Python/da6203f959f1c7800823d2d37508b158c9809365/slides/The+Data+Science+Process.ppt --------------------------------------------------------------------------------