├── 1_XNhMJunrlY_Xfei8D8sPeg.png
├── 20_Pandas_Functions_for_80_of_your_Data_Science Tasks.ipynb
├── Best_Practices_To_Use_Pandas_Efficiently_As_A_Data_Scientist.ipynb
├── Data Exploration Becomes Easier & Better With Pandas Profiling.md
├── Datasets
    ├── Popular_Baby_Names.csv
    ├── Readme.md
    ├── baseball_stats.csv
    ├── pokemon.csv
    ├── poker_hand.csv
    └── restaurant_data.csv
├── How To Eliminate Loops From Your Python Code.ipynb
├── Make_Your_Pandas_Code_1000_Times_Faster_With_This Trick.ipynb
├── README.md
├── Selecting_&_Replacing_Values_In_Pandas_DataFrame_Effectively.ipynb
├── Stop_Looping_Through_Pandas_DataFrames_&_Do_This Instead.ipynb
├── Write Efficient Python Code [Defining and Measuring Code Efficiency].ipynb
└── Write_Efficient_Python_Code_(Optimizing_Your Code).ipynb


/1_XNhMJunrlY_Xfei8D8sPeg.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/youssefHosni/Efficient-Python-for-Data-Scientists-Book/2507c7f72e3cb08ef349047a4ca052bcc7307337/1_XNhMJunrlY_Xfei8D8sPeg.png


--------------------------------------------------------------------------------
/Data Exploration Becomes Easier & Better With Pandas Profiling.md:
--------------------------------------------------------------------------------
  1 | # Data Exploration Becomes Easier & Better With Pandas Profiling
  2 | 
  3 | Data exploration is a crucial step in any data analysis and data science project. It allows you to gain a deeper understanding of your data, identify patterns and relationships, and identify any potential issues or outliers. 
  4 | 
  5 | One of the most popular tools for data exploration is the Python library Pandas. The library provides a powerful set of tools for working with data, including data cleaning, transformation, and visualization. However, even with the powerful capabilities of Pandas, data exploration can still be a time-consuming and tedious task. That's where Pandas Profiling comes in. 
  6 | 
  7 | With Pandas Profiling, you can easily generate detailed reports of your data, including summary statistics, missing values, and correlations, making data exploration faster and more efficient. This article will explore how Pandas Profiling can help you improve your data exploration process and make it easier to understand your data.
  8 | 
  9 | ## Table of content:
 10 | 1. What is Pandas Profiling?
 11 | 2. Installation of Pandas Profiling
 12 | 3. Pandas Profiling in Action 
 13 | 4. Drawbacks of Pandas Profiling & How to Overcome it
 14 | 
 15 | ## 1. What is Pandas Profiling?
 16 | 
 17 | Pandas profiling is a Python library that generates a comprehensive report of a DataFrame, including information about the number of rows and columns, missing values, data types, and other statistics. It can be used to quickly identify potential issues or outliers in the data, and can also be used to generate summary statistics and visualizations of the data.
 18 | 
 19 | The report generated by the pandas profiling library typically includes a variety of information about the dataset, including:
 20 | 
 21 | * Overview: Summary statistics for all columns, including the number of rows, missing values, and data types.
 22 | * Variables: Information about each column, including the number of unique values, missing values, and the top frequent values.
 23 | * Correlations: Correlation matrix and heatmap, showing the relationship between different variables.
 24 | * Distribution: Histograms and kernel density plots for each column, show the distribution of values.
 25 | * Categorical Variables: Bar plots for categorical variables, showing the frequency of each category.
 26 | * Numerical Variables: Box plots for numerical variables, show the distribution of values and outliers.
 27 | * Text: Information about text columns, including the number of characters and words.
 28 | * File: Information about file columns, including the number of files, and the size of each file.
 29 | * High-Cardinality: Information about high-cardinality categorical variables, including their most frequent values.
 30 | * Sample: A sample of the data, with the first and last few rows displayed.
 31 | 
 32 | It is worth noting that the report is interactive and you can drill down on each section for more details.
 33 | 
 34 | ## 2. Installation of Pandas Profiling
 35 | To install pandas-profiling, you can use the following command:
 36 | 
 37 | 
 38 | ```python
 39 | import sys
 40 | 
 41 | !"{sys.executable}" -m pip install -U pandas-profiling[notebook]
 42 | !jupyter nbextension enable --py widgetsnbextension
 43 | ```
 44 | 
 45 |     Collecting pandas-profiling[notebook]
 46 |       Using cached pandas_profiling-3.6.3-py2.py3-none-any.whl (328 kB)
 47 |     Requirement already satisfied, skipping upgrade: numpy<1.24,>=1.16.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (1.19.2)
 48 |     Requirement already satisfied, skipping upgrade: requests<2.29,>=2.24.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (2.24.0)
 49 |     Requirement already satisfied, skipping upgrade: scipy<1.10,>=1.4.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (1.5.2)
 50 |     Collecting visions[type_image_path]==0.7.5
 51 |       Using cached visions-0.7.5-py3-none-any.whl (102 kB)
 52 |     Collecting typeguard<2.14,>=2.13.2
 53 | 
 54 |     ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
 55 |     
 56 |     We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
 57 |     
 58 |     sktime 0.9.0 requires numpy>=1.19.3, but you'll have numpy 1.19.2 which is incompatible.
 59 |     sktime 0.9.0 requires statsmodels<=0.12.1, but you'll have statsmodels 0.13.5 which is incompatible.
 60 |     
 61 | 
 62 |     
 63 |       Using cached typeguard-2.13.3-py3-none-any.whl (17 kB)
 64 |     Requirement already satisfied, skipping upgrade: pandas!=1.4.0,<1.6,>1.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (1.1.3)
 65 |     Collecting phik<0.13,>=0.11.1
 66 |       Using cached phik-0.12.3-cp37-cp37m-win_amd64.whl (664 kB)
 67 |     Requirement already satisfied, skipping upgrade: tqdm<4.65,>=4.48.2 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (4.55.0)
 68 |     Requirement already satisfied, skipping upgrade: PyYAML<6.1,>=5.0.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (5.3.1)
 69 |     Collecting multimethod<1.10,>=1.4
 70 |       Using cached multimethod-1.9.1-py3-none-any.whl (10 kB)
 71 |     Requirement already satisfied, skipping upgrade: jinja2<3.2,>=2.11.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (2.11.2)
 72 |     Collecting statsmodels<0.14,>=0.13.2
 73 |       Using cached statsmodels-0.13.5-cp37-cp37m-win_amd64.whl (9.1 MB)
 74 |     Requirement already satisfied, skipping upgrade: seaborn<0.13,>=0.10.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (0.11.0)
 75 |     Requirement already satisfied, skipping upgrade: matplotlib<3.7,>=3.2 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (3.3.2)
 76 |     Collecting pydantic<1.11,>=1.8.1
 77 |       Using cached pydantic-1.10.4-cp37-cp37m-win_amd64.whl (2.1 MB)
 78 |     Processing c:\users\youss\appdata\local\pip\cache\wheels\70\e1\52\5b14d250ba868768823940c3229e9950d201a26d0bd3ee8655\htmlmin-0.1.12-py3-none-any.whl
 79 |     Requirement already satisfied, skipping upgrade: ipywidgets>=7.5.1; extra == "notebook" in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (8.0.2)
 80 |     Requirement already satisfied, skipping upgrade: jupyter-core>=4.6.3; extra == "notebook" in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (4.6.3)
 81 |     Requirement already satisfied, skipping upgrade: jupyter-client>=5.3.4; extra == "notebook" in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas-profiling[notebook]) (6.1.7)
 82 |     Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from requests<2.29,>=2.24.0->pandas-profiling[notebook]) (3.0.4)
 83 |     Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from requests<2.29,>=2.24.0->pandas-profiling[notebook]) (2.10)
 84 |     Requirement already satisfied, skipping upgrade: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from requests<2.29,>=2.24.0->pandas-profiling[notebook]) (1.25.11)
 85 |     Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from requests<2.29,>=2.24.0->pandas-profiling[notebook]) (2022.9.24)
 86 |     Collecting networkx>=2.4
 87 |       Using cached networkx-2.6.3-py3-none-any.whl (1.9 MB)
 88 |     Requirement already satisfied, skipping upgrade: attrs>=19.3.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from visions[type_image_path]==0.7.5->pandas-profiling[notebook]) (20.2.0)
 89 |     Collecting tangled-up-in-unicode>=0.0.4
 90 |       Using cached tangled_up_in_unicode-0.2.0-py3-none-any.whl (4.7 MB)
 91 |     Collecting imagehash; extra == "type_image_path"
 92 |       Using cached ImageHash-4.3.1-py2.py3-none-any.whl (296 kB)
 93 |     Requirement already satisfied, skipping upgrade: Pillow; extra == "type_image_path" in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from visions[type_image_path]==0.7.5->pandas-profiling[notebook]) (8.0.1)
 94 |     Requirement already satisfied, skipping upgrade: python-dateutil>=2.7.3 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas!=1.4.0,<1.6,>1.1->pandas-profiling[notebook]) (2.8.1)
 95 |     Requirement already satisfied, skipping upgrade: pytz>=2017.2 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pandas!=1.4.0,<1.6,>1.1->pandas-profiling[notebook]) (2020.1)
 96 |     Requirement already satisfied, skipping upgrade: joblib>=0.14.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from phik<0.13,>=0.11.1->pandas-profiling[notebook]) (0.17.0)
 97 |     Requirement already satisfied, skipping upgrade: MarkupSafe>=0.23 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from jinja2<3.2,>=2.11.1->pandas-profiling[notebook]) (1.1.1)
 98 |     Collecting packaging>=21.3
 99 |       Using cached packaging-23.0-py3-none-any.whl (42 kB)
100 |     Requirement already satisfied, skipping upgrade: patsy>=0.5.2 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from statsmodels<0.14,>=0.13.2->pandas-profiling[notebook]) (0.5.2)
101 |     Requirement already satisfied, skipping upgrade: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from matplotlib<3.7,>=3.2->pandas-profiling[notebook]) (2.4.7)
102 |     Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from matplotlib<3.7,>=3.2->pandas-profiling[notebook]) (1.3.0)
103 |     Requirement already satisfied, skipping upgrade: cycler>=0.10 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from matplotlib<3.7,>=3.2->pandas-profiling[notebook]) (0.10.0)
104 |     Requirement already satisfied, skipping upgrade: typing-extensions>=4.2.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from pydantic<1.11,>=1.8.1->pandas-profiling[notebook]) (4.3.0)
105 |     Requirement already satisfied, skipping upgrade: ipykernel>=4.5.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (5.3.4)
106 |     Requirement already satisfied, skipping upgrade: traitlets>=4.3.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (5.0.5)
107 |     Requirement already satisfied, skipping upgrade: ipython>=6.1.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (7.18.1)
108 |     Requirement already satisfied, skipping upgrade: jupyterlab-widgets~=3.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (3.0.3)
109 |     Requirement already satisfied, skipping upgrade: widgetsnbextension~=4.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (4.0.3)
110 |     Requirement already satisfied, skipping upgrade: pywin32>=1.0; sys_platform == "win32" in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from jupyter-core>=4.6.3; extra == "notebook"->pandas-profiling[notebook]) (227)
111 |     Requirement already satisfied, skipping upgrade: pyzmq>=13 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from jupyter-client>=5.3.4; extra == "notebook"->pandas-profiling[notebook]) (19.0.2)
112 |     Requirement already satisfied, skipping upgrade: tornado>=4.1 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from jupyter-client>=5.3.4; extra == "notebook"->pandas-profiling[notebook]) (6.0.4)
113 |     Collecting PyWavelets
114 |       Downloading PyWavelets-1.3.0-cp37-cp37m-win_amd64.whl (4.2 MB)
115 |     Requirement already satisfied, skipping upgrade: six>=1.5 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from python-dateutil>=2.7.3->pandas!=1.4.0,<1.6,>1.1->pandas-profiling[notebook]) (1.15.0)
116 |     Requirement already satisfied, skipping upgrade: ipython-genutils in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from traitlets>=4.3.1->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.2.0)
117 |     Requirement already satisfied, skipping upgrade: setuptools>=18.5 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (50.3.0.post20201006)
118 |     Requirement already satisfied, skipping upgrade: pickleshare in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.7.5)
119 |     Requirement already satisfied, skipping upgrade: pygments in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (2.7.2)
120 |     Requirement already satisfied, skipping upgrade: backcall in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.2.0)
121 |     Requirement already satisfied, skipping upgrade: jedi>=0.10 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.17.1)
122 |     Requirement already satisfied, skipping upgrade: decorator in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (4.4.2)
123 |     Requirement already satisfied, skipping upgrade: colorama; sys_platform == "win32" in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.4.4)
124 |     Requirement already satisfied, skipping upgrade: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (3.0.8)
125 |     Requirement already satisfied, skipping upgrade: parso<0.8.0,>=0.7.0 in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from jedi>=0.10->ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.7.0)
126 |     Requirement already satisfied, skipping upgrade: wcwidth in c:\users\youss\anaconda3\envs\new_enviroment\lib\site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=6.1.0->ipywidgets>=7.5.1; extra == "notebook"->pandas-profiling[notebook]) (0.2.5)
127 |     Installing collected packages: multimethod, networkx, tangled-up-in-unicode, PyWavelets, imagehash, visions, typeguard, phik, packaging, statsmodels, pydantic, htmlmin, pandas-profiling
128 |       Attempting uninstall: packaging
129 |         Found existing installation: packaging 20.4
130 |         Uninstalling packaging-20.4:
131 |           Successfully uninstalled packaging-20.4
132 |       Attempting uninstall: statsmodels
133 |         Found existing installation: statsmodels 0.12.1
134 |         Uninstalling statsmodels-0.12.1:
135 |           Successfully uninstalled statsmodels-0.12.1
136 |     Successfully installed PyWavelets-1.3.0 htmlmin-0.1.12 imagehash-4.3.1 multimethod-1.9.1 networkx-2.6.3 packaging-23.0 pandas-profiling-3.6.3 phik-0.12.3 pydantic-1.10.4 statsmodels-0.13.5 tangled-up-in-unicode-0.2.0 typeguard-2.13.3 visions-0.7.5
137 |     
138 | 
139 |     Enabling notebook extension jupyter-js-widgets/extension...
140 |           - Validating: ok
141 |     
142 | 
143 | 
144 | ```python
145 | import pandas as pd
146 | import pandas_profiling as pp
147 | ```
148 | 
149 | ## 3. Pandas Profiling in Action
150 | Let's put the pandas profiling into action and see how it works. We will use the popular baby names dataset.
151 | 
152 | 
153 | ```python
154 | Popular_baby_names_df = pd.read_csv('Popular_Baby_Names.csv')
155 | Popular_baby_names_df.head()
156 | ```
157 | 
158 | 
159 | 
160 | 
161 | <div>
162 | <style scoped>
163 |     .dataframe tbody tr th:only-of-type {
164 |         vertical-align: middle;
165 |     }
166 | 
167 |     .dataframe tbody tr th {
168 |         vertical-align: top;
169 |     }
170 | 
171 |     .dataframe thead th {
172 |         text-align: right;
173 |     }
174 | </style>
175 | <table border="1" class="dataframe">
176 |   <thead>
177 |     <tr style="text-align: right;">
178 |       <th></th>
179 |       <th>Year of Birth</th>
180 |       <th>Gender</th>
181 |       <th>Ethnicity</th>
182 |       <th>Child's First Name</th>
183 |       <th>Count</th>
184 |       <th>Rank</th>
185 |     </tr>
186 |   </thead>
187 |   <tbody>
188 |     <tr>
189 |       <th>0</th>
190 |       <td>2011</td>
191 |       <td>FEMALE</td>
192 |       <td>ASIAN AND PACIFIC ISLANDER</td>
193 |       <td>SOPHIA</td>
194 |       <td>119</td>
195 |       <td>1</td>
196 |     </tr>
197 |     <tr>
198 |       <th>1</th>
199 |       <td>2011</td>
200 |       <td>FEMALE</td>
201 |       <td>ASIAN AND PACIFIC ISLANDER</td>
202 |       <td>CHLOE</td>
203 |       <td>106</td>
204 |       <td>2</td>
205 |     </tr>
206 |     <tr>
207 |       <th>2</th>
208 |       <td>2011</td>
209 |       <td>FEMALE</td>
210 |       <td>ASIAN AND PACIFIC ISLANDER</td>
211 |       <td>EMILY</td>
212 |       <td>93</td>
213 |       <td>3</td>
214 |     </tr>
215 |     <tr>
216 |       <th>3</th>
217 |       <td>2011</td>
218 |       <td>FEMALE</td>
219 |       <td>ASIAN AND PACIFIC ISLANDER</td>
220 |       <td>OLIVIA</td>
221 |       <td>89</td>
222 |       <td>4</td>
223 |     </tr>
224 |     <tr>
225 |       <th>4</th>
226 |       <td>2011</td>
227 |       <td>FEMALE</td>
228 |       <td>ASIAN AND PACIFIC ISLANDER</td>
229 |       <td>EMMA</td>
230 |       <td>75</td>
231 |       <td>5</td>
232 |     </tr>
233 |   </tbody>
234 | </table>
235 | </div>
236 | 
237 | 
238 | 
239 | 
240 | ```python
241 | profile = pp.ProfileReport(Popular_baby_names_df, title='Pandas Profiling Report')
242 | 
243 | # display the report
244 | profile.to_widgets()
245 | ```
246 | 
247 | 
248 |     Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]
249 | 
250 | 
251 | 
252 |     Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]
253 | 
254 | 
255 | 
256 |     Render widgets:   0%|          | 0/1 [00:00<?, ?it/s]
257 | 
258 | 
259 | 
260 |     VBox(children=(Tab(children=(Tab(children=(GridBox(children=(VBox(children=(GridspecLayout(children=(HTML(valu…
261 | 
262 | 
263 | ## 4. Drawbacks of Pandas Profiling & How to Overcome It
264 | 
265 | Pandas Profiling is a great tool for quickly generating detailed reports of your data, but it does have some drawbacks. One of the main drawbacks is that it can be memory intensive, especially for large datasets. This can cause the tool to run slowly or even crash if you don't have enough memory available.
266 | 
267 | Another drawback is that Pandas Profiling can only be used with Pandas DataFrames. This means that if you're working with data in a different format, such as a CSV file or a SQL database, you'll need to first convert it to a Pandas DataFrame before you can use Pandas Profiling.
268 | 
269 | Additionally, Pandas Profiling generates a lot of information and can be overwhelming to digest if you don't know what you're looking for. The report is also not interactive, and you'll have to export it to a file format like HTML, pdf, or excel to share or present it.
270 | 
271 | To overcome these limitations, you can try the following:
272 | * Use Pandas Profiling on a sample of your data rather than the entire dataset to reduce memory usage.
273 | * Use Pandas to convert your data to a DataFrame before using Pandas Profiling.
274 | * Use the options in Pandas Profiling to customize the report and only include the information that you need.
275 | * Use visualization libraries like Matplotlib, and Seaborn to make the report more interactive and easy to understand.
276 | * Use the report as a starting point for your data exploration, and then use other tools and techniques to dive deeper into your data.
277 | 
278 | 
279 | ```python
280 | profile.to_file("your_report.html")
281 | 
282 | ```
283 | 
284 | 
285 |     Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]
286 | 
287 | 
288 | 
289 |     Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]
290 | 
291 | 


--------------------------------------------------------------------------------
/Datasets/Readme.md:
--------------------------------------------------------------------------------
1 | The datasets used in the advanced python article based tutorials:
2 | 1. Pokemon dataset
3 | 2. Baseball stats dataset
4 | 


--------------------------------------------------------------------------------
/Datasets/pokemon.csv:
--------------------------------------------------------------------------------
  1 | #,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
  2 | 1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
  3 | 2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
  4 | 3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
  5 | 3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
  6 | 4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
  7 | 5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
  8 | 6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
  9 | 6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
 10 | 6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
 11 | 7,Squirtle,Water,,314,44,48,65,50,64,43,1,False
 12 | 8,Wartortle,Water,,405,59,63,80,65,80,58,1,False
 13 | 9,Blastoise,Water,,530,79,83,100,85,105,78,1,False
 14 | 9,BlastoiseMega Blastoise,Water,,630,79,103,120,135,115,78,1,False
 15 | 10,Caterpie,Bug,,195,45,30,35,20,20,45,1,False
 16 | 11,Metapod,Bug,,205,50,20,55,25,25,30,1,False
 17 | 12,Butterfree,Bug,Flying,395,60,45,50,90,80,70,1,False
 18 | 13,Weedle,Bug,Poison,195,40,35,30,20,20,50,1,False
 19 | 14,Kakuna,Bug,Poison,205,45,25,50,25,25,35,1,False
 20 | 15,Beedrill,Bug,Poison,395,65,90,40,45,80,75,1,False
 21 | 15,BeedrillMega Beedrill,Bug,Poison,495,65,150,40,15,80,145,1,False
 22 | 16,Pidgey,Normal,Flying,251,40,45,40,35,35,56,1,False
 23 | 17,Pidgeotto,Normal,Flying,349,63,60,55,50,50,71,1,False
 24 | 18,Pidgeot,Normal,Flying,479,83,80,75,70,70,101,1,False
 25 | 18,PidgeotMega Pidgeot,Normal,Flying,579,83,80,80,135,80,121,1,False
 26 | 19,Rattata,Normal,,253,30,56,35,25,35,72,1,False
 27 | 20,Raticate,Normal,,413,55,81,60,50,70,97,1,False
 28 | 21,Spearow,Normal,Flying,262,40,60,30,31,31,70,1,False
 29 | 22,Fearow,Normal,Flying,442,65,90,65,61,61,100,1,False
 30 | 23,Ekans,Poison,,288,35,60,44,40,54,55,1,False
 31 | 24,Arbok,Poison,,438,60,85,69,65,79,80,1,False
 32 | 25,Pikachu,Electric,,320,35,55,40,50,50,90,1,False
 33 | 26,Raichu,Electric,,485,60,90,55,90,80,110,1,False
 34 | 27,Sandshrew,Ground,,300,50,75,85,20,30,40,1,False
 35 | 28,Sandslash,Ground,,450,75,100,110,45,55,65,1,False
 36 | 29,Nidoran♀,Poison,,275,55,47,52,40,40,41,1,False
 37 | 30,Nidorina,Poison,,365,70,62,67,55,55,56,1,False
 38 | 31,Nidoqueen,Poison,Ground,505,90,92,87,75,85,76,1,False
 39 | 32,Nidoran♂,Poison,,273,46,57,40,40,40,50,1,False
 40 | 33,Nidorino,Poison,,365,61,72,57,55,55,65,1,False
 41 | 34,Nidoking,Poison,Ground,505,81,102,77,85,75,85,1,False
 42 | 35,Clefairy,Fairy,,323,70,45,48,60,65,35,1,False
 43 | 36,Clefable,Fairy,,483,95,70,73,95,90,60,1,False
 44 | 37,Vulpix,Fire,,299,38,41,40,50,65,65,1,False
 45 | 38,Ninetales,Fire,,505,73,76,75,81,100,100,1,False
 46 | 39,Jigglypuff,Normal,Fairy,270,115,45,20,45,25,20,1,False
 47 | 40,Wigglytuff,Normal,Fairy,435,140,70,45,85,50,45,1,False
 48 | 41,Zubat,Poison,Flying,245,40,45,35,30,40,55,1,False
 49 | 42,Golbat,Poison,Flying,455,75,80,70,65,75,90,1,False
 50 | 43,Oddish,Grass,Poison,320,45,50,55,75,65,30,1,False
 51 | 44,Gloom,Grass,Poison,395,60,65,70,85,75,40,1,False
 52 | 45,Vileplume,Grass,Poison,490,75,80,85,110,90,50,1,False
 53 | 46,Paras,Bug,Grass,285,35,70,55,45,55,25,1,False
 54 | 47,Parasect,Bug,Grass,405,60,95,80,60,80,30,1,False
 55 | 48,Venonat,Bug,Poison,305,60,55,50,40,55,45,1,False
 56 | 49,Venomoth,Bug,Poison,450,70,65,60,90,75,90,1,False
 57 | 50,Diglett,Ground,,265,10,55,25,35,45,95,1,False
 58 | 51,Dugtrio,Ground,,405,35,80,50,50,70,120,1,False
 59 | 52,Meowth,Normal,,290,40,45,35,40,40,90,1,False
 60 | 53,Persian,Normal,,440,65,70,60,65,65,115,1,False
 61 | 54,Psyduck,Water,,320,50,52,48,65,50,55,1,False
 62 | 55,Golduck,Water,,500,80,82,78,95,80,85,1,False
 63 | 56,Mankey,Fighting,,305,40,80,35,35,45,70,1,False
 64 | 57,Primeape,Fighting,,455,65,105,60,60,70,95,1,False
 65 | 58,Growlithe,Fire,,350,55,70,45,70,50,60,1,False
 66 | 59,Arcanine,Fire,,555,90,110,80,100,80,95,1,False
 67 | 60,Poliwag,Water,,300,40,50,40,40,40,90,1,False
 68 | 61,Poliwhirl,Water,,385,65,65,65,50,50,90,1,False
 69 | 62,Poliwrath,Water,Fighting,510,90,95,95,70,90,70,1,False
 70 | 63,Abra,Psychic,,310,25,20,15,105,55,90,1,False
 71 | 64,Kadabra,Psychic,,400,40,35,30,120,70,105,1,False
 72 | 65,Alakazam,Psychic,,500,55,50,45,135,95,120,1,False
 73 | 65,AlakazamMega Alakazam,Psychic,,590,55,50,65,175,95,150,1,False
 74 | 66,Machop,Fighting,,305,70,80,50,35,35,35,1,False
 75 | 67,Machoke,Fighting,,405,80,100,70,50,60,45,1,False
 76 | 68,Machamp,Fighting,,505,90,130,80,65,85,55,1,False
 77 | 69,Bellsprout,Grass,Poison,300,50,75,35,70,30,40,1,False
 78 | 70,Weepinbell,Grass,Poison,390,65,90,50,85,45,55,1,False
 79 | 71,Victreebel,Grass,Poison,490,80,105,65,100,70,70,1,False
 80 | 72,Tentacool,Water,Poison,335,40,40,35,50,100,70,1,False
 81 | 73,Tentacruel,Water,Poison,515,80,70,65,80,120,100,1,False
 82 | 74,Geodude,Rock,Ground,300,40,80,100,30,30,20,1,False
 83 | 75,Graveler,Rock,Ground,390,55,95,115,45,45,35,1,False
 84 | 76,Golem,Rock,Ground,495,80,120,130,55,65,45,1,False
 85 | 77,Ponyta,Fire,,410,50,85,55,65,65,90,1,False
 86 | 78,Rapidash,Fire,,500,65,100,70,80,80,105,1,False
 87 | 79,Slowpoke,Water,Psychic,315,90,65,65,40,40,15,1,False
 88 | 80,Slowbro,Water,Psychic,490,95,75,110,100,80,30,1,False
 89 | 80,SlowbroMega Slowbro,Water,Psychic,590,95,75,180,130,80,30,1,False
 90 | 81,Magnemite,Electric,Steel,325,25,35,70,95,55,45,1,False
 91 | 82,Magneton,Electric,Steel,465,50,60,95,120,70,70,1,False
 92 | 83,Farfetch'd,Normal,Flying,352,52,65,55,58,62,60,1,False
 93 | 84,Doduo,Normal,Flying,310,35,85,45,35,35,75,1,False
 94 | 85,Dodrio,Normal,Flying,460,60,110,70,60,60,100,1,False
 95 | 86,Seel,Water,,325,65,45,55,45,70,45,1,False
 96 | 87,Dewgong,Water,Ice,475,90,70,80,70,95,70,1,False
 97 | 88,Grimer,Poison,,325,80,80,50,40,50,25,1,False
 98 | 89,Muk,Poison,,500,105,105,75,65,100,50,1,False
 99 | 90,Shellder,Water,,305,30,65,100,45,25,40,1,False
100 | 91,Cloyster,Water,Ice,525,50,95,180,85,45,70,1,False
101 | 92,Gastly,Ghost,Poison,310,30,35,30,100,35,80,1,False
102 | 93,Haunter,Ghost,Poison,405,45,50,45,115,55,95,1,False
103 | 94,Gengar,Ghost,Poison,500,60,65,60,130,75,110,1,False
104 | 94,GengarMega Gengar,Ghost,Poison,600,60,65,80,170,95,130,1,False
105 | 95,Onix,Rock,Ground,385,35,45,160,30,45,70,1,False
106 | 96,Drowzee,Psychic,,328,60,48,45,43,90,42,1,False
107 | 97,Hypno,Psychic,,483,85,73,70,73,115,67,1,False
108 | 98,Krabby,Water,,325,30,105,90,25,25,50,1,False
109 | 99,Kingler,Water,,475,55,130,115,50,50,75,1,False
110 | 100,Voltorb,Electric,,330,40,30,50,55,55,100,1,False
111 | 101,Electrode,Electric,,480,60,50,70,80,80,140,1,False
112 | 102,Exeggcute,Grass,Psychic,325,60,40,80,60,45,40,1,False
113 | 103,Exeggutor,Grass,Psychic,520,95,95,85,125,65,55,1,False
114 | 104,Cubone,Ground,,320,50,50,95,40,50,35,1,False
115 | 105,Marowak,Ground,,425,60,80,110,50,80,45,1,False
116 | 106,Hitmonlee,Fighting,,455,50,120,53,35,110,87,1,False
117 | 107,Hitmonchan,Fighting,,455,50,105,79,35,110,76,1,False
118 | 108,Lickitung,Normal,,385,90,55,75,60,75,30,1,False
119 | 109,Koffing,Poison,,340,40,65,95,60,45,35,1,False
120 | 110,Weezing,Poison,,490,65,90,120,85,70,60,1,False
121 | 111,Rhyhorn,Ground,Rock,345,80,85,95,30,30,25,1,False
122 | 112,Rhydon,Ground,Rock,485,105,130,120,45,45,40,1,False
123 | 113,Chansey,Normal,,450,250,5,5,35,105,50,1,False
124 | 114,Tangela,Grass,,435,65,55,115,100,40,60,1,False
125 | 115,Kangaskhan,Normal,,490,105,95,80,40,80,90,1,False
126 | 115,KangaskhanMega Kangaskhan,Normal,,590,105,125,100,60,100,100,1,False
127 | 116,Horsea,Water,,295,30,40,70,70,25,60,1,False
128 | 117,Seadra,Water,,440,55,65,95,95,45,85,1,False
129 | 118,Goldeen,Water,,320,45,67,60,35,50,63,1,False
130 | 119,Seaking,Water,,450,80,92,65,65,80,68,1,False
131 | 120,Staryu,Water,,340,30,45,55,70,55,85,1,False
132 | 121,Starmie,Water,Psychic,520,60,75,85,100,85,115,1,False
133 | 122,Mr. Mime,Psychic,Fairy,460,40,45,65,100,120,90,1,False
134 | 123,Scyther,Bug,Flying,500,70,110,80,55,80,105,1,False
135 | 124,Jynx,Ice,Psychic,455,65,50,35,115,95,95,1,False
136 | 125,Electabuzz,Electric,,490,65,83,57,95,85,105,1,False
137 | 126,Magmar,Fire,,495,65,95,57,100,85,93,1,False
138 | 127,Pinsir,Bug,,500,65,125,100,55,70,85,1,False
139 | 127,PinsirMega Pinsir,Bug,Flying,600,65,155,120,65,90,105,1,False
140 | 128,Tauros,Normal,,490,75,100,95,40,70,110,1,False
141 | 129,Magikarp,Water,,200,20,10,55,15,20,80,1,False
142 | 130,Gyarados,Water,Flying,540,95,125,79,60,100,81,1,False
143 | 130,GyaradosMega Gyarados,Water,Dark,640,95,155,109,70,130,81,1,False
144 | 131,Lapras,Water,Ice,535,130,85,80,85,95,60,1,False
145 | 132,Ditto,Normal,,288,48,48,48,48,48,48,1,False
146 | 133,Eevee,Normal,,325,55,55,50,45,65,55,1,False
147 | 134,Vaporeon,Water,,525,130,65,60,110,95,65,1,False
148 | 135,Jolteon,Electric,,525,65,65,60,110,95,130,1,False
149 | 136,Flareon,Fire,,525,65,130,60,95,110,65,1,False
150 | 137,Porygon,Normal,,395,65,60,70,85,75,40,1,False
151 | 138,Omanyte,Rock,Water,355,35,40,100,90,55,35,1,False
152 | 139,Omastar,Rock,Water,495,70,60,125,115,70,55,1,False
153 | 140,Kabuto,Rock,Water,355,30,80,90,55,45,55,1,False
154 | 141,Kabutops,Rock,Water,495,60,115,105,65,70,80,1,False
155 | 142,Aerodactyl,Rock,Flying,515,80,105,65,60,75,130,1,False
156 | 142,AerodactylMega Aerodactyl,Rock,Flying,615,80,135,85,70,95,150,1,False
157 | 143,Snorlax,Normal,,540,160,110,65,65,110,30,1,False
158 | 144,Articuno,Ice,Flying,580,90,85,100,95,125,85,1,True
159 | 145,Zapdos,Electric,Flying,580,90,90,85,125,90,100,1,True
160 | 146,Moltres,Fire,Flying,580,90,100,90,125,85,90,1,True
161 | 147,Dratini,Dragon,,300,41,64,45,50,50,50,1,False
162 | 148,Dragonair,Dragon,,420,61,84,65,70,70,70,1,False
163 | 149,Dragonite,Dragon,Flying,600,91,134,95,100,100,80,1,False
164 | 150,Mewtwo,Psychic,,680,106,110,90,154,90,130,1,True
165 | 150,MewtwoMega Mewtwo X,Psychic,Fighting,780,106,190,100,154,100,130,1,True
166 | 150,MewtwoMega Mewtwo Y,Psychic,,780,106,150,70,194,120,140,1,True
167 | 151,Mew,Psychic,,600,100,100,100,100,100,100,1,False
168 | 152,Chikorita,Grass,,318,45,49,65,49,65,45,2,False
169 | 153,Bayleef,Grass,,405,60,62,80,63,80,60,2,False
170 | 154,Meganium,Grass,,525,80,82,100,83,100,80,2,False
171 | 155,Cyndaquil,Fire,,309,39,52,43,60,50,65,2,False
172 | 156,Quilava,Fire,,405,58,64,58,80,65,80,2,False
173 | 157,Typhlosion,Fire,,534,78,84,78,109,85,100,2,False
174 | 158,Totodile,Water,,314,50,65,64,44,48,43,2,False
175 | 159,Croconaw,Water,,405,65,80,80,59,63,58,2,False
176 | 160,Feraligatr,Water,,530,85,105,100,79,83,78,2,False
177 | 161,Sentret,Normal,,215,35,46,34,35,45,20,2,False
178 | 162,Furret,Normal,,415,85,76,64,45,55,90,2,False
179 | 163,Hoothoot,Normal,Flying,262,60,30,30,36,56,50,2,False
180 | 164,Noctowl,Normal,Flying,442,100,50,50,76,96,70,2,False
181 | 165,Ledyba,Bug,Flying,265,40,20,30,40,80,55,2,False
182 | 166,Ledian,Bug,Flying,390,55,35,50,55,110,85,2,False
183 | 167,Spinarak,Bug,Poison,250,40,60,40,40,40,30,2,False
184 | 168,Ariados,Bug,Poison,390,70,90,70,60,60,40,2,False
185 | 169,Crobat,Poison,Flying,535,85,90,80,70,80,130,2,False
186 | 170,Chinchou,Water,Electric,330,75,38,38,56,56,67,2,False
187 | 171,Lanturn,Water,Electric,460,125,58,58,76,76,67,2,False
188 | 172,Pichu,Electric,,205,20,40,15,35,35,60,2,False
189 | 173,Cleffa,Fairy,,218,50,25,28,45,55,15,2,False
190 | 174,Igglybuff,Normal,Fairy,210,90,30,15,40,20,15,2,False
191 | 175,Togepi,Fairy,,245,35,20,65,40,65,20,2,False
192 | 176,Togetic,Fairy,Flying,405,55,40,85,80,105,40,2,False
193 | 177,Natu,Psychic,Flying,320,40,50,45,70,45,70,2,False
194 | 178,Xatu,Psychic,Flying,470,65,75,70,95,70,95,2,False
195 | 179,Mareep,Electric,,280,55,40,40,65,45,35,2,False
196 | 180,Flaaffy,Electric,,365,70,55,55,80,60,45,2,False
197 | 181,Ampharos,Electric,,510,90,75,85,115,90,55,2,False
198 | 181,AmpharosMega Ampharos,Electric,Dragon,610,90,95,105,165,110,45,2,False
199 | 182,Bellossom,Grass,,490,75,80,95,90,100,50,2,False
200 | 183,Marill,Water,Fairy,250,70,20,50,20,50,40,2,False
201 | 184,Azumarill,Water,Fairy,420,100,50,80,60,80,50,2,False
202 | 185,Sudowoodo,Rock,,410,70,100,115,30,65,30,2,False
203 | 186,Politoed,Water,,500,90,75,75,90,100,70,2,False
204 | 187,Hoppip,Grass,Flying,250,35,35,40,35,55,50,2,False
205 | 188,Skiploom,Grass,Flying,340,55,45,50,45,65,80,2,False
206 | 189,Jumpluff,Grass,Flying,460,75,55,70,55,95,110,2,False
207 | 190,Aipom,Normal,,360,55,70,55,40,55,85,2,False
208 | 191,Sunkern,Grass,,180,30,30,30,30,30,30,2,False
209 | 192,Sunflora,Grass,,425,75,75,55,105,85,30,2,False
210 | 193,Yanma,Bug,Flying,390,65,65,45,75,45,95,2,False
211 | 194,Wooper,Water,Ground,210,55,45,45,25,25,15,2,False
212 | 195,Quagsire,Water,Ground,430,95,85,85,65,65,35,2,False
213 | 196,Espeon,Psychic,,525,65,65,60,130,95,110,2,False
214 | 197,Umbreon,Dark,,525,95,65,110,60,130,65,2,False
215 | 198,Murkrow,Dark,Flying,405,60,85,42,85,42,91,2,False
216 | 199,Slowking,Water,Psychic,490,95,75,80,100,110,30,2,False
217 | 200,Misdreavus,Ghost,,435,60,60,60,85,85,85,2,False
218 | 201,Unown,Psychic,,336,48,72,48,72,48,48,2,False
219 | 202,Wobbuffet,Psychic,,405,190,33,58,33,58,33,2,False
220 | 203,Girafarig,Normal,Psychic,455,70,80,65,90,65,85,2,False
221 | 204,Pineco,Bug,,290,50,65,90,35,35,15,2,False
222 | 205,Forretress,Bug,Steel,465,75,90,140,60,60,40,2,False
223 | 206,Dunsparce,Normal,,415,100,70,70,65,65,45,2,False
224 | 207,Gligar,Ground,Flying,430,65,75,105,35,65,85,2,False
225 | 208,Steelix,Steel,Ground,510,75,85,200,55,65,30,2,False
226 | 208,SteelixMega Steelix,Steel,Ground,610,75,125,230,55,95,30,2,False
227 | 209,Snubbull,Fairy,,300,60,80,50,40,40,30,2,False
228 | 210,Granbull,Fairy,,450,90,120,75,60,60,45,2,False
229 | 211,Qwilfish,Water,Poison,430,65,95,75,55,55,85,2,False
230 | 212,Scizor,Bug,Steel,500,70,130,100,55,80,65,2,False
231 | 212,ScizorMega Scizor,Bug,Steel,600,70,150,140,65,100,75,2,False
232 | 213,Shuckle,Bug,Rock,505,20,10,230,10,230,5,2,False
233 | 214,Heracross,Bug,Fighting,500,80,125,75,40,95,85,2,False
234 | 214,HeracrossMega Heracross,Bug,Fighting,600,80,185,115,40,105,75,2,False
235 | 215,Sneasel,Dark,Ice,430,55,95,55,35,75,115,2,False
236 | 216,Teddiursa,Normal,,330,60,80,50,50,50,40,2,False
237 | 217,Ursaring,Normal,,500,90,130,75,75,75,55,2,False
238 | 218,Slugma,Fire,,250,40,40,40,70,40,20,2,False
239 | 219,Magcargo,Fire,Rock,410,50,50,120,80,80,30,2,False
240 | 220,Swinub,Ice,Ground,250,50,50,40,30,30,50,2,False
241 | 221,Piloswine,Ice,Ground,450,100,100,80,60,60,50,2,False
242 | 222,Corsola,Water,Rock,380,55,55,85,65,85,35,2,False
243 | 223,Remoraid,Water,,300,35,65,35,65,35,65,2,False
244 | 224,Octillery,Water,,480,75,105,75,105,75,45,2,False
245 | 225,Delibird,Ice,Flying,330,45,55,45,65,45,75,2,False
246 | 226,Mantine,Water,Flying,465,65,40,70,80,140,70,2,False
247 | 227,Skarmory,Steel,Flying,465,65,80,140,40,70,70,2,False
248 | 228,Houndour,Dark,Fire,330,45,60,30,80,50,65,2,False
249 | 229,Houndoom,Dark,Fire,500,75,90,50,110,80,95,2,False
250 | 229,HoundoomMega Houndoom,Dark,Fire,600,75,90,90,140,90,115,2,False
251 | 230,Kingdra,Water,Dragon,540,75,95,95,95,95,85,2,False
252 | 231,Phanpy,Ground,,330,90,60,60,40,40,40,2,False
253 | 232,Donphan,Ground,,500,90,120,120,60,60,50,2,False
254 | 233,Porygon2,Normal,,515,85,80,90,105,95,60,2,False
255 | 234,Stantler,Normal,,465,73,95,62,85,65,85,2,False
256 | 235,Smeargle,Normal,,250,55,20,35,20,45,75,2,False
257 | 236,Tyrogue,Fighting,,210,35,35,35,35,35,35,2,False
258 | 237,Hitmontop,Fighting,,455,50,95,95,35,110,70,2,False
259 | 238,Smoochum,Ice,Psychic,305,45,30,15,85,65,65,2,False
260 | 239,Elekid,Electric,,360,45,63,37,65,55,95,2,False
261 | 240,Magby,Fire,,365,45,75,37,70,55,83,2,False
262 | 241,Miltank,Normal,,490,95,80,105,40,70,100,2,False
263 | 242,Blissey,Normal,,540,255,10,10,75,135,55,2,False
264 | 243,Raikou,Electric,,580,90,85,75,115,100,115,2,True
265 | 244,Entei,Fire,,580,115,115,85,90,75,100,2,True
266 | 245,Suicune,Water,,580,100,75,115,90,115,85,2,True
267 | 246,Larvitar,Rock,Ground,300,50,64,50,45,50,41,2,False
268 | 247,Pupitar,Rock,Ground,410,70,84,70,65,70,51,2,False
269 | 248,Tyranitar,Rock,Dark,600,100,134,110,95,100,61,2,False
270 | 248,TyranitarMega Tyranitar,Rock,Dark,700,100,164,150,95,120,71,2,False
271 | 249,Lugia,Psychic,Flying,680,106,90,130,90,154,110,2,True
272 | 250,Ho-oh,Fire,Flying,680,106,130,90,110,154,90,2,True
273 | 251,Celebi,Psychic,Grass,600,100,100,100,100,100,100,2,False
274 | 252,Treecko,Grass,,310,40,45,35,65,55,70,3,False
275 | 253,Grovyle,Grass,,405,50,65,45,85,65,95,3,False
276 | 254,Sceptile,Grass,,530,70,85,65,105,85,120,3,False
277 | 254,SceptileMega Sceptile,Grass,Dragon,630,70,110,75,145,85,145,3,False
278 | 255,Torchic,Fire,,310,45,60,40,70,50,45,3,False
279 | 256,Combusken,Fire,Fighting,405,60,85,60,85,60,55,3,False
280 | 257,Blaziken,Fire,Fighting,530,80,120,70,110,70,80,3,False
281 | 257,BlazikenMega Blaziken,Fire,Fighting,630,80,160,80,130,80,100,3,False
282 | 258,Mudkip,Water,,310,50,70,50,50,50,40,3,False
283 | 259,Marshtomp,Water,Ground,405,70,85,70,60,70,50,3,False
284 | 260,Swampert,Water,Ground,535,100,110,90,85,90,60,3,False
285 | 260,SwampertMega Swampert,Water,Ground,635,100,150,110,95,110,70,3,False
286 | 261,Poochyena,Dark,,220,35,55,35,30,30,35,3,False
287 | 262,Mightyena,Dark,,420,70,90,70,60,60,70,3,False
288 | 263,Zigzagoon,Normal,,240,38,30,41,30,41,60,3,False
289 | 264,Linoone,Normal,,420,78,70,61,50,61,100,3,False
290 | 265,Wurmple,Bug,,195,45,45,35,20,30,20,3,False
291 | 266,Silcoon,Bug,,205,50,35,55,25,25,15,3,False
292 | 267,Beautifly,Bug,Flying,395,60,70,50,100,50,65,3,False
293 | 268,Cascoon,Bug,,205,50,35,55,25,25,15,3,False
294 | 269,Dustox,Bug,Poison,385,60,50,70,50,90,65,3,False
295 | 270,Lotad,Water,Grass,220,40,30,30,40,50,30,3,False
296 | 271,Lombre,Water,Grass,340,60,50,50,60,70,50,3,False
297 | 272,Ludicolo,Water,Grass,480,80,70,70,90,100,70,3,False
298 | 273,Seedot,Grass,,220,40,40,50,30,30,30,3,False
299 | 274,Nuzleaf,Grass,Dark,340,70,70,40,60,40,60,3,False
300 | 275,Shiftry,Grass,Dark,480,90,100,60,90,60,80,3,False
301 | 276,Taillow,Normal,Flying,270,40,55,30,30,30,85,3,False
302 | 277,Swellow,Normal,Flying,430,60,85,60,50,50,125,3,False
303 | 278,Wingull,Water,Flying,270,40,30,30,55,30,85,3,False
304 | 279,Pelipper,Water,Flying,430,60,50,100,85,70,65,3,False
305 | 280,Ralts,Psychic,Fairy,198,28,25,25,45,35,40,3,False
306 | 281,Kirlia,Psychic,Fairy,278,38,35,35,65,55,50,3,False
307 | 282,Gardevoir,Psychic,Fairy,518,68,65,65,125,115,80,3,False
308 | 282,GardevoirMega Gardevoir,Psychic,Fairy,618,68,85,65,165,135,100,3,False
309 | 283,Surskit,Bug,Water,269,40,30,32,50,52,65,3,False
310 | 284,Masquerain,Bug,Flying,414,70,60,62,80,82,60,3,False
311 | 285,Shroomish,Grass,,295,60,40,60,40,60,35,3,False
312 | 286,Breloom,Grass,Fighting,460,60,130,80,60,60,70,3,False
313 | 287,Slakoth,Normal,,280,60,60,60,35,35,30,3,False
314 | 288,Vigoroth,Normal,,440,80,80,80,55,55,90,3,False
315 | 289,Slaking,Normal,,670,150,160,100,95,65,100,3,False
316 | 290,Nincada,Bug,Ground,266,31,45,90,30,30,40,3,False
317 | 291,Ninjask,Bug,Flying,456,61,90,45,50,50,160,3,False
318 | 292,Shedinja,Bug,Ghost,236,1,90,45,30,30,40,3,False
319 | 293,Whismur,Normal,,240,64,51,23,51,23,28,3,False
320 | 294,Loudred,Normal,,360,84,71,43,71,43,48,3,False
321 | 295,Exploud,Normal,,490,104,91,63,91,73,68,3,False
322 | 296,Makuhita,Fighting,,237,72,60,30,20,30,25,3,False
323 | 297,Hariyama,Fighting,,474,144,120,60,40,60,50,3,False
324 | 298,Azurill,Normal,Fairy,190,50,20,40,20,40,20,3,False
325 | 299,Nosepass,Rock,,375,30,45,135,45,90,30,3,False
326 | 300,Skitty,Normal,,260,50,45,45,35,35,50,3,False
327 | 301,Delcatty,Normal,,380,70,65,65,55,55,70,3,False
328 | 302,Sableye,Dark,Ghost,380,50,75,75,65,65,50,3,False
329 | 302,SableyeMega Sableye,Dark,Ghost,480,50,85,125,85,115,20,3,False
330 | 303,Mawile,Steel,Fairy,380,50,85,85,55,55,50,3,False
331 | 303,MawileMega Mawile,Steel,Fairy,480,50,105,125,55,95,50,3,False
332 | 304,Aron,Steel,Rock,330,50,70,100,40,40,30,3,False
333 | 305,Lairon,Steel,Rock,430,60,90,140,50,50,40,3,False
334 | 306,Aggron,Steel,Rock,530,70,110,180,60,60,50,3,False
335 | 306,AggronMega Aggron,Steel,,630,70,140,230,60,80,50,3,False
336 | 307,Meditite,Fighting,Psychic,280,30,40,55,40,55,60,3,False
337 | 308,Medicham,Fighting,Psychic,410,60,60,75,60,75,80,3,False
338 | 308,MedichamMega Medicham,Fighting,Psychic,510,60,100,85,80,85,100,3,False
339 | 309,Electrike,Electric,,295,40,45,40,65,40,65,3,False
340 | 310,Manectric,Electric,,475,70,75,60,105,60,105,3,False
341 | 310,ManectricMega Manectric,Electric,,575,70,75,80,135,80,135,3,False
342 | 311,Plusle,Electric,,405,60,50,40,85,75,95,3,False
343 | 312,Minun,Electric,,405,60,40,50,75,85,95,3,False
344 | 313,Volbeat,Bug,,400,65,73,55,47,75,85,3,False
345 | 314,Illumise,Bug,,400,65,47,55,73,75,85,3,False
346 | 315,Roselia,Grass,Poison,400,50,60,45,100,80,65,3,False
347 | 316,Gulpin,Poison,,302,70,43,53,43,53,40,3,False
348 | 317,Swalot,Poison,,467,100,73,83,73,83,55,3,False
349 | 318,Carvanha,Water,Dark,305,45,90,20,65,20,65,3,False
350 | 319,Sharpedo,Water,Dark,460,70,120,40,95,40,95,3,False
351 | 319,SharpedoMega Sharpedo,Water,Dark,560,70,140,70,110,65,105,3,False
352 | 320,Wailmer,Water,,400,130,70,35,70,35,60,3,False
353 | 321,Wailord,Water,,500,170,90,45,90,45,60,3,False
354 | 322,Numel,Fire,Ground,305,60,60,40,65,45,35,3,False
355 | 323,Camerupt,Fire,Ground,460,70,100,70,105,75,40,3,False
356 | 323,CameruptMega Camerupt,Fire,Ground,560,70,120,100,145,105,20,3,False
357 | 324,Torkoal,Fire,,470,70,85,140,85,70,20,3,False
358 | 325,Spoink,Psychic,,330,60,25,35,70,80,60,3,False
359 | 326,Grumpig,Psychic,,470,80,45,65,90,110,80,3,False
360 | 327,Spinda,Normal,,360,60,60,60,60,60,60,3,False
361 | 328,Trapinch,Ground,,290,45,100,45,45,45,10,3,False
362 | 329,Vibrava,Ground,Dragon,340,50,70,50,50,50,70,3,False
363 | 330,Flygon,Ground,Dragon,520,80,100,80,80,80,100,3,False
364 | 331,Cacnea,Grass,,335,50,85,40,85,40,35,3,False
365 | 332,Cacturne,Grass,Dark,475,70,115,60,115,60,55,3,False
366 | 333,Swablu,Normal,Flying,310,45,40,60,40,75,50,3,False
367 | 334,Altaria,Dragon,Flying,490,75,70,90,70,105,80,3,False
368 | 334,AltariaMega Altaria,Dragon,Fairy,590,75,110,110,110,105,80,3,False
369 | 335,Zangoose,Normal,,458,73,115,60,60,60,90,3,False
370 | 336,Seviper,Poison,,458,73,100,60,100,60,65,3,False
371 | 337,Lunatone,Rock,Psychic,440,70,55,65,95,85,70,3,False
372 | 338,Solrock,Rock,Psychic,440,70,95,85,55,65,70,3,False
373 | 339,Barboach,Water,Ground,288,50,48,43,46,41,60,3,False
374 | 340,Whiscash,Water,Ground,468,110,78,73,76,71,60,3,False
375 | 341,Corphish,Water,,308,43,80,65,50,35,35,3,False
376 | 342,Crawdaunt,Water,Dark,468,63,120,85,90,55,55,3,False
377 | 343,Baltoy,Ground,Psychic,300,40,40,55,40,70,55,3,False
378 | 344,Claydol,Ground,Psychic,500,60,70,105,70,120,75,3,False
379 | 345,Lileep,Rock,Grass,355,66,41,77,61,87,23,3,False
380 | 346,Cradily,Rock,Grass,495,86,81,97,81,107,43,3,False
381 | 347,Anorith,Rock,Bug,355,45,95,50,40,50,75,3,False
382 | 348,Armaldo,Rock,Bug,495,75,125,100,70,80,45,3,False
383 | 349,Feebas,Water,,200,20,15,20,10,55,80,3,False
384 | 350,Milotic,Water,,540,95,60,79,100,125,81,3,False
385 | 351,Castform,Normal,,420,70,70,70,70,70,70,3,False
386 | 352,Kecleon,Normal,,440,60,90,70,60,120,40,3,False
387 | 353,Shuppet,Ghost,,295,44,75,35,63,33,45,3,False
388 | 354,Banette,Ghost,,455,64,115,65,83,63,65,3,False
389 | 354,BanetteMega Banette,Ghost,,555,64,165,75,93,83,75,3,False
390 | 355,Duskull,Ghost,,295,20,40,90,30,90,25,3,False
391 | 356,Dusclops,Ghost,,455,40,70,130,60,130,25,3,False
392 | 357,Tropius,Grass,Flying,460,99,68,83,72,87,51,3,False
393 | 358,Chimecho,Psychic,,425,65,50,70,95,80,65,3,False
394 | 359,Absol,Dark,,465,65,130,60,75,60,75,3,False
395 | 359,AbsolMega Absol,Dark,,565,65,150,60,115,60,115,3,False
396 | 360,Wynaut,Psychic,,260,95,23,48,23,48,23,3,False
397 | 361,Snorunt,Ice,,300,50,50,50,50,50,50,3,False
398 | 362,Glalie,Ice,,480,80,80,80,80,80,80,3,False
399 | 362,GlalieMega Glalie,Ice,,580,80,120,80,120,80,100,3,False
400 | 363,Spheal,Ice,Water,290,70,40,50,55,50,25,3,False
401 | 364,Sealeo,Ice,Water,410,90,60,70,75,70,45,3,False
402 | 365,Walrein,Ice,Water,530,110,80,90,95,90,65,3,False
403 | 366,Clamperl,Water,,345,35,64,85,74,55,32,3,False
404 | 367,Huntail,Water,,485,55,104,105,94,75,52,3,False
405 | 368,Gorebyss,Water,,485,55,84,105,114,75,52,3,False
406 | 369,Relicanth,Water,Rock,485,100,90,130,45,65,55,3,False
407 | 370,Luvdisc,Water,,330,43,30,55,40,65,97,3,False
408 | 371,Bagon,Dragon,,300,45,75,60,40,30,50,3,False
409 | 372,Shelgon,Dragon,,420,65,95,100,60,50,50,3,False
410 | 373,Salamence,Dragon,Flying,600,95,135,80,110,80,100,3,False
411 | 373,SalamenceMega Salamence,Dragon,Flying,700,95,145,130,120,90,120,3,False
412 | 374,Beldum,Steel,Psychic,300,40,55,80,35,60,30,3,False
413 | 375,Metang,Steel,Psychic,420,60,75,100,55,80,50,3,False
414 | 376,Metagross,Steel,Psychic,600,80,135,130,95,90,70,3,False
415 | 376,MetagrossMega Metagross,Steel,Psychic,700,80,145,150,105,110,110,3,False
416 | 377,Regirock,Rock,,580,80,100,200,50,100,50,3,True
417 | 378,Regice,Ice,,580,80,50,100,100,200,50,3,True
418 | 379,Registeel,Steel,,580,80,75,150,75,150,50,3,True
419 | 380,Latias,Dragon,Psychic,600,80,80,90,110,130,110,3,True
420 | 380,LatiasMega Latias,Dragon,Psychic,700,80,100,120,140,150,110,3,True
421 | 381,Latios,Dragon,Psychic,600,80,90,80,130,110,110,3,True
422 | 381,LatiosMega Latios,Dragon,Psychic,700,80,130,100,160,120,110,3,True
423 | 382,Kyogre,Water,,670,100,100,90,150,140,90,3,True
424 | 382,KyogrePrimal Kyogre,Water,,770,100,150,90,180,160,90,3,True
425 | 383,Groudon,Ground,,670,100,150,140,100,90,90,3,True
426 | 383,GroudonPrimal Groudon,Ground,Fire,770,100,180,160,150,90,90,3,True
427 | 384,Rayquaza,Dragon,Flying,680,105,150,90,150,90,95,3,True
428 | 384,RayquazaMega Rayquaza,Dragon,Flying,780,105,180,100,180,100,115,3,True
429 | 385,Jirachi,Steel,Psychic,600,100,100,100,100,100,100,3,True
430 | 386,DeoxysNormal Forme,Psychic,,600,50,150,50,150,50,150,3,True
431 | 386,DeoxysAttack Forme,Psychic,,600,50,180,20,180,20,150,3,True
432 | 386,DeoxysDefense Forme,Psychic,,600,50,70,160,70,160,90,3,True
433 | 386,DeoxysSpeed Forme,Psychic,,600,50,95,90,95,90,180,3,True
434 | 387,Turtwig,Grass,,318,55,68,64,45,55,31,4,False
435 | 388,Grotle,Grass,,405,75,89,85,55,65,36,4,False
436 | 389,Torterra,Grass,Ground,525,95,109,105,75,85,56,4,False
437 | 390,Chimchar,Fire,,309,44,58,44,58,44,61,4,False
438 | 391,Monferno,Fire,Fighting,405,64,78,52,78,52,81,4,False
439 | 392,Infernape,Fire,Fighting,534,76,104,71,104,71,108,4,False
440 | 393,Piplup,Water,,314,53,51,53,61,56,40,4,False
441 | 394,Prinplup,Water,,405,64,66,68,81,76,50,4,False
442 | 395,Empoleon,Water,Steel,530,84,86,88,111,101,60,4,False
443 | 396,Starly,Normal,Flying,245,40,55,30,30,30,60,4,False
444 | 397,Staravia,Normal,Flying,340,55,75,50,40,40,80,4,False
445 | 398,Staraptor,Normal,Flying,485,85,120,70,50,60,100,4,False
446 | 399,Bidoof,Normal,,250,59,45,40,35,40,31,4,False
447 | 400,Bibarel,Normal,Water,410,79,85,60,55,60,71,4,False
448 | 401,Kricketot,Bug,,194,37,25,41,25,41,25,4,False
449 | 402,Kricketune,Bug,,384,77,85,51,55,51,65,4,False
450 | 403,Shinx,Electric,,263,45,65,34,40,34,45,4,False
451 | 404,Luxio,Electric,,363,60,85,49,60,49,60,4,False
452 | 405,Luxray,Electric,,523,80,120,79,95,79,70,4,False
453 | 406,Budew,Grass,Poison,280,40,30,35,50,70,55,4,False
454 | 407,Roserade,Grass,Poison,515,60,70,65,125,105,90,4,False
455 | 408,Cranidos,Rock,,350,67,125,40,30,30,58,4,False
456 | 409,Rampardos,Rock,,495,97,165,60,65,50,58,4,False
457 | 410,Shieldon,Rock,Steel,350,30,42,118,42,88,30,4,False
458 | 411,Bastiodon,Rock,Steel,495,60,52,168,47,138,30,4,False
459 | 412,Burmy,Bug,,224,40,29,45,29,45,36,4,False
460 | 413,WormadamPlant Cloak,Bug,Grass,424,60,59,85,79,105,36,4,False
461 | 413,WormadamSandy Cloak,Bug,Ground,424,60,79,105,59,85,36,4,False
462 | 413,WormadamTrash Cloak,Bug,Steel,424,60,69,95,69,95,36,4,False
463 | 414,Mothim,Bug,Flying,424,70,94,50,94,50,66,4,False
464 | 415,Combee,Bug,Flying,244,30,30,42,30,42,70,4,False
465 | 416,Vespiquen,Bug,Flying,474,70,80,102,80,102,40,4,False
466 | 417,Pachirisu,Electric,,405,60,45,70,45,90,95,4,False
467 | 418,Buizel,Water,,330,55,65,35,60,30,85,4,False
468 | 419,Floatzel,Water,,495,85,105,55,85,50,115,4,False
469 | 420,Cherubi,Grass,,275,45,35,45,62,53,35,4,False
470 | 421,Cherrim,Grass,,450,70,60,70,87,78,85,4,False
471 | 422,Shellos,Water,,325,76,48,48,57,62,34,4,False
472 | 423,Gastrodon,Water,Ground,475,111,83,68,92,82,39,4,False
473 | 424,Ambipom,Normal,,482,75,100,66,60,66,115,4,False
474 | 425,Drifloon,Ghost,Flying,348,90,50,34,60,44,70,4,False
475 | 426,Drifblim,Ghost,Flying,498,150,80,44,90,54,80,4,False
476 | 427,Buneary,Normal,,350,55,66,44,44,56,85,4,False
477 | 428,Lopunny,Normal,,480,65,76,84,54,96,105,4,False
478 | 428,LopunnyMega Lopunny,Normal,Fighting,580,65,136,94,54,96,135,4,False
479 | 429,Mismagius,Ghost,,495,60,60,60,105,105,105,4,False
480 | 430,Honchkrow,Dark,Flying,505,100,125,52,105,52,71,4,False
481 | 431,Glameow,Normal,,310,49,55,42,42,37,85,4,False
482 | 432,Purugly,Normal,,452,71,82,64,64,59,112,4,False
483 | 433,Chingling,Psychic,,285,45,30,50,65,50,45,4,False
484 | 434,Stunky,Poison,Dark,329,63,63,47,41,41,74,4,False
485 | 435,Skuntank,Poison,Dark,479,103,93,67,71,61,84,4,False
486 | 436,Bronzor,Steel,Psychic,300,57,24,86,24,86,23,4,False
487 | 437,Bronzong,Steel,Psychic,500,67,89,116,79,116,33,4,False
488 | 438,Bonsly,Rock,,290,50,80,95,10,45,10,4,False
489 | 439,Mime Jr.,Psychic,Fairy,310,20,25,45,70,90,60,4,False
490 | 440,Happiny,Normal,,220,100,5,5,15,65,30,4,False
491 | 441,Chatot,Normal,Flying,411,76,65,45,92,42,91,4,False
492 | 442,Spiritomb,Ghost,Dark,485,50,92,108,92,108,35,4,False
493 | 443,Gible,Dragon,Ground,300,58,70,45,40,45,42,4,False
494 | 444,Gabite,Dragon,Ground,410,68,90,65,50,55,82,4,False
495 | 445,Garchomp,Dragon,Ground,600,108,130,95,80,85,102,4,False
496 | 445,GarchompMega Garchomp,Dragon,Ground,700,108,170,115,120,95,92,4,False
497 | 446,Munchlax,Normal,,390,135,85,40,40,85,5,4,False
498 | 447,Riolu,Fighting,,285,40,70,40,35,40,60,4,False
499 | 448,Lucario,Fighting,Steel,525,70,110,70,115,70,90,4,False
500 | 448,LucarioMega Lucario,Fighting,Steel,625,70,145,88,140,70,112,4,False
501 | 449,Hippopotas,Ground,,330,68,72,78,38,42,32,4,False
502 | 450,Hippowdon,Ground,,525,108,112,118,68,72,47,4,False
503 | 451,Skorupi,Poison,Bug,330,40,50,90,30,55,65,4,False
504 | 452,Drapion,Poison,Dark,500,70,90,110,60,75,95,4,False
505 | 453,Croagunk,Poison,Fighting,300,48,61,40,61,40,50,4,False
506 | 454,Toxicroak,Poison,Fighting,490,83,106,65,86,65,85,4,False
507 | 455,Carnivine,Grass,,454,74,100,72,90,72,46,4,False
508 | 456,Finneon,Water,,330,49,49,56,49,61,66,4,False
509 | 457,Lumineon,Water,,460,69,69,76,69,86,91,4,False
510 | 458,Mantyke,Water,Flying,345,45,20,50,60,120,50,4,False
511 | 459,Snover,Grass,Ice,334,60,62,50,62,60,40,4,False
512 | 460,Abomasnow,Grass,Ice,494,90,92,75,92,85,60,4,False
513 | 460,AbomasnowMega Abomasnow,Grass,Ice,594,90,132,105,132,105,30,4,False
514 | 461,Weavile,Dark,Ice,510,70,120,65,45,85,125,4,False
515 | 462,Magnezone,Electric,Steel,535,70,70,115,130,90,60,4,False
516 | 463,Lickilicky,Normal,,515,110,85,95,80,95,50,4,False
517 | 464,Rhyperior,Ground,Rock,535,115,140,130,55,55,40,4,False
518 | 465,Tangrowth,Grass,,535,100,100,125,110,50,50,4,False
519 | 466,Electivire,Electric,,540,75,123,67,95,85,95,4,False
520 | 467,Magmortar,Fire,,540,75,95,67,125,95,83,4,False
521 | 468,Togekiss,Fairy,Flying,545,85,50,95,120,115,80,4,False
522 | 469,Yanmega,Bug,Flying,515,86,76,86,116,56,95,4,False
523 | 470,Leafeon,Grass,,525,65,110,130,60,65,95,4,False
524 | 471,Glaceon,Ice,,525,65,60,110,130,95,65,4,False
525 | 472,Gliscor,Ground,Flying,510,75,95,125,45,75,95,4,False
526 | 473,Mamoswine,Ice,Ground,530,110,130,80,70,60,80,4,False
527 | 474,Porygon-Z,Normal,,535,85,80,70,135,75,90,4,False
528 | 475,Gallade,Psychic,Fighting,518,68,125,65,65,115,80,4,False
529 | 475,GalladeMega Gallade,Psychic,Fighting,618,68,165,95,65,115,110,4,False
530 | 476,Probopass,Rock,Steel,525,60,55,145,75,150,40,4,False
531 | 477,Dusknoir,Ghost,,525,45,100,135,65,135,45,4,False
532 | 478,Froslass,Ice,Ghost,480,70,80,70,80,70,110,4,False
533 | 479,Rotom,Electric,Ghost,440,50,50,77,95,77,91,4,False
534 | 479,RotomHeat Rotom,Electric,Fire,520,50,65,107,105,107,86,4,False
535 | 479,RotomWash Rotom,Electric,Water,520,50,65,107,105,107,86,4,False
536 | 479,RotomFrost Rotom,Electric,Ice,520,50,65,107,105,107,86,4,False
537 | 479,RotomFan Rotom,Electric,Flying,520,50,65,107,105,107,86,4,False
538 | 479,RotomMow Rotom,Electric,Grass,520,50,65,107,105,107,86,4,False
539 | 480,Uxie,Psychic,,580,75,75,130,75,130,95,4,True
540 | 481,Mesprit,Psychic,,580,80,105,105,105,105,80,4,True
541 | 482,Azelf,Psychic,,580,75,125,70,125,70,115,4,True
542 | 483,Dialga,Steel,Dragon,680,100,120,120,150,100,90,4,True
543 | 484,Palkia,Water,Dragon,680,90,120,100,150,120,100,4,True
544 | 485,Heatran,Fire,Steel,600,91,90,106,130,106,77,4,True
545 | 486,Regigigas,Normal,,670,110,160,110,80,110,100,4,True
546 | 487,GiratinaAltered Forme,Ghost,Dragon,680,150,100,120,100,120,90,4,True
547 | 487,GiratinaOrigin Forme,Ghost,Dragon,680,150,120,100,120,100,90,4,True
548 | 488,Cresselia,Psychic,,600,120,70,120,75,130,85,4,False
549 | 489,Phione,Water,,480,80,80,80,80,80,80,4,False
550 | 490,Manaphy,Water,,600,100,100,100,100,100,100,4,False
551 | 491,Darkrai,Dark,,600,70,90,90,135,90,125,4,True
552 | 492,ShayminLand Forme,Grass,,600,100,100,100,100,100,100,4,True
553 | 492,ShayminSky Forme,Grass,Flying,600,100,103,75,120,75,127,4,True
554 | 493,Arceus,Normal,,720,120,120,120,120,120,120,4,True
555 | 494,Victini,Psychic,Fire,600,100,100,100,100,100,100,5,True
556 | 495,Snivy,Grass,,308,45,45,55,45,55,63,5,False
557 | 496,Servine,Grass,,413,60,60,75,60,75,83,5,False
558 | 497,Serperior,Grass,,528,75,75,95,75,95,113,5,False
559 | 498,Tepig,Fire,,308,65,63,45,45,45,45,5,False
560 | 499,Pignite,Fire,Fighting,418,90,93,55,70,55,55,5,False
561 | 500,Emboar,Fire,Fighting,528,110,123,65,100,65,65,5,False
562 | 501,Oshawott,Water,,308,55,55,45,63,45,45,5,False
563 | 502,Dewott,Water,,413,75,75,60,83,60,60,5,False
564 | 503,Samurott,Water,,528,95,100,85,108,70,70,5,False
565 | 504,Patrat,Normal,,255,45,55,39,35,39,42,5,False
566 | 505,Watchog,Normal,,420,60,85,69,60,69,77,5,False
567 | 506,Lillipup,Normal,,275,45,60,45,25,45,55,5,False
568 | 507,Herdier,Normal,,370,65,80,65,35,65,60,5,False
569 | 508,Stoutland,Normal,,500,85,110,90,45,90,80,5,False
570 | 509,Purrloin,Dark,,281,41,50,37,50,37,66,5,False
571 | 510,Liepard,Dark,,446,64,88,50,88,50,106,5,False
572 | 511,Pansage,Grass,,316,50,53,48,53,48,64,5,False
573 | 512,Simisage,Grass,,498,75,98,63,98,63,101,5,False
574 | 513,Pansear,Fire,,316,50,53,48,53,48,64,5,False
575 | 514,Simisear,Fire,,498,75,98,63,98,63,101,5,False
576 | 515,Panpour,Water,,316,50,53,48,53,48,64,5,False
577 | 516,Simipour,Water,,498,75,98,63,98,63,101,5,False
578 | 517,Munna,Psychic,,292,76,25,45,67,55,24,5,False
579 | 518,Musharna,Psychic,,487,116,55,85,107,95,29,5,False
580 | 519,Pidove,Normal,Flying,264,50,55,50,36,30,43,5,False
581 | 520,Tranquill,Normal,Flying,358,62,77,62,50,42,65,5,False
582 | 521,Unfezant,Normal,Flying,488,80,115,80,65,55,93,5,False
583 | 522,Blitzle,Electric,,295,45,60,32,50,32,76,5,False
584 | 523,Zebstrika,Electric,,497,75,100,63,80,63,116,5,False
585 | 524,Roggenrola,Rock,,280,55,75,85,25,25,15,5,False
586 | 525,Boldore,Rock,,390,70,105,105,50,40,20,5,False
587 | 526,Gigalith,Rock,,515,85,135,130,60,80,25,5,False
588 | 527,Woobat,Psychic,Flying,313,55,45,43,55,43,72,5,False
589 | 528,Swoobat,Psychic,Flying,425,67,57,55,77,55,114,5,False
590 | 529,Drilbur,Ground,,328,60,85,40,30,45,68,5,False
591 | 530,Excadrill,Ground,Steel,508,110,135,60,50,65,88,5,False
592 | 531,Audino,Normal,,445,103,60,86,60,86,50,5,False
593 | 531,AudinoMega Audino,Normal,Fairy,545,103,60,126,80,126,50,5,False
594 | 532,Timburr,Fighting,,305,75,80,55,25,35,35,5,False
595 | 533,Gurdurr,Fighting,,405,85,105,85,40,50,40,5,False
596 | 534,Conkeldurr,Fighting,,505,105,140,95,55,65,45,5,False
597 | 535,Tympole,Water,,294,50,50,40,50,40,64,5,False
598 | 536,Palpitoad,Water,Ground,384,75,65,55,65,55,69,5,False
599 | 537,Seismitoad,Water,Ground,509,105,95,75,85,75,74,5,False
600 | 538,Throh,Fighting,,465,120,100,85,30,85,45,5,False
601 | 539,Sawk,Fighting,,465,75,125,75,30,75,85,5,False
602 | 540,Sewaddle,Bug,Grass,310,45,53,70,40,60,42,5,False
603 | 541,Swadloon,Bug,Grass,380,55,63,90,50,80,42,5,False
604 | 542,Leavanny,Bug,Grass,500,75,103,80,70,80,92,5,False
605 | 543,Venipede,Bug,Poison,260,30,45,59,30,39,57,5,False
606 | 544,Whirlipede,Bug,Poison,360,40,55,99,40,79,47,5,False
607 | 545,Scolipede,Bug,Poison,485,60,100,89,55,69,112,5,False
608 | 546,Cottonee,Grass,Fairy,280,40,27,60,37,50,66,5,False
609 | 547,Whimsicott,Grass,Fairy,480,60,67,85,77,75,116,5,False
610 | 548,Petilil,Grass,,280,45,35,50,70,50,30,5,False
611 | 549,Lilligant,Grass,,480,70,60,75,110,75,90,5,False
612 | 550,Basculin,Water,,460,70,92,65,80,55,98,5,False
613 | 551,Sandile,Ground,Dark,292,50,72,35,35,35,65,5,False
614 | 552,Krokorok,Ground,Dark,351,60,82,45,45,45,74,5,False
615 | 553,Krookodile,Ground,Dark,519,95,117,80,65,70,92,5,False
616 | 554,Darumaka,Fire,,315,70,90,45,15,45,50,5,False
617 | 555,DarmanitanStandard Mode,Fire,,480,105,140,55,30,55,95,5,False
618 | 555,DarmanitanZen Mode,Fire,Psychic,540,105,30,105,140,105,55,5,False
619 | 556,Maractus,Grass,,461,75,86,67,106,67,60,5,False
620 | 557,Dwebble,Bug,Rock,325,50,65,85,35,35,55,5,False
621 | 558,Crustle,Bug,Rock,475,70,95,125,65,75,45,5,False
622 | 559,Scraggy,Dark,Fighting,348,50,75,70,35,70,48,5,False
623 | 560,Scrafty,Dark,Fighting,488,65,90,115,45,115,58,5,False
624 | 561,Sigilyph,Psychic,Flying,490,72,58,80,103,80,97,5,False
625 | 562,Yamask,Ghost,,303,38,30,85,55,65,30,5,False
626 | 563,Cofagrigus,Ghost,,483,58,50,145,95,105,30,5,False
627 | 564,Tirtouga,Water,Rock,355,54,78,103,53,45,22,5,False
628 | 565,Carracosta,Water,Rock,495,74,108,133,83,65,32,5,False
629 | 566,Archen,Rock,Flying,401,55,112,45,74,45,70,5,False
630 | 567,Archeops,Rock,Flying,567,75,140,65,112,65,110,5,False
631 | 568,Trubbish,Poison,,329,50,50,62,40,62,65,5,False
632 | 569,Garbodor,Poison,,474,80,95,82,60,82,75,5,False
633 | 570,Zorua,Dark,,330,40,65,40,80,40,65,5,False
634 | 571,Zoroark,Dark,,510,60,105,60,120,60,105,5,False
635 | 572,Minccino,Normal,,300,55,50,40,40,40,75,5,False
636 | 573,Cinccino,Normal,,470,75,95,60,65,60,115,5,False
637 | 574,Gothita,Psychic,,290,45,30,50,55,65,45,5,False
638 | 575,Gothorita,Psychic,,390,60,45,70,75,85,55,5,False
639 | 576,Gothitelle,Psychic,,490,70,55,95,95,110,65,5,False
640 | 577,Solosis,Psychic,,290,45,30,40,105,50,20,5,False
641 | 578,Duosion,Psychic,,370,65,40,50,125,60,30,5,False
642 | 579,Reuniclus,Psychic,,490,110,65,75,125,85,30,5,False
643 | 580,Ducklett,Water,Flying,305,62,44,50,44,50,55,5,False
644 | 581,Swanna,Water,Flying,473,75,87,63,87,63,98,5,False
645 | 582,Vanillite,Ice,,305,36,50,50,65,60,44,5,False
646 | 583,Vanillish,Ice,,395,51,65,65,80,75,59,5,False
647 | 584,Vanilluxe,Ice,,535,71,95,85,110,95,79,5,False
648 | 585,Deerling,Normal,Grass,335,60,60,50,40,50,75,5,False
649 | 586,Sawsbuck,Normal,Grass,475,80,100,70,60,70,95,5,False
650 | 587,Emolga,Electric,Flying,428,55,75,60,75,60,103,5,False
651 | 588,Karrablast,Bug,,315,50,75,45,40,45,60,5,False
652 | 589,Escavalier,Bug,Steel,495,70,135,105,60,105,20,5,False
653 | 590,Foongus,Grass,Poison,294,69,55,45,55,55,15,5,False
654 | 591,Amoonguss,Grass,Poison,464,114,85,70,85,80,30,5,False
655 | 592,Frillish,Water,Ghost,335,55,40,50,65,85,40,5,False
656 | 593,Jellicent,Water,Ghost,480,100,60,70,85,105,60,5,False
657 | 594,Alomomola,Water,,470,165,75,80,40,45,65,5,False
658 | 595,Joltik,Bug,Electric,319,50,47,50,57,50,65,5,False
659 | 596,Galvantula,Bug,Electric,472,70,77,60,97,60,108,5,False
660 | 597,Ferroseed,Grass,Steel,305,44,50,91,24,86,10,5,False
661 | 598,Ferrothorn,Grass,Steel,489,74,94,131,54,116,20,5,False
662 | 599,Klink,Steel,,300,40,55,70,45,60,30,5,False
663 | 600,Klang,Steel,,440,60,80,95,70,85,50,5,False
664 | 601,Klinklang,Steel,,520,60,100,115,70,85,90,5,False
665 | 602,Tynamo,Electric,,275,35,55,40,45,40,60,5,False
666 | 603,Eelektrik,Electric,,405,65,85,70,75,70,40,5,False
667 | 604,Eelektross,Electric,,515,85,115,80,105,80,50,5,False
668 | 605,Elgyem,Psychic,,335,55,55,55,85,55,30,5,False
669 | 606,Beheeyem,Psychic,,485,75,75,75,125,95,40,5,False
670 | 607,Litwick,Ghost,Fire,275,50,30,55,65,55,20,5,False
671 | 608,Lampent,Ghost,Fire,370,60,40,60,95,60,55,5,False
672 | 609,Chandelure,Ghost,Fire,520,60,55,90,145,90,80,5,False
673 | 610,Axew,Dragon,,320,46,87,60,30,40,57,5,False
674 | 611,Fraxure,Dragon,,410,66,117,70,40,50,67,5,False
675 | 612,Haxorus,Dragon,,540,76,147,90,60,70,97,5,False
676 | 613,Cubchoo,Ice,,305,55,70,40,60,40,40,5,False
677 | 614,Beartic,Ice,,485,95,110,80,70,80,50,5,False
678 | 615,Cryogonal,Ice,,485,70,50,30,95,135,105,5,False
679 | 616,Shelmet,Bug,,305,50,40,85,40,65,25,5,False
680 | 617,Accelgor,Bug,,495,80,70,40,100,60,145,5,False
681 | 618,Stunfisk,Ground,Electric,471,109,66,84,81,99,32,5,False
682 | 619,Mienfoo,Fighting,,350,45,85,50,55,50,65,5,False
683 | 620,Mienshao,Fighting,,510,65,125,60,95,60,105,5,False
684 | 621,Druddigon,Dragon,,485,77,120,90,60,90,48,5,False
685 | 622,Golett,Ground,Ghost,303,59,74,50,35,50,35,5,False
686 | 623,Golurk,Ground,Ghost,483,89,124,80,55,80,55,5,False
687 | 624,Pawniard,Dark,Steel,340,45,85,70,40,40,60,5,False
688 | 625,Bisharp,Dark,Steel,490,65,125,100,60,70,70,5,False
689 | 626,Bouffalant,Normal,,490,95,110,95,40,95,55,5,False
690 | 627,Rufflet,Normal,Flying,350,70,83,50,37,50,60,5,False
691 | 628,Braviary,Normal,Flying,510,100,123,75,57,75,80,5,False
692 | 629,Vullaby,Dark,Flying,370,70,55,75,45,65,60,5,False
693 | 630,Mandibuzz,Dark,Flying,510,110,65,105,55,95,80,5,False
694 | 631,Heatmor,Fire,,484,85,97,66,105,66,65,5,False
695 | 632,Durant,Bug,Steel,484,58,109,112,48,48,109,5,False
696 | 633,Deino,Dark,Dragon,300,52,65,50,45,50,38,5,False
697 | 634,Zweilous,Dark,Dragon,420,72,85,70,65,70,58,5,False
698 | 635,Hydreigon,Dark,Dragon,600,92,105,90,125,90,98,5,False
699 | 636,Larvesta,Bug,Fire,360,55,85,55,50,55,60,5,False
700 | 637,Volcarona,Bug,Fire,550,85,60,65,135,105,100,5,False
701 | 638,Cobalion,Steel,Fighting,580,91,90,129,90,72,108,5,True
702 | 639,Terrakion,Rock,Fighting,580,91,129,90,72,90,108,5,True
703 | 640,Virizion,Grass,Fighting,580,91,90,72,90,129,108,5,True
704 | 641,TornadusIncarnate Forme,Flying,,580,79,115,70,125,80,111,5,True
705 | 641,TornadusTherian Forme,Flying,,580,79,100,80,110,90,121,5,True
706 | 642,ThundurusIncarnate Forme,Electric,Flying,580,79,115,70,125,80,111,5,True
707 | 642,ThundurusTherian Forme,Electric,Flying,580,79,105,70,145,80,101,5,True
708 | 643,Reshiram,Dragon,Fire,680,100,120,100,150,120,90,5,True
709 | 644,Zekrom,Dragon,Electric,680,100,150,120,120,100,90,5,True
710 | 645,LandorusIncarnate Forme,Ground,Flying,600,89,125,90,115,80,101,5,True
711 | 645,LandorusTherian Forme,Ground,Flying,600,89,145,90,105,80,91,5,True
712 | 646,Kyurem,Dragon,Ice,660,125,130,90,130,90,95,5,True
713 | 646,KyuremBlack Kyurem,Dragon,Ice,700,125,170,100,120,90,95,5,True
714 | 646,KyuremWhite Kyurem,Dragon,Ice,700,125,120,90,170,100,95,5,True
715 | 647,KeldeoOrdinary Forme,Water,Fighting,580,91,72,90,129,90,108,5,False
716 | 647,KeldeoResolute Forme,Water,Fighting,580,91,72,90,129,90,108,5,False
717 | 648,MeloettaAria Forme,Normal,Psychic,600,100,77,77,128,128,90,5,False
718 | 648,MeloettaPirouette Forme,Normal,Fighting,600,100,128,90,77,77,128,5,False
719 | 649,Genesect,Bug,Steel,600,71,120,95,120,95,99,5,False
720 | 650,Chespin,Grass,,313,56,61,65,48,45,38,6,False
721 | 651,Quilladin,Grass,,405,61,78,95,56,58,57,6,False
722 | 652,Chesnaught,Grass,Fighting,530,88,107,122,74,75,64,6,False
723 | 653,Fennekin,Fire,,307,40,45,40,62,60,60,6,False
724 | 654,Braixen,Fire,,409,59,59,58,90,70,73,6,False
725 | 655,Delphox,Fire,Psychic,534,75,69,72,114,100,104,6,False
726 | 656,Froakie,Water,,314,41,56,40,62,44,71,6,False
727 | 657,Frogadier,Water,,405,54,63,52,83,56,97,6,False
728 | 658,Greninja,Water,Dark,530,72,95,67,103,71,122,6,False
729 | 659,Bunnelby,Normal,,237,38,36,38,32,36,57,6,False
730 | 660,Diggersby,Normal,Ground,423,85,56,77,50,77,78,6,False
731 | 661,Fletchling,Normal,Flying,278,45,50,43,40,38,62,6,False
732 | 662,Fletchinder,Fire,Flying,382,62,73,55,56,52,84,6,False
733 | 663,Talonflame,Fire,Flying,499,78,81,71,74,69,126,6,False
734 | 664,Scatterbug,Bug,,200,38,35,40,27,25,35,6,False
735 | 665,Spewpa,Bug,,213,45,22,60,27,30,29,6,False
736 | 666,Vivillon,Bug,Flying,411,80,52,50,90,50,89,6,False
737 | 667,Litleo,Fire,Normal,369,62,50,58,73,54,72,6,False
738 | 668,Pyroar,Fire,Normal,507,86,68,72,109,66,106,6,False
739 | 669,Flabébé,Fairy,,303,44,38,39,61,79,42,6,False
740 | 670,Floette,Fairy,,371,54,45,47,75,98,52,6,False
741 | 671,Florges,Fairy,,552,78,65,68,112,154,75,6,False
742 | 672,Skiddo,Grass,,350,66,65,48,62,57,52,6,False
743 | 673,Gogoat,Grass,,531,123,100,62,97,81,68,6,False
744 | 674,Pancham,Fighting,,348,67,82,62,46,48,43,6,False
745 | 675,Pangoro,Fighting,Dark,495,95,124,78,69,71,58,6,False
746 | 676,Furfrou,Normal,,472,75,80,60,65,90,102,6,False
747 | 677,Espurr,Psychic,,355,62,48,54,63,60,68,6,False
748 | 678,MeowsticMale,Psychic,,466,74,48,76,83,81,104,6,False
749 | 678,MeowsticFemale,Psychic,,466,74,48,76,83,81,104,6,False
750 | 679,Honedge,Steel,Ghost,325,45,80,100,35,37,28,6,False
751 | 680,Doublade,Steel,Ghost,448,59,110,150,45,49,35,6,False
752 | 681,AegislashBlade Forme,Steel,Ghost,520,60,150,50,150,50,60,6,False
753 | 681,AegislashShield Forme,Steel,Ghost,520,60,50,150,50,150,60,6,False
754 | 682,Spritzee,Fairy,,341,78,52,60,63,65,23,6,False
755 | 683,Aromatisse,Fairy,,462,101,72,72,99,89,29,6,False
756 | 684,Swirlix,Fairy,,341,62,48,66,59,57,49,6,False
757 | 685,Slurpuff,Fairy,,480,82,80,86,85,75,72,6,False
758 | 686,Inkay,Dark,Psychic,288,53,54,53,37,46,45,6,False
759 | 687,Malamar,Dark,Psychic,482,86,92,88,68,75,73,6,False
760 | 688,Binacle,Rock,Water,306,42,52,67,39,56,50,6,False
761 | 689,Barbaracle,Rock,Water,500,72,105,115,54,86,68,6,False
762 | 690,Skrelp,Poison,Water,320,50,60,60,60,60,30,6,False
763 | 691,Dragalge,Poison,Dragon,494,65,75,90,97,123,44,6,False
764 | 692,Clauncher,Water,,330,50,53,62,58,63,44,6,False
765 | 693,Clawitzer,Water,,500,71,73,88,120,89,59,6,False
766 | 694,Helioptile,Electric,Normal,289,44,38,33,61,43,70,6,False
767 | 695,Heliolisk,Electric,Normal,481,62,55,52,109,94,109,6,False
768 | 696,Tyrunt,Rock,Dragon,362,58,89,77,45,45,48,6,False
769 | 697,Tyrantrum,Rock,Dragon,521,82,121,119,69,59,71,6,False
770 | 698,Amaura,Rock,Ice,362,77,59,50,67,63,46,6,False
771 | 699,Aurorus,Rock,Ice,521,123,77,72,99,92,58,6,False
772 | 700,Sylveon,Fairy,,525,95,65,65,110,130,60,6,False
773 | 701,Hawlucha,Fighting,Flying,500,78,92,75,74,63,118,6,False
774 | 702,Dedenne,Electric,Fairy,431,67,58,57,81,67,101,6,False
775 | 703,Carbink,Rock,Fairy,500,50,50,150,50,150,50,6,False
776 | 704,Goomy,Dragon,,300,45,50,35,55,75,40,6,False
777 | 705,Sliggoo,Dragon,,452,68,75,53,83,113,60,6,False
778 | 706,Goodra,Dragon,,600,90,100,70,110,150,80,6,False
779 | 707,Klefki,Steel,Fairy,470,57,80,91,80,87,75,6,False
780 | 708,Phantump,Ghost,Grass,309,43,70,48,50,60,38,6,False
781 | 709,Trevenant,Ghost,Grass,474,85,110,76,65,82,56,6,False
782 | 710,PumpkabooAverage Size,Ghost,Grass,335,49,66,70,44,55,51,6,False
783 | 710,PumpkabooSmall Size,Ghost,Grass,335,44,66,70,44,55,56,6,False
784 | 710,PumpkabooLarge Size,Ghost,Grass,335,54,66,70,44,55,46,6,False
785 | 710,PumpkabooSuper Size,Ghost,Grass,335,59,66,70,44,55,41,6,False
786 | 711,GourgeistAverage Size,Ghost,Grass,494,65,90,122,58,75,84,6,False
787 | 711,GourgeistSmall Size,Ghost,Grass,494,55,85,122,58,75,99,6,False
788 | 711,GourgeistLarge Size,Ghost,Grass,494,75,95,122,58,75,69,6,False
789 | 711,GourgeistSuper Size,Ghost,Grass,494,85,100,122,58,75,54,6,False
790 | 712,Bergmite,Ice,,304,55,69,85,32,35,28,6,False
791 | 713,Avalugg,Ice,,514,95,117,184,44,46,28,6,False
792 | 714,Noibat,Flying,Dragon,245,40,30,35,45,40,55,6,False
793 | 715,Noivern,Flying,Dragon,535,85,70,80,97,80,123,6,False
794 | 716,Xerneas,Fairy,,680,126,131,95,131,98,99,6,True
795 | 717,Yveltal,Dark,Flying,680,126,131,95,131,98,99,6,True
796 | 718,Zygarde50% Forme,Dragon,Ground,600,108,100,121,81,95,95,6,True
797 | 719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
798 | 719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
799 | 720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
800 | 720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True
801 | 721,Volcanion,Fire,Water,600,80,110,120,130,90,70,6,True
802 | 


--------------------------------------------------------------------------------
/Datasets/restaurant_data.csv:
--------------------------------------------------------------------------------
  1 | "total_bill","tip","sex","smoker","day","time","size"
  2 | 16.99,1.01,"Female","No","Sun","Dinner",2
  3 | 10.34,1.66,"Male","No","Sun","Dinner",3
  4 | 21.01,3.5,"Male","No","Sun","Dinner",3
  5 | 23.68,3.31,"Male","No","Sun","Dinner",2
  6 | 24.59,3.61,"Female","No","Sun","Dinner",4
  7 | 25.29,4.71,"Male","No","Sun","Dinner",4
  8 | 8.77,2,"Male","No","Sun","Dinner",2
  9 | 26.88,3.12,"Male","No","Sun","Dinner",4
 10 | 15.04,1.96,"Male","No","Sun","Dinner",2
 11 | 14.78,3.23,"Male","No","Sun","Dinner",2
 12 | 10.27,1.71,"Male","No","Sun","Dinner",2
 13 | 35.26,5,"Female","No","Sun","Dinner",4
 14 | 15.42,1.57,"Male","No","Sun","Dinner",2
 15 | 18.43,3,"Male","No","Sun","Dinner",4
 16 | 14.83,3.02,"Female","No","Sun","Dinner",2
 17 | 21.58,3.92,"Male","No","Sun","Dinner",2
 18 | 10.33,1.67,"Female","No","Sun","Dinner",3
 19 | 16.29,3.71,"Male","No","Sun","Dinner",3
 20 | 16.97,3.5,"Female","No","Sun","Dinner",3
 21 | 20.65,3.35,"Male","No","Sat","Dinner",3
 22 | 17.92,4.08,"Male","No","Sat","Dinner",2
 23 | 20.29,2.75,"Female","No","Sat","Dinner",2
 24 | 15.77,2.23,"Female","No","Sat","Dinner",2
 25 | 39.42,7.58,"Male","No","Sat","Dinner",4
 26 | 19.82,3.18,"Male","No","Sat","Dinner",2
 27 | 17.81,2.34,"Male","No","Sat","Dinner",4
 28 | 13.37,2,"Male","No","Sat","Dinner",2
 29 | 12.69,2,"Male","No","Sat","Dinner",2
 30 | 21.7,4.3,"Male","No","Sat","Dinner",2
 31 | 19.65,3,"Female","No","Sat","Dinner",2
 32 | 9.55,1.45,"Male","No","Sat","Dinner",2
 33 | 18.35,2.5,"Male","No","Sat","Dinner",4
 34 | 15.06,3,"Female","No","Sat","Dinner",2
 35 | 20.69,2.45,"Female","No","Sat","Dinner",4
 36 | 17.78,3.27,"Male","No","Sat","Dinner",2
 37 | 24.06,3.6,"Male","No","Sat","Dinner",3
 38 | 16.31,2,"Male","No","Sat","Dinner",3
 39 | 16.93,3.07,"Female","No","Sat","Dinner",3
 40 | 18.69,2.31,"Male","No","Sat","Dinner",3
 41 | 31.27,5,"Male","No","Sat","Dinner",3
 42 | 16.04,2.24,"Male","No","Sat","Dinner",3
 43 | 17.46,2.54,"Male","No","Sun","Dinner",2
 44 | 13.94,3.06,"Male","No","Sun","Dinner",2
 45 | 9.68,1.32,"Male","No","Sun","Dinner",2
 46 | 30.4,5.6,"Male","No","Sun","Dinner",4
 47 | 18.29,3,"Male","No","Sun","Dinner",2
 48 | 22.23,5,"Male","No","Sun","Dinner",2
 49 | 32.4,6,"Male","No","Sun","Dinner",4
 50 | 28.55,2.05,"Male","No","Sun","Dinner",3
 51 | 18.04,3,"Male","No","Sun","Dinner",2
 52 | 12.54,2.5,"Male","No","Sun","Dinner",2
 53 | 10.29,2.6,"Female","No","Sun","Dinner",2
 54 | 34.81,5.2,"Female","No","Sun","Dinner",4
 55 | 9.94,1.56,"Male","No","Sun","Dinner",2
 56 | 25.56,4.34,"Male","No","Sun","Dinner",4
 57 | 19.49,3.51,"Male","No","Sun","Dinner",2
 58 | 38.01,3,"Male","Yes","Sat","Dinner",4
 59 | 26.41,1.5,"Female","No","Sat","Dinner",2
 60 | 11.24,1.76,"Male","Yes","Sat","Dinner",2
 61 | 48.27,6.73,"Male","No","Sat","Dinner",4
 62 | 20.29,3.21,"Male","Yes","Sat","Dinner",2
 63 | 13.81,2,"Male","Yes","Sat","Dinner",2
 64 | 11.02,1.98,"Male","Yes","Sat","Dinner",2
 65 | 18.29,3.76,"Male","Yes","Sat","Dinner",4
 66 | 17.59,2.64,"Male","No","Sat","Dinner",3
 67 | 20.08,3.15,"Male","No","Sat","Dinner",3
 68 | 16.45,2.47,"Female","No","Sat","Dinner",2
 69 | 3.07,1,"Female","Yes","Sat","Dinner",1
 70 | 20.23,2.01,"Male","No","Sat","Dinner",2
 71 | 15.01,2.09,"Male","Yes","Sat","Dinner",2
 72 | 12.02,1.97,"Male","No","Sat","Dinner",2
 73 | 17.07,3,"Female","No","Sat","Dinner",3
 74 | 26.86,3.14,"Female","Yes","Sat","Dinner",2
 75 | 25.28,5,"Female","Yes","Sat","Dinner",2
 76 | 14.73,2.2,"Female","No","Sat","Dinner",2
 77 | 10.51,1.25,"Male","No","Sat","Dinner",2
 78 | 17.92,3.08,"Male","Yes","Sat","Dinner",2
 79 | 27.2,4,"Male","No","Thur","Lunch",4
 80 | 22.76,3,"Male","No","Thur","Lunch",2
 81 | 17.29,2.71,"Male","No","Thur","Lunch",2
 82 | 19.44,3,"Male","Yes","Thur","Lunch",2
 83 | 16.66,3.4,"Male","No","Thur","Lunch",2
 84 | 10.07,1.83,"Female","No","Thur","Lunch",1
 85 | 32.68,5,"Male","Yes","Thur","Lunch",2
 86 | 15.98,2.03,"Male","No","Thur","Lunch",2
 87 | 34.83,5.17,"Female","No","Thur","Lunch",4
 88 | 13.03,2,"Male","No","Thur","Lunch",2
 89 | 18.28,4,"Male","No","Thur","Lunch",2
 90 | 24.71,5.85,"Male","No","Thur","Lunch",2
 91 | 21.16,3,"Male","No","Thur","Lunch",2
 92 | 28.97,3,"Male","Yes","Fri","Dinner",2
 93 | 22.49,3.5,"Male","No","Fri","Dinner",2
 94 | 5.75,1,"Female","Yes","Fri","Dinner",2
 95 | 16.32,4.3,"Female","Yes","Fri","Dinner",2
 96 | 22.75,3.25,"Female","No","Fri","Dinner",2
 97 | 40.17,4.73,"Male","Yes","Fri","Dinner",4
 98 | 27.28,4,"Male","Yes","Fri","Dinner",2
 99 | 12.03,1.5,"Male","Yes","Fri","Dinner",2
100 | 21.01,3,"Male","Yes","Fri","Dinner",2
101 | 12.46,1.5,"Male","No","Fri","Dinner",2
102 | 11.35,2.5,"Female","Yes","Fri","Dinner",2
103 | 15.38,3,"Female","Yes","Fri","Dinner",2
104 | 44.3,2.5,"Female","Yes","Sat","Dinner",3
105 | 22.42,3.48,"Female","Yes","Sat","Dinner",2
106 | 20.92,4.08,"Female","No","Sat","Dinner",2
107 | 15.36,1.64,"Male","Yes","Sat","Dinner",2
108 | 20.49,4.06,"Male","Yes","Sat","Dinner",2
109 | 25.21,4.29,"Male","Yes","Sat","Dinner",2
110 | 18.24,3.76,"Male","No","Sat","Dinner",2
111 | 14.31,4,"Female","Yes","Sat","Dinner",2
112 | 14,3,"Male","No","Sat","Dinner",2
113 | 7.25,1,"Female","No","Sat","Dinner",1
114 | 38.07,4,"Male","No","Sun","Dinner",3
115 | 23.95,2.55,"Male","No","Sun","Dinner",2
116 | 25.71,4,"Female","No","Sun","Dinner",3
117 | 17.31,3.5,"Female","No","Sun","Dinner",2
118 | 29.93,5.07,"Male","No","Sun","Dinner",4
119 | 10.65,1.5,"Female","No","Thur","Lunch",2
120 | 12.43,1.8,"Female","No","Thur","Lunch",2
121 | 24.08,2.92,"Female","No","Thur","Lunch",4
122 | 11.69,2.31,"Male","No","Thur","Lunch",2
123 | 13.42,1.68,"Female","No","Thur","Lunch",2
124 | 14.26,2.5,"Male","No","Thur","Lunch",2
125 | 15.95,2,"Male","No","Thur","Lunch",2
126 | 12.48,2.52,"Female","No","Thur","Lunch",2
127 | 29.8,4.2,"Female","No","Thur","Lunch",6
128 | 8.52,1.48,"Male","No","Thur","Lunch",2
129 | 14.52,2,"Female","No","Thur","Lunch",2
130 | 11.38,2,"Female","No","Thur","Lunch",2
131 | 22.82,2.18,"Male","No","Thur","Lunch",3
132 | 19.08,1.5,"Male","No","Thur","Lunch",2
133 | 20.27,2.83,"Female","No","Thur","Lunch",2
134 | 11.17,1.5,"Female","No","Thur","Lunch",2
135 | 12.26,2,"Female","No","Thur","Lunch",2
136 | 18.26,3.25,"Female","No","Thur","Lunch",2
137 | 8.51,1.25,"Female","No","Thur","Lunch",2
138 | 10.33,2,"Female","No","Thur","Lunch",2
139 | 14.15,2,"Female","No","Thur","Lunch",2
140 | 16,2,"Male","Yes","Thur","Lunch",2
141 | 13.16,2.75,"Female","No","Thur","Lunch",2
142 | 17.47,3.5,"Female","No","Thur","Lunch",2
143 | 34.3,6.7,"Male","No","Thur","Lunch",6
144 | 41.19,5,"Male","No","Thur","Lunch",5
145 | 27.05,5,"Female","No","Thur","Lunch",6
146 | 16.43,2.3,"Female","No","Thur","Lunch",2
147 | 8.35,1.5,"Female","No","Thur","Lunch",2
148 | 18.64,1.36,"Female","No","Thur","Lunch",3
149 | 11.87,1.63,"Female","No","Thur","Lunch",2
150 | 9.78,1.73,"Male","No","Thur","Lunch",2
151 | 7.51,2,"Male","No","Thur","Lunch",2
152 | 14.07,2.5,"Male","No","Sun","Dinner",2
153 | 13.13,2,"Male","No","Sun","Dinner",2
154 | 17.26,2.74,"Male","No","Sun","Dinner",3
155 | 24.55,2,"Male","No","Sun","Dinner",4
156 | 19.77,2,"Male","No","Sun","Dinner",4
157 | 29.85,5.14,"Female","No","Sun","Dinner",5
158 | 48.17,5,"Male","No","Sun","Dinner",6
159 | 25,3.75,"Female","No","Sun","Dinner",4
160 | 13.39,2.61,"Female","No","Sun","Dinner",2
161 | 16.49,2,"Male","No","Sun","Dinner",4
162 | 21.5,3.5,"Male","No","Sun","Dinner",4
163 | 12.66,2.5,"Male","No","Sun","Dinner",2
164 | 16.21,2,"Female","No","Sun","Dinner",3
165 | 13.81,2,"Male","No","Sun","Dinner",2
166 | 17.51,3,"Female","Yes","Sun","Dinner",2
167 | 24.52,3.48,"Male","No","Sun","Dinner",3
168 | 20.76,2.24,"Male","No","Sun","Dinner",2
169 | 31.71,4.5,"Male","No","Sun","Dinner",4
170 | 10.59,1.61,"Female","Yes","Sat","Dinner",2
171 | 10.63,2,"Female","Yes","Sat","Dinner",2
172 | 50.81,10,"Male","Yes","Sat","Dinner",3
173 | 15.81,3.16,"Male","Yes","Sat","Dinner",2
174 | 7.25,5.15,"Male","Yes","Sun","Dinner",2
175 | 31.85,3.18,"Male","Yes","Sun","Dinner",2
176 | 16.82,4,"Male","Yes","Sun","Dinner",2
177 | 32.9,3.11,"Male","Yes","Sun","Dinner",2
178 | 17.89,2,"Male","Yes","Sun","Dinner",2
179 | 14.48,2,"Male","Yes","Sun","Dinner",2
180 | 9.6,4,"Female","Yes","Sun","Dinner",2
181 | 34.63,3.55,"Male","Yes","Sun","Dinner",2
182 | 34.65,3.68,"Male","Yes","Sun","Dinner",4
183 | 23.33,5.65,"Male","Yes","Sun","Dinner",2
184 | 45.35,3.5,"Male","Yes","Sun","Dinner",3
185 | 23.17,6.5,"Male","Yes","Sun","Dinner",4
186 | 40.55,3,"Male","Yes","Sun","Dinner",2
187 | 20.69,5,"Male","No","Sun","Dinner",5
188 | 20.9,3.5,"Female","Yes","Sun","Dinner",3
189 | 30.46,2,"Male","Yes","Sun","Dinner",5
190 | 18.15,3.5,"Female","Yes","Sun","Dinner",3
191 | 23.1,4,"Male","Yes","Sun","Dinner",3
192 | 15.69,1.5,"Male","Yes","Sun","Dinner",2
193 | 19.81,4.19,"Female","Yes","Thur","Lunch",2
194 | 28.44,2.56,"Male","Yes","Thur","Lunch",2
195 | 15.48,2.02,"Male","Yes","Thur","Lunch",2
196 | 16.58,4,"Male","Yes","Thur","Lunch",2
197 | 7.56,1.44,"Male","No","Thur","Lunch",2
198 | 10.34,2,"Male","Yes","Thur","Lunch",2
199 | 43.11,5,"Female","Yes","Thur","Lunch",4
200 | 13,2,"Female","Yes","Thur","Lunch",2
201 | 13.51,2,"Male","Yes","Thur","Lunch",2
202 | 18.71,4,"Male","Yes","Thur","Lunch",3
203 | 12.74,2.01,"Female","Yes","Thur","Lunch",2
204 | 13,2,"Female","Yes","Thur","Lunch",2
205 | 16.4,2.5,"Female","Yes","Thur","Lunch",2
206 | 20.53,4,"Male","Yes","Thur","Lunch",4
207 | 16.47,3.23,"Female","Yes","Thur","Lunch",3
208 | 26.59,3.41,"Male","Yes","Sat","Dinner",3
209 | 38.73,3,"Male","Yes","Sat","Dinner",4
210 | 24.27,2.03,"Male","Yes","Sat","Dinner",2
211 | 12.76,2.23,"Female","Yes","Sat","Dinner",2
212 | 30.06,2,"Male","Yes","Sat","Dinner",3
213 | 25.89,5.16,"Male","Yes","Sat","Dinner",4
214 | 48.33,9,"Male","No","Sat","Dinner",4
215 | 13.27,2.5,"Female","Yes","Sat","Dinner",2
216 | 28.17,6.5,"Female","Yes","Sat","Dinner",3
217 | 12.9,1.1,"Female","Yes","Sat","Dinner",2
218 | 28.15,3,"Male","Yes","Sat","Dinner",5
219 | 11.59,1.5,"Male","Yes","Sat","Dinner",2
220 | 7.74,1.44,"Male","Yes","Sat","Dinner",2
221 | 30.14,3.09,"Female","Yes","Sat","Dinner",4
222 | 12.16,2.2,"Male","Yes","Fri","Lunch",2
223 | 13.42,3.48,"Female","Yes","Fri","Lunch",2
224 | 8.58,1.92,"Male","Yes","Fri","Lunch",1
225 | 15.98,3,"Female","No","Fri","Lunch",3
226 | 13.42,1.58,"Male","Yes","Fri","Lunch",2
227 | 16.27,2.5,"Female","Yes","Fri","Lunch",2
228 | 10.09,2,"Female","Yes","Fri","Lunch",2
229 | 20.45,3,"Male","No","Sat","Dinner",4
230 | 13.28,2.72,"Male","No","Sat","Dinner",2
231 | 22.12,2.88,"Female","Yes","Sat","Dinner",2
232 | 24.01,2,"Male","Yes","Sat","Dinner",4
233 | 15.69,3,"Male","Yes","Sat","Dinner",3
234 | 11.61,3.39,"Male","No","Sat","Dinner",2
235 | 10.77,1.47,"Male","No","Sat","Dinner",2
236 | 15.53,3,"Male","Yes","Sat","Dinner",2
237 | 10.07,1.25,"Male","No","Sat","Dinner",2
238 | 12.6,1,"Male","Yes","Sat","Dinner",2
239 | 32.83,1.17,"Male","Yes","Sat","Dinner",2
240 | 35.83,4.67,"Female","No","Sat","Dinner",3
241 | 29.03,5.92,"Male","No","Sat","Dinner",3
242 | 27.18,2,"Female","Yes","Sat","Dinner",2
243 | 22.67,2,"Male","Yes","Sat","Dinner",2
244 | 17.82,1.75,"Male","No","Sat","Dinner",2
245 | 18.78,3,"Female","No","Thur","Dinner",2
246 | 


--------------------------------------------------------------------------------
/Make_Your_Pandas_Code_1000_Times_Faster_With_This Trick.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "cells": [
  3 |     {
  4 |       "cell_type": "markdown",
  5 |       "metadata": {
  6 |         "id": "view-in-github",
  7 |         "colab_type": "text"
  8 |       },
  9 |       "source": [
 10 |         "<a href=\"https://colab.research.google.com/github/youssefHosni/Efficient-Python-for-Data-Scientists/blob/main/Make_Your_Pandas_Code_1000_Times_Faster_With_This%C2%A0Trick.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
 11 |       ]
 12 |     },
 13 |     {
 14 |       "cell_type": "markdown",
 15 |       "metadata": {
 16 |         "id": "Zr3L0MDvNBUc"
 17 |       },
 18 |       "source": [
 19 |         "# Make Your Pandas Code 1000 Times Faster With This Trick"
 20 |       ]
 21 |     },
 22 |     {
 23 |       "cell_type": "markdown",
 24 |       "metadata": {
 25 |         "id": "e5MbQd8SNBUg"
 26 |       },
 27 |       "source": [
 28 |         "Pandas is a popular and widely used library in Python for data manipulation and analysis. While it is powerful and flexible, its performance can sometimes become a bottleneck in large datasets. In this article, we will explore a trick to make your Pandas code run much faster, increasing its efficiency by up to 1000 times. \n",
 29 |         "\n",
 30 |         "Whether you are a beginner or an experienced Pandas user, this article will provide you with valuable insights and practical tips for speeding up your code. So, if you want to boost the performance of your Pandas code, read on!"
 31 |       ]
 32 |     },
 33 |     {
 34 |       "cell_type": "code",
 35 |       "execution_count": null,
 36 |       "metadata": {
 37 |         "id": "fCRVIA8qNBUi"
 38 |       },
 39 |       "outputs": [],
 40 |       "source": [
 41 |         "Table of Content:\n",
 42 |         "Create Dataset & Problem Statment\n",
 43 |         "Level 1: Loops\n",
 44 |         "Level 2: Apply Function\n",
 45 |         "Level 3: Vectorization\n",
 46 |         "Measuring the Difference"
 47 |       ]
 48 |     },
 49 |     {
 50 |       "cell_type": "code",
 51 |       "execution_count": null,
 52 |       "metadata": {
 53 |         "id": "JmhQZWVsNBUl"
 54 |       },
 55 |       "outputs": [],
 56 |       "source": [
 57 |         "import pandas as pd\n",
 58 |         "import numpy as np"
 59 |       ]
 60 |     },
 61 |     {
 62 |       "cell_type": "markdown",
 63 |       "metadata": {
 64 |         "id": "yPa83A8wNBUm"
 65 |       },
 66 |       "source": [
 67 |         "# 1. Create Dataset & Problem Statment \n",
 68 |         "First, let's create the data we will use throughout this article and compare different methods. The data we will make will have different ages, time in bed and percentage of sleeping, the favorite food, and the least favorite food. \n",
 69 |         "Let's build a function to get the data given the size:# create Dataset"
 70 |       ]
 71 |     },
 72 |     {
 73 |       "cell_type": "code",
 74 |       "execution_count": null,
 75 |       "metadata": {
 76 |         "id": "eK3UfJtzNBUn"
 77 |       },
 78 |       "outputs": [],
 79 |       "source": [
 80 |         "def get_data(size= 10000):\n",
 81 |         "    df = pd.DataFrame()\n",
 82 |         "    size = 10000\n",
 83 |         "    df['age'] = np.random.randint(0,100,size)\n",
 84 |         "    df['time_in_bed'] = np.random.randint(0,9,size)\n",
 85 |         "    df['pct_sleeping'] = np.random.randint(size)\n",
 86 |         "    df['favorite_food'] = np.random.choice(['pizza','ice-cream','burger','rice'], size)\n",
 87 |         "    df['hate_food'] = np.random.choice(['milk','vegetables','eggs'])\n",
 88 |         "    return df"
 89 |       ]
 90 |     },
 91 |     {
 92 |       "cell_type": "code",
 93 |       "execution_count": null,
 94 |       "metadata": {
 95 |         "id": "ZrEgXr-wNBUp",
 96 |         "outputId": "1c1a3e5d-561a-4628-9944-3997c0397d4d"
 97 |       },
 98 |       "outputs": [
 99 |         {
100 |           "data": {
101 |             "text/html": [
102 |               "<div>\n",
103 |               "<style scoped>\n",
104 |               "    .dataframe tbody tr th:only-of-type {\n",
105 |               "        vertical-align: middle;\n",
106 |               "    }\n",
107 |               "\n",
108 |               "    .dataframe tbody tr th {\n",
109 |               "        vertical-align: top;\n",
110 |               "    }\n",
111 |               "\n",
112 |               "    .dataframe thead th {\n",
113 |               "        text-align: right;\n",
114 |               "    }\n",
115 |               "</style>\n",
116 |               "<table border=\"1\" class=\"dataframe\">\n",
117 |               "  <thead>\n",
118 |               "    <tr style=\"text-align: right;\">\n",
119 |               "      <th></th>\n",
120 |               "      <th>age</th>\n",
121 |               "      <th>time_in_bed</th>\n",
122 |               "      <th>pct_sleeping</th>\n",
123 |               "      <th>favorite_food</th>\n",
124 |               "      <th>hate_food</th>\n",
125 |               "    </tr>\n",
126 |               "  </thead>\n",
127 |               "  <tbody>\n",
128 |               "    <tr>\n",
129 |               "      <th>0</th>\n",
130 |               "      <td>39</td>\n",
131 |               "      <td>2</td>\n",
132 |               "      <td>2114</td>\n",
133 |               "      <td>ice-cream</td>\n",
134 |               "      <td>eggs</td>\n",
135 |               "    </tr>\n",
136 |               "    <tr>\n",
137 |               "      <th>1</th>\n",
138 |               "      <td>92</td>\n",
139 |               "      <td>1</td>\n",
140 |               "      <td>2114</td>\n",
141 |               "      <td>pizza</td>\n",
142 |               "      <td>eggs</td>\n",
143 |               "    </tr>\n",
144 |               "    <tr>\n",
145 |               "      <th>2</th>\n",
146 |               "      <td>59</td>\n",
147 |               "      <td>7</td>\n",
148 |               "      <td>2114</td>\n",
149 |               "      <td>ice-cream</td>\n",
150 |               "      <td>eggs</td>\n",
151 |               "    </tr>\n",
152 |               "    <tr>\n",
153 |               "      <th>3</th>\n",
154 |               "      <td>26</td>\n",
155 |               "      <td>3</td>\n",
156 |               "      <td>2114</td>\n",
157 |               "      <td>burger</td>\n",
158 |               "      <td>eggs</td>\n",
159 |               "    </tr>\n",
160 |               "    <tr>\n",
161 |               "      <th>4</th>\n",
162 |               "      <td>14</td>\n",
163 |               "      <td>3</td>\n",
164 |               "      <td>2114</td>\n",
165 |               "      <td>ice-cream</td>\n",
166 |               "      <td>eggs</td>\n",
167 |               "    </tr>\n",
168 |               "  </tbody>\n",
169 |               "</table>\n",
170 |               "</div>"
171 |             ],
172 |             "text/plain": [
173 |               "   age  time_in_bed  pct_sleeping favorite_food hate_food\n",
174 |               "0   39            2          2114     ice-cream      eggs\n",
175 |               "1   92            1          2114         pizza      eggs\n",
176 |               "2   59            7          2114     ice-cream      eggs\n",
177 |               "3   26            3          2114        burger      eggs\n",
178 |               "4   14            3          2114     ice-cream      eggs"
179 |             ]
180 |           },
181 |           "execution_count": 36,
182 |           "metadata": {},
183 |           "output_type": "execute_result"
184 |         }
185 |       ],
186 |       "source": [
187 |         "df = get_data()\n",
188 |         "df.head()"
189 |       ]
190 |     },
191 |     {
192 |       "cell_type": "markdown",
193 |       "metadata": {
194 |         "id": "kVJDig7CNBUr"
195 |       },
196 |       "source": [
197 |         "Reward calculation:\n",
198 |         "* If they were in bed for more than 5 hours and they were sleeping more than 50% we will give their favourite food.\n",
199 |         "* Otherwise we give them their hate food\n",
200 |         "* If they are over 90 years old give their favorite food regardless\n",
201 |         "\n",
202 |         "This can be represented using the following equation"
203 |       ]
204 |     },
205 |     {
206 |       "cell_type": "code",
207 |       "execution_count": null,
208 |       "metadata": {
209 |         "id": "VPGpFG4iNBUs"
210 |       },
211 |       "outputs": [],
212 |       "source": [
213 |         "def reward_cal(row):\n",
214 |         "    if row['age'] >=90:\n",
215 |         "        return row['favorite_food'] \n",
216 |         "    if (row['time_in_bed'] > 5) & (row['pct_sleeping']>0.5):\n",
217 |         "        return row['favorite_food']\n",
218 |         "    return row['hate_food']"
219 |       ]
220 |     },
221 |     {
222 |       "cell_type": "markdown",
223 |       "metadata": {
224 |         "id": "kfnQjHp1NBUt"
225 |       },
226 |       "source": [
227 |         "## Level 1: Loops"
228 |       ]
229 |     },
230 |     {
231 |       "cell_type": "markdown",
232 |       "metadata": {
233 |         "id": "GvJ_eTu4NBUu"
234 |       },
235 |       "source": [
236 |         "The first and straightforward approach is to use for loops to iterate over each row of the data frame. The code below"
237 |       ]
238 |     },
239 |     {
240 |       "cell_type": "code",
241 |       "execution_count": null,
242 |       "metadata": {
243 |         "id": "XWv3MZGoNBUu"
244 |       },
245 |       "outputs": [],
246 |       "source": [
247 |         "df = get_data()"
248 |       ]
249 |     },
250 |     {
251 |       "cell_type": "code",
252 |       "execution_count": null,
253 |       "metadata": {
254 |         "id": "yRnyMEnSNBUv",
255 |         "outputId": "11337aac-de38-4d05-b8f4-5c0a3ece60f7"
256 |       },
257 |       "outputs": [
258 |         {
259 |           "name": "stdout",
260 |           "output_type": "stream",
261 |           "text": [
262 |             "15.8 s ± 1.61 s per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
263 |           ]
264 |         }
265 |       ],
266 |       "source": [
267 |         "%%timeit\n",
268 |         "\n",
269 |         "for index, row in df.iterrows():\n",
270 |         "   df.loc[index,'reward'] = reward_cal(row)"
271 |       ]
272 |     },
273 |     {
274 |       "cell_type": "markdown",
275 |       "metadata": {
276 |         "id": "IyXhpDXRNBUw"
277 |       },
278 |       "source": [
279 |         "As we can see the computation time used to iterate through every row of the data frame is 15.8 s. Given that the data has only 10000 rows which are considered small. So if the data have millions of rows so it will take hours to do only one task. Therefore this is not the most efficient way to iterate through a data frame. So let's discuss the second method which will improve the time complexity."
280 |       ]
281 |     },
282 |     {
283 |       "cell_type": "markdown",
284 |       "metadata": {
285 |         "id": "ZnlAfHz6NBUx"
286 |       },
287 |       "source": [
288 |         "## Level 2: Apply"
289 |       ]
290 |     },
291 |     {
292 |       "cell_type": "markdown",
293 |       "metadata": {
294 |         "id": "V73_y_JANBUy"
295 |       },
296 |       "source": [
297 |         "The .apply() method in pandas is used to apply a function to each element in a pandas dataframe. It can be used to apply a custom function to each element in a specific column or to apply a function along either axis (row-wise or column-wise) of the dataframe. Let's use it to apply the reward calculation function to each row of the data frame and then calculate the computational time:"
298 |       ]
299 |     },
300 |     {
301 |       "cell_type": "code",
302 |       "execution_count": null,
303 |       "metadata": {
304 |         "id": "zSe7u6rnNBUy"
305 |       },
306 |       "outputs": [],
307 |       "source": [
308 |         "df = get_data()"
309 |       ]
310 |     },
311 |     {
312 |       "cell_type": "code",
313 |       "execution_count": null,
314 |       "metadata": {
315 |         "id": "MDJDpDSuNBUz",
316 |         "outputId": "3313ea0e-3875-4850-984e-38b16e96fe96"
317 |       },
318 |       "outputs": [
319 |         {
320 |           "name": "stdout",
321 |           "output_type": "stream",
322 |           "text": [
323 |             "535 ms ± 38.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
324 |           ]
325 |         }
326 |       ],
327 |       "source": [
328 |         "%%timeit\n",
329 |         "df['reward'] = df.apply(reward_cal, axis = 1)"
330 |       ]
331 |     },
332 |     {
333 |       "cell_type": "markdown",
334 |       "metadata": {
335 |         "id": "5BHqk2-JNBUz"
336 |       },
337 |       "source": [
338 |         "The average time to apply the function to the 10000 rows of the data frame is only 535 ms which is 0.535 seconds. This is around 15 times faster than using the loops. However, we are still not done. We can still improve the speed and make it 1000 times faster. Let's see how!"
339 |       ]
340 |     },
341 |     {
342 |       "cell_type": "markdown",
343 |       "metadata": {
344 |         "id": "6KNNOQxnNBU0"
345 |       },
346 |       "source": [
347 |         "## Level 3: Vectorized "
348 |       ]
349 |     },
350 |     {
351 |       "cell_type": "markdown",
352 |       "metadata": {
353 |         "id": "fARqNQQwNBU0"
354 |       },
355 |       "source": [
356 |         "Vectorization in pandas refers to the process of applying operations to entire arrays or sequences of data, as opposed to applying them to individual elements one by one. This is done for performance reasons, as vectorized operations are usually much faster than non-vectorized operations, especially in large datasets.\n",
357 |         "\n",
358 |         "Let's apply this to the data using the conditions stated above:"
359 |       ]
360 |     },
361 |     {
362 |       "cell_type": "code",
363 |       "execution_count": null,
364 |       "metadata": {
365 |         "id": "KU-2MCGQNBU1"
366 |       },
367 |       "outputs": [],
368 |       "source": [
369 |         "df = get_data()\n"
370 |       ]
371 |     },
372 |     {
373 |       "cell_type": "code",
374 |       "execution_count": null,
375 |       "metadata": {
376 |         "id": "aoWodteHNBU1",
377 |         "outputId": "e4e61ca6-49e5-4dd2-e230-5f6ac7ad77b8"
378 |       },
379 |       "outputs": [
380 |         {
381 |           "name": "stdout",
382 |           "output_type": "stream",
383 |           "text": [
384 |             "5.8 ms ± 831 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
385 |           ]
386 |         }
387 |       ],
388 |       "source": [
389 |         "%%timeit\n",
390 |         "\n",
391 |         "df['reward'] = df['hate_food']\n",
392 |         "df.loc[((df['pct_sleeping']>0.5) &(df['time_in_bed']>5))| (df['age']>90), 'reward'] = df['favorite_food']\n"
393 |       ]
394 |     },
395 |     {
396 |       "cell_type": "markdown",
397 |       "metadata": {
398 |         "id": "tawGLcpuNBU2"
399 |       },
400 |       "source": [
401 |         "We can see now a tremendous decrease in the computation time compared to the previous two methods. The computation time has at least decreased by 1000. Let's have a look at the differences in a bar plot."
402 |       ]
403 |     },
404 |     {
405 |       "cell_type": "markdown",
406 |       "metadata": {
407 |         "id": "Lb-T8mcRNBU2"
408 |       },
409 |       "source": [
410 |         "## Plot differences\n",
411 |         "\n"
412 |       ]
413 |     },
414 |     {
415 |       "cell_type": "markdown",
416 |       "metadata": {
417 |         "id": "8O00vigtNBU3"
418 |       },
419 |       "source": [
420 |         "Finally to have a better intuition of the difference between these different three methods. We will plot a bar plot diagram using the code below:"
421 |       ]
422 |     },
423 |     {
424 |       "cell_type": "code",
425 |       "execution_count": null,
426 |       "metadata": {
427 |         "id": "0p40CWMbNBU3"
428 |       },
429 |       "outputs": [],
430 |       "source": [
431 |         "results = pd.DataFrame(\n",
432 |         "    [\n",
433 |         "        [\"Loop\", 15800],\n",
434 |         "        [\"apply\", 535],\n",
435 |         "        ['vectorized', 5.8]\n",
436 |         "        \n",
437 |         "    ],\n",
438 |         "    columns = ['method', 'run_time']\n",
439 |         "\n",
440 |         ")"
441 |       ]
442 |     },
443 |     {
444 |       "cell_type": "code",
445 |       "execution_count": null,
446 |       "metadata": {
447 |         "id": "3FTQJc7oNBU4",
448 |         "outputId": "607fe70f-dc20-4b3a-b0da-58e32b3640b7"
449 |       },
450 |       "outputs": [
451 |         {
452 |           "data": {
453 |             "text/plain": [
454 |               "<AxesSubplot:xlabel='method'>"
455 |             ]
456 |           },
457 |           "execution_count": 31,
458 |           "metadata": {},
459 |           "output_type": "execute_result"
460 |         },
461 |         {
462 |           "data": {
463 |             "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEwCAYAAAC35gawAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAaLUlEQVR4nO3dfbRddX3n8fenSUVQwmNATLBJa6QDLHwgxRSro8UZMj4QHEHDKpJlGdNS6sM4HQfGWaXtrMzCqVMU18CUAhLUAVLUkhkXVoooWhG8PGgImCEVB1KiuQpCaiVO8Dt/nF/K4ebk3pt7w9033vdrrbPO3t+9f/t+z7qQz91PZ6eqkCTpF7puQJI0PRgIkiTAQJAkNQaCJAkwECRJzeyuG5ioQw89tBYsWNB1G5K0V7nzzjt/UFVzBy3bawNhwYIFDA0Ndd2GJO1VkvzfXS3zkJEkCTAQJEmNgSBJAsYRCEmuTLIlyb0j6u9OsiHJ+iT/ta9+fpKNbdnJffXjk6xryy5OklbfJ8l1rX57kgV78PNJksZpPHsIVwFL+wtJXgcsA46rqmOAD7f60cBy4Jg25pIks9qwS4GVwKL22rHNs4HHqurFwEXAhybxeSRJEzRmIFTVrcCjI8rnABdW1ba2zpZWXwZcW1XbqupBYCNwQpIjgDlVdVv1vk3vauDUvjGr2/T1wEk79h4kSVNnoucQXgK8uh3i+XKSX2v1ecDDfettarV5bXpk/Rljqmo78DhwyKAfmmRlkqEkQ8PDwxNsXZI0yEQDYTZwELAE+PfAmvZX/aC/7GuUOmMse2ax6rKqWlxVi+fOHXhfhSRpgiYaCJuAz1TPHcDPgENb/ci+9eYDj7T6/AF1+sckmQ0cwM6HqCRJz7KJ3qn8V8BvAl9K8hLgOcAPgLXA/0zyZ8AL6Z08vqOqnkqyNckS4HbgLOBjbVtrgRXAbcBpwBdrGj61Z8F5n+u6hWfVdy98Y9ctSOrYmIGQ5BrgtcChSTYBFwBXAle2S1F/Cqxo/4ivT7IGuA/YDpxbVU+1TZ1D74qlfYEb2wvgCuATSTbS2zNYvmc+miRpd4wZCFV1xi4WnbmL9VcBqwbUh4BjB9SfBE4fqw9J0rPLO5UlSYCBIElqDARJEmAgSJIaA0GSBBgIkqTGQJAkAQaCJKkxECRJgIEgSWoMBEkSYCBIkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEnAOAIhyZVJtrTHZY5c9gdJKsmhfbXzk2xMsiHJyX3145Osa8suTpJW3yfJda1+e5IFe+izSZJ2w3j2EK4Clo4sJjkS+BfAQ321o+k9E/mYNuaSJLPa4kuBlcCi9tqxzbOBx6rqxcBFwIcm8kEkSZMzZiBU1a3AowMWXQR8AKi+2jLg2qraVlUPAhuBE5IcAcypqtuqqoCrgVP7xqxu09cDJ+3Ye5AkTZ0JnUNIcgrw91X1zRGL5gEP981varV5bXpk/Rljqmo78DhwyC5+7sokQ0mGhoeHJ9K6JGkXdjsQkuwHfBD4w0GLB9RqlPpoY3YuVl1WVYuravHcuXPH064kaZwmsofwK8BC4JtJvgvMB+5K8gJ6f/kf2bfufOCRVp8/oE7/mCSzgQMYfIhKkvQs2u1AqKp1VXVYVS2oqgX0/kF/RVV9D1gLLG9XDi2kd/L4jqraDGxNsqSdHzgLuKFtci2wok2fBnyxnWeQJE2h8Vx2eg1wG3BUkk1Jzt7VulW1HlgD3Ad8Hji3qp5qi88BLqd3ovnvgBtb/QrgkCQbgfcD503ws0iSJmH2WCtU1RljLF8wYn4VsGrAekPAsQPqTwKnj9WHJOnZ5Z3KkiTAQJAkNQaCJAkwECRJjYEgSQIMBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCTAQJEmNgSBJAgwESVJjIEiSAANBktQYCJIkYHyP0LwyyZYk9/bV/jTJt5N8K8lnkxzYt+z8JBuTbEhycl/9+CTr2rKL27OVac9fvq7Vb0+yYM9+REnSeIxnD+EqYOmI2k3AsVV1HPB/gPMBkhwNLAeOaWMuSTKrjbkUWAksaq8d2zwbeKyqXgxcBHxooh9GkjRxYwZCVd0KPDqi9oWq2t5mvw7Mb9PLgGuraltVPQhsBE5IcgQwp6puq6oCrgZO7Ruzuk1fD5y0Y+9BkjR19sQ5hN8GbmzT84CH+5ZtarV5bXpk/RljWsg8DhyyB/qSJO2GSQVCkg8C24FP7SgNWK1GqY82ZtDPW5lkKMnQ8PDw7rYrSRrFhAMhyQrgTcBvtcNA0PvL/8i+1eYDj7T6/AH1Z4xJMhs4gBGHqHaoqsuqanFVLZ47d+5EW5ckDTChQEiyFPgPwClV9Y99i9YCy9uVQwvpnTy+o6o2A1uTLGnnB84Cbugbs6JNnwZ8sS9gJElTZPZYKyS5BngtcGiSTcAF9K4q2ge4qZ3//XpV/W5VrU+yBriP3qGkc6vqqbapc+hdsbQvvXMOO847XAF8IslGensGy/fMR5Mk7Y4xA6GqzhhQvmKU9VcBqwbUh4BjB9SfBE4fqw9J0rPLO5UlSYCBIElqDARJEmAgSJIaA0GSBBgIkqTGQJAkAQaCJKkxECRJgIEgSWoMBEkSYCBIkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEmAgSBJasYMhCRXJtmS5N6+2sFJbkryQHs/qG/Z+Uk2JtmQ5OS++vFJ1rVlF6c9jDnJPkmua/XbkyzYw59RkjQO49lDuApYOqJ2HnBzVS0Cbm7zJDkaWA4c08ZckmRWG3MpsBJY1F47tnk28FhVvRi4CPjQRD+MJGnixgyEqroVeHREeRmwuk2vBk7tq19bVduq6kFgI3BCkiOAOVV1W1UVcPWIMTu2dT1w0o69B0nS1JnoOYTDq2ozQHs/rNXnAQ/3rbep1ea16ZH1Z4ypqu3A48Ahg35okpVJhpIMDQ8PT7B1SdIge/qk8qC/7GuU+mhjdi5WXVZVi6tq8dy5cyfYoiRpkIkGwvfbYSDa+5ZW3wQc2bfefOCRVp8/oP6MMUlmAwew8yEqSdKzbKKBsBZY0aZXADf01Ze3K4cW0jt5fEc7rLQ1yZJ2fuCsEWN2bOs04IvtPIMkaQrNHmuFJNcArwUOTbIJuAC4EFiT5GzgIeB0gKpan2QNcB+wHTi3qp5qmzqH3hVL+wI3thfAFcAnkmykt2ewfI98MknSbhkzEKrqjF0sOmkX668CVg2oDwHHDqg/SQsUSVJ3vFNZkgQYCJKkxkCQJAEGgiSpMRAkSYCBIElqDARJEmAgSJIaA0GSBBgIkqTGQJAkAQaCJKkxECRJgIEgSWoMBEkSYCBIkhoDQZIETDIQkvzbJOuT3JvkmiTPTXJwkpuSPNDeD+pb//wkG5NsSHJyX/34JOvasovbc5clSVNowoGQZB7wHmBxVR0LzKL3POTzgJurahFwc5snydFt+THAUuCSJLPa5i4FVgKL2mvpRPuSJE3MZA8ZzQb2TTIb2A94BFgGrG7LVwOntullwLVVta2qHgQ2AickOQKYU1W3VVUBV/eNkSRNkQkHQlX9PfBh4CFgM/B4VX0BOLyqNrd1NgOHtSHzgIf7NrGp1ea16ZH1nSRZmWQoydDw8PBEW5ckDTCZQ0YH0furfyHwQuB5Sc4cbciAWo1S37lYdVlVLa6qxXPnzt3dliVJo5jMIaPXAw9W1XBV/T/gM8CJwPfbYSDa+5a2/ibgyL7x8+kdYtrUpkfWJUlTaDKB8BCwJMl+7aqgk4D7gbXAirbOCuCGNr0WWJ5knyQL6Z08vqMdVtqaZEnbzll9YyRJU2T2RAdW1e1JrgfuArYDdwOXAc8H1iQ5m15onN7WX59kDXBfW//cqnqqbe4c4CpgX+DG9pIkTaEJBwJAVV0AXDCivI3e3sKg9VcBqwbUh4BjJ9OLJGlyvFNZkgQYCJKkxkCQJAEGgiSpMRAkSYCBIElqDARJEmAgSJIaA0GSBBgIkqTGQJAkAQaCJKkxECRJgIEgSWoMBEkSYCBIkhoDQZIETDIQkhyY5Pok305yf5JfT3JwkpuSPNDeD+pb//wkG5NsSHJyX/34JOvasovbs5UlSVNosnsIHwU+X1W/CrwUuB84D7i5qhYBN7d5khwNLAeOAZYClySZ1bZzKbASWNReSyfZlyRpN004EJLMAV4DXAFQVT+tqh8By4DVbbXVwKltehlwbVVtq6oHgY3ACUmOAOZU1W1VVcDVfWMkSVNkMnsIvwwMAx9PcneSy5M8Dzi8qjYDtPfD2vrzgIf7xm9qtXltemR9J0lWJhlKMjQ8PDyJ1iVJI00mEGYDrwAuraqXAz+mHR7ahUHnBWqU+s7FqsuqanFVLZ47d+7u9itJGsVkAmETsKmqbm/z19MLiO+3w0C09y196x/ZN34+8Eirzx9QlyRNoQkHQlV9D3g4yVGtdBJwH7AWWNFqK4Ab2vRaYHmSfZIspHfy+I52WGlrkiXt6qKz+sZIkqbI7EmOfzfwqSTPAb4DvJNeyKxJcjbwEHA6QFWtT7KGXmhsB86tqqfads4BrgL2BW5sL0nSFJpUIFTVPcDiAYtO2sX6q4BVA+pDwLGT6UWSNDneqSxJAgwESVJjIEiSAANBktQYCJIkwECQJDUGgiQJMBAkSY2BIEkCDARJUmMgSJIAA0GS1BgIkiTAQJAkNQaCJAkwECRJjYEgSQL2QCAkmZXk7iT/u80fnOSmJA+094P61j0/ycYkG5Kc3Fc/Psm6tuzi9mxlSdIU2hN7CO8F7u+bPw+4uaoWATe3eZIcDSwHjgGWApckmdXGXAqsBBa119I90JckaTdMKhCSzAfeCFzeV14GrG7Tq4FT++rXVtW2qnoQ2AickOQIYE5V3VZVBVzdN0aSNEUmu4fwEeADwM/6aodX1WaA9n5Yq88DHu5bb1OrzWvTI+s7SbIyyVCSoeHh4Um2LknqN+FASPImYEtV3TneIQNqNUp952LVZVW1uKoWz507d5w/VpI0HrMnMfZVwClJ3gA8F5iT5JPA95McUVWb2+GgLW39TcCRfePnA4+0+vwBdUnSFJrwHkJVnV9V86tqAb2TxV+sqjOBtcCKttoK4IY2vRZYnmSfJAvpnTy+ox1W2ppkSbu66Ky+MZKkKTKZPYRduRBYk+Rs4CHgdICqWp9kDXAfsB04t6qeamPOAa4C9gVubC9J0hTaI4FQVV8CvtSmfwictIv1VgGrBtSHgGP3RC+SpInxTmVJEmAgSJIaA0GSBBgIkqTGQJAkAQaCJKkxECRJgIEgSWoMBEkSYCBIkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEmAgSBJagwESRIwiUBIcmSSW5Lcn2R9kve2+sFJbkryQHs/qG/M+Uk2JtmQ5OS++vFJ1rVlF7dnK0uSptBk9hC2A/+uqv4ZsAQ4N8nRwHnAzVW1CLi5zdOWLQeOAZYClySZ1bZ1KbASWNReSyfRlyRpAiYcCFW1uaruatNbgfuBecAyYHVbbTVwapteBlxbVduq6kFgI3BCkiOAOVV1W1UVcHXfGEnSFNkj5xCSLABeDtwOHF5Vm6EXGsBhbbV5wMN9wza12rw2PbIuSZpCkw6EJM8HPg28r6qeGG3VAbUapT7oZ61MMpRkaHh4ePeblSTt0qQCIckv0guDT1XVZ1r5++0wEO19S6tvAo7sGz4feKTV5w+o76SqLquqxVW1eO7cuZNpXZI0wmSuMgpwBXB/Vf1Z36K1wIo2vQK4oa++PMk+SRbSO3l8RzustDXJkrbNs/rGSJKmyOxJjH0V8A5gXZJ7Wu0/AhcCa5KcDTwEnA5QVeuTrAHuo3eF0rlV9VQbdw5wFbAvcGN7SZKm0IQDoaq+yuDj/wAn7WLMKmDVgPoQcOxEe5EkTZ53KkuSAANBktQYCJIkwECQJDUGgiQJMBAkSY2BIEkCDARJUmMgSJIAA0GS1BgIkiTAQJAkNQaCJAkwECRJzWSehyDtNRac97muW3jWfPfCN3bdgn5OuIcgSQIMBElSYyBIkoBpFAhJlibZkGRjkvO67keSZpppEQhJZgH/HfhXwNHAGUmO7rYrSZpZpkUgACcAG6vqO1X1U+BaYFnHPUnSjDJdLjudBzzcN78JeOXIlZKsBFa22X9IsmEKeuvKocAPpuqH5UNT9ZNmBH93e7cp/f114Jd2tWC6BEIG1GqnQtVlwGXPfjvdSzJUVYu77kO7z9/d3m0m//6myyGjTcCRffPzgUc66kWSZqTpEgjfABYlWZjkOcByYG3HPUnSjDItDhlV1fYkvw/8NTALuLKq1nfcVtdmxKGxn1P+7vZuM/b3l6qdDtVLkmag6XLISJLUMQNBkgQYCJKkxkCYhpLMSbJ/131o/NrXr0h7NU8qTyNJFgMfB/and7Pej4Dfrqo7u+xLY0vyIHA98PGquq/rfjS2JAePtryqHp2qXqYLA2EaSfIt4Nyq+kqb/w3gkqo6rtvONJa2R7cceCe9Pe8rgWur6olOG9MutRAven98vQh4rE0fCDxUVQu7664bHjKaXrbuCAOAqvoqsLXDfjROVbW1qv6iqk4EPgBcAGxOsjrJiztuTwNU1cKq+mV69z+9uaoOrapDgDcBn+m2u264hzCNJLkI2A+4ht5fLm+n91fLpwGq6q7uutNo2jmEN9LbQ1gAfAL4FPBq4L9U1Uu6606jSXJnVR0/ojYjv89oWtyprH/ysvZ+wYj6ifQC4jentBvtjgeAW4A/raqv9dWvT/KajnrS+PwgyX8CPknv/7MzgR9221I33EOQ9oAkz6+qf+i6D+2+dnL5AuA19ALhVuBPPKmsTiU5gKf/wwT4Mr3/MB/vriuNJsnHGPBV7TtU1XumsB1NgqHuIaPp5krgXuBtbf4d9C5D/deddaSxDHXdgCYnyYnA5cDzgRcleSnwO1X1e912NvXcQ5hGktxTVS8bq6bpK8kcoKrKq8P2EkluB04D1lbVy1vt3qo6ttvOpp6XnU4vP2n3HgCQ5FXATzrsR+OUZHGSdcC3gHuTfDPJ8WON0/RQVQ+PKD3VSSMd85DR9HIOsLqdSwjwKLCi25Y0TlcCvzfipsKPA95UOP093A4bVXtA13uA+zvuqRMeMpqG2mEHvMt175Hkb6vqVWPVNP0kORT4KPB6en+IfQF4b1XNuEtPDYRpxKuM9l7eVLj3SnLkyENGSV5QVd/rqqeuGAjTSJJP07vKaHUrvQN4aVV5ldE0l+SWURZXVXlT4TSVZDvwl/S+SPInrXZXVb2i286mnucQppdfqaq39s3/cZJ7umpG41dVr+u6B03YOuArwFeTvK2q/o7eoaMZx6uMphevMtpLJTkkycVJ7kpyZ5KPJjmk6740LlVVl9A7mfy/kryZUW42/HnmIaNppN0QczVwQCs9Bqyoqm9115XGI8lN9L7y4JOt9FvAa6vq9d11pfFIcnff/QdHANcBi6tqv247m3oGwjTUf5VRkvdV1Uc6bklj8Bsz915JjqiqzX3zs4ETq+rWDtvqhOcQpqERl5u+H/hIR61o/G5JshxY0+ZPAz7XYT8aQ5Izq+qTwBnJwFMGBoKmnRl5cmsv9Dv0wvsTbX4W8OMk76d3jHpOZ51pV57X3n1+eWMgTH8e09sLVNX+7WuUFwHP7at/ubuuNJqq+vP2YKMnquqirvuZDjyHMA0k2crgf/gD7FtVBvc0l+TfAO8F5gP3AEuAr1XVSV32pbElucXLhnsMBGkPaF9s92vA16vqZUl+Ffjjqnp7x61pDElW0buy7zrgxzvqM/Hucv/ylPaMJ6vqySQk2aeqvp3kqK6b0ric2N7/pK82Ix9ZayBIe8amJAcCfwXclOQx4JFOO9K4eLjoaR4ykvawJP+c3iGIz1fVT7vuR6PzSyWfZiBImtH8UsmnGQiSZjQfXfs0v9xO0kznl0o27iFImtH8UsmneZWRpJnuiap66YgvlVzYdVNd8JCRpJlux2NOn+j7YsnrO+ynM+4hSJqR2t3kxwAHJOm/omgOfd9HNZMYCJJmqqOANwEHAm/uq28F3tVFQ13zpLKkGS3Jr1fVbV33MR14DkHSTPe77WtHAEhyUJIrO+ynMwaCpJnuuKr60Y6ZqnoMeHl37XTHQJA00/1CkoN2zLQHHc3I86sz8kNLUp//BnwtyfX0vvb6bcCqblvqhieVJc14SY6m9/yDADdX1X0dt9QJDxlJEhwM/LiqPgYMz9Q7ld1DkDSjJbkAWAwcVVUvSfJC4C+r6lUdtzbl3EOQNNO9BTiF9jzlqnoE2L/TjjpiIEia6X5avUMlBZDkeR330xkDQdJMtybJnwMHJnkX8DfAX3TcUye87FTSTPcz4CvAE8BLgD+sqpu6bakbBoKkmW5/4GzgUeBaYMY9GGcHrzKSJCDJccDbgbcCm6rq9R23NOU8hyBJPVuA7wE/BA7ruJdOGAiSZrQk5yT5EnAzcCjwrqo6rtuuuuE5BEkz3S8B76uqe7pupGueQ5AkAR4ykiQ1BoIkCTAQpD0iycuSvKFv/o+S/MEktjep8dJEGAjSnvEy4A1jrSRNZwaC1CRZkOTbSS5Pcm+STyV5fZK/TfJAkhOSPC/JlUm+keTuJMuSPAf4E+DtSe5J8va2yaOTfCnJd5K8p+/nvL9t/94k7+urfzDJhiR/Axw1pR9ewquMpH+SZAGwkd4D1tcD3wC+Se9rDU4B3gncB9xXVZ9MciBwR1v/dGBxVf1+29YfAf8SeB29r0bYALwAOA64ClhC7+lctwNn0vvj7CrglfQuB78L+B9V9eFn8zNL/bwPQXqmB6tqHUCS9fQep1hJ1gELgPnAKX3H958LvGgX2/pcVW0DtiXZAhwO/Abw2ar6cfsZnwFeTS8QPltV/9jqa5+VTyeNwkCQnmlb3/TP+uZ/Ru//l6eAt1bVhv5BSV45xraeauMzys92d12d8hyCtHv+Gnh3kgAkeXmrb2V8T9m6FTg1yX7tQSxvoffVy7cCb0myb5L9gTfv+dal0RkI0u75z8AvAt9Kcm+bB7iF3knk/pPKO6mqu+idK7iD3vmDy6vq7la/DrgH+DS9kJCmlCeVJUmAewiSpMZAkCQBBoIkqTEQJEmAgSBJagwESRJgIEiSmv8Pl7ME0L3WsPEAAAAASUVORK5CYII=\n",
464 |             "text/plain": [
465 |               "<Figure size 432x288 with 1 Axes>"
466 |             ]
467 |           },
468 |           "metadata": {
469 |             "needs_background": "light"
470 |           },
471 |           "output_type": "display_data"
472 |         }
473 |       ],
474 |       "source": [
475 |         "results.set_index('method')['run_time'].plot(kind='bar')"
476 |       ]
477 |     },
478 |     {
479 |       "cell_type": "markdown",
480 |       "metadata": {
481 |         "id": "XwuNEJ00NBU4"
482 |       },
483 |       "source": [
484 |         "Looking at the bar plot we can get a better intuition of the huge difference between the different computational times of the different methods used in this article."
485 |       ]
486 |     }
487 |   ],
488 |   "metadata": {
489 |     "kernelspec": {
490 |       "display_name": "Python 3",
491 |       "language": "python",
492 |       "name": "python3"
493 |     },
494 |     "language_info": {
495 |       "codemirror_mode": {
496 |         "name": "ipython",
497 |         "version": 3
498 |       },
499 |       "file_extension": ".py",
500 |       "mimetype": "text/x-python",
501 |       "name": "python",
502 |       "nbconvert_exporter": "python",
503 |       "pygments_lexer": "ipython3",
504 |       "version": "3.7.9"
505 |     },
506 |     "colab": {
507 |       "provenance": [],
508 |       "include_colab_link": true
509 |     }
510 |   },
511 |   "nbformat": 4,
512 |   "nbformat_minor": 0
513 | }


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | ## Efficient Python for Data Scientists Book ##
 2 | 
 3 | Learn how to write efficient Python code as a data scientist. You can buy from [here](https://youssefhosni.gumroad.com/l/cbousj)
 4 | 
 5 | [![Substack](https://img.shields.io/badge/Substack-%23006f5c.svg?style=for-the-badge&logo=substack&logoColor=FF6719)](https://youssefh.substack.com/)
 6 | [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@yousefhosni)
 7 | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/youssef19)
 8 | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/channel/UCeEcSgRzYFuVt-2Yk1ULdhQ)
 9 | 
10 | [![GitHub watchers](https://img.shields.io/github/watchers/youssefHosni/Efficient-Python-for-Data-Scientists.svg?style=social&label=Watch)](https://GitHub.com/youssefHosni/Awesome-ML-GitHub-Repos/watchers/)
11 | [![GitHub forks](https://img.shields.io/github/forks/youssefHosni/Efficient-Python-for-Data-Scientists.svg?style=social&label=Fork)](https://GitHub.com/youssefHosni/Efficient-Python-for-Data-Scientists/network/)
12 | [![GitHub stars](https://img.shields.io/github/stars/youssefHosni/Efficient-Python-for-Data-Scientists.svg?style=social&label=Star)](https://GitHub.com/youssefHosni/Efficient-Python-for-Data-Scientists/stargazers/)
13 | 
14 | 
15 | ## Efficient Python ##
16 | 
17 | |Topic |Blog|Kaggle Notebook| Youtube Video |
18 | |-----|--------|----------|----------|
19 | |**How To Write Python Clean Code** |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/learn-how-to-write-python-clean-code-using-these-3-principles-ed046978e39a?sk=84b51a685bc2981d85ac5f7346eeb4bf) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)]()| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
20 | |**Write Efficient Python Code: Defining & Measuring Code Efficiency** |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/geekculture/write-efficient-python-code-defining-measuring-code-efficiency-e33a5bd9f7ca?sk=39ca91a495d591e785427aa870081c68) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/measuring-python-code-efficiency)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
21 | |**Write Efficient Python Code: Optimizing Your Code** |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/write-efficient-python-code-for-data-scientists-optimizing-your-code-2dbb717f610e?sk=33c661faf85862b61e52343b90d35045) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/python-code-optimization-for-data-scientists)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
22 | |**How To Eliminate Loops From Your Python Code** |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/how-to-eliminate-loops-from-your-python-code-6dfb7c3578fa?sk=c7fc6bb617dd2e07dd20410ad7ff96e9) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/how-to-eliminate-loops-from-your-python-code)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
23 | | **5 Tips to Write Efficient Python Functions** |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/gitconnected/5-tips-to-write-efficient-python-functions-d9befdfa7778?sk=3f621008e6e44bf8fcf9e90da19edef1) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/5-tips-to-write-efficient-python-functions)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
24 | | **How to Use Caching to Speed Up Your Python Code & LLM Application** |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/gitconnected/how-to-use-caching-to-speed-up-your-python-code-llm-application-f385a5a10a0f) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/python-caching-to-speed-up-your-code-llm-app)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
25 | 
26 | 
27 | 
28 | 
29 | ## Efficient Pandas ##  
30 | |Topic |Blog|Kaggle Notebook| Youtube Video |
31 | |-----|--------|----------|----------|
32 | |Best Practices To Use Pandas Efficiently As A Data Scientist |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/best-practices-to-use-pandas-efficiently-as-a-data-scientist-9198b3e7bb6d?sk=88e07bfb5fba1b3a208a1e7e01a5981c) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)]()| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
33 | |Stop Looping Through Pandas DataFrames & Do This Instead |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/stop-looping-through-pandas-dataframes-do-this-instead-ddcb6009cbc1?sk=a81ea280c77aeb28afdbdacbef380c43) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/stop-looping-through-pandas-dataframes)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
34 | |Selecting & Replacing Values In Pandas DataFrame  Effectively |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/selecting-replacing-values-in-pandas-dataframe-effectively-69c5cee9f526?sk=1abbd5bff836d0ddb445fd5c8bf6ea74) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/selecting-replacing-values-in-pandas-effectively)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
35 | |How To Use .groupby() Effectively For Data Transformation As A Data Scientist |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/how-to-use-groupby-effectively-as-a-data-scientist-9e1d931e1619?sk=569f074940cf15c63837bd2dae0b714b) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/how-to-use-groupby-effectively-as-a-data-scient)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
36 | |20 Pandas Functions for 80% of Your Data Science Tasks |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/20-pandas-functions-for-80-of-your-data-science-tasks-b610c8bfe63c?sk=3f73cdb37d52db86caaa0e7d52852c64) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/20-pandas-functions-for-80-data-science-tasks)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
37 | |Data Exploration Becomes Easier & Better With Pandas Profiling |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/data-exploration-becomes-easier-better-with-pandas-profiling-2d527a612bef?sk=e052c35b0a34de49b18a28526513f754) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)]()| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
38 | |Make Your Pandas Code 1000 Times Faster With This Trick |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://levelup.gitconnected.com/make-your-pandas-code-1000-times-faster-with-this-trick-5b3a1438598a?sk=8d2d6cd70914e6e509ae6f6ab0791212) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/make-your-pandas-code-1000-times-faster)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
39 | |Maximizing Pandas Efficiency: Top 10 Mistakes to Steer Clear of in Your Code |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/gitconnected/maximizing-pandas-efficiency-top-10-mistakes-to-steer-clear-of-in-your-code-8623aff053cd?sk=3c2d020271318a0a3b8bfe945ed49b4c) | [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/youssef19/top-10-pandas-mistakes-to-avoid)| [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)]() |
40 | 
41 | 
42 |   
43 | 
44 | 


--------------------------------------------------------------------------------
/Stop_Looping_Through_Pandas_DataFrames_&_Do_This Instead.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |   "nbformat": 4,
   3 |   "nbformat_minor": 0,
   4 |   "metadata": {
   5 |     "colab": {
   6 |       "provenance": [],
   7 |       "authorship_tag": "ABX9TyPG/0UgH4QPGUoZduEMRpWs",
   8 |       "include_colab_link": true
   9 |     },
  10 |     "kernelspec": {
  11 |       "name": "python3",
  12 |       "display_name": "Python 3"
  13 |     },
  14 |     "language_info": {
  15 |       "name": "python"
  16 |     }
  17 |   },
  18 |   "cells": [
  19 |     {
  20 |       "cell_type": "markdown",
  21 |       "metadata": {
  22 |         "id": "view-in-github",
  23 |         "colab_type": "text"
  24 |       },
  25 |       "source": [
  26 |         "<a href=\"https://colab.research.google.com/github/youssefHosni/Efficient-Python-for-Data-Scientists/blob/main/Stop_Looping_Through_Pandas_DataFrames_%26_Do_This%C2%A0Instead.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
  27 |       ]
  28 |     },
  29 |     {
  30 |       "cell_type": "markdown",
  31 |       "source": [
  32 |         "# Stop Looping Through Pandas DataFrames & Do This Instead\n"
  33 |       ],
  34 |       "metadata": {
  35 |         "id": "Tki3VuqnrCTo"
  36 |       }
  37 |     },
  38 |     {
  39 |       "cell_type": "markdown",
  40 |       "source": [
  41 |         "Working with Pandas DataFrames can be a tedious and time-consuming task, especially when it comes to looping through them. If you're like most Python developers, you've probably spent a significant amount of time trying to figure out the most efficient way to loop through DataFrames. But what if there was an easier way? In this article, we'll show you better ways to work with Pandas DataFrames that don't require looping. We'll discuss the advantages of these approaches, as well as provide some practical examples to help you get started. So if you're ready to stop looping through Pandas DataFrames and do something better, read on!"
  42 |       ],
  43 |       "metadata": {
  44 |         "id": "fZn7EvAMrel6"
  45 |       }
  46 |     },
  47 |     {
  48 |       "cell_type": "markdown",
  49 |       "source": [
  50 |         "## Table of Contents:\n",
  51 |         "1. Why Do We Need Efficient Coding?\n",
  52 |         "2. Looping effectively Using the .iterrows()\n",
  53 |         "3. Looping Effectively Using .apply()\n",
  54 |         "4. Looping Effectively Using vectorization\n",
  55 |         "5. Summary of Best Practices"
  56 |       ],
  57 |       "metadata": {
  58 |         "id": "eAAS5oG7ri3e"
  59 |       }
  60 |     },
  61 |     {
  62 |       "cell_type": "markdown",
  63 |       "source": [
  64 |         "Throughout this article, we will use the Poker card game dataset. First let's load and explore the data:"
  65 |       ],
  66 |       "metadata": {
  67 |         "id": "jqQzluwOrqs1"
  68 |       }
  69 |     },
  70 |     {
  71 |       "cell_type": "code",
  72 |       "source": [
  73 |         "import pandas as pd\n",
  74 |         "import time\n",
  75 |         "poker_data = pd.read_csv('poker_hand.csv')\n",
  76 |         "poker_data.head()"
  77 |       ],
  78 |       "metadata": {
  79 |         "colab": {
  80 |           "base_uri": "https://localhost:8080/",
  81 |           "height": 206
  82 |         },
  83 |         "id": "JkSjVUtIrruF",
  84 |         "outputId": "a839ba09-6b8d-4039-b030-db72c1d7ab4c"
  85 |       },
  86 |       "execution_count": 1,
  87 |       "outputs": [
  88 |         {
  89 |           "output_type": "execute_result",
  90 |           "data": {
  91 |             "text/plain": [
  92 |               "   S1  R1  S2  R2  S3  R3  S4  R4  S5  R5  Class\n",
  93 |               "0   1  10   1  11   1  13   1  12   1   1      9\n",
  94 |               "1   2  11   2  13   2  10   2  12   2   1      9\n",
  95 |               "2   3  12   3  11   3  13   3  10   3   1      9\n",
  96 |               "3   4  10   4  11   4   1   4  13   4  12      9\n",
  97 |               "4   4   1   4  13   4  12   4  11   4  10      9"
  98 |             ],
  99 |             "text/html": [
 100 |               "\n",
 101 |               "  <div id=\"df-5b9e2534-2ca2-48fc-82e7-53629aecbb13\">\n",
 102 |               "    <div class=\"colab-df-container\">\n",
 103 |               "      <div>\n",
 104 |               "<style scoped>\n",
 105 |               "    .dataframe tbody tr th:only-of-type {\n",
 106 |               "        vertical-align: middle;\n",
 107 |               "    }\n",
 108 |               "\n",
 109 |               "    .dataframe tbody tr th {\n",
 110 |               "        vertical-align: top;\n",
 111 |               "    }\n",
 112 |               "\n",
 113 |               "    .dataframe thead th {\n",
 114 |               "        text-align: right;\n",
 115 |               "    }\n",
 116 |               "</style>\n",
 117 |               "<table border=\"1\" class=\"dataframe\">\n",
 118 |               "  <thead>\n",
 119 |               "    <tr style=\"text-align: right;\">\n",
 120 |               "      <th></th>\n",
 121 |               "      <th>S1</th>\n",
 122 |               "      <th>R1</th>\n",
 123 |               "      <th>S2</th>\n",
 124 |               "      <th>R2</th>\n",
 125 |               "      <th>S3</th>\n",
 126 |               "      <th>R3</th>\n",
 127 |               "      <th>S4</th>\n",
 128 |               "      <th>R4</th>\n",
 129 |               "      <th>S5</th>\n",
 130 |               "      <th>R5</th>\n",
 131 |               "      <th>Class</th>\n",
 132 |               "    </tr>\n",
 133 |               "  </thead>\n",
 134 |               "  <tbody>\n",
 135 |               "    <tr>\n",
 136 |               "      <th>0</th>\n",
 137 |               "      <td>1</td>\n",
 138 |               "      <td>10</td>\n",
 139 |               "      <td>1</td>\n",
 140 |               "      <td>11</td>\n",
 141 |               "      <td>1</td>\n",
 142 |               "      <td>13</td>\n",
 143 |               "      <td>1</td>\n",
 144 |               "      <td>12</td>\n",
 145 |               "      <td>1</td>\n",
 146 |               "      <td>1</td>\n",
 147 |               "      <td>9</td>\n",
 148 |               "    </tr>\n",
 149 |               "    <tr>\n",
 150 |               "      <th>1</th>\n",
 151 |               "      <td>2</td>\n",
 152 |               "      <td>11</td>\n",
 153 |               "      <td>2</td>\n",
 154 |               "      <td>13</td>\n",
 155 |               "      <td>2</td>\n",
 156 |               "      <td>10</td>\n",
 157 |               "      <td>2</td>\n",
 158 |               "      <td>12</td>\n",
 159 |               "      <td>2</td>\n",
 160 |               "      <td>1</td>\n",
 161 |               "      <td>9</td>\n",
 162 |               "    </tr>\n",
 163 |               "    <tr>\n",
 164 |               "      <th>2</th>\n",
 165 |               "      <td>3</td>\n",
 166 |               "      <td>12</td>\n",
 167 |               "      <td>3</td>\n",
 168 |               "      <td>11</td>\n",
 169 |               "      <td>3</td>\n",
 170 |               "      <td>13</td>\n",
 171 |               "      <td>3</td>\n",
 172 |               "      <td>10</td>\n",
 173 |               "      <td>3</td>\n",
 174 |               "      <td>1</td>\n",
 175 |               "      <td>9</td>\n",
 176 |               "    </tr>\n",
 177 |               "    <tr>\n",
 178 |               "      <th>3</th>\n",
 179 |               "      <td>4</td>\n",
 180 |               "      <td>10</td>\n",
 181 |               "      <td>4</td>\n",
 182 |               "      <td>11</td>\n",
 183 |               "      <td>4</td>\n",
 184 |               "      <td>1</td>\n",
 185 |               "      <td>4</td>\n",
 186 |               "      <td>13</td>\n",
 187 |               "      <td>4</td>\n",
 188 |               "      <td>12</td>\n",
 189 |               "      <td>9</td>\n",
 190 |               "    </tr>\n",
 191 |               "    <tr>\n",
 192 |               "      <th>4</th>\n",
 193 |               "      <td>4</td>\n",
 194 |               "      <td>1</td>\n",
 195 |               "      <td>4</td>\n",
 196 |               "      <td>13</td>\n",
 197 |               "      <td>4</td>\n",
 198 |               "      <td>12</td>\n",
 199 |               "      <td>4</td>\n",
 200 |               "      <td>11</td>\n",
 201 |               "      <td>4</td>\n",
 202 |               "      <td>10</td>\n",
 203 |               "      <td>9</td>\n",
 204 |               "    </tr>\n",
 205 |               "  </tbody>\n",
 206 |               "</table>\n",
 207 |               "</div>\n",
 208 |               "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-5b9e2534-2ca2-48fc-82e7-53629aecbb13')\"\n",
 209 |               "              title=\"Convert this dataframe to an interactive table.\"\n",
 210 |               "              style=\"display:none;\">\n",
 211 |               "        \n",
 212 |               "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
 213 |               "       width=\"24px\">\n",
 214 |               "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
 215 |               "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
 216 |               "  </svg>\n",
 217 |               "      </button>\n",
 218 |               "      \n",
 219 |               "  <style>\n",
 220 |               "    .colab-df-container {\n",
 221 |               "      display:flex;\n",
 222 |               "      flex-wrap:wrap;\n",
 223 |               "      gap: 12px;\n",
 224 |               "    }\n",
 225 |               "\n",
 226 |               "    .colab-df-convert {\n",
 227 |               "      background-color: #E8F0FE;\n",
 228 |               "      border: none;\n",
 229 |               "      border-radius: 50%;\n",
 230 |               "      cursor: pointer;\n",
 231 |               "      display: none;\n",
 232 |               "      fill: #1967D2;\n",
 233 |               "      height: 32px;\n",
 234 |               "      padding: 0 0 0 0;\n",
 235 |               "      width: 32px;\n",
 236 |               "    }\n",
 237 |               "\n",
 238 |               "    .colab-df-convert:hover {\n",
 239 |               "      background-color: #E2EBFA;\n",
 240 |               "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
 241 |               "      fill: #174EA6;\n",
 242 |               "    }\n",
 243 |               "\n",
 244 |               "    [theme=dark] .colab-df-convert {\n",
 245 |               "      background-color: #3B4455;\n",
 246 |               "      fill: #D2E3FC;\n",
 247 |               "    }\n",
 248 |               "\n",
 249 |               "    [theme=dark] .colab-df-convert:hover {\n",
 250 |               "      background-color: #434B5C;\n",
 251 |               "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
 252 |               "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
 253 |               "      fill: #FFFFFF;\n",
 254 |               "    }\n",
 255 |               "  </style>\n",
 256 |               "\n",
 257 |               "      <script>\n",
 258 |               "        const buttonEl =\n",
 259 |               "          document.querySelector('#df-5b9e2534-2ca2-48fc-82e7-53629aecbb13 button.colab-df-convert');\n",
 260 |               "        buttonEl.style.display =\n",
 261 |               "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
 262 |               "\n",
 263 |               "        async function convertToInteractive(key) {\n",
 264 |               "          const element = document.querySelector('#df-5b9e2534-2ca2-48fc-82e7-53629aecbb13');\n",
 265 |               "          const dataTable =\n",
 266 |               "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
 267 |               "                                                     [key], {});\n",
 268 |               "          if (!dataTable) return;\n",
 269 |               "\n",
 270 |               "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
 271 |               "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
 272 |               "            + ' to learn more about interactive tables.';\n",
 273 |               "          element.innerHTML = '';\n",
 274 |               "          dataTable['output_type'] = 'display_data';\n",
 275 |               "          await google.colab.output.renderOutput(dataTable, element);\n",
 276 |               "          const docLink = document.createElement('div');\n",
 277 |               "          docLink.innerHTML = docLinkHtml;\n",
 278 |               "          element.appendChild(docLink);\n",
 279 |               "        }\n",
 280 |               "      </script>\n",
 281 |               "    </div>\n",
 282 |               "  </div>\n",
 283 |               "  "
 284 |             ]
 285 |           },
 286 |           "metadata": {},
 287 |           "execution_count": 1
 288 |         }
 289 |       ]
 290 |     },
 291 |     {
 292 |       "cell_type": "markdown",
 293 |       "source": [
 294 |         "In each poker round, each player has five cards in hand, each one characterized by its symbol, which can be either hearts, diamonds, clubs, or spades, and its rank, which ranges from 1 to 13. The dataset consists of every possible combination of five cards one person can possess.\n",
 295 |         "* Sn: symbol of the n-th card where: 1 (Hearts), 2 (Diamonds), 3 (Clubs), 4 (Spades)\n",
 296 |         "* Rn: rank of the n-th card where: 1 (Ace), 2–10, 11 (Jack), 12 (Queen), 13 (King)"
 297 |       ],
 298 |       "metadata": {
 299 |         "id": "vFyaahTCrz4n"
 300 |       }
 301 |     },
 302 |     {
 303 |       "cell_type": "markdown",
 304 |       "source": [
 305 |         "## 1. Why do We need Efficient Coding?\n"
 306 |       ],
 307 |       "metadata": {
 308 |         "id": "YLPUOB6xyxdv"
 309 |       }
 310 |     },
 311 |     {
 312 |       "cell_type": "markdown",
 313 |       "source": [
 314 |         "Efficient code is code that executes faster and with lower computational meomery. In this article, we will use the time() function to measure the computational time. \n",
 315 |         "\n",
 316 |         "This function measures the current time so we will assign it to a variable before the code execution and after and then calculate the difference to know the computational time of the code. A simple example is shown in the code below:\n",
 317 |         "\n"
 318 |       ],
 319 |       "metadata": {
 320 |         "id": "PBLJceGKy1FR"
 321 |       }
 322 |     },
 323 |     {
 324 |       "cell_type": "code",
 325 |       "execution_count": null,
 326 |       "metadata": {
 327 |         "colab": {
 328 |           "base_uri": "https://localhost:8080/"
 329 |         },
 330 |         "id": "0ronfYdopjMo",
 331 |         "outputId": "7d2f0a38-b742-40ab-826f-6e8c9276cd41"
 332 |       },
 333 |       "outputs": [
 334 |         {
 335 |           "output_type": "stream",
 336 |           "name": "stdout",
 337 |           "text": [
 338 |             "Result calculated in 0.0001010894775390625 sec\n"
 339 |           ]
 340 |         }
 341 |       ],
 342 |       "source": [
 343 |         "# record time before execution\n",
 344 |         "start_time = time.time()\n",
 345 |         "# execute operation\n",
 346 |         "result = 5 + 2\n",
 347 |         "# record time after execution\n",
 348 |         "end_time = time.time()\n",
 349 |         "print(\"Result calculated in {} sec\".format(end_time - start_time))"
 350 |       ]
 351 |     },
 352 |     {
 353 |       "cell_type": "markdown",
 354 |       "source": [
 355 |         "Let's see some examples of how applying efficient code methods will improve the code runtime and decrease the computational time complexity: We will calculate the square of each number from zero, up to a million. At first, we will use a list comprehension to execute this operation, and then repeat the same procedure using a for-loop.\n",
 356 |         "\n",
 357 |         "First using list comprehension:\n",
 358 |         "\n",
 359 |         "\n"
 360 |       ],
 361 |       "metadata": {
 362 |         "id": "PMjqHZUYy6Im"
 363 |       }
 364 |     },
 365 |     {
 366 |       "cell_type": "code",
 367 |       "source": [
 368 |         "#using List comprehension \n",
 369 |         "list_comp_start_time = time.time()\n",
 370 |         "result = [i*i for i in range(0,1000000)]\n",
 371 |         "list_comp_end_time = time.time()\n",
 372 |         "print(\"Time using the list_comprehension: {} sec\".format(list_comp_end_time -\n",
 373 |         "list_comp_start_time))"
 374 |       ],
 375 |       "metadata": {
 376 |         "colab": {
 377 |           "base_uri": "https://localhost:8080/"
 378 |         },
 379 |         "id": "OeXxZOMRy4Y-",
 380 |         "outputId": "4f25b0ce-dff4-440c-f2d4-f7d703cc7229"
 381 |       },
 382 |       "execution_count": null,
 383 |       "outputs": [
 384 |         {
 385 |           "output_type": "stream",
 386 |           "name": "stdout",
 387 |           "text": [
 388 |             "Time using the list_comprehension: 0.12260246276855469 sec\n"
 389 |           ]
 390 |         }
 391 |       ]
 392 |     },
 393 |     {
 394 |       "cell_type": "markdown",
 395 |       "source": [
 396 |         "Now we will use for loop to execute the same operation:\n",
 397 |         "\n"
 398 |       ],
 399 |       "metadata": {
 400 |         "id": "pqoqyCQbzBTt"
 401 |       }
 402 |     },
 403 |     {
 404 |       "cell_type": "code",
 405 |       "source": [
 406 |         "# Using For loop\n",
 407 |         "for_loop_start_time= time.time()\n",
 408 |         "result=[]\n",
 409 |         "for i in range(0,1000000):\n",
 410 |         "  result.append(i*i)\n",
 411 |         "for_loop_end_time= time.time()\n",
 412 |         "print(\"Time using the for loop: {} sec\".format(for_loop_end_time - for_loop_start_time))"
 413 |       ],
 414 |       "metadata": {
 415 |         "colab": {
 416 |           "base_uri": "https://localhost:8080/"
 417 |         },
 418 |         "id": "pVJcr30ZzH9z",
 419 |         "outputId": "9a409a1b-3e26-454c-af52-e50dff54c165"
 420 |       },
 421 |       "execution_count": null,
 422 |       "outputs": [
 423 |         {
 424 |           "output_type": "stream",
 425 |           "name": "stdout",
 426 |           "text": [
 427 |             "Time using the for loop: 0.37175941467285156 sec\n"
 428 |           ]
 429 |         }
 430 |       ]
 431 |     },
 432 |     {
 433 |       "cell_type": "markdown",
 434 |       "source": [
 435 |         "We can see that there is a big difference between them, we can calculate the difference between them in percentage:\n",
 436 |         "\n"
 437 |       ],
 438 |       "metadata": {
 439 |         "id": "gGKbSvTiznVq"
 440 |       }
 441 |     },
 442 |     {
 443 |       "cell_type": "code",
 444 |       "source": [
 445 |         "list_comp_time = list_comp_end_time - list_comp_start_time\n",
 446 |         "for_loop_time = for_loop_end_time - for_loop_start_time\n",
 447 |         "print(\"Difference in time: {} %\".format((for_loop_time - list_comp_time)/\n",
 448 |         "list_comp_time*100))"
 449 |       ],
 450 |       "metadata": {
 451 |         "colab": {
 452 |           "base_uri": "https://localhost:8080/"
 453 |         },
 454 |         "id": "Ka7TiiU7zO4j",
 455 |         "outputId": "e36554c8-0761-4261-d906-5debaa21be6b"
 456 |       },
 457 |       "execution_count": null,
 458 |       "outputs": [
 459 |         {
 460 |           "output_type": "stream",
 461 |           "name": "stdout",
 462 |           "text": [
 463 |             "Difference in time: 203.22344778232394 %\n"
 464 |           ]
 465 |         }
 466 |       ]
 467 |     },
 468 |     {
 469 |       "cell_type": "markdown",
 470 |       "source": [
 471 |         "Here is another example to show the effect of writing efficient code. We would like to calculate the sum of all consecutive numbers from 1 to 1 million. There are two ways the first is to use brute force in which we will add one by one to a million.\n",
 472 |         "\n"
 473 |       ],
 474 |       "metadata": {
 475 |         "id": "8yWk4HrKzrjF"
 476 |       }
 477 |     },
 478 |     {
 479 |       "cell_type": "code",
 480 |       "source": [
 481 |         "def sum_formula(N):\n",
 482 |         "  return N*(N+1)/2\n",
 483 |         "  \n",
 484 |         "# Using the formula\n",
 485 |         "formula_start_time = time.time()\n",
 486 |         "formula_result = sum_formula(1000000)\n",
 487 |         "formula_end_time = time.time()\n",
 488 |         "\n",
 489 |         "print(\"Time using the formula: {} sec\".format(formula_end_time - formula_start_time))"
 490 |       ],
 491 |       "metadata": {
 492 |         "colab": {
 493 |           "base_uri": "https://localhost:8080/"
 494 |         },
 495 |         "id": "zglHUf7t0OPK",
 496 |         "outputId": "fc6007e5-5a75-4791-b9f6-0b9c6bc179b4"
 497 |       },
 498 |       "execution_count": null,
 499 |       "outputs": [
 500 |         {
 501 |           "output_type": "stream",
 502 |           "name": "stdout",
 503 |           "text": [
 504 |             "Time using the formula: 5.8650970458984375e-05 sec\n"
 505 |           ]
 506 |         }
 507 |       ]
 508 |     },
 509 |     {
 510 |       "cell_type": "markdown",
 511 |       "source": [
 512 |         "Another more efficient method is to use a formula to calculate it. When we want to calculate the sum of all the integer numbers from 1 up to a number, let’s say N, we can multiply N by N+1, and then divide by 2, and this will give us the result we want. This problem was actually given to some students back in Germany in the 19th century, and a bright student called Carl-Friedrich Gauss devised this formula to solve the problem in seconds."
 513 |       ],
 514 |       "metadata": {
 515 |         "id": "Iek6ZUDuzvJe"
 516 |       }
 517 |     },
 518 |     {
 519 |       "cell_type": "code",
 520 |       "source": [
 521 |         "def sum_brute_force(N):\n",
 522 |         "  res = 0\n",
 523 |         "  for i in range(1,N+1):\n",
 524 |         "    res+=i\n",
 525 |         "  return res\n",
 526 |         "\n",
 527 |         "# Using brute force\n",
 528 |         "bf_start_time = time.time()\n",
 529 |         "bf_result = sum_brute_force(1000000)\n",
 530 |         "bf_end_time = time.time()\n",
 531 |         "\n",
 532 |         "print(\"Time using brute force: {} sec\".format(bf_end_time - bf_start_time))"
 533 |       ],
 534 |       "metadata": {
 535 |         "colab": {
 536 |           "base_uri": "https://localhost:8080/"
 537 |         },
 538 |         "id": "kLHOBkEG012K",
 539 |         "outputId": "8c61e2ef-2335-49bb-eb3e-e599af4927ad"
 540 |       },
 541 |       "execution_count": null,
 542 |       "outputs": [
 543 |         {
 544 |           "output_type": "stream",
 545 |           "name": "stdout",
 546 |           "text": [
 547 |             "Time using brute force: 0.06304192543029785 sec\n"
 548 |           ]
 549 |         }
 550 |       ]
 551 |     },
 552 |     {
 553 |       "cell_type": "markdown",
 554 |       "source": [
 555 |         "After running both methods, we achieve a massive improvement with a magnitude of over 160,000%, which clearly demonstrates why we need efficient and optimized code, even for simple tasks.\n",
 556 |         "\n"
 557 |       ],
 558 |       "metadata": {
 559 |         "id": "pD2YNf5Lzxg_"
 560 |       }
 561 |     },
 562 |     {
 563 |       "cell_type": "markdown",
 564 |       "source": [
 565 |         "One of the most inefficient methods to write python code is to have many loops in your code, especially if you have large data. Since as a data scientist, you will need to iterate through your dataframe extensively, especially in the data preparation and exploration phase, it is important to be able to do this efficiently, as it will save you much time and give space for more important work. We will walk through three methods to make your loops much faster and more efficient:\n",
 566 |         "\n",
 567 |         "* Looping using the .iterrows() function\n",
 568 |         "* Looping using the .apply() function\n",
 569 |         "* Vectorization\n"
 570 |       ],
 571 |       "metadata": {
 572 |         "id": "T2oxGNcW2SNq"
 573 |       }
 574 |     },
 575 |     {
 576 |       "cell_type": "markdown",
 577 |       "source": [
 578 |         "## 2. Looping effectively using .iterrows()\n",
 579 |         "Before we talk about how to use the .iterrows() function to improve the looping process, let’s refresh the notion of a generator function.\n",
 580 |         "\n",
 581 |         "Generators are a simple tool to create iterators. Inside the body of a generator, instead of return statements, you will find only yield() statements. There can be just one, or several yield() statements. Here, we can see a generator, city_name_generator(), that produces four city names. We assign the generator to the variable city_names for simplicity.\n",
 582 |         "\n"
 583 |       ],
 584 |       "metadata": {
 585 |         "id": "SQjPBhCL2aDU"
 586 |       }
 587 |     },
 588 |     {
 589 |       "cell_type": "code",
 590 |       "source": [
 591 |         "def city_name_generator():\n",
 592 |         "  yield('New York')\n",
 593 |         "  yield('London')\n",
 594 |         "  yield('Tokyo')\n",
 595 |         "  yield('Sao Paolo')\n",
 596 |         "\n",
 597 |         "city_names = city_name_generator()\n"
 598 |       ],
 599 |       "metadata": {
 600 |         "id": "Gs6UTJjqqLT-"
 601 |       },
 602 |       "execution_count": null,
 603 |       "outputs": []
 604 |     },
 605 |     {
 606 |       "cell_type": "markdown",
 607 |       "source": [
 608 |         "To access the elements that the generator yields we can use Python’s next() function. Each time the next() command is used, the generator will produce the next value to yield, until there are no more values to yield. We here have 4 cities. Let’s run the next command four times and see what it returns:\n",
 609 |         "\n"
 610 |       ],
 611 |       "metadata": {
 612 |         "id": "x7-XgpHX2fH7"
 613 |       }
 614 |     },
 615 |     {
 616 |       "cell_type": "code",
 617 |       "source": [
 618 |         "next(city_names)"
 619 |       ],
 620 |       "metadata": {
 621 |         "colab": {
 622 |           "base_uri": "https://localhost:8080/",
 623 |           "height": 35
 624 |         },
 625 |         "id": "BdS9iCA2ESQu",
 626 |         "outputId": "8a25c76a-6fa2-47db-ddf9-977d7b57932f"
 627 |       },
 628 |       "execution_count": null,
 629 |       "outputs": [
 630 |         {
 631 |           "output_type": "execute_result",
 632 |           "data": {
 633 |             "text/plain": [
 634 |               "'New York'"
 635 |             ],
 636 |             "application/vnd.google.colaboratory.intrinsic+json": {
 637 |               "type": "string"
 638 |             }
 639 |           },
 640 |           "metadata": {},
 641 |           "execution_count": 2
 642 |         }
 643 |       ]
 644 |     },
 645 |     {
 646 |       "cell_type": "code",
 647 |       "source": [
 648 |         "next(city_names)"
 649 |       ],
 650 |       "metadata": {
 651 |         "colab": {
 652 |           "base_uri": "https://localhost:8080/",
 653 |           "height": 35
 654 |         },
 655 |         "id": "vRT4Eaz1GgUI",
 656 |         "outputId": "14fae336-6331-48e8-fb83-317d168014eb"
 657 |       },
 658 |       "execution_count": null,
 659 |       "outputs": [
 660 |         {
 661 |           "output_type": "execute_result",
 662 |           "data": {
 663 |             "text/plain": [
 664 |               "'London'"
 665 |             ],
 666 |             "application/vnd.google.colaboratory.intrinsic+json": {
 667 |               "type": "string"
 668 |             }
 669 |           },
 670 |           "metadata": {},
 671 |           "execution_count": 3
 672 |         }
 673 |       ]
 674 |     },
 675 |     {
 676 |       "cell_type": "code",
 677 |       "source": [
 678 |         "next(city_names)"
 679 |       ],
 680 |       "metadata": {
 681 |         "colab": {
 682 |           "base_uri": "https://localhost:8080/",
 683 |           "height": 35
 684 |         },
 685 |         "id": "hA6SCwAUGh79",
 686 |         "outputId": "0eaf8af0-f56f-4534-ffad-ce3c10590f4e"
 687 |       },
 688 |       "execution_count": null,
 689 |       "outputs": [
 690 |         {
 691 |           "output_type": "execute_result",
 692 |           "data": {
 693 |             "text/plain": [
 694 |               "'Tokyo'"
 695 |             ],
 696 |             "application/vnd.google.colaboratory.intrinsic+json": {
 697 |               "type": "string"
 698 |             }
 699 |           },
 700 |           "metadata": {},
 701 |           "execution_count": 4
 702 |         }
 703 |       ]
 704 |     },
 705 |     {
 706 |       "cell_type": "code",
 707 |       "source": [
 708 |         "next(city_names)"
 709 |       ],
 710 |       "metadata": {
 711 |         "colab": {
 712 |           "base_uri": "https://localhost:8080/",
 713 |           "height": 35
 714 |         },
 715 |         "id": "rKjG6CzhGjob",
 716 |         "outputId": "741af17b-186c-4ab0-d475-fe18e49cd5c9"
 717 |       },
 718 |       "execution_count": null,
 719 |       "outputs": [
 720 |         {
 721 |           "output_type": "execute_result",
 722 |           "data": {
 723 |             "text/plain": [
 724 |               "'Sao Paolo'"
 725 |             ],
 726 |             "application/vnd.google.colaboratory.intrinsic+json": {
 727 |               "type": "string"
 728 |             }
 729 |           },
 730 |           "metadata": {},
 731 |           "execution_count": 5
 732 |         }
 733 |       ]
 734 |     },
 735 |     {
 736 |       "cell_type": "markdown",
 737 |       "source": [
 738 |         "As we can see that every time we run the next() function it will print a new city name.\n",
 739 |         "\n"
 740 |       ],
 741 |       "metadata": {
 742 |         "id": "5FFZfu_K2oP8"
 743 |       }
 744 |     },
 745 |     {
 746 |       "cell_type": "markdown",
 747 |       "source": [
 748 |         "Let's go back to the .iterrows() function. The .iterrows() function is a property of every pandas DataFrame. When called, it produces a list with two elements. We will use this generator to iterate through each line of our poker DataFrame. The first element is the index of the row, while the second element contains a pandas Series of each feature of the row: the Symbol and the Rank for each of the five cards. It is very similar to the notion of the enumerate() function, which when applied to a list, returns each element along with its index.\n",
 749 |         "\n",
 750 |         "The most intuitive way to iterate through a Pandas DataFrame is to use the range() function, which is often called crude looping. This is shown with the code below:\n",
 751 |         "\n"
 752 |       ],
 753 |       "metadata": {
 754 |         "id": "Va18cE2T3Rvw"
 755 |       }
 756 |     },
 757 |     {
 758 |       "cell_type": "code",
 759 |       "source": [
 760 |         "start_time = time.time()\n",
 761 |         "for index in range(poker_data.shape[0]):\n",
 762 |         "  next\n",
 763 |         "print(\"Time using range(): {} sec\".format(time.time() - start_time))"
 764 |       ],
 765 |       "metadata": {
 766 |         "colab": {
 767 |           "base_uri": "https://localhost:8080/"
 768 |         },
 769 |         "id": "r55m_fxePy5N",
 770 |         "outputId": "2432444e-a802-4e92-e7d1-fd0146b578c3"
 771 |       },
 772 |       "execution_count": null,
 773 |       "outputs": [
 774 |         {
 775 |           "output_type": "stream",
 776 |           "name": "stdout",
 777 |           "text": [
 778 |             "Time using range(): 0.0036385059356689453 sec\n"
 779 |           ]
 780 |         }
 781 |       ]
 782 |     },
 783 |     {
 784 |       "cell_type": "markdown",
 785 |       "source": [
 786 |         "One smarter way to iterate through a pandas DataFrame is to use the **.iterrows()** function, which is optimized for this task. We simply define the **‘for’** loop with two iterators, one for the number of each row and the other for all the values.\n",
 787 |         "\n",
 788 |         "Inside the loop, the **next()** command indicates that the loop moves to the next value of the iterator, without actually doing something."
 789 |       ],
 790 |       "metadata": {
 791 |         "id": "gFaF1Z-13Way"
 792 |       }
 793 |     },
 794 |     {
 795 |       "cell_type": "code",
 796 |       "source": [
 797 |         "data_generator = poker_data.iterrows()\n",
 798 |         "start_time = time.time()\n",
 799 |         "for index, values in data_generator:\n",
 800 |         "  next\n",
 801 |         "print(\"Time using .iterrows(): {} sec\".format(time.time() - start_time))"
 802 |       ],
 803 |       "metadata": {
 804 |         "colab": {
 805 |           "base_uri": "https://localhost:8080/"
 806 |         },
 807 |         "id": "saqXt53lNPDQ",
 808 |         "outputId": "2ac4354c-25d1-4032-a9eb-df7d81e8a7d2"
 809 |       },
 810 |       "execution_count": null,
 811 |       "outputs": [
 812 |         {
 813 |           "output_type": "stream",
 814 |           "name": "stdout",
 815 |           "text": [
 816 |             "Time using .iterrows(): 1.2583379745483398 sec\n"
 817 |           ]
 818 |         }
 819 |       ]
 820 |     },
 821 |     {
 822 |       "cell_type": "markdown",
 823 |       "source": [
 824 |         "Comparing the two computational times we can also notice that the use of .iterrows() does not improve the speed of iterating through pandas DataFrame. It is very useful though when we need a cleaner way to use the values of each row while iterating through the dataset.\n"
 825 |       ],
 826 |       "metadata": {
 827 |         "id": "mD6O-NjL3hah"
 828 |       }
 829 |     },
 830 |     {
 831 |       "cell_type": "markdown",
 832 |       "source": [
 833 |         "## 3.Looping Effectively Using .apply()"
 834 |       ],
 835 |       "metadata": {
 836 |         "id": "0fqYrl6F3q-_"
 837 |       }
 838 |     },
 839 |     {
 840 |       "cell_type": "markdown",
 841 |       "source": [
 842 |         "Now we will use the **.apply()** function to be able to perform a specific task while iterating through a pandas DataFrame. The **.apply()** function does exactly what it says; it applies another function to the whole DataFrame.\n",
 843 |         "\n",
 844 |         "The syntax of the **.apply()** function is simple: we create a mapping, using a lambda function in this case, and then declare the function we want to apply to every cell. Here, we’re applying the square root function to every cell of the DataFrame. In terms of speed, it matches the speed of just using the NumPy sqrt() function over the whole DataFrame.\n"
 845 |       ],
 846 |       "metadata": {
 847 |         "id": "x6Y4ACXY3pDX"
 848 |       }
 849 |     },
 850 |     {
 851 |       "cell_type": "code",
 852 |       "source": [
 853 |         "data_sqrt = poker_data.apply(lambda x: np.sqrt(x), axis =0 )\n",
 854 |         "data_sqrt.head()"
 855 |       ],
 856 |       "metadata": {
 857 |         "colab": {
 858 |           "base_uri": "https://localhost:8080/",
 859 |           "height": 206
 860 |         },
 861 |         "id": "GVixY9NAP7JI",
 862 |         "outputId": "447e5a58-91e1-42d1-cfa4-21582df6b580"
 863 |       },
 864 |       "execution_count": null,
 865 |       "outputs": [
 866 |         {
 867 |           "output_type": "execute_result",
 868 |           "data": {
 869 |             "text/plain": [
 870 |               "         S1        R1        S2        R2        S3        R3        S4  \\\n",
 871 |               "0  1.000000  3.162278  1.000000  3.316625  1.000000  3.605551  1.000000   \n",
 872 |               "1  1.414214  3.316625  1.414214  3.605551  1.414214  3.162278  1.414214   \n",
 873 |               "2  1.732051  3.464102  1.732051  3.316625  1.732051  3.605551  1.732051   \n",
 874 |               "3  2.000000  3.162278  2.000000  3.316625  2.000000  1.000000  2.000000   \n",
 875 |               "4  2.000000  1.000000  2.000000  3.605551  2.000000  3.464102  2.000000   \n",
 876 |               "\n",
 877 |               "         R4        S5        R5  Class  \n",
 878 |               "0  3.464102  1.000000  1.000000    3.0  \n",
 879 |               "1  3.464102  1.414214  1.000000    3.0  \n",
 880 |               "2  3.162278  1.732051  1.000000    3.0  \n",
 881 |               "3  3.605551  2.000000  3.464102    3.0  \n",
 882 |               "4  3.316625  2.000000  3.162278    3.0  "
 883 |             ],
 884 |             "text/html": [
 885 |               "\n",
 886 |               "  <div id=\"df-68d3ecd1-faa7-407d-825a-d187213248a6\">\n",
 887 |               "    <div class=\"colab-df-container\">\n",
 888 |               "      <div>\n",
 889 |               "<style scoped>\n",
 890 |               "    .dataframe tbody tr th:only-of-type {\n",
 891 |               "        vertical-align: middle;\n",
 892 |               "    }\n",
 893 |               "\n",
 894 |               "    .dataframe tbody tr th {\n",
 895 |               "        vertical-align: top;\n",
 896 |               "    }\n",
 897 |               "\n",
 898 |               "    .dataframe thead th {\n",
 899 |               "        text-align: right;\n",
 900 |               "    }\n",
 901 |               "</style>\n",
 902 |               "<table border=\"1\" class=\"dataframe\">\n",
 903 |               "  <thead>\n",
 904 |               "    <tr style=\"text-align: right;\">\n",
 905 |               "      <th></th>\n",
 906 |               "      <th>S1</th>\n",
 907 |               "      <th>R1</th>\n",
 908 |               "      <th>S2</th>\n",
 909 |               "      <th>R2</th>\n",
 910 |               "      <th>S3</th>\n",
 911 |               "      <th>R3</th>\n",
 912 |               "      <th>S4</th>\n",
 913 |               "      <th>R4</th>\n",
 914 |               "      <th>S5</th>\n",
 915 |               "      <th>R5</th>\n",
 916 |               "      <th>Class</th>\n",
 917 |               "    </tr>\n",
 918 |               "  </thead>\n",
 919 |               "  <tbody>\n",
 920 |               "    <tr>\n",
 921 |               "      <th>0</th>\n",
 922 |               "      <td>1.000000</td>\n",
 923 |               "      <td>3.162278</td>\n",
 924 |               "      <td>1.000000</td>\n",
 925 |               "      <td>3.316625</td>\n",
 926 |               "      <td>1.000000</td>\n",
 927 |               "      <td>3.605551</td>\n",
 928 |               "      <td>1.000000</td>\n",
 929 |               "      <td>3.464102</td>\n",
 930 |               "      <td>1.000000</td>\n",
 931 |               "      <td>1.000000</td>\n",
 932 |               "      <td>3.0</td>\n",
 933 |               "    </tr>\n",
 934 |               "    <tr>\n",
 935 |               "      <th>1</th>\n",
 936 |               "      <td>1.414214</td>\n",
 937 |               "      <td>3.316625</td>\n",
 938 |               "      <td>1.414214</td>\n",
 939 |               "      <td>3.605551</td>\n",
 940 |               "      <td>1.414214</td>\n",
 941 |               "      <td>3.162278</td>\n",
 942 |               "      <td>1.414214</td>\n",
 943 |               "      <td>3.464102</td>\n",
 944 |               "      <td>1.414214</td>\n",
 945 |               "      <td>1.000000</td>\n",
 946 |               "      <td>3.0</td>\n",
 947 |               "    </tr>\n",
 948 |               "    <tr>\n",
 949 |               "      <th>2</th>\n",
 950 |               "      <td>1.732051</td>\n",
 951 |               "      <td>3.464102</td>\n",
 952 |               "      <td>1.732051</td>\n",
 953 |               "      <td>3.316625</td>\n",
 954 |               "      <td>1.732051</td>\n",
 955 |               "      <td>3.605551</td>\n",
 956 |               "      <td>1.732051</td>\n",
 957 |               "      <td>3.162278</td>\n",
 958 |               "      <td>1.732051</td>\n",
 959 |               "      <td>1.000000</td>\n",
 960 |               "      <td>3.0</td>\n",
 961 |               "    </tr>\n",
 962 |               "    <tr>\n",
 963 |               "      <th>3</th>\n",
 964 |               "      <td>2.000000</td>\n",
 965 |               "      <td>3.162278</td>\n",
 966 |               "      <td>2.000000</td>\n",
 967 |               "      <td>3.316625</td>\n",
 968 |               "      <td>2.000000</td>\n",
 969 |               "      <td>1.000000</td>\n",
 970 |               "      <td>2.000000</td>\n",
 971 |               "      <td>3.605551</td>\n",
 972 |               "      <td>2.000000</td>\n",
 973 |               "      <td>3.464102</td>\n",
 974 |               "      <td>3.0</td>\n",
 975 |               "    </tr>\n",
 976 |               "    <tr>\n",
 977 |               "      <th>4</th>\n",
 978 |               "      <td>2.000000</td>\n",
 979 |               "      <td>1.000000</td>\n",
 980 |               "      <td>2.000000</td>\n",
 981 |               "      <td>3.605551</td>\n",
 982 |               "      <td>2.000000</td>\n",
 983 |               "      <td>3.464102</td>\n",
 984 |               "      <td>2.000000</td>\n",
 985 |               "      <td>3.316625</td>\n",
 986 |               "      <td>2.000000</td>\n",
 987 |               "      <td>3.162278</td>\n",
 988 |               "      <td>3.0</td>\n",
 989 |               "    </tr>\n",
 990 |               "  </tbody>\n",
 991 |               "</table>\n",
 992 |               "</div>\n",
 993 |               "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-68d3ecd1-faa7-407d-825a-d187213248a6')\"\n",
 994 |               "              title=\"Convert this dataframe to an interactive table.\"\n",
 995 |               "              style=\"display:none;\">\n",
 996 |               "        \n",
 997 |               "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
 998 |               "       width=\"24px\">\n",
 999 |               "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
1000 |               "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
1001 |               "  </svg>\n",
1002 |               "      </button>\n",
1003 |               "      \n",
1004 |               "  <style>\n",
1005 |               "    .colab-df-container {\n",
1006 |               "      display:flex;\n",
1007 |               "      flex-wrap:wrap;\n",
1008 |               "      gap: 12px;\n",
1009 |               "    }\n",
1010 |               "\n",
1011 |               "    .colab-df-convert {\n",
1012 |               "      background-color: #E8F0FE;\n",
1013 |               "      border: none;\n",
1014 |               "      border-radius: 50%;\n",
1015 |               "      cursor: pointer;\n",
1016 |               "      display: none;\n",
1017 |               "      fill: #1967D2;\n",
1018 |               "      height: 32px;\n",
1019 |               "      padding: 0 0 0 0;\n",
1020 |               "      width: 32px;\n",
1021 |               "    }\n",
1022 |               "\n",
1023 |               "    .colab-df-convert:hover {\n",
1024 |               "      background-color: #E2EBFA;\n",
1025 |               "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
1026 |               "      fill: #174EA6;\n",
1027 |               "    }\n",
1028 |               "\n",
1029 |               "    [theme=dark] .colab-df-convert {\n",
1030 |               "      background-color: #3B4455;\n",
1031 |               "      fill: #D2E3FC;\n",
1032 |               "    }\n",
1033 |               "\n",
1034 |               "    [theme=dark] .colab-df-convert:hover {\n",
1035 |               "      background-color: #434B5C;\n",
1036 |               "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
1037 |               "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
1038 |               "      fill: #FFFFFF;\n",
1039 |               "    }\n",
1040 |               "  </style>\n",
1041 |               "\n",
1042 |               "      <script>\n",
1043 |               "        const buttonEl =\n",
1044 |               "          document.querySelector('#df-68d3ecd1-faa7-407d-825a-d187213248a6 button.colab-df-convert');\n",
1045 |               "        buttonEl.style.display =\n",
1046 |               "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
1047 |               "\n",
1048 |               "        async function convertToInteractive(key) {\n",
1049 |               "          const element = document.querySelector('#df-68d3ecd1-faa7-407d-825a-d187213248a6');\n",
1050 |               "          const dataTable =\n",
1051 |               "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
1052 |               "                                                     [key], {});\n",
1053 |               "          if (!dataTable) return;\n",
1054 |               "\n",
1055 |               "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
1056 |               "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
1057 |               "            + ' to learn more about interactive tables.';\n",
1058 |               "          element.innerHTML = '';\n",
1059 |               "          dataTable['output_type'] = 'display_data';\n",
1060 |               "          await google.colab.output.renderOutput(dataTable, element);\n",
1061 |               "          const docLink = document.createElement('div');\n",
1062 |               "          docLink.innerHTML = docLinkHtml;\n",
1063 |               "          element.appendChild(docLink);\n",
1064 |               "        }\n",
1065 |               "      </script>\n",
1066 |               "    </div>\n",
1067 |               "  </div>\n",
1068 |               "  "
1069 |             ]
1070 |           },
1071 |           "metadata": {},
1072 |           "execution_count": 3
1073 |         }
1074 |       ]
1075 |     },
1076 |     {
1077 |       "cell_type": "markdown",
1078 |       "source": [
1079 |         "This is a simple example since we would like to apply this function to the dataframe.\n",
1080 |         "\n",
1081 |         "But what happens when the function of interest is taking more than one cell as an input? For example, what if we want to calculate the sum of the rank of all the cards in each hand? In this case, we will use the .apply() function the same way as we did before, but we need to add ‘axis=1’ at the end of the line to specify we’re applying the function to each row.\n",
1082 |         "\n"
1083 |       ],
1084 |       "metadata": {
1085 |         "id": "dMg4d5SV4IL-"
1086 |       }
1087 |     },
1088 |     {
1089 |       "cell_type": "code",
1090 |       "source": [
1091 |         "apply_start_time = time.time()\n",
1092 |         "poker_data[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x), axis=1)\n",
1093 |         "apply_end_time = time.time()\n",
1094 |         "apply_time = apply_end_time - apply_start_time\n",
1095 |         "print(\"Time using .apply(): {} sec\".format(apply_time))"
1096 |       ],
1097 |       "metadata": {
1098 |         "colab": {
1099 |           "base_uri": "https://localhost:8080/"
1100 |         },
1101 |         "id": "6dzHsYQdjtfa",
1102 |         "outputId": "4e4351c3-174a-4961-83da-1627f5db1e89"
1103 |       },
1104 |       "execution_count": null,
1105 |       "outputs": [
1106 |         {
1107 |           "output_type": "stream",
1108 |           "name": "stdout",
1109 |           "text": [
1110 |             "Time using .apply(): 0.2000577449798584 sec\n"
1111 |           ]
1112 |         }
1113 |       ]
1114 |     },
1115 |     {
1116 |       "cell_type": "markdown",
1117 |       "source": [
1118 |         "Then, we will use the .iterrows() function we saw previously, and compare their efficiency.\n",
1119 |         "\n"
1120 |       ],
1121 |       "metadata": {
1122 |         "id": "6Jg-EIJV4McK"
1123 |       }
1124 |     },
1125 |     {
1126 |       "cell_type": "code",
1127 |       "source": [
1128 |         "for_loop_start_time = time.time()\n",
1129 |         "for ind, value in poker_data.iterrows():\n",
1130 |         "  sum([value[1], value[3], value[5], value[7], value[9]])\n",
1131 |         "for_loop_end_time = time.time()\n",
1132 |         "\n",
1133 |         "for_loop_time = for_loop_end_time - for_loop_start_time\n",
1134 |         "print(\"Time using .iterrows(): {} sec\".format(for_loop_time))"
1135 |       ],
1136 |       "metadata": {
1137 |         "colab": {
1138 |           "base_uri": "https://localhost:8080/"
1139 |         },
1140 |         "id": "sOkXe8gxrXOX",
1141 |         "outputId": "2b901612-ffb1-438d-952c-04b9ce221c62"
1142 |       },
1143 |       "execution_count": null,
1144 |       "outputs": [
1145 |         {
1146 |           "output_type": "stream",
1147 |           "name": "stdout",
1148 |           "text": [
1149 |             "Time using .iterrows(): 1.1545953750610352 sec\n"
1150 |           ]
1151 |         }
1152 |       ]
1153 |     },
1154 |     {
1155 |       "cell_type": "markdown",
1156 |       "source": [
1157 |         "Using the .apply() function is significantly faster than the .iterrows() function, with a magnitude of around 400 percent, which is a massive improvement!\n",
1158 |         "\n"
1159 |       ],
1160 |       "metadata": {
1161 |         "id": "V3FUIUts4PBG"
1162 |       }
1163 |     },
1164 |     {
1165 |       "cell_type": "code",
1166 |       "source": [
1167 |         "print('The differnce: {} %'.format((for_loop_time - apply_time) / apply_time * 100))"
1168 |       ],
1169 |       "metadata": {
1170 |         "colab": {
1171 |           "base_uri": "https://localhost:8080/"
1172 |         },
1173 |         "id": "JGFa3H3ysxmE",
1174 |         "outputId": "566e5293-2da4-4f63-9b0a-8e065689326f"
1175 |       },
1176 |       "execution_count": null,
1177 |       "outputs": [
1178 |         {
1179 |           "output_type": "stream",
1180 |           "name": "stdout",
1181 |           "text": [
1182 |             "The differnce: 477.1310554246618 %\n"
1183 |           ]
1184 |         }
1185 |       ]
1186 |     },
1187 |     {
1188 |       "cell_type": "markdown",
1189 |       "source": [
1190 |         "As we did with rows, we can do exactly the same thing for the columns; apply one function to each column. By replacing the axis=1 with axis=0, we can apply the sum function on every column.\n",
1191 |         "\n"
1192 |       ],
1193 |       "metadata": {
1194 |         "id": "7FE-SES04VZc"
1195 |       }
1196 |     },
1197 |     {
1198 |       "cell_type": "code",
1199 |       "source": [
1200 |         "apply_start_time = time.time()\n",
1201 |         "poker_data[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x), axis=0)\n",
1202 |         "apply_end_time = time.time()\n",
1203 |         "apply_time = apply_end_time - apply_start_time\n",
1204 |         "print(\"Time using .apply(): {} sec\".format(apply_time))"
1205 |       ],
1206 |       "metadata": {
1207 |         "id": "TqSssrebte0e",
1208 |         "colab": {
1209 |           "base_uri": "https://localhost:8080/"
1210 |         },
1211 |         "outputId": "9aa644fc-cb77-4e73-823b-fd749617722e"
1212 |       },
1213 |       "execution_count": null,
1214 |       "outputs": [
1215 |         {
1216 |           "output_type": "stream",
1217 |           "name": "stdout",
1218 |           "text": [
1219 |             "Time using .apply(): 0.021090030670166016 sec\n"
1220 |           ]
1221 |         }
1222 |       ]
1223 |     },
1224 |     {
1225 |       "cell_type": "markdown",
1226 |       "source": [
1227 |         "By comparing the **.apply()** function with the native panda's function for summing over rows, we can see that pandas’ native .sum() functions perform the same operation faster.\n",
1228 |         "\n"
1229 |       ],
1230 |       "metadata": {
1231 |         "id": "KybwcTHL4YNl"
1232 |       }
1233 |     },
1234 |     {
1235 |       "cell_type": "code",
1236 |       "source": [
1237 |         "pandas_start_time = time.time()\n",
1238 |         "poker_data[['R1', 'R1', 'R3', 'R4', 'R5']].sum(axis=0)\n",
1239 |         "pandas_end_time = time.time()\n",
1240 |         "pandas_time = pandas_end_time - pandas_start_time\n",
1241 |         "print(\"Time using pandas: {} sec\".format(pandas_time))"
1242 |       ],
1243 |       "metadata": {
1244 |         "colab": {
1245 |           "base_uri": "https://localhost:8080/"
1246 |         },
1247 |         "id": "c9N3WsQDP_kF",
1248 |         "outputId": "7a395b28-fb3f-4e58-8189-d24e9714fc62"
1249 |       },
1250 |       "execution_count": null,
1251 |       "outputs": [
1252 |         {
1253 |           "output_type": "stream",
1254 |           "name": "stdout",
1255 |           "text": [
1256 |             "Time using pandas: 0.0039751529693603516 sec\n"
1257 |           ]
1258 |         }
1259 |       ]
1260 |     },
1261 |     {
1262 |       "cell_type": "code",
1263 |       "source": [
1264 |         "print('The differnce: {} %'.format((apply_time - pandas_time) / pandas_time * 100))"
1265 |       ],
1266 |       "metadata": {
1267 |         "colab": {
1268 |           "base_uri": "https://localhost:8080/"
1269 |         },
1270 |         "id": "89czJRWaQC83",
1271 |         "outputId": "1e30f4ea-2d55-4035-9def-e2d587b4296d"
1272 |       },
1273 |       "execution_count": null,
1274 |       "outputs": [
1275 |         {
1276 |           "output_type": "stream",
1277 |           "name": "stdout",
1278 |           "text": [
1279 |             "The differnce: 430.54639237089907 %\n"
1280 |           ]
1281 |         }
1282 |       ]
1283 |     },
1284 |     {
1285 |       "cell_type": "markdown",
1286 |       "source": [
1287 |         "In conclusion, we observe that the .apply() function performs faster when we want to iterate through all the rows of a pandas DataFrame, but is slower when we perform the same operation through a column.\n",
1288 |         "\n"
1289 |       ],
1290 |       "metadata": {
1291 |         "id": "ul9vEJjD40cf"
1292 |       }
1293 |     },
1294 |     {
1295 |       "cell_type": "markdown",
1296 |       "source": [
1297 |         "## 4.Looping effectively using vectorization"
1298 |       ],
1299 |       "metadata": {
1300 |         "id": "5RjLUtkmUuhy"
1301 |       }
1302 |     },
1303 |     {
1304 |       "cell_type": "markdown",
1305 |       "source": [
1306 |         "To understand how we can reduce the amount of iteration performed by the function, recall that the fundamental units of Pandas, DataFrames, and Series, are both based on arrays. Pandas perform more efficiently when an operation is performed to a whole array than to each value separately or sequentially. This can be achieved through vectorization. Vectorization is the process of executing operations on entire arrays.\n",
1307 |         "\n",
1308 |         "In the code below we want to calculate the sum of the ranks of all the cards in each hand. In order to do that, we slice the poker dataset keeping only the columns that contain the ranks of each card. Then, we call the built-in .sum() property of the DataFrame, using the parameter axis = 1 to denote that we want the sum for each row. In the end, we print the sum of the first five rows of the data.\n",
1309 |         "\n"
1310 |       ],
1311 |       "metadata": {
1312 |         "id": "k1sJKs2d49tX"
1313 |       }
1314 |     },
1315 |     {
1316 |       "cell_type": "code",
1317 |       "source": [
1318 |         "start_time_vectorization = time.time()\n",
1319 |         "\n",
1320 |         "poker_data[['R1', 'R2', 'R3', 'R4', 'R5']].sum(axis=1)\n",
1321 |         "end_time_vectorization = time.time()\n",
1322 |         "\n",
1323 |         "vectorization_time = end_time_vectorization  - start_time_vectorization\n",
1324 |         "print(\"Time using pandas vectorization: {} sec\".format(vectorization_time))"
1325 |       ],
1326 |       "metadata": {
1327 |         "colab": {
1328 |           "base_uri": "https://localhost:8080/"
1329 |         },
1330 |         "id": "QsTG45mKQ-Mi",
1331 |         "outputId": "1acedf98-4a2a-4f9b-d04c-f4dc062baf52"
1332 |       },
1333 |       "execution_count": null,
1334 |       "outputs": [
1335 |         {
1336 |           "output_type": "stream",
1337 |           "name": "stdout",
1338 |           "text": [
1339 |             "Time using pandas vectorization: 0.009327411651611328 sec\n"
1340 |           ]
1341 |         }
1342 |       ]
1343 |     },
1344 |     {
1345 |       "cell_type": "markdown",
1346 |       "source": [
1347 |         "We saw previously various methods that perform functions applied to a DataFrame faster than simply iterating through all the rows of the DataFrame. Our goal is to find the most efficient method to perform this task.\n",
1348 |         "\n"
1349 |       ],
1350 |       "metadata": {
1351 |         "id": "ljv_-ySq7qRL"
1352 |       }
1353 |     },
1354 |     {
1355 |       "cell_type": "markdown",
1356 |       "source": [
1357 |         "Using .iterrows() to loop through the DataFrame:\n"
1358 |       ],
1359 |       "metadata": {
1360 |         "id": "3H338N5h7wgS"
1361 |       }
1362 |     },
1363 |     {
1364 |       "cell_type": "code",
1365 |       "source": [
1366 |         "data_generator = poker_data.iterrows()\n",
1367 |         "\n",
1368 |         "start_time_iterrows = time.time()\n",
1369 |         "\n",
1370 |         "for index, value in data_generator:\n",
1371 |         "  sum([value[1], value[3], value[5], value[7]])\n",
1372 |         "\n",
1373 |         "end_time_iterrows = time.time()\n",
1374 |         "iterrows_time = end_time_iterrows - start_time_iterrows\n",
1375 |         "print(\"Time using .iterrows() {} seconds \" .format(iterrows_time))"
1376 |       ],
1377 |       "metadata": {
1378 |         "colab": {
1379 |           "base_uri": "https://localhost:8080/"
1380 |         },
1381 |         "id": "kyivEyu_V1OO",
1382 |         "outputId": "816b1050-4228-42e2-a81c-164b75cab419"
1383 |       },
1384 |       "execution_count": null,
1385 |       "outputs": [
1386 |         {
1387 |           "output_type": "stream",
1388 |           "name": "stdout",
1389 |           "text": [
1390 |             "Time using .iterrows() 1.1502439975738525 seconds \n"
1391 |           ]
1392 |         }
1393 |       ]
1394 |     },
1395 |     {
1396 |       "cell_type": "markdown",
1397 |       "source": [
1398 |         "Using the .apply() mehtod\n"
1399 |       ],
1400 |       "metadata": {
1401 |         "id": "0FURW1ta71Mj"
1402 |       }
1403 |     },
1404 |     {
1405 |       "cell_type": "code",
1406 |       "source": [
1407 |         "start_time_apply = time.time()\n",
1408 |         "poker_data[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x),axis=1)\n",
1409 |         "end_time_apply = time.time()\n",
1410 |         "\n",
1411 |         "apply_time = end_time_apply - start_time_apply\n",
1412 |         "\n",
1413 |         "print(\"Time using apply() {} seconds\"  .format(apply_time))"
1414 |       ],
1415 |       "metadata": {
1416 |         "colab": {
1417 |           "base_uri": "https://localhost:8080/"
1418 |         },
1419 |         "id": "M4rz9TNIWI1m",
1420 |         "outputId": "8434927d-9217-4c07-c08d-4ec2f97ba4db"
1421 |       },
1422 |       "execution_count": null,
1423 |       "outputs": [
1424 |         {
1425 |           "output_type": "stream",
1426 |           "name": "stdout",
1427 |           "text": [
1428 |             "Time using apply() 0.3497791290283203 seconds\n"
1429 |           ]
1430 |         }
1431 |       ]
1432 |     },
1433 |     {
1434 |       "cell_type": "markdown",
1435 |       "source": [
1436 |         "Comparing the time it takes to sum the ranks of all the cards in each hand using vectorization, the .iterrows() function, and the .apply() function, we can see that the vectorization method performs much better.\n",
1437 |         "\n",
1438 |         "We can also use another vectorization method to effectively iterate through the DataFrame which is using Numpy arrays to vectorize the DataFrame.\n",
1439 |         "\n",
1440 |         "The NumPy library, which defines itself as a “fundamental package for scientific computing in Python”, performs operations under the hood in optimized, pre-compiled C code. Similar to pandas working with arrays, NumPy operates on arrays called ndarrays. A major difference between Series and ndarrays is that ndarrays leave out many operations such as indexing, data type checking, etc. As a result, operations on NumPy arrays can be significantly faster than operations on pandas Series. NumPy arrays can be used in place of the pandas Series when the additional functionality offered by the pandas Series isn’t critical.\n",
1441 |         "\n",
1442 |         "For the problems we explore in this article, we could use NumPy ndarrays instead of the pandas series. The question at stake is whether this would be more efficient or not.\n",
1443 |         "\n",
1444 |         "Again, we will calculate the sum of the ranks of all the cards in each hand. We convert our rank arrays from pandas Series to NumPy arrays simply by using the .values method of pandas Series, which returns a pandas Series as a NumPy ndarray. As with vectorization on the series, passing the NumPy array directly into the function will lead pandas to apply the function to the entire vector.\n",
1445 |         "\n"
1446 |       ],
1447 |       "metadata": {
1448 |         "id": "WO9COAX876tv"
1449 |       }
1450 |     },
1451 |     {
1452 |       "cell_type": "code",
1453 |       "source": [
1454 |         "start_time = time.time()\n",
1455 |         "\n",
1456 |         "poker_data[['R1', 'R2', 'R3', 'R4', 'R5']].values.sum(axis=1)\n",
1457 |         "\n",
1458 |         "print(\"Time using NumPy vectorization: {} sec\" .format(time.time() - start_time))"
1459 |       ],
1460 |       "metadata": {
1461 |         "colab": {
1462 |           "base_uri": "https://localhost:8080/"
1463 |         },
1464 |         "id": "wJaJBbiI5Q5M",
1465 |         "outputId": "27ca2ac9-9a87-4a3d-8b0d-2f1ec772a7d4"
1466 |       },
1467 |       "execution_count": null,
1468 |       "outputs": [
1469 |         {
1470 |           "output_type": "stream",
1471 |           "name": "stdout",
1472 |           "text": [
1473 |             "Time using NumPy vectorization: 0.001745462417602539 sec\n"
1474 |           ]
1475 |         }
1476 |       ]
1477 |     },
1478 |     {
1479 |       "cell_type": "code",
1480 |       "source": [
1481 |         "start_time = time.time()\n",
1482 |         "poker_data[['R1', 'R2', 'R3', 'R4', 'R5']].sum(axis=1)\n",
1483 |         "print(\"Time using the pandas vectorization %s seconds\" % (time.time() - start_time))"
1484 |       ],
1485 |       "metadata": {
1486 |         "colab": {
1487 |           "base_uri": "https://localhost:8080/"
1488 |         },
1489 |         "id": "OVGS_FGb5eeV",
1490 |         "outputId": "e8635e7b-ef4a-42a5-b4b2-ddca0778cd56"
1491 |       },
1492 |       "execution_count": null,
1493 |       "outputs": [
1494 |         {
1495 |           "output_type": "stream",
1496 |           "name": "stdout",
1497 |           "text": [
1498 |             "Time using the  0.003729104995727539 seconds\n"
1499 |           ]
1500 |         }
1501 |       ]
1502 |     },
1503 |     {
1504 |       "cell_type": "markdown",
1505 |       "source": [
1506 |         "## 5. Summary of best practices for looping through DataFrame\n",
1507 |         "* Using **.iterrows()** does not improve the speed of iterating through the DataFrame but it is more efficient.\n",
1508 |         "* The **.apply()** function performs faster when we want to iterate through all the rows of a pandas DataFrame, but is slower when we perform the same operation through a column.\n",
1509 |         "* Vectorizing over the pandas series achieves the overwhelming majority of optimization needs for everyday calculations. However, if speed is of the highest priority, we can call in reinforcements in the form of the NumPy Python library."
1510 |       ],
1511 |       "metadata": {
1512 |         "id": "lXeC6MGF8MfY"
1513 |       }
1514 |     }
1515 |   ]
1516 | }


--------------------------------------------------------------------------------
/Write Efficient Python Code [Defining and Measuring Code Efficiency].ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# Write Efficient Python Code: Defining & Measuring Code Efficiency"
   8 |    ]
   9 |   },
  10 |   {
  11 |    "cell_type": "markdown",
  12 |    "metadata": {},
  13 |    "source": [
  14 |     "As a data scientist, you should spend most of your time working on gaining insights from data not waiting for your code to finish running. Writing efficient Python code can help reduce runtime and save computational resources, ultimately freeing you up to do the things that have more impact."
  15 |    ]
  16 |   },
  17 |   {
  18 |    "cell_type": "markdown",
  19 |    "metadata": {},
  20 |    "source": [
  21 |     "Table of Content: \n",
  22 |     "1. Define Efficiency\n",
  23 |     "   \n",
  24 |     "   1.1. What is meant by efficient code?\n",
  25 |     "   \n",
  26 |     "   1.2. Python Standard Libraries\n",
  27 |     "   \n",
  28 |     "\n",
  29 |     "2. Python Code Timing and Profiling\n",
  30 |     "   \n",
  31 |     "   2.1. Python Runtime Investigation\n",
  32 |     "   \n",
  33 |     "   2.2. Code profiling for runtime\n",
  34 |     "   \n",
  35 |     "   2.3. Code profiling for memory use"
  36 |    ]
  37 |   },
  38 |   {
  39 |    "cell_type": "markdown",
  40 |    "metadata": {},
  41 |    "source": [
  42 |     "## 1. Define Efficiency "
  43 |    ]
  44 |   },
  45 |   {
  46 |    "cell_type": "markdown",
  47 |    "metadata": {},
  48 |    "source": [
  49 |     "### 1.1. What is meant by efficient code?"
  50 |    ]
  51 |   },
  52 |   {
  53 |    "cell_type": "markdown",
  54 |    "metadata": {},
  55 |    "source": [
  56 |     "Efficient refers to code that satisfies two key concepts. First, efficient code is fast and has a small latency between execution and returning a result. Second, efficient code allocates resources skillfully and isn't subjected to unnecessary overhead. Although in general your definition of fast runtime and small memory usage may differ depending on the task at hand, the goal of writing efficient code is still to reduce both latency and overhead."
  57 |    ]
  58 |   },
  59 |   {
  60 |    "cell_type": "markdown",
  61 |    "metadata": {},
  62 |    "source": [
  63 |     "Python is a language that prides itself on code readability, and thus, it comes with its own set of idioms and best practices. Writing Python code the way it was intended is often referred to as Pythonic code. This means the code that you write follows the best practices and guiding principles of Python. Pythonic code tends to be less verbose and easier to interpret. Although Python supports code that doesn't follow its guiding principles, this type of code tends to run slower."
  64 |    ]
  65 |   },
  66 |   {
  67 |    "cell_type": "markdown",
  68 |    "metadata": {},
  69 |    "source": [
  70 |     "As an example, look at the non-Pythonic below. Not only is this code more verbose than the Pythonic version, but it also takes longer to run. We'll take a closer look at why this is the case later on in the course, but for now, the main takeaway here is that Pythonic code is efficient code!"
  71 |    ]
  72 |   },
  73 |   {
  74 |    "cell_type": "code",
  75 |    "execution_count": 1,
  76 |    "metadata": {},
  77 |    "outputs": [],
  78 |    "source": [
  79 |     "numbers = [1,2,3,4,5]\n",
  80 |     "\n",
  81 |     "# Non-Pythonic \n",
  82 |     "doubled_numbers = []\n",
  83 |     "\n",
  84 |     "for i in range(len(numbers)):\n",
  85 |     "    doubled_numbers.append(numbers[i]*2)\n",
  86 |     "    \n",
  87 |     "# pythonic \n",
  88 |     "\n",
  89 |     "double_numbers = [x*2 for x in numbers ]"
  90 |    ]
  91 |   },
  92 |   {
  93 |    "cell_type": "markdown",
  94 |    "metadata": {},
  95 |    "source": [
  96 |     "### 1.2. Python Standard Libraries"
  97 |    ]
  98 |   },
  99 |   {
 100 |    "cell_type": "markdown",
 101 |    "metadata": {},
 102 |    "source": [
 103 |     "Python Standard Libraries are the Built-in components and libraries of python. These libraries come with every Python installation and are commonly cited as one of Python's greatest strengths. Python has a number of built-in types."
 104 |    ]
 105 |   },
 106 |   {
 107 |    "cell_type": "markdown",
 108 |    "metadata": {},
 109 |    "source": [
 110 |     "Let's start exploring some of the mentioned functions:\n",
 111 |     "* range(): This is a handy tool whenever we want to create a sequence of numbers. Suppose we wanted to create a list of integers from zero to ten. We could explicitly type out each integer, but that is not very efficient. Instead, we can use range to accomplish this task. We can provide a range with a start and stop value to create this sequence. Or, we can provide just a stop value assuming that we want our sequence to start at zero. Notice that the stop value is exclusive, or up to but not including this value. Also note the range function returns a range object, which we can convert into a list and print. The range function can also accept a start, stop, and step value (in that order)."
 112 |    ]
 113 |   },
 114 |   {
 115 |    "cell_type": "code",
 116 |    "execution_count": 2,
 117 |    "metadata": {},
 118 |    "outputs": [
 119 |     {
 120 |      "name": "stdout",
 121 |      "output_type": "stream",
 122 |      "text": [
 123 |       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]\n",
 124 |       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]\n",
 125 |       "[2, 4, 6, 8, 10]\n"
 126 |      ]
 127 |     }
 128 |    ],
 129 |    "source": [
 130 |     "# range(start,stop)\n",
 131 |     "nums = range(0,11)\n",
 132 |     "nums_list = list(nums)\n",
 133 |     "print(nums_list)\n",
 134 |     "\n",
 135 |     "# range(stop)\n",
 136 |     "nums = range(11)\n",
 137 |     "nums_list = list(nums)\n",
 138 |     "print(nums_list)\n",
 139 |     "\n",
 140 |     "# Using range() with a step value\n",
 141 |     "\n",
 142 |     "even_nums = range(2, 11, 2)\n",
 143 |     "even_nums_list = list(even_nums)\n",
 144 |     "print(even_nums_list)"
 145 |    ]
 146 |   },
 147 |   {
 148 |    "cell_type": "markdown",
 149 |    "metadata": {},
 150 |    "source": [
 151 |     "**enumerate():** Another useful built-in function is enumerate. enumerate creates an index item pair for each item in the object provided. For example, calling enumerate on the list letters produces a sequence of indexed values. Similar to the range function, enumerate returns an enumerate object, which can also be converted into a list and printed."
 152 |    ]
 153 |   },
 154 |   {
 155 |    "cell_type": "code",
 156 |    "execution_count": 2,
 157 |    "metadata": {
 158 |     "scrolled": true
 159 |    },
 160 |    "outputs": [
 161 |     {
 162 |      "name": "stdout",
 163 |      "output_type": "stream",
 164 |      "text": [
 165 |       "[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]\n"
 166 |      ]
 167 |     }
 168 |    ],
 169 |    "source": [
 170 |     "# Creates an indexed list of objects using enumerate\n",
 171 |     "letters = ['a', 'b', 'c', 'd' ]\n",
 172 |     "indexed_letters = enumerate(letters)\n",
 173 |     "indexed_letters_list = list(indexed_letters)\n",
 174 |     "# print(indexed_letters_list)"
 175 |    ]
 176 |   },
 177 |   {
 178 |    "cell_type": "markdown",
 179 |    "metadata": {},
 180 |    "source": [
 181 |     "We can also specify the starting index of enumerate with the keyword argument start. Here, we tell enumerate to start the index at five by passing start equals five into the function call."
 182 |    ]
 183 |   },
 184 |   {
 185 |    "cell_type": "code",
 186 |    "execution_count": 3,
 187 |    "metadata": {},
 188 |    "outputs": [
 189 |     {
 190 |      "name": "stdout",
 191 |      "output_type": "stream",
 192 |      "text": [
 193 |       "[(5, 'a'), (6, 'b'), (7, 'c'), (8, 'd')]\n"
 194 |      ]
 195 |     }
 196 |    ],
 197 |    "source": [
 198 |     "#specify a start value\n",
 199 |     "letters = ['a', 'b', 'c', 'd' ]\n",
 200 |     "indexed_letters2 = enumerate(letters, start=5)\n",
 201 |     "indexed_letters2_list = list(indexed_letters2)\n",
 202 |     "print(indexed_letters2_list)"
 203 |    ]
 204 |   },
 205 |   {
 206 |    "cell_type": "markdown",
 207 |    "metadata": {},
 208 |    "source": [
 209 |     "**map():** The last notable built-in function we'll cover is the map() function. map applies a function to each element in an object. Notice that the map function takes two arguments; first, the function you'd like to apply, and second, the object you'd like to apply that function on. Here, we use a map to apply the built-in function round to each element of the nums list."
 210 |    ]
 211 |   },
 212 |   {
 213 |    "cell_type": "code",
 214 |    "execution_count": 4,
 215 |    "metadata": {},
 216 |    "outputs": [
 217 |     {
 218 |      "name": "stdout",
 219 |      "output_type": "stream",
 220 |      "text": [
 221 |       "[2, 2, 3, 5, 5]\n"
 222 |      ]
 223 |     }
 224 |    ],
 225 |    "source": [
 226 |     "nums = [1.5, 2.3, 3.4, 4.6, 5.0]\n",
 227 |     "rnd_nums = map(round, nums)\n",
 228 |     "print(list(rnd_nums))"
 229 |    ]
 230 |   },
 231 |   {
 232 |    "cell_type": "markdown",
 233 |    "metadata": {},
 234 |    "source": [
 235 |     "The map function can also be used with a lambda, or, an anonymous function. Notice here, that we can use the map function and a lambda expression to apply a function, which we've defined on the fly, to our original list nums. The map function provides a quick and clean way to apply a function to an object iteratively without writing a for loop."
 236 |    ]
 237 |   },
 238 |   {
 239 |    "cell_type": "code",
 240 |    "execution_count": 6,
 241 |    "metadata": {},
 242 |    "outputs": [
 243 |     {
 244 |      "name": "stdout",
 245 |      "output_type": "stream",
 246 |      "text": [
 247 |       "[1, 4, 9, 16, 25]\n"
 248 |      ]
 249 |     }
 250 |    ],
 251 |    "source": [
 252 |     "# map() with lambda \n",
 253 |     "nums = [1, 2, 3, 4, 5]\n",
 254 |     "sqrd_nums = map(lambda x: x ** 2, nums)\n",
 255 |     "print(list(sqrd_nums))"
 256 |    ]
 257 |   },
 258 |   {
 259 |    "cell_type": "markdown",
 260 |    "metadata": {},
 261 |    "source": [
 262 |     "NumPy, or Numerical Python, is an invaluable Python package for Data Scientists. It is the fundamental package for scientific computing in Python and provides a number of benefits for writing efficient code.\n",
 263 |     "\n",
 264 |     "**NumPy arrays** provide a fast and memory-efficient alternative to Python lists. Typically, we import NumPy as np and use np dot array to create a NumPy array."
 265 |    ]
 266 |   },
 267 |   {
 268 |    "cell_type": "code",
 269 |    "execution_count": 7,
 270 |    "metadata": {},
 271 |    "outputs": [
 272 |     {
 273 |      "name": "stdout",
 274 |      "output_type": "stream",
 275 |      "text": [
 276 |       "[0, 1, 2, 3, 4]\n",
 277 |       "[0 1 2 3 4]\n"
 278 |      ]
 279 |     }
 280 |    ],
 281 |    "source": [
 282 |     "# python list \n",
 283 |     "nums_list = list(range(5))\n",
 284 |     "print(nums_list)\n",
 285 |     "\n",
 286 |     "# using numpyu alternative to python lists\n",
 287 |     "import numpy as np\n",
 288 |     "nums_np = np.array(range(5))\n",
 289 |     "print(nums_np)"
 290 |    ]
 291 |   },
 292 |   {
 293 |    "cell_type": "markdown",
 294 |    "metadata": {},
 295 |    "source": [
 296 |     "NumPy arrays are homogeneous, which means that they must contain elements of the same type. We can see the type of each element using the .dtype method. Suppose we created an array using a mixture of types. Here, we create the array nums_np_floats using the integers [1,3] and a float [2.5]. Can you spot the difference in the output? The integers now have a proceeding dot in the array. That's because NumPy converted the integers to floats to retain that array's homogeneous nature. Using .dtype, we can verify that the elements in this array are floats."
 297 |    ]
 298 |   },
 299 |   {
 300 |    "cell_type": "code",
 301 |    "execution_count": 9,
 302 |    "metadata": {},
 303 |    "outputs": [
 304 |     {
 305 |      "name": "stdout",
 306 |      "output_type": "stream",
 307 |      "text": [
 308 |       "integer numpy array [1 2 3]\n",
 309 |       "int32\n",
 310 |       "float numpy array [1.  2.5 3. ]\n",
 311 |       "float64\n"
 312 |      ]
 313 |     }
 314 |    ],
 315 |    "source": [
 316 |     "# NumPy array homogeneity\n",
 317 |     "nums_np_ints = np.array([1, 2, 3]) \n",
 318 |     "print('integer numpy array',nums_np_ints)\n",
 319 |     "print(nums_np_ints.dtype)\n",
 320 |     "\n",
 321 |     "nums_np_floats = np.array([1, 2.5, 3])\n",
 322 |     "print('float numpy array',nums_np_floats)\n",
 323 |     "print(nums_np_floats.dtype)"
 324 |    ]
 325 |   },
 326 |   {
 327 |    "cell_type": "markdown",
 328 |    "metadata": {},
 329 |    "source": [
 330 |     "When analyzing data, you'll often want to perform operations over entire collections of values quickly. Say, for example, you'd like to square each number within a list of numbers. It'd be nice if we could simply square the list, and get a list of squared values returned. Unfortunately, Python lists don't support these types of calculations."
 331 |    ]
 332 |   },
 333 |   {
 334 |    "cell_type": "code",
 335 |    "execution_count": 10,
 336 |    "metadata": {},
 337 |    "outputs": [
 338 |     {
 339 |      "ename": "TypeError",
 340 |      "evalue": "unsupported operand type(s) for ** or pow(): 'list' and 'int'",
 341 |      "output_type": "error",
 342 |      "traceback": [
 343 |       "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
 344 |       "\u001b[1;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
 345 |       "\u001b[1;32m<ipython-input-10-14d223ec0fdb>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m      1\u001b[0m \u001b[1;31m# Python lists don't support broadcasting\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      2\u001b[0m \u001b[0mnums\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[1;33m-\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m-\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mnums\u001b[0m \u001b[1;33m**\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
 346 |       "\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for ** or pow(): 'list' and 'int'"
 347 |      ]
 348 |     }
 349 |    ],
 350 |    "source": [
 351 |     "# Python lists don't support broadcasting\n",
 352 |     "nums = [-2, -1, 0, 1, 2]\n",
 353 |     "nums ** 2"
 354 |    ]
 355 |   },
 356 |   {
 357 |    "cell_type": "markdown",
 358 |    "metadata": {},
 359 |    "source": [
 360 |     "We could square the values using a list by writing a for loop or using a list comprehension as shown in the code below. But neither of these approaches is the most efficient way of doing this. Here lies the second advantage of NumPy arrays - their broadcasting functionality. NumPy arrays vectorize operations, so they are performed on all elements of an object at once. This allows us to efficiently perform calculations over entire arrays. Let's compare the computational time using these three approaches in the following code:"
 361 |    ]
 362 |   },
 363 |   {
 364 |    "cell_type": "code",
 365 |    "execution_count": 24,
 366 |    "metadata": {},
 367 |    "outputs": [
 368 |     {
 369 |      "name": "stdout",
 370 |      "output_type": "stream",
 371 |      "text": [
 372 |       "Execution time using for loops over list: 0.0020008087158203125 seconds\n",
 373 |       "Execution time using list comprehension: 0.0020008087158203125 seconds\n",
 374 |       "Execution time using numpy array broadcasting: 0.0010006427764892578 seconds\n"
 375 |      ]
 376 |     }
 377 |    ],
 378 |    "source": [
 379 |     "import time\n",
 380 |     "\n",
 381 |     "# define numerical list \n",
 382 |     "nums = range(0,1000)\n",
 383 |     "nums = list(nums)\n",
 384 |     "\n",
 385 |     "# For loop (inefficient option)\n",
 386 |     "# get the start time\n",
 387 |     "st = time.time()\n",
 388 |     "\n",
 389 |     "sqrd_nums = []\n",
 390 |     "for num in nums:\n",
 391 |     "    sqrd_nums.append(num ** 2)\n",
 392 |     "#print(sqrd_nums)\n",
 393 |     "\n",
 394 |     "# get the end time\n",
 395 |     "et = time.time()\n",
 396 |     "\n",
 397 |     "# get the execution time\n",
 398 |     "elapsed_time = et - st\n",
 399 |     "print('Execution time using for loops over list:', elapsed_time, 'seconds')\n",
 400 |     "\n",
 401 |     "\n",
 402 |     "# List comprehension (better option but not best)\n",
 403 |     "# get the start time\n",
 404 |     "st = time.time()\n",
 405 |     "\n",
 406 |     "sqrd_nums = [num ** 2 for num in nums]\n",
 407 |     "#print(sqrd_nums)\n",
 408 |     "\n",
 409 |     "# get the end time\n",
 410 |     "et = time.time()\n",
 411 |     "print('Execution time using list comprehension:', elapsed_time, 'seconds')\n",
 412 |     "\n",
 413 |     "\n",
 414 |     "# using numpy array broadcasting\n",
 415 |     "\n",
 416 |     "# define the numpy array \n",
 417 |     "nums_np = np.arange(0,1000)\n",
 418 |     "# get the start time\n",
 419 |     "st = time.time()\n",
 420 |     "\n",
 421 |     "nums_np ** 2\n",
 422 |     "\n",
 423 |     "# get the end time\n",
 424 |     "et = time.time()\n",
 425 |     "\n",
 426 |     "# get the execution time\n",
 427 |     "elapsed_time = et - st\n",
 428 |     "print('Execution time using numpy array broadcasting:', elapsed_time, 'seconds')\n"
 429 |    ]
 430 |   },
 431 |   {
 432 |    "cell_type": "markdown",
 433 |    "metadata": {},
 434 |    "source": [
 435 |     "We can see that the first two approaches have the same time complexity while using NumPy in the third approach has decreased the computational time to half."
 436 |    ]
 437 |   },
 438 |   {
 439 |    "cell_type": "markdown",
 440 |    "metadata": {},
 441 |    "source": [
 442 |     "Another advantage of NumPy arrays is their indexing capabilities. When comparing basic indexing between a one-dimensional array and a list, the capabilities are identical. When using two-dimensional arrays and lists, the advantages of arrays are clear. To return the second item of the first row in our two-dimensional object, the array syntax is [0,1]. The analogous list syntax is a bit more verbose as you have to surround both the zero and one with square brackets [0][1]. To return the first column values in the 2-d object, the array syntax is [:,0]. Lists don't support this type of syntax, so we must use a list comprehension to return columns."
 443 |    ]
 444 |   },
 445 |   {
 446 |    "cell_type": "code",
 447 |    "execution_count": 27,
 448 |    "metadata": {},
 449 |    "outputs": [
 450 |     {
 451 |      "name": "stdout",
 452 |      "output_type": "stream",
 453 |      "text": [
 454 |       "2\n",
 455 |       "2\n",
 456 |       "[1, 4]\n",
 457 |       "[1 4]\n"
 458 |      ]
 459 |     }
 460 |    ],
 461 |    "source": [
 462 |     "#2-D list\n",
 463 |     "nums2 = [ [1, 2, 3],\n",
 464 |     "        [4, 5, 6] ]\n",
 465 |     "\n",
 466 |     "\n",
 467 |     "# 2-D array\n",
 468 |     "nums2_np = np.array(nums2)\n",
 469 |     "\n",
 470 |     "# printing the second item of the first row \n",
 471 |     "print(nums2[0][1])\n",
 472 |     "print(nums2_np[0,1])\n",
 473 |     "\n",
 474 |     "# printing the first row values \n",
 475 |     "\n",
 476 |     "print([row[0] for row in nums2])\n",
 477 |     "print(nums2_np[:,0])"
 478 |    ]
 479 |   },
 480 |   {
 481 |    "cell_type": "markdown",
 482 |    "metadata": {},
 483 |    "source": [
 484 |     "NumPy arrays also have a special technique called boolean indexing. Suppose we wanted to gather only positive numbers from the sequence listed here. With an array, we can create a boolean mask using a simple inequality. Indexing the array is as simple as enclosing this inequality in square brackets. However, to do this using a list, we need to write a for loop to filter the list or use a list comprehension. In either case, using a NumPy array to the index is less verbose and has a faster runtime."
 485 |    ]
 486 |   },
 487 |   {
 488 |    "cell_type": "code",
 489 |    "execution_count": 46,
 490 |    "metadata": {},
 491 |    "outputs": [
 492 |     {
 493 |      "name": "stdout",
 494 |      "output_type": "stream",
 495 |      "text": [
 496 |       "[1 2]\n",
 497 |       "[1, 2]\n",
 498 |       "[1, 2]\n"
 499 |      ]
 500 |     }
 501 |    ],
 502 |    "source": [
 503 |     "nums = [-2, -1, 0, 1, 2]\n",
 504 |     "nums_np = np.array(nums)\n",
 505 |     "\n",
 506 |     "# Boolean indexing\n",
 507 |     "print(nums_np[nums_np > 0])\n",
 508 |     "\n",
 509 |     "# No boolean indexing for lists\n",
 510 |     "# For loop (inefficient option)\n",
 511 |     "\n",
 512 |     "\n",
 513 |     "pos = []\n",
 514 |     "for num in nums:\n",
 515 |     "    if num > 0:\n",
 516 |     "        pos.append(num)\n",
 517 |     "print(pos)\n",
 518 |     "\n",
 519 |     "\n",
 520 |     "# List comprehension (better option but not best)\n",
 521 |     "pos = [num for num in nums if num > 0]\n",
 522 |     "print(pos)"
 523 |    ]
 524 |   },
 525 |   {
 526 |    "cell_type": "markdown",
 527 |    "metadata": {},
 528 |    "source": [
 529 |     "## 2.Python Code Timing and Profiling"
 530 |    ]
 531 |   },
 532 |   {
 533 |    "cell_type": "markdown",
 534 |    "metadata": {},
 535 |    "source": [
 536 |     "In the second section of the article, you will learn how to gather and compare runtimes between different coding approaches. You'll practice using the line_profiler and memory_profiler packages to profile your code base and spot bottlenecks. Then, you'll put your learnings to practice by replacing these bottlenecks with efficient Python code."
 537 |    ]
 538 |   },
 539 |   {
 540 |    "cell_type": "markdown",
 541 |    "metadata": {},
 542 |    "source": [
 543 |     "### 2.1. Python Runtime Investigation "
 544 |    ]
 545 |   },
 546 |   {
 547 |    "cell_type": "markdown",
 548 |    "metadata": {},
 549 |    "source": [
 550 |     "As mentioned in the previous chapter code efficiency code means fast code. To be able to measure how fast our code is we need to be able to measure the code runtime. Comparing runtimes between two code bases, that effectively do the same thing, allows us to pick the code with the optimal performance. By gathering and analyzing runtimes, we can be sure to implement the code that is fastest and thus more efficient.\n",
 551 |     "\n",
 552 |     "To compare runtimes, we need to be able to compute the runtime for a line or multiple lines of code. IPython comes with some handy built-in magic commands we can use to time our code. Magic commands are enhancements that have been added on top of the normal Python syntax. These commands are prefixed with the percentage sign. If you aren't familiar with magic commands take a moment to review the documentation. \n",
 553 |     "\n",
 554 |     "\n",
 555 |     "Let's start with this example: we want to inspect the runtime for selecting 1,000 random numbers between zero and one using NumPy's **random.rand()** function. Using %timeit just requires adding the magic command before the line of code we want to analyze. That's it! One simple command to gather runtimes."
 556 |    ]
 557 |   },
 558 |   {
 559 |    "cell_type": "code",
 560 |    "execution_count": 2,
 561 |    "metadata": {},
 562 |    "outputs": [
 563 |     {
 564 |      "name": "stdout",
 565 |      "output_type": "stream",
 566 |      "text": [
 567 |       "26.6 µs ± 4.32 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
 568 |      ]
 569 |     }
 570 |    ],
 571 |    "source": [
 572 |     "%timeit rand_nums = np.random.rand(1000)"
 573 |    ]
 574 |   },
 575 |   {
 576 |    "cell_type": "markdown",
 577 |    "metadata": {},
 578 |    "source": [
 579 |     "As we can see %timeit provides an average of timing statistics. This is one of the advantages of using %timeit. We also see that multiple runs and loops were generated. %timeit runs through the provided code multiple times to estimate the code's average execution time. This provides a more accurate representation of the actual runtime rather than relying on just one iteration to calculate the runtime. The mean and standard deviation displayed in the output is a summary of the runtime considering each of the multiple runs."
 580 |    ]
 581 |   },
 582 |   {
 583 |    "cell_type": "markdown",
 584 |    "metadata": {},
 585 |    "source": [
 586 |     "#### Specifying number of runs/loops"
 587 |    ]
 588 |   },
 589 |   {
 590 |    "cell_type": "markdown",
 591 |    "metadata": {},
 592 |    "source": [
 593 |     "The number of runs represents how many iterations you'd like to use to estimate the runtime. The number of loops represents how many times you'd like the code to be executed per run. We can specify the number of runs, using the -r flag, and the number of loops, using the -n flag. Here, we use -r2, to set the number of runs to two and -n10, to set the number of loops to ten. In this example, %timeit would execute our random number selection 20 times in order to estimate runtime (2 runs each with 10 executions)."
 594 |    ]
 595 |   },
 596 |   {
 597 |    "cell_type": "code",
 598 |    "execution_count": 3,
 599 |    "metadata": {},
 600 |    "outputs": [
 601 |     {
 602 |      "name": "stdout",
 603 |      "output_type": "stream",
 604 |      "text": [
 605 |       "90 µs ± 2.64 µs per loop (mean ± std. dev. of 2 runs, 10 loops each)\n"
 606 |      ]
 607 |     }
 608 |    ],
 609 |    "source": [
 610 |     "# Set number of runs to 2 (-r2)\n",
 611 |     "# Set number of loops to 10 (-n10)\n",
 612 |     "%timeit -r2 -n10 rand_nums = np.random.rand(1000)"
 613 |    ]
 614 |   },
 615 |   {
 616 |    "cell_type": "markdown",
 617 |    "metadata": {},
 618 |    "source": [
 619 |     "Another cool feature of %timeit is its ability to run on single or multiple lines of code. When using %timeit in line magic mode, or with a single line of code, one percentage sign is used and we can run %timeit in cell magic mode (or provide multiple lines of code) by using two percentage signs."
 620 |    ]
 621 |   },
 622 |   {
 623 |    "cell_type": "code",
 624 |    "execution_count": 10,
 625 |    "metadata": {},
 626 |    "outputs": [
 627 |     {
 628 |      "name": "stdout",
 629 |      "output_type": "stream",
 630 |      "text": [
 631 |       "1.77 µs ± 62.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n"
 632 |      ]
 633 |     }
 634 |    ],
 635 |    "source": [
 636 |     "# Single line of code\n",
 637 |     "%timeit nums = [x for x in range(10)]"
 638 |    ]
 639 |   },
 640 |   {
 641 |    "cell_type": "code",
 642 |    "execution_count": 13,
 643 |    "metadata": {},
 644 |    "outputs": [
 645 |     {
 646 |      "name": "stdout",
 647 |      "output_type": "stream",
 648 |      "text": [
 649 |       "2.32 µs ± 101 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
 650 |      ]
 651 |     }
 652 |    ],
 653 |    "source": [
 654 |     "%%timeit\n",
 655 |     "# Multiple lines of code\n",
 656 |     "nums = []\n",
 657 |     "for x in range(10):\n",
 658 |     "    nums.append(x)"
 659 |    ]
 660 |   },
 661 |   {
 662 |    "cell_type": "markdown",
 663 |    "metadata": {},
 664 |    "source": [
 665 |     "We can also save the output of %timeit into a variable using the -o flag. This allows us to dig deeper into the output and see things like the time for each run, the best time for all runs, and the worst time for all runs."
 666 |    ]
 667 |   },
 668 |   {
 669 |    "cell_type": "code",
 670 |    "execution_count": 16,
 671 |    "metadata": {},
 672 |    "outputs": [
 673 |     {
 674 |      "name": "stdout",
 675 |      "output_type": "stream",
 676 |      "text": [
 677 |       "21.6 µs ± 991 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
 678 |       "The timings for all the 7 runs [1.9878090000020164e-05, 2.1997419999934208e-05, 2.3160939999979746e-05, 2.157510999995793e-05, 2.0585660000051577e-05, 2.2036539999953675e-05, 2.1905299999980344e-05]\n",
 679 |       "The best timing is 1.9878090000020164e-05\n",
 680 |       "The worst timeing is 2.3160939999979746e-05\n"
 681 |      ]
 682 |     }
 683 |    ],
 684 |    "source": [
 685 |     "# Saving the output to a variable and exploring them\n",
 686 |     "\n",
 687 |     "times = %timeit -o rand_nums = np.random.rand(1000)\n",
 688 |     "print('The timings for all the 7 runs',times.timings)\n",
 689 |     "print('The best timing is',times.best)\n",
 690 |     "print('The worst timeing is',times.worst)"
 691 |    ]
 692 |   },
 693 |   {
 694 |    "cell_type": "markdown",
 695 |    "metadata": {},
 696 |    "source": [
 697 |     "### 2.2. Code profiling for runtime"
 698 |    ]
 699 |   },
 700 |   {
 701 |    "cell_type": "markdown",
 702 |    "metadata": {},
 703 |    "source": [
 704 |     "We've covered how to time the  code using the magic command %timeit, which works well with bite-sized code. But, what if we wanted to time a large code base or see the line-by-line runtimes within a function? In this section, we'll cover a concept called code profiling that allows us to analyze code more efficiently.\n",
 705 |     "\n",
 706 |     "Code profiling is a technique used to describe how long, and how often, various parts of a program are executed. The beauty of a code profiler is its ability to gather summary statistics on individual pieces of our code without using magic commands like %timeit. We'll focus on the line_profiler package to profile a function's runtime line-by-line. \n",
 707 |     "\n",
 708 |     "Since this package isn't a part of Python's Standard Library, we need to install it separately. This can easily be done with a pip install command as shown in the code below."
 709 |    ]
 710 |   },
 711 |   {
 712 |    "cell_type": "code",
 713 |    "execution_count": 17,
 714 |    "metadata": {},
 715 |    "outputs": [
 716 |     {
 717 |      "name": "stdout",
 718 |      "output_type": "stream",
 719 |      "text": [
 720 |       "Collecting line_profiler\n",
 721 |       "  Downloading line_profiler-3.5.1-cp37-cp37m-win_amd64.whl (52 kB)\n",
 722 |       "Installing collected packages: line-profiler\n",
 723 |       "Successfully installed line-profiler-3.5.1\n"
 724 |      ]
 725 |     }
 726 |    ],
 727 |    "source": [
 728 |     "!pip install line_profiler"
 729 |    ]
 730 |   },
 731 |   {
 732 |    "cell_type": "markdown",
 733 |    "metadata": {},
 734 |    "source": [
 735 |     "Let's explore using line_profiler with an example. Suppose we have a list of names along with each someones height (in centimeters) and weight (in kilograms) loaded as NumPy arrays."
 736 |    ]
 737 |   },
 738 |   {
 739 |    "cell_type": "code",
 740 |    "execution_count": 21,
 741 |    "metadata": {},
 742 |    "outputs": [],
 743 |    "source": [
 744 |     "names = ['Ahmed', 'Mohammed', 'Youssef']\n",
 745 |     "hts = np.array([188.0, 191.0, 185.0])\n",
 746 |     "wts = np.array([ 95.0, 100.0, 75.0])"
 747 |    ]
 748 |   },
 749 |   {
 750 |    "cell_type": "markdown",
 751 |    "metadata": {},
 752 |    "source": [
 753 |     "We will then develop a function called convert_units that converts each person's height from centimeters to inches and weight from kilograms to pounds."
 754 |    ]
 755 |   },
 756 |   {
 757 |    "cell_type": "code",
 758 |    "execution_count": 24,
 759 |    "metadata": {},
 760 |    "outputs": [
 761 |     {
 762 |      "data": {
 763 |       "text/plain": [
 764 |        "{'Ahmed': (74.01559999999999, 209.4389),\n",
 765 |        " 'Mohammed': (75.19669999999999, 220.462),\n",
 766 |        " 'Youssef': (72.8345, 165.3465)}"
 767 |       ]
 768 |      },
 769 |      "execution_count": 24,
 770 |      "metadata": {},
 771 |      "output_type": "execute_result"
 772 |     }
 773 |    ],
 774 |    "source": [
 775 |     "def convert_units(names, heights, weights):\n",
 776 |     "    new_hts = [ht * 0.39370 for ht in heights]\n",
 777 |     "    new_wts = [wt * 2.20462 for wt in weights]\n",
 778 |     "    data = {}\n",
 779 |     "    for i,name in enumerate(names):\n",
 780 |     "        data[name] = (new_hts[i], new_wts[i])\n",
 781 |     "    return data\n",
 782 |     "convert_units(names, hts, wts)"
 783 |    ]
 784 |   },
 785 |   {
 786 |    "cell_type": "markdown",
 787 |    "metadata": {},
 788 |    "source": [
 789 |     "If we wanted to get an estimated runtime of this function, we could use %timeit. But, this will only give us the total execution time. What if we wanted to see how long each line within the function took to run? One solution is to use %timeit on each individual line of our convert_units function. But, that's a lot of manual work and not very efficient."
 790 |    ]
 791 |   },
 792 |   {
 793 |    "cell_type": "code",
 794 |    "execution_count": 26,
 795 |    "metadata": {},
 796 |    "outputs": [
 797 |     {
 798 |      "name": "stdout",
 799 |      "output_type": "stream",
 800 |      "text": [
 801 |       "13.1 µs ± 787 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
 802 |      ]
 803 |     }
 804 |    ],
 805 |    "source": [
 806 |     "%timeit convert_units(names, hts, wts)"
 807 |    ]
 808 |   },
 809 |   {
 810 |    "cell_type": "markdown",
 811 |    "metadata": {},
 812 |    "source": [
 813 |     "Instead, we can profile our function with the line_profiler package. To use this package, we first need to load it into our session. We can do this using the command %load_ext followed by line_profiler."
 814 |    ]
 815 |   },
 816 |   {
 817 |    "cell_type": "markdown",
 818 |    "metadata": {},
 819 |    "source": [
 820 |     "Now, we can use the magic command %lprun, from line_profiler, to gather runtimes for individual lines of code within the convert_units function. %lprun uses a special syntax. First, we use the -f flag to indicate we'd like to profile a function. Then, we specify the name of the function we'd like to profile. Note, the name of the function is passed without any parentheses. Finally, we provide the exact function call we'd like to profile by including any arguments that are needed. This is shown in the code below:"
 821 |    ]
 822 |   },
 823 |   {
 824 |    "cell_type": "code",
 825 |    "execution_count": 25,
 826 |    "metadata": {
 827 |     "scrolled": true
 828 |    },
 829 |    "outputs": [
 830 |     {
 831 |      "name": "stdout",
 832 |      "output_type": "stream",
 833 |      "text": [
 834 |       "The line_profiler extension is already loaded. To reload it, use:\n",
 835 |       "  %reload_ext line_profiler\n"
 836 |      ]
 837 |     }
 838 |    ],
 839 |    "source": [
 840 |     "%load_ext line_profiler\n",
 841 |     "%lprun -f convert_units convert_units(names, hts, wts)"
 842 |    ]
 843 |   },
 844 |   {
 845 |    "cell_type": "markdown",
 846 |    "metadata": {},
 847 |    "source": [
 848 |     "### 2.3. Code profiling for memory use"
 849 |    ]
 850 |   },
 851 |   {
 852 |    "cell_type": "markdown",
 853 |    "metadata": {},
 854 |    "source": [
 855 |     "We've defined efficient code as code that has a minimal runtime and a small memory footprint. So far, we've only covered how to inspect the runtime of our code. In this section, we'll cover a few techniques on how to evaluate our code's memory usage."
 856 |    ]
 857 |   },
 858 |   {
 859 |    "cell_type": "markdown",
 860 |    "metadata": {},
 861 |    "source": [
 862 |     "One basic approach for inspecting memory consumption is using Python's built-in module sys. This module contains system-specific functions and contains one nice method .getsizeof which returns the size of an object in bytes. sys.getsizeof() is a quick way to see the size of an object."
 863 |    ]
 864 |   },
 865 |   {
 866 |    "cell_type": "code",
 867 |    "execution_count": 27,
 868 |    "metadata": {},
 869 |    "outputs": [
 870 |     {
 871 |      "data": {
 872 |       "text/plain": [
 873 |        "9112"
 874 |       ]
 875 |      },
 876 |      "execution_count": 27,
 877 |      "metadata": {},
 878 |      "output_type": "execute_result"
 879 |     }
 880 |    ],
 881 |    "source": [
 882 |     "import sys\n",
 883 |     "nums_list = [*range(1000)]\n",
 884 |     "sys.getsizeof(nums_list)"
 885 |    ]
 886 |   },
 887 |   {
 888 |    "cell_type": "code",
 889 |    "execution_count": 28,
 890 |    "metadata": {},
 891 |    "outputs": [
 892 |     {
 893 |      "data": {
 894 |       "text/plain": [
 895 |        "4096"
 896 |       ]
 897 |      },
 898 |      "execution_count": 28,
 899 |      "metadata": {},
 900 |      "output_type": "execute_result"
 901 |     }
 902 |    ],
 903 |    "source": [
 904 |     "nums_np = np.array(range(1000))\n",
 905 |     "sys.getsizeof(nums_np)"
 906 |    ]
 907 |   },
 908 |   {
 909 |    "cell_type": "markdown",
 910 |    "metadata": {},
 911 |    "source": [
 912 |     "We can see that the memory allocation of a list is almost double that of a NumPy array. However, this method only gives us the size of an individual object. However, if we wanted to inspect the line-by-line memory size of our code. As the runtime profile, we could use a code profiler. Just like we've used code profiling to gather detailed stats on runtimes, we can also use code profiling to analyze the memory allocation for each line of code in our code base. We'll use the memory_profiler package which is very similar to the line_profiler package. It can be downloaded via pip and comes with a handy magic command (%mprun) that uses the same syntax as %lprun."
 913 |    ]
 914 |   },
 915 |   {
 916 |    "cell_type": "markdown",
 917 |    "metadata": {},
 918 |    "source": [
 919 |     "first lets download the meomery profiler package"
 920 |    ]
 921 |   },
 922 |   {
 923 |    "cell_type": "code",
 924 |    "execution_count": 29,
 925 |    "metadata": {
 926 |     "scrolled": true
 927 |    },
 928 |    "outputs": [
 929 |     {
 930 |      "name": "stdout",
 931 |      "output_type": "stream",
 932 |      "text": [
 933 |       "Collecting memory_profiler\n",
 934 |       "  Downloading memory_profiler-0.60.0.tar.gz (38 kB)\n",
 935 |       "Requirement already satisfied: psutil in c:\\users\\youss\\anaconda3\\envs\\new_enviroment\\lib\\site-packages (from memory_profiler) (5.7.2)\n",
 936 |       "Building wheels for collected packages: memory-profiler\n",
 937 |       "  Building wheel for memory-profiler (setup.py): started\n",
 938 |       "  Building wheel for memory-profiler (setup.py): finished with status 'done'\n",
 939 |       "  Created wheel for memory-profiler: filename=memory_profiler-0.60.0-py3-none-any.whl size=31279 sha256=288154cda37cbe0ab88effb1aaa56beba4a185ce47eb84b745a1603b06b8294b\n",
 940 |       "  Stored in directory: c:\\users\\youss\\appdata\\local\\pip\\cache\\wheels\\67\\2b\\fb\\326e30d638c538e69a5eb0aa47f4223d979f502bbdb403950f\n",
 941 |       "Successfully built memory-profiler\n",
 942 |       "Installing collected packages: memory-profiler\n",
 943 |       "Successfully installed memory-profiler-0.60.0\n"
 944 |      ]
 945 |     }
 946 |    ],
 947 |    "source": [
 948 |     "!pip install memory_profiler"
 949 |    ]
 950 |   },
 951 |   {
 952 |    "cell_type": "markdown",
 953 |    "metadata": {},
 954 |    "source": [
 955 |     "To be able to apply %mprun to a function and calculate the meomery allocation, this function should be loaded from a separate physical file and not in the IPython console so first we will create a utils_funcs.py file and define the convert_units function in it, and then we will load this function from the file and apply %mprun to it."
 956 |    ]
 957 |   },
 958 |   {
 959 |    "cell_type": "code",
 960 |    "execution_count": 38,
 961 |    "metadata": {},
 962 |    "outputs": [
 963 |     {
 964 |      "name": "stdout",
 965 |      "output_type": "stream",
 966 |      "text": [
 967 |       "The memory_profiler extension is already loaded. To reload it, use:\n",
 968 |       "  %reload_ext memory_profiler\n",
 969 |       "\n"
 970 |      ]
 971 |     }
 972 |    ],
 973 |    "source": [
 974 |     "from utils_funcs import convert_units\n",
 975 |     "\n",
 976 |     "%load_ext memory_profiler\n",
 977 |     "%mprun -f convert_units convert_units(names, hts, wts)"
 978 |    ]
 979 |   }
 980 |  ],
 981 |  "metadata": {
 982 |   "kernelspec": {
 983 |    "display_name": "Python 3",
 984 |    "language": "python",
 985 |    "name": "python3"
 986 |   },
 987 |   "language_info": {
 988 |    "codemirror_mode": {
 989 |     "name": "ipython",
 990 |     "version": 3
 991 |    },
 992 |    "file_extension": ".py",
 993 |    "mimetype": "text/x-python",
 994 |    "name": "python",
 995 |    "nbconvert_exporter": "python",
 996 |    "pygments_lexer": "ipython3",
 997 |    "version": "3.7.9"
 998 |   }
 999 |  },
1000 |  "nbformat": 4,
1001 |  "nbformat_minor": 4
1002 | }
1003 | 


--------------------------------------------------------------------------------