├── .gitignore ├── README.md ├── data └── interviews.csv └── just-pandas-things.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # just-pandas-things 2 | This repo contains a few peculiar things I've learned about pandas that have made my life easier and my code faster. This post isn't a friendly tutorial for beginners, but a friendly introduction to pandas weirdness. 3 | 4 | ## What's in this repo? 5 | 6 | 1. pandas is column-major, which is why row-based operations are slow 7 | 2. `SettingWithCopyWarning`, or why we can't have nice things 8 | 3. Indexing and slicing 9 | 4. Accessors 10 | 5. Data exploration 11 | 6. Common pitfalls 12 | 13 | I'll continue updating this repo as I have more time. As I'm still learning pandas quirks, feedback is much appreciated. 14 | 15 | Thanks [Luke Metz](https://twitter.com/luke_metz), [Vikram Tiwari](https://twitter.com/Vikram_Tiwari), and [Karson Elmgren](https://karsonelmgren.com/) for reviewing! 16 | -------------------------------------------------------------------------------- /just-pandas-things.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Just pandas things\n", 8 | "\n", 9 | "It's possible that Python wouldn't have become [the lingua franca of data science if it wasn't for pandas](https://stackoverflow.blog/2017/09/14/python-growing-quickly/). The package's exponential growth on Stack Overflow means two things:\n", 10 | "1. It's getting increasingly popular.\n", 11 | "2. It can be frustrating to use sometimes (hence the high number of questions).\n", 12 | "\n", 13 | "This repo contains a few peculiar things I've learned about pandas that have made my life easier and my code faster. This post isn't a friendly tutorial for beginners, but a friendly introduction to pandas weirdness.\n", 14 | "\n", 15 | "To demonstrate the use of pandas, we'll be using the interview reviews scraped from Glassdoor in 2019. The data is stored in the folder `data` under the name `interviews.csv`.\n", 16 | "\n", 17 | "I'll continue updating this repo as I have more time. As I'm still learning pandas quirks, feedback is much appreciated!\n", 18 | "\n", 19 | "![](https://i.pinimg.com/originals/9e/7c/78/9e7c7816c30327890dc94ba16e5dac1b.jpg)" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 1, 25 | "metadata": {}, 26 | "outputs": [ 27 | { 28 | "name": "stdout", 29 | "output_type": "stream", 30 | "text": [ 31 | "(17654, 10)\n" 32 | ] 33 | }, 34 | { 35 | "data": { 36 | "text/html": [ 37 | "
\n", 38 | "\n", 51 | "\n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
0AppleSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumApplication I applied through a staffing agen...
1AppleSoftware EngineerSoftware EngineerEngineerAug 8, 20190Accepted offer1.0HardApplication I applied online. The process too...
2AppleSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
3AppleSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
4AppleSoftware EngineerSoftware EngineerEngineerMay 29, 20092No offer0.0MediumApplication I applied through an employee ref...
\n", 135 | "
" 136 | ], 137 | "text/plain": [ 138 | " Company Title Job Level Date \\\n", 139 | "0 Apple Software Engineer Software Engineer Engineer Aug 7, 2019 \n", 140 | "1 Apple Software Engineer Software Engineer Engineer Aug 8, 2019 \n", 141 | "2 Apple Software Engineer Software Engineer Engineer NaN \n", 142 | "3 Apple Software Engineer Software Engineer Engineer NaN \n", 143 | "4 Apple Software Engineer Software Engineer Engineer May 29, 2009 \n", 144 | "\n", 145 | " Upvotes Offer Experience Difficulty \\\n", 146 | "0 0 No offer 0.0 Medium \n", 147 | "1 0 Accepted offer 1.0 Hard \n", 148 | "2 0 Declined offer 0.0 Medium \n", 149 | "3 9 Declined offer -1.0 Medium \n", 150 | "4 2 No offer 0.0 Medium \n", 151 | "\n", 152 | " Review \n", 153 | "0 Application I applied through a staffing agen... \n", 154 | "1 Application I applied online. The process too... \n", 155 | "2 Application The process took 4 weeks. I inter... \n", 156 | "3 Application The process took a week. I interv... \n", 157 | "4 Application I applied through an employee ref... " 158 | ] 159 | }, 160 | "execution_count": 1, 161 | "metadata": {}, 162 | "output_type": "execute_result" 163 | } 164 | ], 165 | "source": [ 166 | "import pandas as pd\n", 167 | "df = pd.read_csv(\"data/interviews.csv\")\n", 168 | "\n", 169 | "print(df.shape)\n", 170 | "df.head()" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "## 1. pandas is column-major\n", 178 | "Pandas is built around `DataFrame`, a concept inspired by R's Data Frame, which is, in turn, similar to tables in relational databases. A `DataFrame` is a two-dimensional table with rows and columns.\n", 179 | "\n", 180 | "One important thing to know about pandas is that it's column-major, which explains many of its quirks.\n", 181 | "\n", 182 | "Column-major means consecutive elements in a column are stored next to each other in memory. Row-major means the same but for elements in a row. Because modern computers process sequential data more efficiently than nonsequential data, if a table is row-major, accessing its rows will be much faster than accessing its columns.\n", 183 | "\n", 184 | "In NumPy, major order can be specified. When a `ndarray` is created, it's row-major by default if you don't specify the order.\n", 185 | "\n", 186 | "Like R's Data Frame, pandas' `DataFrame` is column-major. People coming to pandas from NumPy tend to treat `DataFrame` the way they would `ndarray`, e.g. trying to access data by rows, and find `DataFrame` slow.\n", 187 | "\n", 188 | "**Note**: A column in a `DataFrame` is a `Series`. You can think of a `DataFrame` as a bunch of `Series` being stored next to each other in memory.\n", 189 | "\n", 190 | "**For our dataset, accessing a row takes about 50x longer than accessing a column in our `DataFrame`.**" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": 2, 196 | "metadata": {}, 197 | "outputs": [ 198 | { 199 | "name": "stdout", 200 | "output_type": "stream", 201 | "text": [ 202 | "1.78 µs ± 167 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n", 203 | "145 µs ± 9.41 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 204 | ] 205 | } 206 | ], 207 | "source": [ 208 | "# Get the column `date`, 1000 loops\n", 209 | "%timeit -n1000 df[\"Date\"]\n", 210 | "\n", 211 | "# Get the first row, 1000 loops\n", 212 | "%timeit -n1000 df.iloc[0]" 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": {}, 218 | "source": [ 219 | "### 1.1 Iterating over rows\n", 220 | "#### 1.1.1 `.apply()`\n", 221 | "pandas documentation has [a warning box](https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#iteration) that basically tells you not to iterate over rows because it's slow.\n", 222 | "\n", 223 | "Before iterating over rows, think about what you want to do with each row, pack that into a function and use methods like `.apply()` to apply the function to all rows.\n", 224 | "\n", 225 | "For example, to scale the \"Experience\" column by the number of \"Upvotes\" each review has, one way is to iterate over rows and multiple the \"Upvotes\" value by the \"Experience\" value of that row. But you can also use `.apply()` with a `lambda` function." 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "execution_count": 3, 231 | "metadata": {}, 232 | "outputs": [ 233 | { 234 | "name": "stdout", 235 | "output_type": "stream", 236 | "text": [ 237 | "180 ms ± 16.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 238 | ] 239 | } 240 | ], 241 | "source": [ 242 | "%timeit -n1 df.apply(lambda x: x[\"Experience\"] * x[\"Upvotes\"], axis=1)" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": {}, 248 | "source": [ 249 | "#### 1.1.2 `.iterrows()` and `.itertuples()`\n", 250 | "If you really want to iterate over rows, one naive way is to use `.iterrows()`. It returns a generator that generates row by row and it's very slow." 251 | ] 252 | }, 253 | { 254 | "cell_type": "code", 255 | "execution_count": 4, 256 | "metadata": {}, 257 | "outputs": [ 258 | { 259 | "name": "stdout", 260 | "output_type": "stream", 261 | "text": [ 262 | "1.42 s ± 107 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 263 | ] 264 | } 265 | ], 266 | "source": [ 267 | "%timeit -n1 [row for index, row in df.iterrows()]" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": 5, 273 | "metadata": {}, 274 | "outputs": [ 275 | { 276 | "name": "stdout", 277 | "output_type": "stream", 278 | "text": [ 279 | "Company Apple\n", 280 | "Title Software Engineer\n", 281 | "Job Software Engineer\n", 282 | "Level Engineer\n", 283 | "Date Aug 7, 2019\n", 284 | "Upvotes 0\n", 285 | "Offer No offer\n", 286 | "Experience 0\n", 287 | "Difficulty Medium\n", 288 | "Review Application I applied through a staffing agen...\n", 289 | "Name: 0, dtype: object\n" 290 | ] 291 | } 292 | ], 293 | "source": [ 294 | "# This is what a row looks like as a pandas object\n", 295 | "for index, row in df.iterrows():\n", 296 | " print(row)\n", 297 | " break" 298 | ] 299 | }, 300 | { 301 | "cell_type": "markdown", 302 | "metadata": {}, 303 | "source": [ 304 | "`.itertuples()` returns rows in the `namedtuple` format. It still lets you access each row and it's about 40x faster than `.iterrows()`." 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": 6, 310 | "metadata": {}, 311 | "outputs": [ 312 | { 313 | "name": "stdout", 314 | "output_type": "stream", 315 | "text": [ 316 | "24.2 ms ± 709 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 317 | ] 318 | } 319 | ], 320 | "source": [ 321 | "%timeit -n1 [row for row in df.itertuples()]" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": 7, 327 | "metadata": {}, 328 | "outputs": [ 329 | { 330 | "name": "stdout", 331 | "output_type": "stream", 332 | "text": [ 333 | "Pandas(Index=0, Company='Apple', Title='Software Engineer', Job='Software Engineer', Level='Engineer', Date='Aug 7, 2019', Upvotes=0, Offer='No offer', Experience=0.0, Difficulty='Medium', Review='Application I applied through a staffing agency. I interviewed at Apple (Sunnyvale, CA) in March 2019. Interview The interviewer asked me about my background. Asked few questions from the resume. Asked about my proficiency on data structures. Asked me how do you sort hashmap keys based on values. Interview Questions Write a program that uses two threads to print the numbers from 1 to n.')\n" 334 | ] 335 | } 336 | ], 337 | "source": [ 338 | "# This is what a row looks like as a namedtuple.\n", 339 | "for row in df.itertuples():\n", 340 | " print(row)\n", 341 | " break" 342 | ] 343 | }, 344 | { 345 | "cell_type": "markdown", 346 | "metadata": {}, 347 | "source": [ 348 | "#### 1.1.3 Converting DataFrame to row-major order\n", 349 | "If you need to do a lot of row operations, you might want to convert your `DataFrame` to a NumPy's row-major `ndarray`, then iterating through the rows." 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": 8, 355 | "metadata": {}, 356 | "outputs": [ 357 | { 358 | "name": "stdout", 359 | "output_type": "stream", 360 | "text": [ 361 | "4.55 ms ± 280 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 362 | ] 363 | } 364 | ], 365 | "source": [ 366 | "# Now, iterating through our DataFrame is 100x faster.\n", 367 | "%timeit -n1 df_np = df.to_numpy(); rows = [row for row in df_np]" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": {}, 373 | "source": [ 374 | "Accessing a row or a column of our `ndarray` takes nanoseconds instead of microseconds." 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": 9, 380 | "metadata": {}, 381 | "outputs": [ 382 | { 383 | "name": "stdout", 384 | "output_type": "stream", 385 | "text": [ 386 | "147 ns ± 1.54 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n", 387 | "204 ns ± 0.678 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 388 | ] 389 | } 390 | ], 391 | "source": [ 392 | "df_np = df.to_numpy()\n", 393 | "%timeit -n1000 df_np[0]\n", 394 | "%timeit -n1000 df_np[:,0]" 395 | ] 396 | }, 397 | { 398 | "cell_type": "markdown", 399 | "metadata": {}, 400 | "source": [ 401 | "### 1.2. Ordering slicing operations\n", 402 | "Because pandas is column-major, if you want to do multiple slicing operations, always do the column-based slicing operations first.\n", 403 | "\n", 404 | "For example, if you want to get the review from the first row of the data, there are two slicing operations:\n", 405 | "- get row (row-based operation)\n", 406 | "- get review (column-based operation)\n", 407 | "\n", 408 | "Get row -> get review is 25x slower than get review -> get row.\n", 409 | "\n", 410 | "**Note**: You can also just use `df.loc[0, \"Review\"]` to calculate the memory address to retrieve the item. Its performance is comparable to get review then get row." 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": 10, 416 | "metadata": {}, 417 | "outputs": [ 418 | { 419 | "name": "stdout", 420 | "output_type": "stream", 421 | "text": [ 422 | "5.55 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n", 423 | "136 µs ± 2.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n", 424 | "6.23 µs ± 264 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 425 | ] 426 | } 427 | ], 428 | "source": [ 429 | "%timeit -n1000 df[\"Review\"][0]\n", 430 | "%timeit -n1000 df.iloc[0][\"Review\"]\n", 431 | "%timeit -n1000 df.loc[0, \"Review\"]" 432 | ] 433 | }, 434 | { 435 | "cell_type": "markdown", 436 | "metadata": {}, 437 | "source": [ 438 | "## 2. SettingWithCopyWarning\n", 439 | "Sometimes, when you try to assign values to a subset of data in a DataFrame, you get `SettingWithCopyWarning`. Don't ignore the warning because it means sometimes, the assignment works (example 1), but sometimes, it doesn't (example 2)." 440 | ] 441 | }, 442 | { 443 | "cell_type": "code", 444 | "execution_count": 11, 445 | "metadata": {}, 446 | "outputs": [ 447 | { 448 | "name": "stderr", 449 | "output_type": "stream", 450 | "text": [ 451 | "/Users/chip/miniconda3/envs/py3/lib/python3.6/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n", 452 | "A value is trying to be set on a copy of a slice from a DataFrame\n", 453 | "\n", 454 | "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", 455 | " \n" 456 | ] 457 | }, 458 | { 459 | "data": { 460 | "text/html": [ 461 | "
\n", 462 | "\n", 475 | "\n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
0AppleSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumI like Orange better.
\n", 507 | "
" 508 | ], 509 | "text/plain": [ 510 | " Company Title Job Level Date \\\n", 511 | "0 Apple Software Engineer Software Engineer Engineer Aug 7, 2019 \n", 512 | "\n", 513 | " Upvotes Offer Experience Difficulty Review \n", 514 | "0 0 No offer 0.0 Medium I like Orange better. " 515 | ] 516 | }, 517 | "execution_count": 11, 518 | "metadata": {}, 519 | "output_type": "execute_result" 520 | } 521 | ], 522 | "source": [ 523 | "# Example 1: Changing the review of the first row.\n", 524 | "df[\"Review\"][0] = \"I like Orange better.\"\n", 525 | "# Even though with the warning, the assignment works. The review is updated.\n", 526 | "df.head(1)" 527 | ] 528 | }, 529 | { 530 | "cell_type": "code", 531 | "execution_count": 12, 532 | "metadata": {}, 533 | "outputs": [ 534 | { 535 | "name": "stderr", 536 | "output_type": "stream", 537 | "text": [ 538 | "/Users/chip/miniconda3/envs/py3/lib/python3.6/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n", 539 | "A value is trying to be set on a copy of a slice from a DataFrame.\n", 540 | "Try using .loc[row_indexer,col_indexer] = value instead\n", 541 | "\n", 542 | "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", 543 | " \n" 544 | ] 545 | }, 546 | { 547 | "data": { 548 | "text/html": [ 549 | "
\n", 550 | "\n", 563 | "\n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
0AppleSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumI like Orange better.
\n", 595 | "
" 596 | ], 597 | "text/plain": [ 598 | " Company Title Job Level Date \\\n", 599 | "0 Apple Software Engineer Software Engineer Engineer Aug 7, 2019 \n", 600 | "\n", 601 | " Upvotes Offer Experience Difficulty Review \n", 602 | "0 0 No offer 0.0 Medium I like Orange better. " 603 | ] 604 | }, 605 | "execution_count": 12, 606 | "metadata": {}, 607 | "output_type": "execute_result" 608 | } 609 | ], 610 | "source": [ 611 | "# Example 2: Changing the company name Apple to Orange.\n", 612 | "df[df[\"Company\"] == \"Apple\"][\"Company\"] = \"Orange\"\n", 613 | "# With the warning, the assignment doesn't work. The company name is still Apple.\n", 614 | "df.head(1)" 615 | ] 616 | }, 617 | { 618 | "cell_type": "markdown", 619 | "metadata": {}, 620 | "source": [ 621 | "### 2.1. `View` vs. `Copy`\n", 622 | "To understand this weird behavior, we need to understand two concepts in pandas: `View` vs. `Copy`.\n", 623 | "- `View` is the actual `DataFrame` you want to work with.\n", 624 | "- `Copy` is a copy of that actual `DataFrame`, which will be thrown away as soon as the operation is done.\n", 625 | "\n", 626 | "So if you try to do an assignment on a `Copy`, the assignment won't work. \n", 627 | "\n", 628 | "`SettingWithCopyWarning` doesn't mean you're making changes to a `Copy`. It means that the thing you're making changes to might be a `Copy` or a `View`, and pandas can't tell you.\n", 629 | "\n", 630 | "The ambiguity happens because of `__getitem__` operation.\n", 631 | "`__getitem__` sometimes returns a `Copy`, sometimes a `View`, and pandas makes no guarantee." 632 | ] 633 | }, 634 | { 635 | "cell_type": "code", 636 | "execution_count": 13, 637 | "metadata": {}, 638 | "outputs": [], 639 | "source": [ 640 | "# df[\"Review\"][0] = \"I like Orange better.\"\n", 641 | "# can be understood as\n", 642 | "# `df.__getitem__(\"Review\").__setitem__(0, \"I like Orange better.\")`" 643 | ] 644 | }, 645 | { 646 | "cell_type": "code", 647 | "execution_count": 14, 648 | "metadata": {}, 649 | "outputs": [], 650 | "source": [ 651 | "# df[df[\"Company\"] == \"Apple\"][\"Company\"] = \"Orange\"\n", 652 | "# can be understood as\n", 653 | "# df.__getitem__(where df[\"Company\"] == \"Apple\").__setitem__(\"Company\", \"Orange\")" 654 | ] 655 | }, 656 | { 657 | "cell_type": "markdown", 658 | "metadata": {}, 659 | "source": [ 660 | "### 2.2 Solutions\n", 661 | "#### 2.2.1 Combine all chained operations into one single operation\n", 662 | "To avoid `__getitem__` ambiguity, you can combine all your operations into one single operation.\n", 663 | "`.loc[]` is usually great for that." 664 | ] 665 | }, 666 | { 667 | "cell_type": "code", 668 | "execution_count": 15, 669 | "metadata": {}, 670 | "outputs": [ 671 | { 672 | "data": { 673 | "text/html": [ 674 | "
\n", 675 | "\n", 688 | "\n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
0AppleSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumOrange is love. Orange is life.
1AppleSoftware EngineerSoftware EngineerEngineerAug 8, 20190Accepted offer1.0HardApplication I applied online. The process too...
2AppleSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
3AppleSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
4AppleSoftware EngineerSoftware EngineerEngineerMay 29, 20092No offer0.0MediumApplication I applied through an employee ref...
\n", 772 | "
" 773 | ], 774 | "text/plain": [ 775 | " Company Title Job Level Date \\\n", 776 | "0 Apple Software Engineer Software Engineer Engineer Aug 7, 2019 \n", 777 | "1 Apple Software Engineer Software Engineer Engineer Aug 8, 2019 \n", 778 | "2 Apple Software Engineer Software Engineer Engineer NaN \n", 779 | "3 Apple Software Engineer Software Engineer Engineer NaN \n", 780 | "4 Apple Software Engineer Software Engineer Engineer May 29, 2009 \n", 781 | "\n", 782 | " Upvotes Offer Experience Difficulty \\\n", 783 | "0 0 No offer 0.0 Medium \n", 784 | "1 0 Accepted offer 1.0 Hard \n", 785 | "2 0 Declined offer 0.0 Medium \n", 786 | "3 9 Declined offer -1.0 Medium \n", 787 | "4 2 No offer 0.0 Medium \n", 788 | "\n", 789 | " Review \n", 790 | "0 Orange is love. Orange is life. \n", 791 | "1 Application I applied online. The process too... \n", 792 | "2 Application The process took 4 weeks. I inter... \n", 793 | "3 Application The process took a week. I interv... \n", 794 | "4 Application I applied through an employee ref... " 795 | ] 796 | }, 797 | "execution_count": 15, 798 | "metadata": {}, 799 | "output_type": "execute_result" 800 | } 801 | ], 802 | "source": [ 803 | "# Changing the review of the first row.\n", 804 | "df.loc[0, \"Review\"] = \"Orange is love. Orange is life.\"\n", 805 | "df.head()" 806 | ] 807 | }, 808 | { 809 | "cell_type": "code", 810 | "execution_count": 16, 811 | "metadata": {}, 812 | "outputs": [ 813 | { 814 | "data": { 815 | "text/html": [ 816 | "
\n", 817 | "\n", 830 | "\n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
0OrangeSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumOrange is love. Orange is life.
1OrangeSoftware EngineerSoftware EngineerEngineerAug 8, 20190Accepted offer1.0HardApplication I applied online. The process too...
2OrangeSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
3OrangeSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
4OrangeSoftware EngineerSoftware EngineerEngineerMay 29, 20092No offer0.0MediumApplication I applied through an employee ref...
\n", 914 | "
" 915 | ], 916 | "text/plain": [ 917 | " Company Title Job Level Date \\\n", 918 | "0 Orange Software Engineer Software Engineer Engineer Aug 7, 2019 \n", 919 | "1 Orange Software Engineer Software Engineer Engineer Aug 8, 2019 \n", 920 | "2 Orange Software Engineer Software Engineer Engineer NaN \n", 921 | "3 Orange Software Engineer Software Engineer Engineer NaN \n", 922 | "4 Orange Software Engineer Software Engineer Engineer May 29, 2009 \n", 923 | "\n", 924 | " Upvotes Offer Experience Difficulty \\\n", 925 | "0 0 No offer 0.0 Medium \n", 926 | "1 0 Accepted offer 1.0 Hard \n", 927 | "2 0 Declined offer 0.0 Medium \n", 928 | "3 9 Declined offer -1.0 Medium \n", 929 | "4 2 No offer 0.0 Medium \n", 930 | "\n", 931 | " Review \n", 932 | "0 Orange is love. Orange is life. \n", 933 | "1 Application I applied online. The process too... \n", 934 | "2 Application The process took 4 weeks. I inter... \n", 935 | "3 Application The process took a week. I interv... \n", 936 | "4 Application I applied through an employee ref... " 937 | ] 938 | }, 939 | "execution_count": 16, 940 | "metadata": {}, 941 | "output_type": "execute_result" 942 | } 943 | ], 944 | "source": [ 945 | "# Changing the company name Apple to Orange.\n", 946 | "df.loc[df[\"Company\"] == \"Apple\", \"Company\"] = \"Orange\"\n", 947 | "df.head()" 948 | ] 949 | }, 950 | { 951 | "cell_type": "markdown", 952 | "metadata": {}, 953 | "source": [ 954 | "#### 2.2.2 Raise an error\n", 955 | "I believe `SettingWithCopyWarning` should be an Exception instead of a warning. You can change this warning into an exception with pandas' magic `set_option()`." 956 | ] 957 | }, 958 | { 959 | "cell_type": "code", 960 | "execution_count": 17, 961 | "metadata": {}, 962 | "outputs": [], 963 | "source": [ 964 | "pd.set_option(\"mode.chained_assignment\", \"raise\")\n", 965 | "# Running this will show you an Exception\n", 966 | "# df[\"Review\"][0] = \"I like Orange better.\"" 967 | ] 968 | }, 969 | { 970 | "cell_type": "markdown", 971 | "metadata": {}, 972 | "source": [ 973 | "## 3. Indexing and slicing\n", 974 | "\n", 975 | "### 3.1 `.iloc[]`: selecting rows based on integer indices\n", 976 | "`.iloc[]` lets you select rows by integer indices." 977 | ] 978 | }, 979 | { 980 | "cell_type": "code", 981 | "execution_count": 18, 982 | "metadata": {}, 983 | "outputs": [ 984 | { 985 | "data": { 986 | "text/plain": [ 987 | "Company Orange\n", 988 | "Title Software Engineer\n", 989 | "Job Software Engineer\n", 990 | "Level Engineer\n", 991 | "Date NaN\n", 992 | "Upvotes 9\n", 993 | "Offer Declined offer\n", 994 | "Experience -1\n", 995 | "Difficulty Medium\n", 996 | "Review Application The process took a week. I interv...\n", 997 | "Name: 3, dtype: object" 998 | ] 999 | }, 1000 | "execution_count": 18, 1001 | "metadata": {}, 1002 | "output_type": "execute_result" 1003 | } 1004 | ], 1005 | "source": [ 1006 | "# Accessing the third row of a `DataFrame`.\n", 1007 | "df.iloc[3]" 1008 | ] 1009 | }, 1010 | { 1011 | "cell_type": "markdown", 1012 | "metadata": {}, 1013 | "source": [ 1014 | "Slicing with `.iloc[]` is similar to slicing in Python. If you want a refresh on how slicing in Python works, see [Python-is-cool](https://github.com/chiphuyen/python-is-cool)." 1015 | ] 1016 | }, 1017 | { 1018 | "cell_type": "code", 1019 | "execution_count": 19, 1020 | "metadata": {}, 1021 | "outputs": [ 1022 | { 1023 | "data": { 1024 | "text/html": [ 1025 | "
\n", 1026 | "\n", 1039 | "\n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | " \n", 1086 | " \n", 1087 | " \n", 1088 | " \n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | " \n", 1102 | " \n", 1103 | " \n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | " \n", 1109 | " \n", 1110 | " \n", 1111 | " \n", 1112 | " \n", 1113 | " \n", 1114 | " \n", 1115 | " \n", 1116 | " \n", 1117 | " \n", 1118 | " \n", 1119 | " \n", 1120 | " \n", 1121 | " \n", 1122 | " \n", 1123 | " \n", 1124 | " \n", 1125 | " \n", 1126 | " \n", 1127 | " \n", 1128 | " \n", 1129 | " \n", 1130 | " \n", 1131 | " \n", 1132 | " \n", 1133 | " \n", 1134 | " \n", 1135 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
17648TencentSoftware EngineerSoftware EngineerEngineerNov 4, 20120No offerNaNNaNApplication I applied online. The process too...
17649TencentSoftware EngineerSoftware EngineerEngineerMay 25, 20120Declined offer0.0MediumApplication I applied online. The process too...
17650TencentSoftware EngineerSoftware EngineerEngineerMar 15, 20140No offerNaNNaNApplication I applied through college or univ...
17651TencentSoftware EngineerSoftware EngineerEngineerSep 22, 20150Accepted offer1.0MediumApplication I applied through college or univ...
17652TencentSoftware EngineerSoftware EngineerEngineerJul 4, 20170Declined offer1.0MediumApplication I applied through college or univ...
17653TencentSoftware EngineerSoftware EngineerEngineerSep 30, 20160Declined offer0.0EasyApplication I applied online. The process too...
\n", 1136 | "
" 1137 | ], 1138 | "text/plain": [ 1139 | " Company Title Job Level Date \\\n", 1140 | "17648 Tencent Software Engineer Software Engineer Engineer Nov 4, 2012 \n", 1141 | "17649 Tencent Software Engineer Software Engineer Engineer May 25, 2012 \n", 1142 | "17650 Tencent Software Engineer Software Engineer Engineer Mar 15, 2014 \n", 1143 | "17651 Tencent Software Engineer Software Engineer Engineer Sep 22, 2015 \n", 1144 | "17652 Tencent Software Engineer Software Engineer Engineer Jul 4, 2017 \n", 1145 | "17653 Tencent Software Engineer Software Engineer Engineer Sep 30, 2016 \n", 1146 | "\n", 1147 | " Upvotes Offer Experience Difficulty \\\n", 1148 | "17648 0 No offer NaN NaN \n", 1149 | "17649 0 Declined offer 0.0 Medium \n", 1150 | "17650 0 No offer NaN NaN \n", 1151 | "17651 0 Accepted offer 1.0 Medium \n", 1152 | "17652 0 Declined offer 1.0 Medium \n", 1153 | "17653 0 Declined offer 0.0 Easy \n", 1154 | "\n", 1155 | " Review \n", 1156 | "17648 Application I applied online. The process too... \n", 1157 | "17649 Application I applied online. The process too... \n", 1158 | "17650 Application I applied through college or univ... \n", 1159 | "17651 Application I applied through college or univ... \n", 1160 | "17652 Application I applied through college or univ... \n", 1161 | "17653 Application I applied online. The process too... " 1162 | ] 1163 | }, 1164 | "execution_count": 19, 1165 | "metadata": {}, 1166 | "output_type": "execute_result" 1167 | } 1168 | ], 1169 | "source": [ 1170 | "# Selecting the last 6 rows\n", 1171 | "df.iloc[-6:]" 1172 | ] 1173 | }, 1174 | { 1175 | "cell_type": "code", 1176 | "execution_count": 20, 1177 | "metadata": {}, 1178 | "outputs": [ 1179 | { 1180 | "data": { 1181 | "text/html": [ 1182 | "
\n", 1183 | "\n", 1196 | "\n", 1197 | " \n", 1198 | " \n", 1199 | " \n", 1200 | " \n", 1201 | " \n", 1202 | " \n", 1203 | " \n", 1204 | " \n", 1205 | " \n", 1206 | " \n", 1207 | " \n", 1208 | " \n", 1209 | " \n", 1210 | " \n", 1211 | " \n", 1212 | " \n", 1213 | " \n", 1214 | " \n", 1215 | " \n", 1216 | " \n", 1217 | " \n", 1218 | " \n", 1219 | " \n", 1220 | " \n", 1221 | " \n", 1222 | " \n", 1223 | " \n", 1224 | " \n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | " \n", 1236 | " \n", 1237 | " \n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | " \n", 1244 | " \n", 1245 | " \n", 1246 | " \n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
17648TencentSoftware EngineerSoftware EngineerEngineerNov 4, 20120No offerNaNNaNApplication I applied online. The process too...
17650TencentSoftware EngineerSoftware EngineerEngineerMar 15, 20140No offerNaNNaNApplication I applied through college or univ...
17652TencentSoftware EngineerSoftware EngineerEngineerJul 4, 20170Declined offer1.0MediumApplication I applied through college or univ...
\n", 1254 | "
" 1255 | ], 1256 | "text/plain": [ 1257 | " Company Title Job Level Date \\\n", 1258 | "17648 Tencent Software Engineer Software Engineer Engineer Nov 4, 2012 \n", 1259 | "17650 Tencent Software Engineer Software Engineer Engineer Mar 15, 2014 \n", 1260 | "17652 Tencent Software Engineer Software Engineer Engineer Jul 4, 2017 \n", 1261 | "\n", 1262 | " Upvotes Offer Experience Difficulty \\\n", 1263 | "17648 0 No offer NaN NaN \n", 1264 | "17650 0 No offer NaN NaN \n", 1265 | "17652 0 Declined offer 1.0 Medium \n", 1266 | "\n", 1267 | " Review \n", 1268 | "17648 Application I applied online. The process too... \n", 1269 | "17650 Application I applied through college or univ... \n", 1270 | "17652 Application I applied through college or univ... " 1271 | ] 1272 | }, 1273 | "execution_count": 20, 1274 | "metadata": {}, 1275 | "output_type": "execute_result" 1276 | } 1277 | ], 1278 | "source": [ 1279 | "# Selecting 1 from every 2 rows in the last 6 rows\n", 1280 | "df.iloc[-6::2]" 1281 | ] 1282 | }, 1283 | { 1284 | "cell_type": "markdown", 1285 | "metadata": {}, 1286 | "source": [ 1287 | "### 3.2 `.loc[]`: selecting rows by labels or boolean masks\n", 1288 | "`.loc[]` lets you select rows based on one of the two things:\n", 1289 | "- boolean masks\n", 1290 | "- labels\n", 1291 | "\n", 1292 | "#### 3.2.1 Selecting rows by boolean masks\n", 1293 | "If you want to select all the rows where candidates declined offer, you can do it with two steps:\n", 1294 | "1. Create a boolean mask on whether the \"Offer\" column equals to \"Declined offer\"\n", 1295 | "2. Use that mask to select rows" 1296 | ] 1297 | }, 1298 | { 1299 | "cell_type": "code", 1300 | "execution_count": 21, 1301 | "metadata": {}, 1302 | "outputs": [ 1303 | { 1304 | "data": { 1305 | "text/html": [ 1306 | "
\n", 1307 | "\n", 1320 | "\n", 1321 | " \n", 1322 | " \n", 1323 | " \n", 1324 | " \n", 1325 | " \n", 1326 | " \n", 1327 | " \n", 1328 | " \n", 1329 | " \n", 1330 | " \n", 1331 | " \n", 1332 | " \n", 1333 | " \n", 1334 | " \n", 1335 | " \n", 1336 | " \n", 1337 | " \n", 1338 | " \n", 1339 | " \n", 1340 | " \n", 1341 | " \n", 1342 | " \n", 1343 | " \n", 1344 | " \n", 1345 | " \n", 1346 | " \n", 1347 | " \n", 1348 | " \n", 1349 | " \n", 1350 | " \n", 1351 | " \n", 1352 | " \n", 1353 | " \n", 1354 | " \n", 1355 | " \n", 1356 | " \n", 1357 | " \n", 1358 | " \n", 1359 | " \n", 1360 | " \n", 1361 | " \n", 1362 | " \n", 1363 | " \n", 1364 | " \n", 1365 | " \n", 1366 | " \n", 1367 | " \n", 1368 | " \n", 1369 | " \n", 1370 | " \n", 1371 | " \n", 1372 | " \n", 1373 | " \n", 1374 | " \n", 1375 | " \n", 1376 | " \n", 1377 | " \n", 1378 | " \n", 1379 | " \n", 1380 | " \n", 1381 | " \n", 1382 | " \n", 1383 | " \n", 1384 | " \n", 1385 | " \n", 1386 | " \n", 1387 | " \n", 1388 | " \n", 1389 | " \n", 1390 | " \n", 1391 | " \n", 1392 | " \n", 1393 | " \n", 1394 | " \n", 1395 | " \n", 1396 | " \n", 1397 | " \n", 1398 | " \n", 1399 | " \n", 1400 | " \n", 1401 | " \n", 1402 | " \n", 1403 | " \n", 1404 | " \n", 1405 | " \n", 1406 | " \n", 1407 | " \n", 1408 | " \n", 1409 | " \n", 1410 | " \n", 1411 | " \n", 1412 | " \n", 1413 | " \n", 1414 | " \n", 1415 | " \n", 1416 | " \n", 1417 | " \n", 1418 | " \n", 1419 | " \n", 1420 | " \n", 1421 | " \n", 1422 | " \n", 1423 | " \n", 1424 | " \n", 1425 | " \n", 1426 | " \n", 1427 | " \n", 1428 | " \n", 1429 | " \n", 1430 | " \n", 1431 | " \n", 1432 | " \n", 1433 | " \n", 1434 | " \n", 1435 | " \n", 1436 | " \n", 1437 | " \n", 1438 | " \n", 1439 | " \n", 1440 | " \n", 1441 | " \n", 1442 | " \n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1446 | " \n", 1447 | " \n", 1448 | " \n", 1449 | " \n", 1450 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1454 | " \n", 1455 | " \n", 1456 | " \n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1460 | " \n", 1461 | " \n", 1462 | " \n", 1463 | " \n", 1464 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1468 | " \n", 1469 | " \n", 1470 | " \n", 1471 | " \n", 1472 | " \n", 1473 | " \n", 1474 | " \n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
2OrangeSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
3OrangeSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
7OrangeSoftware EngineerSoftware EngineerEngineerJul 26, 20191Declined offer-1.0MediumApplication The process took 4+ weeks. I inte...
17OrangeSoftware EngineerSoftware EngineerEngineerFeb 27, 20107Declined offer-1.0MediumApplication The process took 1 day. I intervi...
65OrangeSoftware EngineerSoftware EngineerEngineerMay 6, 20121Declined offer1.0EasyApplication The process took 2 days. I interv...
.................................
17643TencentSoftware EngineerSoftware EngineerEngineerApr 9, 20160Declined offer1.0MediumApplication I applied online. I interviewed a...
17646TencentSoftware EngineerSoftware EngineerEngineerMay 28, 20100Declined offer0.0EasyApplication I applied through an employee ref...
17649TencentSoftware EngineerSoftware EngineerEngineerMay 25, 20120Declined offer0.0MediumApplication I applied online. The process too...
17652TencentSoftware EngineerSoftware EngineerEngineerJul 4, 20170Declined offer1.0MediumApplication I applied through college or univ...
17653TencentSoftware EngineerSoftware EngineerEngineerSep 30, 20160Declined offer0.0EasyApplication I applied online. The process too...
\n", 1482 | "

1135 rows × 10 columns

\n", 1483 | "
" 1484 | ], 1485 | "text/plain": [ 1486 | " Company Title Job Level Date \\\n", 1487 | "2 Orange Software Engineer Software Engineer Engineer NaN \n", 1488 | "3 Orange Software Engineer Software Engineer Engineer NaN \n", 1489 | "7 Orange Software Engineer Software Engineer Engineer Jul 26, 2019 \n", 1490 | "17 Orange Software Engineer Software Engineer Engineer Feb 27, 2010 \n", 1491 | "65 Orange Software Engineer Software Engineer Engineer May 6, 2012 \n", 1492 | "... ... ... ... ... ... \n", 1493 | "17643 Tencent Software Engineer Software Engineer Engineer Apr 9, 2016 \n", 1494 | "17646 Tencent Software Engineer Software Engineer Engineer May 28, 2010 \n", 1495 | "17649 Tencent Software Engineer Software Engineer Engineer May 25, 2012 \n", 1496 | "17652 Tencent Software Engineer Software Engineer Engineer Jul 4, 2017 \n", 1497 | "17653 Tencent Software Engineer Software Engineer Engineer Sep 30, 2016 \n", 1498 | "\n", 1499 | " Upvotes Offer Experience Difficulty \\\n", 1500 | "2 0 Declined offer 0.0 Medium \n", 1501 | "3 9 Declined offer -1.0 Medium \n", 1502 | "7 1 Declined offer -1.0 Medium \n", 1503 | "17 7 Declined offer -1.0 Medium \n", 1504 | "65 1 Declined offer 1.0 Easy \n", 1505 | "... ... ... ... ... \n", 1506 | "17643 0 Declined offer 1.0 Medium \n", 1507 | "17646 0 Declined offer 0.0 Easy \n", 1508 | "17649 0 Declined offer 0.0 Medium \n", 1509 | "17652 0 Declined offer 1.0 Medium \n", 1510 | "17653 0 Declined offer 0.0 Easy \n", 1511 | "\n", 1512 | " Review \n", 1513 | "2 Application The process took 4 weeks. I inter... \n", 1514 | "3 Application The process took a week. I interv... \n", 1515 | "7 Application The process took 4+ weeks. I inte... \n", 1516 | "17 Application The process took 1 day. I intervi... \n", 1517 | "65 Application The process took 2 days. I interv... \n", 1518 | "... ... \n", 1519 | "17643 Application I applied online. I interviewed a... \n", 1520 | "17646 Application I applied through an employee ref... \n", 1521 | "17649 Application I applied online. The process too... \n", 1522 | "17652 Application I applied through college or univ... \n", 1523 | "17653 Application I applied online. The process too... \n", 1524 | "\n", 1525 | "[1135 rows x 10 columns]" 1526 | ] 1527 | }, 1528 | "execution_count": 21, 1529 | "metadata": {}, 1530 | "output_type": "execute_result" 1531 | } 1532 | ], 1533 | "source": [ 1534 | "df.loc[df[\"Offer\"] == \"Declined offer\"]\n", 1535 | "# This is equivalent to:\n", 1536 | "# mask = df[\"Offer\"] == \"Declined offer\"\n", 1537 | "# df.loc[mask]" 1538 | ] 1539 | }, 1540 | { 1541 | "cell_type": "markdown", 1542 | "metadata": {}, 1543 | "source": [ 1544 | "#### 3.2.2 Selecting rows by labels\n", 1545 | "##### 3.2.2.1 Creating labels\n", 1546 | "Currently, our `DataFrame` has no labels yet. To create labels, use `.set_index()`.\n", 1547 | "\n", 1548 | "1. Labels can be integers or strings\n", 1549 | "2. A DataFrame can have multiple labels" 1550 | ] 1551 | }, 1552 | { 1553 | "cell_type": "code", 1554 | "execution_count": 22, 1555 | "metadata": {}, 1556 | "outputs": [ 1557 | { 1558 | "data": { 1559 | "text/html": [ 1560 | "
\n", 1561 | "\n", 1574 | "\n", 1575 | " \n", 1576 | " \n", 1577 | " \n", 1578 | " \n", 1579 | " \n", 1580 | " \n", 1581 | " \n", 1582 | " \n", 1583 | " \n", 1584 | " \n", 1585 | " \n", 1586 | " \n", 1587 | " \n", 1588 | " \n", 1589 | " \n", 1590 | " \n", 1591 | " \n", 1592 | " \n", 1593 | " \n", 1594 | " \n", 1595 | " \n", 1596 | " \n", 1597 | " \n", 1598 | " \n", 1599 | " \n", 1600 | " \n", 1601 | " \n", 1602 | " \n", 1603 | " \n", 1604 | " \n", 1605 | " \n", 1606 | " \n", 1607 | " \n", 1608 | " \n", 1609 | " \n", 1610 | " \n", 1611 | " \n", 1612 | " \n", 1613 | " \n", 1614 | " \n", 1615 | " \n", 1616 | " \n", 1617 | " \n", 1618 | " \n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
Type
HardwareOrangeSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumOrange is love. Orange is life.
HardwareOrangeSoftware EngineerSoftware EngineerEngineerAug 8, 20190Accepted offer1.0HardApplication I applied online. The process too...
HardwareOrangeSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
HardwareOrangeSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
HardwareOrangeSoftware EngineerSoftware EngineerEngineerMay 29, 20092No offer0.0MediumApplication I applied through an employee ref...
.................................
SoftwareTencentSoftware EngineerSoftware EngineerEngineerMay 25, 20120Declined offer0.0MediumApplication I applied online. The process too...
SoftwareTencentSoftware EngineerSoftware EngineerEngineerMar 15, 20140No offerNaNNaNApplication I applied through college or univ...
SoftwareTencentSoftware EngineerSoftware EngineerEngineerSep 22, 20150Accepted offer1.0MediumApplication I applied through college or univ...
SoftwareTencentSoftware EngineerSoftware EngineerEngineerJul 4, 20170Declined offer1.0MediumApplication I applied through college or univ...
SoftwareTencentSoftware EngineerSoftware EngineerEngineerSep 30, 20160Declined offer0.0EasyApplication I applied online. The process too...
\n", 1749 | "

17654 rows × 10 columns

\n", 1750 | "
" 1751 | ], 1752 | "text/plain": [ 1753 | " Company Title Job Level \\\n", 1754 | "Type \n", 1755 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 1756 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 1757 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 1758 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 1759 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 1760 | "... ... ... ... ... \n", 1761 | "Software Tencent Software Engineer Software Engineer Engineer \n", 1762 | "Software Tencent Software Engineer Software Engineer Engineer \n", 1763 | "Software Tencent Software Engineer Software Engineer Engineer \n", 1764 | "Software Tencent Software Engineer Software Engineer Engineer \n", 1765 | "Software Tencent Software Engineer Software Engineer Engineer \n", 1766 | "\n", 1767 | " Date Upvotes Offer Experience Difficulty \\\n", 1768 | "Type \n", 1769 | "Hardware Aug 7, 2019 0 No offer 0.0 Medium \n", 1770 | "Hardware Aug 8, 2019 0 Accepted offer 1.0 Hard \n", 1771 | "Hardware NaN 0 Declined offer 0.0 Medium \n", 1772 | "Hardware NaN 9 Declined offer -1.0 Medium \n", 1773 | "Hardware May 29, 2009 2 No offer 0.0 Medium \n", 1774 | "... ... ... ... ... ... \n", 1775 | "Software May 25, 2012 0 Declined offer 0.0 Medium \n", 1776 | "Software Mar 15, 2014 0 No offer NaN NaN \n", 1777 | "Software Sep 22, 2015 0 Accepted offer 1.0 Medium \n", 1778 | "Software Jul 4, 2017 0 Declined offer 1.0 Medium \n", 1779 | "Software Sep 30, 2016 0 Declined offer 0.0 Easy \n", 1780 | "\n", 1781 | " Review \n", 1782 | "Type \n", 1783 | "Hardware Orange is love. Orange is life. \n", 1784 | "Hardware Application I applied online. The process too... \n", 1785 | "Hardware Application The process took 4 weeks. I inter... \n", 1786 | "Hardware Application The process took a week. I interv... \n", 1787 | "Hardware Application I applied through an employee ref... \n", 1788 | "... ... \n", 1789 | "Software Application I applied online. The process too... \n", 1790 | "Software Application I applied through college or univ... \n", 1791 | "Software Application I applied through college or univ... \n", 1792 | "Software Application I applied through college or univ... \n", 1793 | "Software Application I applied online. The process too... \n", 1794 | "\n", 1795 | "[17654 rows x 10 columns]" 1796 | ] 1797 | }, 1798 | "execution_count": 22, 1799 | "metadata": {}, 1800 | "output_type": "execute_result" 1801 | } 1802 | ], 1803 | "source": [ 1804 | "# Adding label \"Hardware\" if the company name is \"Orange\", \"Dell\", \"IDM\", or \"Siemens\".\n", 1805 | "# \"Orange\" because we changed \"Apple\" to \"Orange\" above.\n", 1806 | "# Adding label \"Software\" otherwise.\n", 1807 | "\n", 1808 | "def company_type(x):\n", 1809 | " hardware_companies = set([\"Orange\", \"Dell\", \"IBM\", \"Siemens\"])\n", 1810 | " return \"Hardware\" if x[\"Company\"] in hardware_companies else \"Software\"\n", 1811 | "df[\"Type\"] = df.apply(lambda x: company_type(x), axis=1)\n", 1812 | "\n", 1813 | "# Setting \"Type\" to be labels. We call \"\"\n", 1814 | "df = df.set_index(\"Type\")\n", 1815 | "df\n", 1816 | "# Label columns aren't considered part of the DataFrame's content.\n", 1817 | "# After adding labels to your DataFrame, it still has 10 columns, same as before." 1818 | ] 1819 | }, 1820 | { 1821 | "cell_type": "markdown", 1822 | "metadata": {}, 1823 | "source": [ 1824 | "**Warning**: labels in `DataFrame` are stored as normal columns when you write the `DataFrame` to file using `.to_csv()`, and will need to be explicitly set after loading files, so if you send your CSV file to other people without explaining, they'll have no way of knowing which columns are labels. This might cause reproducibility issues. See [Stack Overflow answer](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)." 1825 | ] 1826 | }, 1827 | { 1828 | "cell_type": "markdown", 1829 | "metadata": {}, 1830 | "source": [ 1831 | "##### 3.2.2.1 Selecting rows by labels " 1832 | ] 1833 | }, 1834 | { 1835 | "cell_type": "code", 1836 | "execution_count": 23, 1837 | "metadata": {}, 1838 | "outputs": [ 1839 | { 1840 | "data": { 1841 | "text/html": [ 1842 | "
\n", 1843 | "\n", 1856 | "\n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | " \n", 1904 | " \n", 1905 | " \n", 1906 | " \n", 1907 | " \n", 1908 | " \n", 1909 | " \n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | " \n", 1916 | " \n", 1917 | " \n", 1918 | " \n", 1919 | " \n", 1920 | " \n", 1921 | " \n", 1922 | " \n", 1923 | " \n", 1924 | " \n", 1925 | " \n", 1926 | " \n", 1927 | " \n", 1928 | " \n", 1929 | " \n", 1930 | " \n", 1931 | " \n", 1932 | " \n", 1933 | " \n", 1934 | " \n", 1935 | " \n", 1936 | " \n", 1937 | " \n", 1938 | " \n", 1939 | " \n", 1940 | " \n", 1941 | " \n", 1942 | " \n", 1943 | " \n", 1944 | " \n", 1945 | " \n", 1946 | " \n", 1947 | " \n", 1948 | " \n", 1949 | " \n", 1950 | " \n", 1951 | " \n", 1952 | " \n", 1953 | " \n", 1954 | " \n", 1955 | " \n", 1956 | " \n", 1957 | " \n", 1958 | " \n", 1959 | " \n", 1960 | " \n", 1961 | " \n", 1962 | " \n", 1963 | " \n", 1964 | " \n", 1965 | " \n", 1966 | " \n", 1967 | " \n", 1968 | " \n", 1969 | " \n", 1970 | " \n", 1971 | " \n", 1972 | " \n", 1973 | " \n", 1974 | " \n", 1975 | " \n", 1976 | " \n", 1977 | " \n", 1978 | " \n", 1979 | " \n", 1980 | " \n", 1981 | " \n", 1982 | " \n", 1983 | " \n", 1984 | " \n", 1985 | " \n", 1986 | " \n", 1987 | " \n", 1988 | " \n", 1989 | " \n", 1990 | " \n", 1991 | " \n", 1992 | " \n", 1993 | " \n", 1994 | " \n", 1995 | " \n", 1996 | " \n", 1997 | " \n", 1998 | " \n", 1999 | " \n", 2000 | " \n", 2001 | " \n", 2002 | " \n", 2003 | " \n", 2004 | " \n", 2005 | " \n", 2006 | " \n", 2007 | " \n", 2008 | " \n", 2009 | " \n", 2010 | " \n", 2011 | " \n", 2012 | " \n", 2013 | " \n", 2014 | " \n", 2015 | " \n", 2016 | " \n", 2017 | " \n", 2018 | " \n", 2019 | " \n", 2020 | " \n", 2021 | " \n", 2022 | " \n", 2023 | " \n", 2024 | " \n", 2025 | " \n", 2026 | " \n", 2027 | " \n", 2028 | " \n", 2029 | " \n", 2030 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
Type
HardwareOrangeSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumOrange is love. Orange is life.
HardwareOrangeSoftware EngineerSoftware EngineerEngineerAug 8, 20190Accepted offer1.0HardApplication I applied online. The process too...
HardwareOrangeSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
HardwareOrangeSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
HardwareOrangeSoftware EngineerSoftware EngineerEngineerMay 29, 20092No offer0.0MediumApplication I applied through an employee ref...
.................................
HardwareIBMSenior Software EngineerSoftware EngineerSeniorSep 20, 20150No offer-1.0EasyApplication I applied through a recruiter. Th...
HardwareIBMSenior Software EngineerSoftware EngineerSeniorSep 14, 20150Accepted offer-1.0MediumApplication I applied in-person. The process ...
HardwareIBMSenior Software EngineerSoftware EngineerSeniorAug 6, 20150Accepted offer1.0HardApplication I applied through a recruiter. Th...
HardwareIBMSenior Software EngineerSoftware EngineerSeniorDec 13, 20150Accepted offer1.0MediumApplication I applied online. The process too...
HardwareIBMSenior Software EngineerSoftware EngineerSeniorFeb 15, 201611Accepted offer1.0EasyApplication I applied online. The process too...
\n", 2031 | "

1676 rows × 10 columns

\n", 2032 | "
" 2033 | ], 2034 | "text/plain": [ 2035 | " Company Title Job Level \\\n", 2036 | "Type \n", 2037 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 2038 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 2039 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 2040 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 2041 | "Hardware Orange Software Engineer Software Engineer Engineer \n", 2042 | "... ... ... ... ... \n", 2043 | "Hardware IBM Senior Software Engineer Software Engineer Senior \n", 2044 | "Hardware IBM Senior Software Engineer Software Engineer Senior \n", 2045 | "Hardware IBM Senior Software Engineer Software Engineer Senior \n", 2046 | "Hardware IBM Senior Software Engineer Software Engineer Senior \n", 2047 | "Hardware IBM Senior Software Engineer Software Engineer Senior \n", 2048 | "\n", 2049 | " Date Upvotes Offer Experience Difficulty \\\n", 2050 | "Type \n", 2051 | "Hardware Aug 7, 2019 0 No offer 0.0 Medium \n", 2052 | "Hardware Aug 8, 2019 0 Accepted offer 1.0 Hard \n", 2053 | "Hardware NaN 0 Declined offer 0.0 Medium \n", 2054 | "Hardware NaN 9 Declined offer -1.0 Medium \n", 2055 | "Hardware May 29, 2009 2 No offer 0.0 Medium \n", 2056 | "... ... ... ... ... ... \n", 2057 | "Hardware Sep 20, 2015 0 No offer -1.0 Easy \n", 2058 | "Hardware Sep 14, 2015 0 Accepted offer -1.0 Medium \n", 2059 | "Hardware Aug 6, 2015 0 Accepted offer 1.0 Hard \n", 2060 | "Hardware Dec 13, 2015 0 Accepted offer 1.0 Medium \n", 2061 | "Hardware Feb 15, 2016 11 Accepted offer 1.0 Easy \n", 2062 | "\n", 2063 | " Review \n", 2064 | "Type \n", 2065 | "Hardware Orange is love. Orange is life. \n", 2066 | "Hardware Application I applied online. The process too... \n", 2067 | "Hardware Application The process took 4 weeks. I inter... \n", 2068 | "Hardware Application The process took a week. I interv... \n", 2069 | "Hardware Application I applied through an employee ref... \n", 2070 | "... ... \n", 2071 | "Hardware Application I applied through a recruiter. Th... \n", 2072 | "Hardware Application I applied in-person. The process ... \n", 2073 | "Hardware Application I applied through a recruiter. Th... \n", 2074 | "Hardware Application I applied online. The process too... \n", 2075 | "Hardware Application I applied online. The process too... \n", 2076 | "\n", 2077 | "[1676 rows x 10 columns]" 2078 | ] 2079 | }, 2080 | "execution_count": 23, 2081 | "metadata": {}, 2082 | "output_type": "execute_result" 2083 | } 2084 | ], 2085 | "source": [ 2086 | "# Selecting rows with label \"Hardware\"\n", 2087 | "df.loc[\"Hardware\"]" 2088 | ] 2089 | }, 2090 | { 2091 | "cell_type": "code", 2092 | "execution_count": 24, 2093 | "metadata": {}, 2094 | "outputs": [ 2095 | { 2096 | "data": { 2097 | "text/html": [ 2098 | "
\n", 2099 | "\n", 2112 | "\n", 2113 | " \n", 2114 | " \n", 2115 | " \n", 2116 | " \n", 2117 | " \n", 2118 | " \n", 2119 | " \n", 2120 | " \n", 2121 | " \n", 2122 | " \n", 2123 | " \n", 2124 | " \n", 2125 | " \n", 2126 | " \n", 2127 | " \n", 2128 | " \n", 2129 | " \n", 2130 | " \n", 2131 | " \n", 2132 | " \n", 2133 | " \n", 2134 | " \n", 2135 | " \n", 2136 | " \n", 2137 | " \n", 2138 | " \n", 2139 | " \n", 2140 | " \n", 2141 | " \n", 2142 | " \n", 2143 | " \n", 2144 | " \n", 2145 | " \n", 2146 | " \n", 2147 | " \n", 2148 | " \n", 2149 | " \n", 2150 | " \n", 2151 | " \n", 2152 | " \n", 2153 | " \n", 2154 | " \n", 2155 | " \n", 2156 | " \n", 2157 | " \n", 2158 | " \n", 2159 | " \n", 2160 | " \n", 2161 | " \n", 2162 | " \n", 2163 | " \n", 2164 | " \n", 2165 | " \n", 2166 | " \n", 2167 | " \n", 2168 | " \n", 2169 | " \n", 2170 | " \n", 2171 | " \n", 2172 | " \n", 2173 | " \n", 2174 | " \n", 2175 | " \n", 2176 | " \n", 2177 | " \n", 2178 | " \n", 2179 | " \n", 2180 | " \n", 2181 | " \n", 2182 | " \n", 2183 | " \n", 2184 | " \n", 2185 | " \n", 2186 | " \n", 2187 | " \n", 2188 | " \n", 2189 | " \n", 2190 | " \n", 2191 | " \n", 2192 | " \n", 2193 | " \n", 2194 | " \n", 2195 | " \n", 2196 | " \n", 2197 | " \n", 2198 | " \n", 2199 | " \n", 2200 | " \n", 2201 | " \n", 2202 | " \n", 2203 | " \n", 2204 | " \n", 2205 | " \n", 2206 | " \n", 2207 | " \n", 2208 | " \n", 2209 | " \n", 2210 | " \n", 2211 | " \n", 2212 | " \n", 2213 | " \n", 2214 | " \n", 2215 | " \n", 2216 | " \n", 2217 | " \n", 2218 | " \n", 2219 | " \n", 2220 | " \n", 2221 | " \n", 2222 | " \n", 2223 | " \n", 2224 | " \n", 2225 | " \n", 2226 | " \n", 2227 | " \n", 2228 | " \n", 2229 | " \n", 2230 | " \n", 2231 | " \n", 2232 | " \n", 2233 | " \n", 2234 | " \n", 2235 | " \n", 2236 | " \n", 2237 | " \n", 2238 | " \n", 2239 | " \n", 2240 | " \n", 2241 | " \n", 2242 | " \n", 2243 | " \n", 2244 | " \n", 2245 | " \n", 2246 | " \n", 2247 | " \n", 2248 | " \n", 2249 | " \n", 2250 | " \n", 2251 | " \n", 2252 | " \n", 2253 | " \n", 2254 | " \n", 2255 | " \n", 2256 | " \n", 2257 | " \n", 2258 | " \n", 2259 | " \n", 2260 | " \n", 2261 | " \n", 2262 | " \n", 2263 | " \n", 2264 | " \n", 2265 | " \n", 2266 | " \n", 2267 | " \n", 2268 | " \n", 2269 | " \n", 2270 | " \n", 2271 | " \n", 2272 | " \n", 2273 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReview
0OrangeSoftware EngineerSoftware EngineerEngineerAug 7, 20190No offer0.0MediumOrange is love. Orange is life.
1OrangeSoftware EngineerSoftware EngineerEngineerAug 8, 20190Accepted offer1.0HardApplication I applied online. The process too...
2OrangeSoftware EngineerSoftware EngineerEngineerNaN0Declined offer0.0MediumApplication The process took 4 weeks. I inter...
3OrangeSoftware EngineerSoftware EngineerEngineerNaN9Declined offer-1.0MediumApplication The process took a week. I interv...
4OrangeSoftware EngineerSoftware EngineerEngineerMay 29, 20092No offer0.0MediumApplication I applied through an employee ref...
.................................
17649TencentSoftware EngineerSoftware EngineerEngineerMay 25, 20120Declined offer0.0MediumApplication I applied online. The process too...
17650TencentSoftware EngineerSoftware EngineerEngineerMar 15, 20140No offerNaNNaNApplication I applied through college or univ...
17651TencentSoftware EngineerSoftware EngineerEngineerSep 22, 20150Accepted offer1.0MediumApplication I applied through college or univ...
17652TencentSoftware EngineerSoftware EngineerEngineerJul 4, 20170Declined offer1.0MediumApplication I applied through college or univ...
17653TencentSoftware EngineerSoftware EngineerEngineerSep 30, 20160Declined offer0.0EasyApplication I applied online. The process too...
\n", 2274 | "

17654 rows × 10 columns

\n", 2275 | "
" 2276 | ], 2277 | "text/plain": [ 2278 | " Company Title Job Level Date \\\n", 2279 | "0 Orange Software Engineer Software Engineer Engineer Aug 7, 2019 \n", 2280 | "1 Orange Software Engineer Software Engineer Engineer Aug 8, 2019 \n", 2281 | "2 Orange Software Engineer Software Engineer Engineer NaN \n", 2282 | "3 Orange Software Engineer Software Engineer Engineer NaN \n", 2283 | "4 Orange Software Engineer Software Engineer Engineer May 29, 2009 \n", 2284 | "... ... ... ... ... ... \n", 2285 | "17649 Tencent Software Engineer Software Engineer Engineer May 25, 2012 \n", 2286 | "17650 Tencent Software Engineer Software Engineer Engineer Mar 15, 2014 \n", 2287 | "17651 Tencent Software Engineer Software Engineer Engineer Sep 22, 2015 \n", 2288 | "17652 Tencent Software Engineer Software Engineer Engineer Jul 4, 2017 \n", 2289 | "17653 Tencent Software Engineer Software Engineer Engineer Sep 30, 2016 \n", 2290 | "\n", 2291 | " Upvotes Offer Experience Difficulty \\\n", 2292 | "0 0 No offer 0.0 Medium \n", 2293 | "1 0 Accepted offer 1.0 Hard \n", 2294 | "2 0 Declined offer 0.0 Medium \n", 2295 | "3 9 Declined offer -1.0 Medium \n", 2296 | "4 2 No offer 0.0 Medium \n", 2297 | "... ... ... ... ... \n", 2298 | "17649 0 Declined offer 0.0 Medium \n", 2299 | "17650 0 No offer NaN NaN \n", 2300 | "17651 0 Accepted offer 1.0 Medium \n", 2301 | "17652 0 Declined offer 1.0 Medium \n", 2302 | "17653 0 Declined offer 0.0 Easy \n", 2303 | "\n", 2304 | " Review \n", 2305 | "0 Orange is love. Orange is life. \n", 2306 | "1 Application I applied online. The process too... \n", 2307 | "2 Application The process took 4 weeks. I inter... \n", 2308 | "3 Application The process took a week. I interv... \n", 2309 | "4 Application I applied through an employee ref... \n", 2310 | "... ... \n", 2311 | "17649 Application I applied online. The process too... \n", 2312 | "17650 Application I applied through college or univ... \n", 2313 | "17651 Application I applied through college or univ... \n", 2314 | "17652 Application I applied through college or univ... \n", 2315 | "17653 Application I applied online. The process too... \n", 2316 | "\n", 2317 | "[17654 rows x 10 columns]" 2318 | ] 2319 | }, 2320 | "execution_count": 24, 2321 | "metadata": {}, 2322 | "output_type": "execute_result" 2323 | } 2324 | ], 2325 | "source": [ 2326 | "# To drop a label, you need to use reset_index with drop=True\n", 2327 | "df.reset_index(drop=True, inplace=True)\n", 2328 | "df" 2329 | ] 2330 | }, 2331 | { 2332 | "cell_type": "markdown", 2333 | "metadata": {}, 2334 | "source": [ 2335 | "### 3.3 Slicing Series\n", 2336 | "Slicing pandas `Series` is similar to slicing in Python." 2337 | ] 2338 | }, 2339 | { 2340 | "cell_type": "code", 2341 | "execution_count": 25, 2342 | "metadata": {}, 2343 | "outputs": [ 2344 | { 2345 | "data": { 2346 | "text/plain": [ 2347 | "0 Orange\n", 2348 | "100 Orange\n", 2349 | "200 Orange\n", 2350 | "300 Orange\n", 2351 | "400 Intel\n", 2352 | "500 Intel\n", 2353 | "600 Intel\n", 2354 | "700 Intel\n", 2355 | "800 Uber\n", 2356 | "900 Uber\n", 2357 | "Name: Company, dtype: object" 2358 | ] 2359 | }, 2360 | "execution_count": 25, 2361 | "metadata": {}, 2362 | "output_type": "execute_result" 2363 | } 2364 | ], 2365 | "source": [ 2366 | "series = df.Company\n", 2367 | "# The first 1000 companies, picking every 100th companies\n", 2368 | "series[:1000:100]" 2369 | ] 2370 | }, 2371 | { 2372 | "cell_type": "markdown", 2373 | "metadata": {}, 2374 | "source": [ 2375 | "## 4. Accessors\n", 2376 | "\n", 2377 | "### 4.1 string accessor\n", 2378 | "`.str` allows you to apply built-in string functions to all strings in a column (aka a pandas Series). These built-in functions come in handy when you want to do some basic string processing." 2379 | ] 2380 | }, 2381 | { 2382 | "cell_type": "code", 2383 | "execution_count": 26, 2384 | "metadata": {}, 2385 | "outputs": [ 2386 | { 2387 | "data": { 2388 | "text/plain": [ 2389 | "0 orange is love. orange is life.\n", 2390 | "1 application i applied online. the process too...\n", 2391 | "2 application the process took 4 weeks. i inter...\n", 2392 | "3 application the process took a week. i interv...\n", 2393 | "4 application i applied through an employee ref...\n", 2394 | " ... \n", 2395 | "17649 application i applied online. the process too...\n", 2396 | "17650 application i applied through college or univ...\n", 2397 | "17651 application i applied through college or univ...\n", 2398 | "17652 application i applied through college or univ...\n", 2399 | "17653 application i applied online. the process too...\n", 2400 | "Name: Review, Length: 17654, dtype: object" 2401 | ] 2402 | }, 2403 | "execution_count": 26, 2404 | "metadata": {}, 2405 | "output_type": "execute_result" 2406 | } 2407 | ], 2408 | "source": [ 2409 | "# If you want to lowercase all the reviews in the `Reviews` column.\n", 2410 | "df[\"Review\"].str.lower()" 2411 | ] 2412 | }, 2413 | { 2414 | "cell_type": "code", 2415 | "execution_count": 27, 2416 | "metadata": {}, 2417 | "outputs": [ 2418 | { 2419 | "data": { 2420 | "text/plain": [ 2421 | "0 31\n", 2422 | "1 670\n", 2423 | "2 350\n", 2424 | "3 807\n", 2425 | "4 663\n", 2426 | " ... \n", 2427 | "17649 470\n", 2428 | "17650 394\n", 2429 | "17651 524\n", 2430 | "17652 391\n", 2431 | "17653 784\n", 2432 | "Name: Review, Length: 17654, dtype: int64" 2433 | ] 2434 | }, 2435 | "execution_count": 27, 2436 | "metadata": {}, 2437 | "output_type": "execute_result" 2438 | } 2439 | ], 2440 | "source": [ 2441 | "# Or if you want to get the length of all the reviews\n", 2442 | "df.Review.str.len()" 2443 | ] 2444 | }, 2445 | { 2446 | "cell_type": "markdown", 2447 | "metadata": {}, 2448 | "source": [ 2449 | "`.str` can be very powerful if you use it with Regex. Imagine you want to get a sense of how long the interview process takes for each review. You notice that each review mentions how long it takes such as \"the process took 4 weeks\". So you use this heuristic:\n", 2450 | "- a process is short if it takes days\n", 2451 | "- a process is average is if it takes weeks\n", 2452 | "- a process is long if it takes at least 4 weeks" 2453 | ] 2454 | }, 2455 | { 2456 | "cell_type": "code", 2457 | "execution_count": 28, 2458 | "metadata": {}, 2459 | "outputs": [ 2460 | { 2461 | "data": { 2462 | "text/html": [ 2463 | "
\n", 2464 | "\n", 2477 | "\n", 2478 | " \n", 2479 | " \n", 2480 | " \n", 2481 | " \n", 2482 | " \n", 2483 | " \n", 2484 | " \n", 2485 | " \n", 2486 | " \n", 2487 | " \n", 2488 | " \n", 2489 | " \n", 2490 | " \n", 2491 | " \n", 2492 | " \n", 2493 | " \n", 2494 | " \n", 2495 | " \n", 2496 | " \n", 2497 | " \n", 2498 | " \n", 2499 | " \n", 2500 | " \n", 2501 | " \n", 2502 | " \n", 2503 | " \n", 2504 | " \n", 2505 | " \n", 2506 | " \n", 2507 | " \n", 2508 | " \n", 2509 | " \n", 2510 | " \n", 2511 | " \n", 2512 | " \n", 2513 | " \n", 2514 | " \n", 2515 | " \n", 2516 | " \n", 2517 | " \n", 2518 | " \n", 2519 | " \n", 2520 | " \n", 2521 | " \n", 2522 | " \n", 2523 | " \n", 2524 | " \n", 2525 | " \n", 2526 | " \n", 2527 | " \n", 2528 | " \n", 2529 | " \n", 2530 | " \n", 2531 | " \n", 2532 | " \n", 2533 | " \n", 2534 | " \n", 2535 | " \n", 2536 | " \n", 2537 | " \n", 2538 | " \n", 2539 | " \n", 2540 | " \n", 2541 | " \n", 2542 | "
ReviewProcess
1Application I applied online. The process too...Long
2Application The process took 4 weeks. I inter...Long
3Application The process took a week. I interv...Average
5Application I applied through college or univ...Long
6Application The process took 2 days. I interv...Short
.........
17645Application I applied online. The process too...Average
17647Application I applied through college or univ...Average
17648Application I applied online. The process too...Short
17649Application I applied online. The process too...Average
17653Application I applied online. The process too...Average
\n", 2543 | "

12045 rows × 2 columns

\n", 2544 | "
" 2545 | ], 2546 | "text/plain": [ 2547 | " Review Process\n", 2548 | "1 Application I applied online. The process too... Long\n", 2549 | "2 Application The process took 4 weeks. I inter... Long\n", 2550 | "3 Application The process took a week. I interv... Average\n", 2551 | "5 Application I applied through college or univ... Long\n", 2552 | "6 Application The process took 2 days. I interv... Short\n", 2553 | "... ... ...\n", 2554 | "17645 Application I applied online. The process too... Average\n", 2555 | "17647 Application I applied through college or univ... Average\n", 2556 | "17648 Application I applied online. The process too... Short\n", 2557 | "17649 Application I applied online. The process too... Average\n", 2558 | "17653 Application I applied online. The process too... Average\n", 2559 | "\n", 2560 | "[12045 rows x 2 columns]" 2561 | ] 2562 | }, 2563 | "execution_count": 28, 2564 | "metadata": {}, 2565 | "output_type": "execute_result" 2566 | } 2567 | ], 2568 | "source": [ 2569 | "df.loc[df[\"Review\"].str.contains(\"days\"), \"Process\"] = \"Short\"\n", 2570 | "df.loc[df[\"Review\"].str.contains(\"week\"), \"Process\"] = \"Average\"\n", 2571 | "df.loc[df[\"Review\"].str.contains(\"month|[4-9]+[^ ]* weeks|[1-9]\\d{1,}[^ ]* weeks\"), \"Process\"] = \"Long\"\n", 2572 | "df[~df.Process.isna()][[\"Review\", \"Process\"]]" 2573 | ] 2574 | }, 2575 | { 2576 | "cell_type": "markdown", 2577 | "metadata": {}, 2578 | "source": [ 2579 | "We want to sanity check if `Process` corresponds to `Review`, but `Review` is cut off in the display above. To show wider columns, you can set `display.max_colwidth` to `100`.\n", 2580 | "\n", 2581 | "**Note**: set_option has several great options you should check out." 2582 | ] 2583 | }, 2584 | { 2585 | "cell_type": "code", 2586 | "execution_count": 29, 2587 | "metadata": {}, 2588 | "outputs": [ 2589 | { 2590 | "data": { 2591 | "text/html": [ 2592 | "
\n", 2593 | "\n", 2606 | "\n", 2607 | " \n", 2608 | " \n", 2609 | " \n", 2610 | " \n", 2611 | " \n", 2612 | " \n", 2613 | " \n", 2614 | " \n", 2615 | " \n", 2616 | " \n", 2617 | " \n", 2618 | " \n", 2619 | " \n", 2620 | " \n", 2621 | " \n", 2622 | " \n", 2623 | " \n", 2624 | " \n", 2625 | " \n", 2626 | " \n", 2627 | " \n", 2628 | " \n", 2629 | " \n", 2630 | " \n", 2631 | " \n", 2632 | " \n", 2633 | " \n", 2634 | " \n", 2635 | " \n", 2636 | " \n", 2637 | " \n", 2638 | " \n", 2639 | " \n", 2640 | " \n", 2641 | " \n", 2642 | " \n", 2643 | " \n", 2644 | " \n", 2645 | " \n", 2646 | " \n", 2647 | " \n", 2648 | " \n", 2649 | " \n", 2650 | " \n", 2651 | " \n", 2652 | " \n", 2653 | " \n", 2654 | " \n", 2655 | " \n", 2656 | " \n", 2657 | " \n", 2658 | " \n", 2659 | " \n", 2660 | " \n", 2661 | " \n", 2662 | " \n", 2663 | " \n", 2664 | " \n", 2665 | " \n", 2666 | " \n", 2667 | " \n", 2668 | " \n", 2669 | " \n", 2670 | " \n", 2671 | "
ReviewProcess
1Application I applied online. The process took 2+ months. I interviewed at Apple (San Jose, CA)...Long
2Application The process took 4 weeks. I interviewed at Apple (San Antonio, TX) in February 2016...Long
3Application The process took a week. I interviewed at Apple (Cupertino, CA) in December 2008. ...Average
5Application I applied through college or university. The process took 6 weeks. I interviewed at...Long
6Application The process took 2 days. I interviewed at Apple (Cupertino, CA) in March 2009. Int...Short
.........
17645Application I applied online. The process took a week. I interviewed at Tencent (Palo Alto, CA)...Average
17647Application I applied through college or university. The process took 4+ weeks. I interviewed a...Average
17648Application I applied online. The process took 2 days. I interviewed at Tencent. Interview I ...Short
17649Application I applied online. The process took a week. I interviewed at Tencent (Beijing, Beiji...Average
17653Application I applied online. The process took 3+ weeks. I interviewed at Tencent (Beijing, Bei...Average
\n", 2672 | "

12045 rows × 2 columns

\n", 2673 | "
" 2674 | ], 2675 | "text/plain": [ 2676 | " Review \\\n", 2677 | "1 Application I applied online. The process took 2+ months. I interviewed at Apple (San Jose, CA)... \n", 2678 | "2 Application The process took 4 weeks. I interviewed at Apple (San Antonio, TX) in February 2016... \n", 2679 | "3 Application The process took a week. I interviewed at Apple (Cupertino, CA) in December 2008. ... \n", 2680 | "5 Application I applied through college or university. The process took 6 weeks. I interviewed at... \n", 2681 | "6 Application The process took 2 days. I interviewed at Apple (Cupertino, CA) in March 2009. Int... \n", 2682 | "... ... \n", 2683 | "17645 Application I applied online. The process took a week. I interviewed at Tencent (Palo Alto, CA)... \n", 2684 | "17647 Application I applied through college or university. The process took 4+ weeks. I interviewed a... \n", 2685 | "17648 Application I applied online. The process took 2 days. I interviewed at Tencent. Interview I ... \n", 2686 | "17649 Application I applied online. The process took a week. I interviewed at Tencent (Beijing, Beiji... \n", 2687 | "17653 Application I applied online. The process took 3+ weeks. I interviewed at Tencent (Beijing, Bei... \n", 2688 | "\n", 2689 | " Process \n", 2690 | "1 Long \n", 2691 | "2 Long \n", 2692 | "3 Average \n", 2693 | "5 Long \n", 2694 | "6 Short \n", 2695 | "... ... \n", 2696 | "17645 Average \n", 2697 | "17647 Average \n", 2698 | "17648 Short \n", 2699 | "17649 Average \n", 2700 | "17653 Average \n", 2701 | "\n", 2702 | "[12045 rows x 2 columns]" 2703 | ] 2704 | }, 2705 | "execution_count": 29, 2706 | "metadata": {}, 2707 | "output_type": "execute_result" 2708 | } 2709 | ], 2710 | "source": [ 2711 | "pd.set_option('display.max_colwidth', 100)\n", 2712 | "df[~df.Process.isna()][[\"Review\", \"Process\"]]" 2713 | ] 2714 | }, 2715 | { 2716 | "cell_type": "code", 2717 | "execution_count": 30, 2718 | "metadata": {}, 2719 | "outputs": [ 2720 | { 2721 | "data": { 2722 | "text/plain": [ 2723 | "dict_keys(['__module__', '__annotations__', '__doc__', '__init__', '_validate', '__getitem__', '__iter__', '_wrap_result', '_get_series_list', 'cat', 'split', 'rsplit', 'partition', 'rpartition', 'get', 'join', 'contains', 'match', 'fullmatch', 'replace', 'repeat', 'pad', 'center', 'ljust', 'rjust', 'zfill', 'slice', 'slice_replace', 'decode', 'encode', 'strip', 'lstrip', 'rstrip', 'wrap', 'get_dummies', 'translate', 'count', 'startswith', 'endswith', 'findall', 'extract', 'extractall', 'find', 'rfind', 'normalize', 'index', 'rindex', 'len', '_doc_args', 'lower', 'upper', 'title', 'capitalize', 'swapcase', 'casefold', 'isalnum', 'isalpha', 'isdigit', 'isspace', 'islower', 'isupper', 'istitle', 'isnumeric', 'isdecimal', '_make_accessor'])" 2724 | ] 2725 | }, 2726 | "execution_count": 30, 2727 | "metadata": {}, 2728 | "output_type": "execute_result" 2729 | } 2730 | ], 2731 | "source": [ 2732 | "# To see the built-in functions available for `.str`, use this\n", 2733 | "pd.Series.str.__dict__.keys()" 2734 | ] 2735 | }, 2736 | { 2737 | "cell_type": "markdown", 2738 | "metadata": {}, 2739 | "source": [ 2740 | "### 4.2 Other accessors\n", 2741 | "pandas `Series` has 3 other accessors.\n", 2742 | "- `.dt`: handles date formats\n", 2743 | "- `.cat`: handles categorical data\n", 2744 | "- `.sparse`: handles sparse matrices" 2745 | ] 2746 | }, 2747 | { 2748 | "cell_type": "code", 2749 | "execution_count": 31, 2750 | "metadata": {}, 2751 | "outputs": [ 2752 | { 2753 | "data": { 2754 | "text/plain": [ 2755 | "{'cat', 'dt', 'sparse', 'str'}" 2756 | ] 2757 | }, 2758 | "execution_count": 31, 2759 | "metadata": {}, 2760 | "output_type": "execute_result" 2761 | } 2762 | ], 2763 | "source": [ 2764 | "pd.Series._accessors" 2765 | ] 2766 | }, 2767 | { 2768 | "cell_type": "markdown", 2769 | "metadata": {}, 2770 | "source": [ 2771 | "## 5. Data exploration\n", 2772 | "When analyzing data, you might want to take a look at the data. pandas has some great built-in functions for that.\n", 2773 | "\n", 2774 | "### 5.1 `.head()`, `.tail()`, `.describe()`, `.info()`\n", 2775 | "You're probably familiar with `.head()` and `.tail()` methods for showing the first/last rows of `DataFrame`. By default, 5 rows are shown, but you can specify the exact number." 2776 | ] 2777 | }, 2778 | { 2779 | "cell_type": "code", 2780 | "execution_count": 32, 2781 | "metadata": {}, 2782 | "outputs": [ 2783 | { 2784 | "data": { 2785 | "text/html": [ 2786 | "
\n", 2787 | "\n", 2800 | "\n", 2801 | " \n", 2802 | " \n", 2803 | " \n", 2804 | " \n", 2805 | " \n", 2806 | " \n", 2807 | " \n", 2808 | " \n", 2809 | " \n", 2810 | " \n", 2811 | " \n", 2812 | " \n", 2813 | " \n", 2814 | " \n", 2815 | " \n", 2816 | " \n", 2817 | " \n", 2818 | " \n", 2819 | " \n", 2820 | " \n", 2821 | " \n", 2822 | " \n", 2823 | " \n", 2824 | " \n", 2825 | " \n", 2826 | " \n", 2827 | " \n", 2828 | " \n", 2829 | " \n", 2830 | " \n", 2831 | " \n", 2832 | " \n", 2833 | " \n", 2834 | " \n", 2835 | " \n", 2836 | " \n", 2837 | " \n", 2838 | " \n", 2839 | " \n", 2840 | " \n", 2841 | " \n", 2842 | " \n", 2843 | " \n", 2844 | " \n", 2845 | " \n", 2846 | " \n", 2847 | " \n", 2848 | " \n", 2849 | " \n", 2850 | " \n", 2851 | " \n", 2852 | " \n", 2853 | " \n", 2854 | " \n", 2855 | " \n", 2856 | " \n", 2857 | " \n", 2858 | " \n", 2859 | " \n", 2860 | " \n", 2861 | " \n", 2862 | " \n", 2863 | " \n", 2864 | " \n", 2865 | " \n", 2866 | " \n", 2867 | " \n", 2868 | " \n", 2869 | " \n", 2870 | " \n", 2871 | " \n", 2872 | " \n", 2873 | " \n", 2874 | " \n", 2875 | " \n", 2876 | " \n", 2877 | " \n", 2878 | " \n", 2879 | " \n", 2880 | " \n", 2881 | " \n", 2882 | " \n", 2883 | " \n", 2884 | " \n", 2885 | " \n", 2886 | " \n", 2887 | " \n", 2888 | " \n", 2889 | " \n", 2890 | " \n", 2891 | " \n", 2892 | " \n", 2893 | " \n", 2894 | " \n", 2895 | " \n", 2896 | " \n", 2897 | " \n", 2898 | " \n", 2899 | " \n", 2900 | " \n", 2901 | " \n", 2902 | " \n", 2903 | " \n", 2904 | " \n", 2905 | " \n", 2906 | " \n", 2907 | " \n", 2908 | " \n", 2909 | " \n", 2910 | " \n", 2911 | " \n", 2912 | " \n", 2913 | " \n", 2914 | " \n", 2915 | " \n", 2916 | " \n", 2917 | " \n", 2918 | " \n", 2919 | " \n", 2920 | " \n", 2921 | " \n", 2922 | " \n", 2923 | " \n", 2924 | " \n", 2925 | " \n", 2926 | " \n", 2927 | " \n", 2928 | " \n", 2929 | " \n", 2930 | " \n", 2931 | "
CompanyTitleJobLevelDateUpvotesOfferExperienceDifficultyReviewProcess
17646TencentSoftware EngineerSoftware EngineerEngineerMay 28, 20100Declined offer0.0EasyApplication I applied through an employee referral. The process took 1 day. I interviewed at Te...NaN
17647TencentSoftware EngineerSoftware EngineerEngineerApr 11, 20190Accepted offer1.0MediumApplication I applied through college or university. The process took 4+ weeks. I interviewed a...Average
17648TencentSoftware EngineerSoftware EngineerEngineerNov 4, 20120No offerNaNNaNApplication I applied online. The process took 2 days. I interviewed at Tencent. Interview I ...Short
17649TencentSoftware EngineerSoftware EngineerEngineerMay 25, 20120Declined offer0.0MediumApplication I applied online. The process took a week. I interviewed at Tencent (Beijing, Beiji...Average
17650TencentSoftware EngineerSoftware EngineerEngineerMar 15, 20140No offerNaNNaNApplication I applied through college or university. I interviewed at Tencent. Interview Prof...NaN
17651TencentSoftware EngineerSoftware EngineerEngineerSep 22, 20150Accepted offer1.0MediumApplication I applied through college or university. The process took 1 day. I interviewed at T...NaN
17652TencentSoftware EngineerSoftware EngineerEngineerJul 4, 20170Declined offer1.0MediumApplication I applied through college or university. I interviewed at Tencent (London, England ...NaN
17653TencentSoftware EngineerSoftware EngineerEngineerSep 30, 20160Declined offer0.0EasyApplication I applied online. The process took 3+ weeks. I interviewed at Tencent (Beijing, Bei...Average
\n", 2932 | "
" 2933 | ], 2934 | "text/plain": [ 2935 | " Company Title Job Level Date \\\n", 2936 | "17646 Tencent Software Engineer Software Engineer Engineer May 28, 2010 \n", 2937 | "17647 Tencent Software Engineer Software Engineer Engineer Apr 11, 2019 \n", 2938 | "17648 Tencent Software Engineer Software Engineer Engineer Nov 4, 2012 \n", 2939 | "17649 Tencent Software Engineer Software Engineer Engineer May 25, 2012 \n", 2940 | "17650 Tencent Software Engineer Software Engineer Engineer Mar 15, 2014 \n", 2941 | "17651 Tencent Software Engineer Software Engineer Engineer Sep 22, 2015 \n", 2942 | "17652 Tencent Software Engineer Software Engineer Engineer Jul 4, 2017 \n", 2943 | "17653 Tencent Software Engineer Software Engineer Engineer Sep 30, 2016 \n", 2944 | "\n", 2945 | " Upvotes Offer Experience Difficulty \\\n", 2946 | "17646 0 Declined offer 0.0 Easy \n", 2947 | "17647 0 Accepted offer 1.0 Medium \n", 2948 | "17648 0 No offer NaN NaN \n", 2949 | "17649 0 Declined offer 0.0 Medium \n", 2950 | "17650 0 No offer NaN NaN \n", 2951 | "17651 0 Accepted offer 1.0 Medium \n", 2952 | "17652 0 Declined offer 1.0 Medium \n", 2953 | "17653 0 Declined offer 0.0 Easy \n", 2954 | "\n", 2955 | " Review \\\n", 2956 | "17646 Application I applied through an employee referral. The process took 1 day. I interviewed at Te... \n", 2957 | "17647 Application I applied through college or university. The process took 4+ weeks. I interviewed a... \n", 2958 | "17648 Application I applied online. The process took 2 days. I interviewed at Tencent. Interview I ... \n", 2959 | "17649 Application I applied online. The process took a week. I interviewed at Tencent (Beijing, Beiji... \n", 2960 | "17650 Application I applied through college or university. I interviewed at Tencent. Interview Prof... \n", 2961 | "17651 Application I applied through college or university. The process took 1 day. I interviewed at T... \n", 2962 | "17652 Application I applied through college or university. I interviewed at Tencent (London, England ... \n", 2963 | "17653 Application I applied online. The process took 3+ weeks. I interviewed at Tencent (Beijing, Bei... \n", 2964 | "\n", 2965 | " Process \n", 2966 | "17646 NaN \n", 2967 | "17647 Average \n", 2968 | "17648 Short \n", 2969 | "17649 Average \n", 2970 | "17650 NaN \n", 2971 | "17651 NaN \n", 2972 | "17652 NaN \n", 2973 | "17653 Average " 2974 | ] 2975 | }, 2976 | "execution_count": 32, 2977 | "metadata": {}, 2978 | "output_type": "execute_result" 2979 | } 2980 | ], 2981 | "source": [ 2982 | "df.tail(8)" 2983 | ] 2984 | }, 2985 | { 2986 | "cell_type": "code", 2987 | "execution_count": 33, 2988 | "metadata": {}, 2989 | "outputs": [ 2990 | { 2991 | "data": { 2992 | "text/html": [ 2993 | "
\n", 2994 | "\n", 3007 | "\n", 3008 | " \n", 3009 | " \n", 3010 | " \n", 3011 | " \n", 3012 | " \n", 3013 | " \n", 3014 | " \n", 3015 | " \n", 3016 | " \n", 3017 | " \n", 3018 | " \n", 3019 | " \n", 3020 | " \n", 3021 | " \n", 3022 | " \n", 3023 | " \n", 3024 | " \n", 3025 | " \n", 3026 | " \n", 3027 | " \n", 3028 | " \n", 3029 | " \n", 3030 | " \n", 3031 | " \n", 3032 | " \n", 3033 | " \n", 3034 | " \n", 3035 | " \n", 3036 | " \n", 3037 | " \n", 3038 | " \n", 3039 | " \n", 3040 | " \n", 3041 | " \n", 3042 | " \n", 3043 | " \n", 3044 | " \n", 3045 | " \n", 3046 | " \n", 3047 | " \n", 3048 | " \n", 3049 | " \n", 3050 | " \n", 3051 | " \n", 3052 | " \n", 3053 | " \n", 3054 | " \n", 3055 | " \n", 3056 | " \n", 3057 | "
UpvotesExperience
count17654.00000016365.000000
mean2.2984590.431714
std28.2525620.759964
min0.000000-1.000000
25%0.0000000.000000
50%0.0000001.000000
75%1.0000001.000000
max1916.0000001.000000
\n", 3058 | "
" 3059 | ], 3060 | "text/plain": [ 3061 | " Upvotes Experience\n", 3062 | "count 17654.000000 16365.000000\n", 3063 | "mean 2.298459 0.431714\n", 3064 | "std 28.252562 0.759964\n", 3065 | "min 0.000000 -1.000000\n", 3066 | "25% 0.000000 0.000000\n", 3067 | "50% 0.000000 1.000000\n", 3068 | "75% 1.000000 1.000000\n", 3069 | "max 1916.000000 1.000000" 3070 | ] 3071 | }, 3072 | "execution_count": 33, 3073 | "metadata": {}, 3074 | "output_type": "execute_result" 3075 | } 3076 | ], 3077 | "source": [ 3078 | "# Generate statistics about numeric columns.\n", 3079 | "df.describe()\n", 3080 | "\n", 3081 | "# Note:\n", 3082 | "# 1. `.describe()` ignores all non-numeric columns.\n", 3083 | "# 2. It doesn't take into account NaN values. So, the number shown in `count` below is the number of non-NaN entries." 3084 | ] 3085 | }, 3086 | { 3087 | "cell_type": "code", 3088 | "execution_count": 34, 3089 | "metadata": {}, 3090 | "outputs": [ 3091 | { 3092 | "name": "stdout", 3093 | "output_type": "stream", 3094 | "text": [ 3095 | "\n", 3096 | "RangeIndex: 17654 entries, 0 to 17653\n", 3097 | "Data columns (total 11 columns):\n", 3098 | " # Column Non-Null Count Dtype \n", 3099 | "--- ------ -------------- ----- \n", 3100 | " 0 Company 17654 non-null object \n", 3101 | " 1 Title 17654 non-null object \n", 3102 | " 2 Job 17654 non-null object \n", 3103 | " 3 Level 17654 non-null object \n", 3104 | " 4 Date 17652 non-null object \n", 3105 | " 5 Upvotes 17654 non-null int64 \n", 3106 | " 6 Offer 17654 non-null object \n", 3107 | " 7 Experience 16365 non-null float64\n", 3108 | " 8 Difficulty 16376 non-null object \n", 3109 | " 9 Review 17654 non-null object \n", 3110 | " 10 Process 12045 non-null object \n", 3111 | "dtypes: float64(1), int64(1), object(9)\n", 3112 | "memory usage: 1.5+ MB\n" 3113 | ] 3114 | } 3115 | ], 3116 | "source": [ 3117 | "# Show non-null count and types of all columns\n", 3118 | "df.info()\n", 3119 | "\n", 3120 | "# Note: pandas treats string type as object" 3121 | ] 3122 | }, 3123 | { 3124 | "cell_type": "code", 3125 | "execution_count": 35, 3126 | "metadata": {}, 3127 | "outputs": [ 3128 | { 3129 | "data": { 3130 | "text/plain": [ 3131 | "Company 1126274\n", 3132 | "Title 1358305\n", 3133 | "Job 1345085\n", 3134 | "Level 1144784\n", 3135 | "Date 1212964\n", 3136 | "Upvotes 141384\n", 3137 | "Offer 1184190\n", 3138 | "Experience 141384\n", 3139 | "Difficulty 1058008\n", 3140 | "Review 20503467\n", 3141 | "Process 1274180\n", 3142 | "dtype: int64" 3143 | ] 3144 | }, 3145 | "execution_count": 35, 3146 | "metadata": {}, 3147 | "output_type": "execute_result" 3148 | } 3149 | ], 3150 | "source": [ 3151 | "# You can also see how much space your DataFrame is taking up\n", 3152 | "import sys\n", 3153 | "df.apply(sys.getsizeof)" 3154 | ] 3155 | }, 3156 | { 3157 | "cell_type": "markdown", 3158 | "metadata": {}, 3159 | "source": [ 3160 | "### 5.2 Count unique values\n", 3161 | "You can get the number of unique values in a row (excluding NaN) with `nunique()`." 3162 | ] 3163 | }, 3164 | { 3165 | "cell_type": "code", 3166 | "execution_count": 36, 3167 | "metadata": {}, 3168 | "outputs": [ 3169 | { 3170 | "data": { 3171 | "text/plain": [ 3172 | "28" 3173 | ] 3174 | }, 3175 | "execution_count": 36, 3176 | "metadata": {}, 3177 | "output_type": "execute_result" 3178 | } 3179 | ], 3180 | "source": [ 3181 | "# Get the number of unique companies in our data\n", 3182 | "df.Company.nunique()" 3183 | ] 3184 | }, 3185 | { 3186 | "cell_type": "code", 3187 | "execution_count": 37, 3188 | "metadata": {}, 3189 | "outputs": [ 3190 | { 3191 | "data": { 3192 | "text/plain": [ 3193 | "Amazon 3469\n", 3194 | "Google 3445\n", 3195 | "Facebook 1817\n", 3196 | "Microsoft 1790\n", 3197 | "IBM 873\n", 3198 | "Cisco 787\n", 3199 | "Oracle 701\n", 3200 | "Uber 445\n", 3201 | "Yelp 404\n", 3202 | "Orange 363\n", 3203 | "Intel 338\n", 3204 | "Salesforce 313\n", 3205 | "SAP 275\n", 3206 | "Twitter 258\n", 3207 | "Dell 258\n", 3208 | "Airbnb 233\n", 3209 | "NVIDIA 229\n", 3210 | "Adobe 211\n", 3211 | "Intuit 203\n", 3212 | "PayPal 193\n", 3213 | "Siemens 182\n", 3214 | "Square 177\n", 3215 | "Samsung 159\n", 3216 | "eBay 148\n", 3217 | "Symantec 147\n", 3218 | "Snap 113\n", 3219 | "Netflix 109\n", 3220 | "Tencent 14\n", 3221 | "Name: Company, dtype: int64" 3222 | ] 3223 | }, 3224 | "execution_count": 37, 3225 | "metadata": {}, 3226 | "output_type": "execute_result" 3227 | } 3228 | ], 3229 | "source": [ 3230 | "# You can also see how many reviews are for each company, sorted in a descending order.\n", 3231 | "df.Company.value_counts()" 3232 | ] 3233 | }, 3234 | { 3235 | "cell_type": "markdown", 3236 | "metadata": {}, 3237 | "source": [ 3238 | "### 5.3 Plotting\n", 3239 | "If you want to see the break down of process lengths for different companies, you can use `.plot()` with `.groupby()`.\n", 3240 | "\n", 3241 | "**Note**: Plotting in pandas is both mind-boggling and mind-blowing. If you're not familiar, you might want to check out some tutorials, e.g. [this simple tutorial](https://realpython.com/pandas-plot-python/) or [this saiyan-level pandas plotting with seaborn](https://jakevdp.github.io/PythonDataScienceHandbook/04.14-visualization-with-seaborn.html)." 3242 | ] 3243 | }, 3244 | { 3245 | "cell_type": "code", 3246 | "execution_count": 38, 3247 | "metadata": {}, 3248 | "outputs": [ 3249 | { 3250 | "data": { 3251 | "text/plain": [ 3252 | "" 3253 | ] 3254 | }, 3255 | "execution_count": 38, 3256 | "metadata": {}, 3257 | "output_type": "execute_result" 3258 | }, 3259 | { 3260 | "data": { 3261 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA3MAAAIKCAYAAACJExbHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nOzde9xVZZ3//9dHUPGIJ9ISFXI8ESAaEnko1FKTRi0zLVNKy85azs8kp9KOY5Mz/ZIpjQmVGoc0rTS1g4c84IgJxHjASsZQMDM8kWdFPt8/1rpxc3Nz8L732pt136/n43E/uNe1196ftTf73nu913Wta0VmIkmSJEmql3XavQGSJEmSpFfPMCdJkiRJNWSYkyRJkqQaMsxJkiRJUg0Z5iRJkiSphgxzkiRJklRD/du9Aauy1VZb5ZAhQ9q9GZIkSZLUFrNmzXo0Mwd1ddtaHeaGDBnCzJkz270ZkiRJktQWEfHAym5zmKUkSZIk1ZBhTpIkSZJqyDAnSZIkSTW0Vp8zJ0mSJKmeXnrpJRYuXMjzzz/f7k2phQEDBjB48GDWXXfdNb6PYU6SJElS0y1cuJBNNtmEIUOGEBHt3py1Wmby2GOPsXDhQoYOHbrG93OYpSRJkqSme/7559lyyy0NcmsgIthyyy1fdS+mYU6SJElSJQxya647r5VhTpIkSVIt9OvXj1GjRjF8+HCOOuoonn322XZvUlsZ5iRJkiTVwgYbbMCcOXO4++67WW+99Tj//POXuz0zWbp0aZu2rvUMc5IkSZJqZ7/99mPevHnMnz+fXXbZheOPP57hw4ezYMECpk2bxogRIxg+fDinn376svv86le/Ys8992T33XfnwAMPBOCZZ57hhBNOYMyYMeyxxx5cccUVANxzzz2MGTOGUaNGMXLkSO677z6eeeYZxo8fz+67787w4cO55JJL2vLcOzibpSRJkqRaWbJkCb/85S855JBDALjvvvuYOnUqY8eO5S9/+Qunn346s2bNYvPNN+eggw7i5z//Ofvssw8f+chHuPnmmxk6dCiPP/44AF//+tc54IADuOCCC3jyyScZM2YMb3vb2zj//PM55ZRTOPbYY3nxxRd5+eWXueaaa3jd617H1VdfDcDixYvb9hqAPXOSJEmSauK5555j1KhRjB49mu23354TTzwRgB122IGxY8cCcMcddzBu3DgGDRpE//79OfbYY7n55puZMWMGb3nLW5ZN/b/FFlsA8Jvf/Iazzz6bUaNGMW7cOJ5//nkefPBB3vzmN/ONb3yDb37zmzzwwANssMEGjBgxgmuvvZbTTz+dW265hYEDB7bnhSjZMydJkiSpFjrOmetso4026vZjZiaXX345u+yyy3Ltu+22G29605u4+uqrOfTQQ/n+97/PAQccwOzZs7nmmmv4whe+wIEHHsiXvvSlbtfuKXvmJEmSJPUaY8aM4aabbuLRRx/l5ZdfZtq0abz1rW9l7Nix3Hzzzfz5z38GWDbM8uCDD2bSpElkJgC///3vAbj//vt5/etfz8knn8zhhx/OnXfeyV/+8hc23HBDPvCBD3Daaacxe/bs9jzJkj1zkiRJknqN1772tZx99tnsv//+ZCbjx4/n8MMPB2Dy5Mm8+93vZunSpbzmNa/h2muv5Ytf/CKf+cxnGDlyJEuXLmXo0KFcddVVXHrppfzoRz9i3XXXZZtttuGMM87gjjvu4LTTTmOdddZh3XXX5bzzzmvrc42OBLo2Gj16dM6cObPdmyFJkiTpVbr33nvZbbfd2r0ZtdLVaxYRszJzdFfrO8xSkiRJkmrIMCdJkiRJNWSYkyRJkqQacgKUGhky8eoV2uafPb4NWyJJkiSp3eyZkyRJkqQaMsxJkiRJUg0Z5iRJkiT1Wj//+c+JCP7whz+0e1OaznPmJEmSJFWuq/kfemJN546YNm0a++67L9OmTePLX/5yj2ouWbKE/v3Xnghlz5wkSZKkXunpp59m+vTpTJkyhR//+McAHHPMMVx99SvB8oMf/CCXXXYZL7/8Mqeddhp77bUXI0eO5Pvf/z4AN954I/vttx+HHXYYw4YNA+CII47gjW98I294wxuYPHnysseaMmUKO++8M2PGjOEjH/kIn/rUpwBYtGgRRx55JHvttRd77bUXt956a1Oe39oTK6U+zJlKJUmSmu+KK67gkEMOYeedd2bLLbdk1qxZHH300Vx66aWMHz+eF198keuvv57zzjuPKVOmMHDgQO644w5eeOEF9tlnHw466CAAZs+ezd13383QoUMBuOCCC9hiiy147rnn2GuvvTjyyCN54YUX+OpXv8rs2bPZZJNNOOCAA9h9990BOOWUU/jsZz/Lvvvuy4MPPsjBBx/Mvffe2+PnZ5iTJEmS1CtNmzaNU045BSh65KZNm8bXvvY1TjnlFF544QV+9atf8Za3vIUNNtiA3/zmN9x5551cdtllACxevJj77ruP9dZbjzFjxiwLcgDnnnsuP/vZzwBYsGAB9913H3/9619561vfyhZbbAHAUUcdxZ/+9CcArrvuOubOnbvs/n//+995+umn2XjjjXv0/AxzkiRJknqdxx9/nBtuuIG77rqLiODll18mIvjWt77FuHHj+PWvf80ll1zCMcccA0BmMmnSJA4++ODlHufGG29ko402Wm75uuuu47bbbmPDDTdk3LhxPP/886vclqVLlzJjxgwGDBjQ1OfoOXOSJEmSep3LLruM4447jgceeID58+ezYMEChg4dyi233MLRRx/NhRdeyC233MIhhxwCwMEHH8x5553HSy+9BMCf/vQnnnnmmRUed/HixWy++eZsuOGG/OEPf2DGjBkA7LXXXtx000088cQTLFmyhMsvv3zZfQ466CAmTZq0bHnOnDlNeY6GOUmSJEm9zrRp03jXu961XNuRRx7JtGnTOOigg7jpppt429vexnrrrQfAhz/8YYYNG8aee+7J8OHD+ehHP8qSJUtWeNxDDjmEJUuWsNtuuzFx4kTGjh0LwLbbbssZZ5zBmDFj2GeffRgyZAgDBw4EimGZM2fOZOTIkQwbNozzzz+/Kc8xMrMpD1SF0aNH58yZM9u9GWsNJ8novfy/lSRJvc29997Lbrvt1u7NaKmO8+CWLFnCu971Lk444YQVAuWqdPWaRcSszBzd1fr2zEmSJElSE5x11lmMGjWK4cOHM3ToUI444ohK6zkBiiRJkiQ1wTnnnNPSevbMSZIkSVINGeYkSZIkqYZWG+Yi4oKI+FtE3N3Fbf8UERkRW5XLERHnRsS8iLgzIvZsWHdCRNxX/kxo7tOQJEmSpL5lTXrmLgIO6dwYEdsBBwEPNjS/A9ip/DkJOK9cdwvgTOBNwBjgzIjYvCcbLkmSJEl92WrDXGbeDDzexU3fBj4HNF7b4HDgh1mYAWwWEa8FDgauzczHM/MJ4Fq6CIiSJEmS1Cwbb7xxuzehUt2azTIiDgceysz/jYjGm7YFFjQsLyzbVtYuSZIkqS84a2CTH29xcx+vhl71BCgRsSFwBvCl5m8ORMRJETEzImYuWrSoihKSJEmS+qj58+dzwAEHMHLkSA488EAefLA4a+yDH/wgJ598MnvvvTevf/3rueyyywBYunQpn/jEJ9h11115+9vfzqGHHrrstnbrzmyWOwJDgf+NiPnAYGB2RGwDPARs17Du4LJtZe0ryMzJmTk6M0cPGjSoG5snSZIkSV379Kc/zYQJE7jzzjs59thjOfnkk5fd9vDDDzN9+nSuuuoqJk6cCMBPf/pT5s+fz9y5c/nRj37Ebbfd1q5NX8GrDnOZeVdmviYzh2TmEIohk3tm5l+BK4Hjy1ktxwKLM/Nh4NfAQRGxeTnxyUFlmyRJkiS1zG233cb73/9+AI477jimT5++7LYjjjiCddZZh2HDhvHII48AMH36dI466ijWWWcdttlmG/bff/+2bHdX1uTSBNOA24BdImJhRJy4itWvAe4H5gH/CXwCIDMfB74K3FH+fKVskyRJkqS1wvrrr7/s98xcxZprhzWZzfJ9mfnazFw3Mwdn5pROtw/JzEfL3zMzP5mZO2bmiMyc2bDeBZn5D+XPhc1/KpIkSZK0anvvvTc//vGPAbj44ovZb7/9Vrn+Pvvsw+WXX87SpUt55JFHuPHGG1uwlWumW7NZSpIkSdLa7tlnn2Xw4MHLlk899VQmTZrEhz70Ib71rW8xaNAgLrxw1f1MRx55JNdffz3Dhg1ju+22Y88992TgwCbPzNlNhjlJkiRJ1WvDpQSWLl3aZfsNN9ywQttFF1203PLTTz8NwDrrrMM555zDxhtvzGOPPcaYMWMYMWJE07e1OwxzkiRJkrQK73znO3nyySd58cUX+eIXv8g222zT7k0CDHOSJEmStEpr03lyjbpznTlJkiRJUpsZ5iRJkiSphgxzkiRJklRDhjlJkiRJqiHDnCRJkqRe6etf/zpveMMbGDlyJKNGjeL2229nyJAhPProo91+zDlz5nDNNdc0cSu7z9ksJUmSJFVuxNTmXpvtrgl3rfL22267jauuuorZs2ez/vrr8+ijj/Liiy/2qOaSJUuYM2cOM2fO5NBDD+3RYzWDYU6SJElSr/Pwww+z1VZbsf766wOw1VZbLbtt0qRJ/OIXv+Cll17iJz/5CbvuuiuPP/44J5xwAvfffz8bbrghkydPZuTIkZx11ln83//9H/fffz/bb789t956K8899xzTp0/n85//PEcffXS7nqLDLCVJkiT1PgcddBALFixg55135hOf+AQ33XTTstu22morZs+ezcc//nHOOeccAM4880z22GMP7rzzTr7xjW9w/PHHL1t/7ty5XHfddUybNo2vfOUrHH300cyZM6etQQ4Mc5IkSZJ6oY033phZs2YxefJkBg0axNFHH81FF10EwLvf/W4A3vjGNzJ//nwApk+fznHHHQfAAQccwGOPPcbf//53AA477DA22GCDlj+H1XGYpSRJkqReqV+/fowbN45x48YxYsQIpk6dCrBs6GW/fv1YsmTJah9no402qnQ7u8ueOUmSJEm9zh//+Efuu+++Zctz5sxhhx12WOn6++23HxdffDEAN954I1tttRWbbrrpCuttsskmPPXUU83f4G4wzEmSJEnqdZ5++mkmTJjAsGHDGDlyJHPnzuWss85a6fpnnXUWs2bNYuTIkUycOHFZL15n+++/P3PnzmXUqFFccsklFW39monMbOsGrMro0aNz5syZ7d6MtcaQiVev0Db/7PFt2BI1m/+3kiSpt7n33nvZbbfd2r0ZtdLVaxYRszJzdFfr2zMnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkVWJtnp9jbdOd18owJ0mSJKnpBgwYwGOPPWagWwOZyWOPPcaAAQNe1f28aLgkSZKkphs8eDALFy5k0aJF7d6UWhgwYACDBw9+VfcxzEmSJElqunXXXZehQ4e2ezN6NYdZSpIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTW02jAXERdExN8i4u6Gtm9FxB8i4s6I+FlEbNZw2+cjYl5E/DEiDm5oP6RsmxcRE5v/VCRJkiSp71iTnrmLgEM6tV0LDM/MkcCfgM8DRMQw4BjgDeV9vhcR/SKiH/Bd4B3AMOB95bqSJEmSpG5YbZjLzJuBxzu1/SYzl5SLM4DB5e+HAz/OzBcy88/APGBM+TMvM+/PzBeBH5frSpIkSZK6oRnnzJ0A/LL8fVtgQcNtC8u2lbVLkiRJkrqhR2EuIv4ZWAJc3JzNgYg4KSJmRsTMRYsWNethJUmSJKlX6XaYi4gPAu8Ejs3MLJsfArZrWG1w2bay9hVk5uTMHJ2ZowcNGtTdzZMkSZKkXq1bYS4iDgE+BxyWmc823HQlcExErB8RQ4GdgN8BdwA7RcTQiFiPYpKUK3u26ZIkSZLUd/Vf3QoRMQ0YB2wVEQuBMylmr1wfuDYiAGZk5scy856IuBSYSzH88pOZ+XL5OJ8Cfg30Ay7IzHsqeD6SJEmS1CesNsxl5vu6aJ6yivW/Dny9i/ZrgGte1dZJkiRJkrrUjNksJUmSJEktZpiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSami1YS4iLoiIv0XE3Q1tW0TEtRFxX/nv5mV7RMS5ETEvIu6MiD0b7jOhXP++iJhQzdORJEmSpL5hTXrmLgIO6dQ2Ebg+M3cCri+XAd4B7FT+nAScB0X4A84E3gSMAc7sCICSJEmSpFdvtWEuM28GHu/UfDgwtfx9KnBEQ/sPszAD2CwiXgscDFybmY9n5hPAtawYECVJkiRJa6i758xtnZkPl7//Fdi6/H1bYEHDegvLtpW1ryAiToqImRExc9GiRd3cPEmSJEnq3Xo8AUpmJpBN2JaOx5ucmaMzc/SgQYOa9bCSJEmS1Kt0N8w9Ug6fpPz3b2X7Q8B2DesNLttW1i5JkiRJ6obuhrkrgY4ZKScAVzS0H1/OajkWWFwOx/w1cFBEbF5OfHJQ2SZJkiRJ6ob+q1shIqYB44CtImIhxayUZwOXRsSJwAPAe8vVrwEOBeYBzwIfAsjMxyPiq8Ad5XpfyczOk6pIkiRJktbQasNcZr5vJTcd2MW6CXxyJY9zAXDBq9o6SZIkSVKXejwBiiRJkiSp9QxzkiRJklRDhjlJkiRJqiHDnCRJkiTV0GonQJHUJmcN7KJtceu3Q5IkSWsle+YkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqqH+7N0BamwyZePUKbfPPHt+GLZEkSZJWzZ45SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTXUozAXEZ+NiHsi4u6ImBYRAyJiaETcHhHzIuKSiFivXHf9cnleefuQZjwBSZIkSeqLuh3mImJb4GRgdGYOB/oBxwDfBL6dmf8APAGcWN7lROCJsv3b5XqSJEmSpG7o6TDL/sAGEdEf2BB4GDgAuKy8fSpwRPn74eUy5e0HRkT0sL4kSZIk9UndDnOZ+RBwDvAgRYhbDMwCnszMJeVqC4Fty9+3BRaU911Srr9ld+tLkiRJUl/Wk2GWm1P0tg0FXgdsBBzS0w2KiJMiYmZEzFy0aFFPH06SJEmSeqWeDLN8G/DnzFyUmS8BPwX2ATYrh10CDAYeKn9/CNgOoLx9IPBY5wfNzMmZOTozRw8aNKgHmydJkiRJvVdPwtyDwNiI2LA89+1AYC7wW+A95ToTgCvK368slylvvyEzswf1JUmSJKnP6sk5c7dTTGQyG7irfKzJwOnAqRExj+KcuCnlXaYAW5btpwITe7DdkiRJktSn9V/9KiuXmWcCZ3Zqvh8Y08W6zwNH9aSeJEmSJKnQ00sTSJIkSZLawDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTXUv90bIEm93ZCJV6/QNv/s8W3YEkmS1JvYMydJkiRJNWSYkyRJkqQaMsxJkiRJUg0Z5iRJkiSphgxzkiRJklRDhjlJkiRJqiHDnCRJkiTVkGFOkiRJkmrIMCdJkiRJNWSYkyRJkqQaMsxJkiRJUg0Z5iRJkiSphgxzkiRJklRDhjlJkiRJqiHDnCRJkiTVkGFOkiRJkmrIMCdJkiRJNWSYkyRJkqQaMsxJkiRJUg0Z5iRJkiSphnoU5iJis4i4LCL+EBH3RsSbI2KLiLg2Iu4r/928XDci4tyImBcRd0bEns15CpIkSZLU9/Tv4f2/A/wqM98TEesBGwJnANdn5tkRMRGYCJwOvAPYqfx5E3Be+a964qyBXbQtbv12SJIkSWqpbvfMRcRA4C3AFIDMfDEznwQOB6aWq00Fjih/Pxz4YRZmAJtFxGu7veWSJEmS1If1ZJjlUGARcGFE/D4ifhARGwFbZ+bD5Tp/BbYuf98WWNBw/4VlmyRJkiTpVepJmOsP7Amcl5l7AM9QDKlcJjMTyFfzoBFxUkTMjIiZixYt6sHmSZIkSVLv1ZMwtxBYmJm3l8uXUYS7RzqGT5b//q28/SFgu4b7Dy7blpOZkzNzdGaOHjRoUA82T5IkSZJ6r26Hucz8K7AgInYpmw4E5gJXAhPKtgnAFeXvVwLHl7NajgUWNwzHlCRJkiS9Cj2dzfLTwMXlTJb3Ax+iCIiXRsSJwAPAe8t1rwEOBeYBz5brSpIkSZK6oUdhLjPnAKO7uOnALtZN4JM9qSdJkiRJKvToouGSJEmSpPYwzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIcOcJEmSJNWQYU6SJEmSasgwJ0mSJEk11OMwFxH9IuL3EXFVuTw0Im6PiHkRcUlErFe2r18uzytvH9LT2pIkSZLUVzWjZ+4U4N6G5W8C387MfwCeAE4s208Enijbv12uJ0mSJEnqhh6FuYgYDIwHflAuB3AAcFm5ylTgiPL3w8tlytsPLNeXJEmSJL1KPe2Z+/+BzwFLy+UtgSczc0m5vBDYtvx9W2ABQHn74nL95UTESRExMyJmLlq0qIebJ0mSJEm9U7fDXES8E/hbZs5q4vaQmZMzc3Rmjh40aFAzH1qSJEmSeo3+PbjvPsBhEXEoMADYFPgOsFlE9C973wYDD5XrPwRsByyMiP7AQOCxHtSXJEmSpD6r2z1zmfn5zBycmUOAY4AbMvNY4LfAe8rVJgBXlL9fWS5T3n5DZmZ360uSJElSX1bFdeZOB06NiHkU58RNKdunAFuW7acCEyuoLUmSJEl9Qk+GWS6TmTcCN5a/3w+M6WKd54GjmlFPkiRJkvq6KnrmJEmSJEkVM8xJkiRJUg0Z5iRJkiSphgxzkiRJklRDhjlJkiRJqiHDnCRJkiTVkGFOkiRJkmrIMCdJkiRJNWSYkyRJkqQaMsxJkiRJUg0Z5iRJkiSphgxzkiRJklRDhjlJkiRJqiHDnCRJkiTVUP92b4Ak9UlnDeyibXHrt0OSJNWWPXOSJEmSVEP2zEl92JCJV6/QNv/s8W3YEkmSJL1a9sxJkiRJUg0Z5iRJkiSphgxzkiRJklRDhjlJkiRJqiEnQJFWxynkJUmStBayZ06SJEmSasgwJ0mSJEk1ZJiTJEmSpBoyzEmSJElSDRnmJEmSJKmGnM1SqzVk4tUrtM0/e3wbtkSSJElSB3vmJEmSJKmGDHOSJEmSVEOGOUmSJEmqIc+Z64VGTB2xQttdE+5qw5ZIkiRJqoo9c5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZBhTpIkSZJqyDAnSZIkSTVkmJMkSZKkGjLMSZIkSVINGeYkSZIkqYYMc5IkSZJUQ4Y5SZIkSaohw5wkSZIk1VD/dm+AJEmSpN5nyMSrV2ibf/b4NmxJ72XPnCRJkiTVkGFOkiRJkmrIMCdJkiRJNdTtMBcR20XEbyNibkTcExGnlO1bRMS1EXFf+e/mZXtExLkRMS8i7oyIPZv1JCRJkiSpr+lJz9wS4J8ycxgwFvhkRAwDJgLXZ+ZOwPXlMptNcpQAACAASURBVMA7gJ3Kn5OA83pQW5IkSZL6tG6Hucx8ODNnl78/BdwLbAscDkwtV5sKHFH+fjjwwyzMADaLiNd2e8slSZIkqQ9ryjlzETEE2AO4Hdg6Mx8ub/orsHX5+7bAgoa7LSzbJEmSJEmvUo+vMxcRGwOXA5/JzL9HxLLbMjMjIl/l451EMQyT7bffvqebJ0mSJKkP6UvXt+tRmIuIdSmC3MWZ+dOy+ZGIeG1mPlwOo/xb2f4QsF3D3QeXbcvJzMnAZIDRo0e/qiAoSeq7+tKXtyRJ0LPZLAOYAtybmf/ecNOVwITy9wnAFQ3tx5ezWo4FFjcMx5QkSZIkvQo96ZnbBzgOuCsi5pRtZwBnA5dGxInAA8B7y9uuAQ4F5gHPAh/qQW1JkiRJ6tO6HeYyczoQK7n5wC7WT+CT3a0nSZIkSXpFU2azlCRJkiS1lmFOkiRJkmrIMCdJkiRJNWSYkyRJkqQaMsxJkiRJUg0Z5iRJkiSphnpynTlJLTZi6ogV2u6acFcbtkSSWmfIxKtXaJt/9vg2bIkkrV0Mc5IkdZMhQ5LUTg6zlCRJkqQasmdOUsvZmyFJktRzhjlJWkt4TqQkSXo1HGYpSZIkSTVkmJMkSZKkGjLMSZIkSVINec6cJEmS1Is58VjvZZjrBv8gJEmSJLWbYU7S8s4a2EXb4tZvhyRJklbJc+YkSZIkqYYMc5IkSZJUQw6zVNN4wePey/9b9Ra+lyVJvYk9c5IkSZJUQ4Y5SZIkSaohw5wkSZIk1ZDnzGmttLZfy8/zblQHa/vfkSRJ6hnDnCSp9+p83cSh27dnO3oJDxBI0trFMCdJkqS28SCB1H2eMydJkiRJNWSYkyRJkqQacpilpD7DoTySJKk3McxJklQzHpiQJIFhTpIk1VHnmUoBzlrc+u2QpDaqdZjzyKQkSZKkvqrWYU5t1NURUa/fJAkPtNljJElqFcOcJEmS+pw+f+CpXXrxAa92vKcMc5IkSWsRQ4akNeV15iRJkiSphuyZkyRJktS79dLhnYY5SZIkObxTqiHDnCRJaqoRU0es0HbXhLvasCU9Y7iRtLYzzElaO/TS4Q+SJElV6RNhrrccIdSK/L+VJPVVfgeqR9aig6i+l7uvT4Q5SZLayR0VSVIVDHPNshYd3ZB6C3eAK+BnlSRJvYZhTlLfZriRpLWPn83SGjHMSVIn9giqt2jJe7mrne6h2ze3hiRVoDd83/e+MOeXSu/l/60kaRXaFV5HdPFdVLcdQgl6R7hZ61Tcy9z7wpwk6VXxy1uSSoZ11UzLw1xEHAJ8B+gH/CAzz271NrSKO0iStHJ+RjaZoxdUhbXo3DU/M6QVtTTMRUQ/4LvA24GFwB0RcWVmzm3ldkiSJNXKWhTWDVVSzzTzb6jVPXNjgHmZeT9ARPwYOBwwzEmSJKlXGzLx6hXa5p89vg1bot6i1WFuW2BBw/JC4E0t3gZJUqutRb0KkiT1FpGZrSsW8R7gkMz8cLl8HPCmzPxUwzonASeVi7sAf+xmua2AR3uwud1lXetat551+9Jzta51rVvPmta1rnX7Zt0dMnNQVze0umfuIWC7huXBZdsymTkZmNzTQhExMzNH9/RxrGtd6/aNun3puVrXutatZ03rWte61u1snWY+2Bq4A9gpIoZGxHrAMcCVLd4GSZIkSaq9lvbMZeaSiPgU8GuKSxNckJn3tHIbJEmSJKk3aPl15jLzGuCaFpTq8VBN61rXun2qbl96rta1rnXrWdO61rWudZfT0glQJEmSJEnN0epz5iRJkiRJTWCYkyRJktogIla44GZXbdLKGOYk9VkRsf6atEmSVJGfr2Fb00REv4joUxMQRsT6EXFyRFwaEZdExKd7y/d9yydAqVpEbJiZz7a45nrArkACf8zMF1tUd1tgBxr+HzPz5lbU7ksiYovMfLxT29DM/HOFNQM4Fnh9Zn6lPEq3TWb+rsKap67q9sz894rq7rmaurOrqFu6Dehcv6u2poqIb2bm6atrq1pEbAcck5nfqrjOazLzb53adsnMP1ZZt10iYgdgp8y8LiI2APpn5lMV1ttiVbd3/vxqUs3XAGcA/wDcBfxLZv692XVWsw37UrzOF0bEIGDjKj+Xy5pdfU4uBmZl5pwqazdswzoUz7Ulr3f5Gb0vxT7OrRV/JnfU3Br4BvC6zHxHRAwD3pyZUyquuxHwXGYuLZfXAQZUsV8ZETsDuwEDI+Kwhps2BQY0u16jzHw5Iu6PiG0z86HV36N5IqIfcF1m7t/KusBU4AXgP8vl95dtx1RZNCL+jYpn7+81YS4i9gZ+AGwMbB8RuwMfzcxPVFx3PHA+8H9AAEMj4qOZ+cuK634TOBqYC7xcNidQeZiLiMHAJF75cL8FOCUzF1ZU7xdlnS5l5mEru61JfhER7+j44iy/VC4FhldY83vAUuAA4CvAU8DlwF4V1jwHmAP8kuIDLyqs1ejfVnFbUrwGTRUR2wDbAhtExB688lw3BTZsdr0uvB3oHNze0UVb05U7vUcB7wNeB/ys6prALRHxxcy8tNyGfwJOBIZVWbTcWToP2Dozh0fESOCwzPxahTU/ApwEbAHsCAym+I44sKqawCyKv5Wu/mYTeH0FNX9Y1p0EvBM4F/hgBXW6FBFnAqOBXYALgXWB/wL2qbj06PLnF+XyO4E7gY9FxE8y81+rKBoR/w18jOL7/g5g04j4TgsOxHyJ4vPip2XTheXzrOxvqHQRxf/rP5fLfwIuASoNc8D1wNuAp8vlDYHfAHtXUOsNwLuBzShe4w5PAR+toF5nGwP3RsRtwDMdjZn57iqLlkFyaUQMzMzFVdbqZGRmNn7nXBsRc1tQ915gckT0p3hPT2v28+41s1lGxO3Ae4ArM3OPsu3uzKxyh5uI+APwzsycVy7vCFydmbtWXPePFG/MF6qss5La1wL/DfyobPoAcGxmvr2iem9d1e2ZeVMVdRvqjwc+B4yn2HH4IcXzrewobETMzsw9I+L3De/n/83M3SusuTvFDv4hFDtp04Drs7d8SDSIiAkUO56jgZkNNz0FXJSZP+3qfk2o+3HgExQ7+fMabtqE4oj3ByqquwnFTsP7gZ0pdsyOzszBVdTrov5rKaZjfh7YmuLL7Z8y8+lV3rHndW8CTgO+36rvhYiYA4wBbm+oeVdmjqiqZjt0/jzq+MxqYf05wB7A7IbX+c7MHFlx3ZuBQzveuxGxMXA15edmp53FZtadk5mjIuJYipEDE8t6VT/fPwK7Z+bz5fIGwJzM3KXiundk5l6dvgPnZOaoiuuuUKPquhGxb2ZOr+rxV1G3ywNMmXl9C2pfQfH3ey3LB8mTK6w5Dfj3zLyjXH4jcGpmHltVzU71dwE+RLGfdSvwn5n522Y8dq/pmQPIzAXF6LRlXl7Zuk30VEeQK91PsUNYtfspjkS2PMwBgzLzwobliyLiM1UVqzqsrUH9qyNiXYqjc5sA78rMP1Vc9qVyKELCst6UpVUWzMz/Bf4XmFj2dL8PmBQRp2fmlVXWhmKINHAqsH1mnhQROwG7ZOZVza6VmVOBqRFxZGZe3uzHX4UZFD2f/0KxM9bhqSqGwjX4G/A74AvA9MzMiHhXhfWWk5kPR8SvgM9TvI8nVh3kShtm5u86fS8sqbjmC5n5YkfN8mhsyw6IRMTmwE40DNOqavh9Wavjxe3XuFzx+xngxfJ93PEZuVHF9Tq8huW/d1+i6Pl9LiKq/D5et/weOgL4j8x8qeO5V+wvFO+l58vl9YFWDMt7JiK25JXvwLEUw1lbUXfPjqGk5Q7/cxXX/ENEfA4YwvKnzZxUZdHMvL4cabVTZv42IgYA/aqs2eCnvNLb2yojgBkR0TEUeyhFz+TvgazyYFS5P7dr+fMoxb7WqeVIvh4P8+xNYW5BuQOa5QfeKRRHfysRER3d0DMj4hqKYXdJ0VV+R1V1GzwLzImI62n4YqnyqEaDxyLiAxQ9N1Ds9D9WVbGIuIuud4aC4g+wkiOTETGpU92BFMNpPxURVb/W51IMf3tNRHydotf5CxXWW6YMjntQfPAtpAgCrXAhRY9gx3CWh4CfAE0PcxHxgcz8L2BIdHEOTFXnBwI/yMw3RnEO2QMV1ejK5ynOC/geMC0iLmlhbSLiOoqdwuHAdsCUiLg5M/+/iks/Wo6W6NghfA/wcMU1b4qIMyiG8L6doif2F6u5T1NExIcpvvsGUwyZHktxDmjThypTfB7OYvmhnR3nUlU1tLPRpRHxfWCzcmjrCbxyLkyVLgZuL3sWAP4R+O8yTFY5ZOv7wHyKncCbozgvsxXnzC0G7ilH5CTFEPHfRcS5UOk+x6nAlcCOEXErMIjie7BqnwF+EhF/oXhvb0NxSkuVrqA40Ded1nRCABARJwCfovhb3hHYnuI74m1V187MqWUv7/YtPHf68BbVWU5EfJtiOPYNwDca5j74Ztnz3fMavWUEVURsBXyH4k24DvBrivO4KgkZEXHhqm7PzA9VUbeh/oSV1J1aZd2y9g4U50m8meLD/X+AkzPzwQrrrVRVO8Ure40b6lb6WkfErhTn2QTFcMfKDk6U9U4A3ktxFPYy4NLsNGlFxfVnZuboVgwtLY+GfT+K825WkJlfbnbNsu7vKQLqx4Fvd1G3qhDZUf/1FKHufRS9N2cCP6u6pzkijsjMnzcs9wc+n5lfrbju6ymGd+4NPAH8GfhAZs6vsOY6FOcDHkTxt/trihBf+ZdteeBrL2BGOSRvV4qdh0rPgWmXMiwve50z89oW1d2LVw463ZqZM1e1foXb0T8zK+1pbsf3YPk3NJZiNMEuFP+/f8zMl5pdayX11y3r0oq6rRg+urK6tGlIeET8I8V5+utl5tCIGAV8JSuYAyEiNl3V7VnxREIR8SGK/alnuritKecN9pow1xdFMYvmzuViyz7o2ilaPEtcQ92WvNbRhlnpGmovBe4GOsLxch8OVXzIdqr/PxTh9dbyfMEdKU4UHlNl3VYqx8wfQXH09/zOt1cVIleyLcMpzqF7b2b+Q6vqtkPZa7JOKz4r2ileOc9oDvCmzHwhIu7JzDe0qP6OFO+pY6quGRFDgYc7ncu1dZVBvaF2P4pzPxuHxFVyMLOh5vrAkaw4FO8rVdZtl8aDei2u2zHcf4fM/EiVw/0bav4L8NvM/E1VNVZSd0Zmju14rcv39ZwWhblZFCMGbsyKz2eOiAWsOEFUx3JmZuXX9Kt6+HuvGWZZHoH9DsXRnKQYWvLZzLy/4rqDgI+w4gfsCRXXHUcxpep8ijfkdhExoapzIzrVnkrR6/lkubw58G8teM7tmCWu1a91V7PSLfvQodqhS62eJrizs4BfUby+F1PMSld1D/eFdDGEt6r3cjmc5JtRTNRQ6Yy3a7Atd1NMLX9GVTUiYnpm7hsRT7H869zxJbrKI6Y9qNvlZTaiPI+tyh7QlQwLX0wx0c7XqhotUloYEZtRXKPq2oh4glcOzlQiIl5HMQzt/RRDs/+Fiqf6Lv2E5WcYfLlsq3LGXyLi0xQ92o+UNTs+myudiIRiKN5iiu+Ilp0rX4aZf6GYebZxR7TqYbTXR8SRwE9b0avdoGO4/5vL5cqG+zf4GHB6RDwLvMgrn4+rPLjbBLdGca7egIjYH/gk1T7PRi9l5uJY/nzmSuYFyMztqnjcNdWK4e+9pmcuImYA3+WV87iOAT6dmW+quO7/UEzNP4uGsc5Z8aQK5VGN93eMNY5iCu5pmfnGKuuWtVY4YtaKo2jtGhLQzte6r4nihPexFF9mMzLz0YrrHdmwOAB4F/CXqs89jYiBFDuEbymbbqIYYlLJCf6dwlTHt2fjkclKQlW7rGz4bIcqe0Aj4l8pvgv+u2w6hmJ6878C+2bmP1ZVu9N2vJXiXJhfZQXXPo2IkyiG625Lcc74pcAVmTm02bVWUr+rWQcrnfG3rDGPotezylDeVd3KZ+deSd3pFJ9V36Y4P/BDFL3cX6q47lPARhQTFj1Piz6rWjncv6Fml5OOZGal58+VdU9i+SHh57doSPgUistATKTocT4ZWDczP1Zx3WMort/7jSgmf9k6M2dVXLPy4e+9pmeOYtayHzUs/1dEnNaiui290G9p3Ww4aTQz/1SO826FdSJi88x8ApYNDWzFe6lds8S1/LWOri+kvRh4oKpzJMojsGdQnFv07xSTCexHMenLiVWfFxIR12fmgRTTfHduq0Tngy5RTF3ciimiL6AY0vrecvk4iiPClZzblJmbVPG4q9OuYcOtHK7ahbfl8rOi3RWvXGqkqktPDKA4ut9xAe8pWf0swP9BcXT5/R2fDdGa2RU7LIqIw7KcaTciDqeYJa5qC2jNrIqd/U9EjMjMu1pcd4MsZj2MLM5PP6s8wFlpmGvXZxbwYjlkt2PSpB2puCc0i+uurRAyKDoJqvTxzPwPimtxAhARn6L4267apymuIfgCRSfMr4Gqz6H+D4pZ4N9CcUH6ZylGd1Xamw88n5nPRwQRsX5m/qE85aJpah/mGnYWfhkRE4EfU/wRHg1c04JNuCoiDs3MVtRqNDMifkBxkVSAY1n+ellV+jfgtoj4CcXRnPcAX29B3ZuiPbPEteO1/h7FtYTupHiNR1Ds/A+MiI9XNLb+Qopr6G0K3E5xXte7KALdd4FKernLHdENga1i+anON6U46t9KO1FMPV61HTOzsVfwy2XPc6UiYgTF1MgAczPznopLtuNi1su0aUh4v4gYk+WMZVFMltFx5L2qySqmUkyRfwvFxeeHUQzrqdJrKWZv/reI2IaiZ65VBxShCK8XlztoUMy8e1wL6t4P3BgRV7P8TNKVTl4E7At8MIpp1V+AamdzbvBCFBOS3Ffu6D9EcbHpylV9ntFKnEXrh/u3K2ScwIrB7cQu2pouM58F/jkivlkstuR85r3Lg2q/L7fh8SjmQ6ha5cPfaz/MsvxgW+nOQtXjuhuGArxI8WXaUbfqoQDrU4xv3rdsugX4bhXDaVZSfxivjPe9ITOrnJK5o2ZbZolbyWv9vazwgu0R8VPgix072+Xr/RWKi5f/tPPwoibVXDZsKSLmZcOkGF0NaWpi3VMoguPrKHYUOv6W/05xUc3Kvli6OJfrrxSzLFY9TPo24LQsLxQbEfsA52Tmm1d9z27XG0hxzs32FFObdxwgeBA4PCuazSvKi+FGxIAsJ6popXYMCS/D2wUUO7xB8T7+MHAPMD4zL62g5rLh5uWIhd9lay/gPZjiAOr7KL4Pf5aZVZ6LuQ7wnsy8NIqLdpOtuW7hSofwVt0bHCuZ1TkrvsRJ+X6+F9iMoudkIPCvmTmj4rpdnmeUmVVcZqNz7VYP9+/ouW/J0M6IOJpi+Pc4oPGi1ZtQTCpX+fnzDZ+THT2wi4ETqhzyGBG3U5wLObN8vbcErqv6FKFO21DJ8Pfa98y1anz+Kuq3ayjAeyl2dJcdDYyId9KCk1fLYQf/l5lzo5gc5G0R8ZeOo99VycylEfFz4OeZuajKWp3qvlAeOeu4zk4rZg7dubHXpHytd83M+yO6Om7RFI0nH3feua/sguWZ+R3gOxHx6cycVFWdldRu19/vxykuWj6QYofhceCDFdb7KkVv8gGZuRSW7RCfTdGr/umK6n4HeCPF5UtaFi4atHxIeGbeAYwo/2/J5c+DbHqQKy37PMrMJRV+RixT7owtyMy/ZubCiHiU4lqC/aj4Isvld8HnKKb7bkmIa6jdliG8mflAROxLMZvzhVFMvlZ5D1n5fgZ4mop7qDo5hVfOM9q/4zyjqou2Y7g/8FL5edwxtHNLKvzOpbjkw2MUQfm7De1PAb+vsG6jKcAnMvMWKA78UYwOqrKn+bvA5cCgiPgyxX50K2eQ3hB4hmJUTFM7XmrfM9chinOYPs4rEwrcCHy/BTvdRHEB8X0p/hBvyYbrKVVY80mK2RXfl+X1xzqO7rSg9hxgNMUMnldTXNjzDZl5aEX1guIE7E9RXEMQigkGJmULpmWOLmazBCZUOdwjios6P04xbBiKo95bUQwjmp6ZTR9+EcVMWvMonuOO5e+Uy6/PzI2aXbOLbdibFWeG/WGF9Vb4km7BF3djrU2hJde5mQuMzE7nW5a9OHdl5m4V1Z1BMVT4CF55Ly+T1U80czzFeaDLDQnP5c+vbnbNlk8hHxEvU+wkdKS4DSiGalU2aUREzKY4P/DxiHgLxf/vp4FRwG6ZWekFniPibIpz5C6heO5AtZdvKev+lq5nwK20x6jsERxNMU3+zlHMIvqTzNyn4ro7A6cBO7D8+7nq59vSy2w0DPf/LUWPVeNw/19l5q4ruWszah9PcUrDaIreqvcCX87MFT4zm1z3G5170Ltqq6h2V6MmKtmHjYhrKILj/Ih4A8X1qIOiV+7uZtdrqHsYcC7FvtwXKMLkIxTfDadnE6/RWPueuQbnUYw5/l65fFzZ9uEqi0bE9yhOOu+YRfNjEfH2zPxklXUpLn57InBZRJyVmR07K62wtDz6+27gPzJzUscY5Ip8lmLc+l6Z+WeAKC5FcV7E/2vvzKMkK6ts/9tVMihI0SiICggNytDK1BST2s0g7YCtQouAAkrzGlDWa0B9Iq/xLRRFhaaVSWWyBBphgcwqWDIj3cgkowIK+kSZRZ5YIAK13x/nu5WRWZlZVhHfvRWR57dWrcq4NyPOzcyIe+/5zjl760Db85kv95mjgH/wGDVLouJQi48QM4EHlMfXAZ8kVuBrtUBUuaH/S5F0OpFE3sqIMqyJOb5+x+p0Tq+0ls4iVkJPUgjefNr1fIb+PDaRg3lVnJrD/e8mLpxvp/4w/3zYPk0h1tB8ZnZsoSW8dQl52+Oq4VVmek/itDNwYmlPPlctzH+WmBAt8A3V5zCJ83DD0kTiXtW4u7ADsBFwC4DtByW10VlwDjG/dRI9it0t0LbNxj6MtPvfzOh2/6ozZD3nqSbJ2KlmktHDO5jfmmb7cbb1DY2Iu10t6QTiXqrRubiqUthZwGzFDPURrj8r3nAYMRY0g1gkWL90V61EKHlmMjcOM8f0F18h6bYW4m5DrEI25fFTidmI2tj2LaX/9kxJmzEyZF+b5yTtCuxBSBVD3cH33YHtevvWywdiN2A2IZlck9bVLG0/I+lY4ucb29pZpa2o9uzFX8AmwHrNZ6kynV24C/9s+2hJbwdeQbzHTyf+3jVYWtJGzL/gI2CpSjEpn9mzJP3GZT5wXuCYE6yO7bskPUYRUZC0musaPK9i+x0VX39CJB1FKFlWn2EmhF5eUhYJtiUkzhuq31t0NWIxzkzPdZJuaCH0n21bRTFUUvVOicLztr++4G/rL7Z3KF8eWqqhMwhhklrxOmv3b+YSyzEg6eWSNnElBWlJ+xACQmuXCnvDy6m/6HYUo3UuGlXUxq+x79g+R9IlwGcIQbvT6WljdT3xorm274XQ93Dxvbb9qKS+LgANUzL3gqQ1bd8H8yo3bawi/YIQFWhuhFdlpD2tJg9B3CyVG8IvA2150OxJnAi+YPuXktYgbkRrsYTHGUC2/VjtpKrQuprleK2dqmwKr/nFQObtoh0vsjuBlSnv7Zp0eeEuNBeydwGnlYSjZmX9YcJuYqJ9tTmG+Wfmjh1nW18pbS5HEUn7o0Sr2M+AKq1aha4k5CF+tpNK++wswg+zloz+mcTq+uPEjFwz+7IWLUj3dzVaodF2G9OIDo0ZNWMWzi6VjOUl/QuhRHhSC3EvlvQx4HxGq3fWbmddrefhL8v/KxOiTdUonUattvsDJzK602cOcAL1un/OJipDXyR83hqesv1opZgNY3UdDDxGjI/8cpzv7xd/Jn6vSxFJa82ZxIZppfNnGjC3pwtIjIwM9YVhmpnblrh43U/8ol4H7Gn7ykmfuOjxLibehDOIId0byuPNCCWxrWrEnYpM1kfdxpygulGznHJG5WX1dUPis9R70/CeynHbvnAjaRbRzrkGsAFRVb9q2P6+krYAtiSqoL0V9OWAHVzf4Pk2onviMtsbSdoa2M32XhVj/pRovW9bQr73GNYmFt12JVq0T6pxLZS0OWFRMNv2nLLtDcCytm+Z9MkvPvbJREdI06q0O/CC7dqjFb0K2s8Tf+fPja08V4q9HT1qzrZ/2ELM5gZ71M2i6yuF38HI73lp4lx5T62ZuZ6447b715zv1TjKlZJur3XOkLSM7TkqM9tjccUZbo2vBrsC0Yp/aI05QUnvIBYzLyI+q0/3O8YEcX9FJI3NQm1vRdL9/AwNTTIH8266GyO+eyrfbP/9ZPtd2bBVoWR1EOEn1OvBUm0oWdLZtj/Qc5IdRcUTTzPcP98uYGnb1atzCi+SdYkP5j2ubAEx3om85sl9cWCiz1TNz1IXF+4SdxqRuN5v+0mFetlrbd9eKd6kZuS2z6sU9+8JMYF9ibmbhqeAi23/vEbcnvg32d6kJHUbOVQQq0l+l5idSMj3xJ9OzCruSXSKnE0sRM2xvUsbx9AGE9wAV/3bLg6UG/DehacqFTL1qJWWxx8m5gN/Rdx0V63MjXM8GxMiFrWT9Z/RXrt/E/MCwmrpROLe6qPA22stZEq6xPY7JT3ASIIx73/bq036AnWOaQVi0a2GAMq1wL5ub1ZubPxpREfXGrY/V6rOK7t4kfYlxqAnc13dpJTY04k3X3VPjnFizyZUvD5J3Ch9GHjM9kEVY76VSGZ+M2bXqsDDtttoL20dSdsTN6L3ESe7NYB9bF9SMeY3id91b2vndNc1O55ydHHh7on9HkZaxK62fXHFWLMm2e3a7ytJr3NIq7+srVXREvcyQknzi4Qa7KPEfPWWLcReidELbVXbw0rMrxCJ3BXE7NwNPfvusb32hE8eMMqsz05jRiu+00Knxk6EuuFTkg4hWoU/30Ilch9CRv1PjKz293V1f0y8TtVKJzimeX6KFWOcA/yr7ert/j0xX0UoHW5FJFVXAv/T9iNtHcPigCp7gHaFpK8Tn9ltbK9b2i1nu4+qCQZ3jQAAFBFJREFU5MMwM9cIcKxEtPNcTpzktia8jaolc7ZfkDRX0oyKcwkT8Qrbp0jav1QurpZ04wKf9eI4iDBUHrXCXFYKv8LI32LYOArYuklWFT573wOqJXPEytx+QFMhupYRpdahouNZvdbm9HpRyKrPBM4om/5V0hauJwl9cLPC3hGvUQygLwusJmkDYkHkY5XjvpeY5zqQWBCZAVS1M+loTq/hduCQpuVxDJu2EL9N/hdwpaRRoxUtxP2MQ1DhLYT64JGEcvZmleN+EnijKxtY99CpWqmkj/c8nEYkzQ/Wjkss+vxUIWrTSrt/Sdq6SI5nAdcQllqdLsaXFvjfd3kMFdnMxRQewPbvS7dX3xj4ZM72njCvUrVes5oi6dXAt1o4hD8Cd0j6IaO9bqq2aTFiEvtQqRw9SPQd1+RVHmeo3/YdklavHLtLnhpzorufaBOrhrsxKu8Ed2fcDR1cuAvvAjb0iIH3qYRZa61k7lZJdxKiFefafrJSnIn4KjETcRGA7dvKan81SufEd0vnxFz6KAO9AA4DNmfMnF4bgR1m0n+l8FLqrQpe08GCYxUk7eSw4rkfeD0tjVb00LRjb08kON+T9PkW4t5HeAe2RadqpYRIRcPzxALquS3EPbSFGKOQ9EpC0GZ1RrfQ7j3Rc/rEt4G3AnuU1r+bgGtsHz/50xadCcZ0ViDuYfeoFbdjnivXo0aJdkX6LMAy8MlcD6uOKYs/QqzU1eY8Klb/JuHzkmYAnyBU4ZYjVp9rsvwk+15aOXbr9LTw3qQwnTyb+DDuBFStgqoDNcspyqEdxl6eMBOF+mp4ryWqCLsAhyvMvM8ELrT9TOXYANh+QKMFO6uqDXfYOfGc7d9JmiZpmu0rJX21jcCS/gewP7AKMQe6OfDfhAjMsHAw4X12bmmprDJnOgm/VahKbgd8uczq91WZbgIOJpRSf8zohadaC8edqpXa/mztGBPErap3MAEXAtcDP6JFLz/bPyyt6H9LJOz7la+rJXNEG/iowwB+N0E3wbBwDKEGu5KkLxBV2EP6GWCYkrnLJf2AEfPunann2TQP99HBfSHjNvKu/496JtJjuUnSv9geJYdcbiBaNwRugd620UeARqBjnmdVRbowKp9ydHThhpjh+olCwVPE7NynJ3/KomP7BWLA/gelveOdRGL3VUmX2/5QrdiFBxSqoVZIyu9PtB7WpovOiSclLUu0L50h6VHGF3Cqwf5E++71treWtA5weEux2+J3pRNnDUkXjd3ZQlX9A4TZ8r8X8aJXEy2ftTmBmIW8gxZk1W1/QdLljKiVNtWUacTsXFU0ohg+0fHVEgfZnFggXxdYklAanlO53X8Z25+o+PrjUu6ZZxCL09cCm9uu2so6dkxnKmD7DIVC+bbE9f59tvt6/Rt4AZReJO3AiKDAE4RazH6VYk2m7GjXl9xegzihrs7osny1C1kZ0j2f8OtokrdNiBPeDh3P5AwVmoJqlm3S5Zyeojy1CtE61AxA39Dm50fS6wnZ+t2AP7YgGvFK4GiiOihioW1/27+rHPejxPnRxO/7Gai7CKcwc36GuOlt5vTOqP2zltg32p5ZZpo2K+3ad7mynHublMWIjQlv0/mUDdtYoCkCBqsy+tpbWwBlKMUhJkLS0cQ8cyMCtiuxqHoB1Ps7S7qJWOg6h7i/2QN4g+2Da8QrMb8IXGm7egFiTNxjgY2IRa8fEQtQ17uyWnfSf4YtmdsI+CDRBvdLog3juEqxXm37IY2WoRZxgj/Y9rtqxO2JfxtwCmNW6Vq6kG3NiEH5XbavqB2zS0pV7OvEzOAbJa0PvMd2tTkJza9muRswzalmORS0oco2TsxViZuUXYFliErvWbbvbvM42kBhmn04MYfyf4lz82qEF+n/rjV/qg4Vjkv88wkRkAOI1srfA0vUvh51gaQVbT9Wvp5GeNtV88fqiXsY8BFihq25gbIr2gKVuIcTbfcX06J5d1eo2IosaFutuL2Lp7UTaUm/JxZ9niYWy5tFxdo6CE38GUTS+klgJdtDNzYz7Ax8MldutHct/x6nyPXbbmNerjmG1pLInpg/tl1bPSsBJF1NtNGc0JzQJd1p+42TP/NFxWyMyt9cNjVG5bliNgQUwZPjbNdWoG3i/RcxN3cOYT7fSlu0pP8zyW7bPqxS3K8QAgoH2n6qbFsO+HfgadsH1Ihb4lwO7Ni14IjC428GIaM/dOcNSd8mbHleINrElgOOtn1k5bj3AG9q+3eqEfPuXuzK5t1dobCN2d72/eXxGsD3ba9bOe41RAfBycDDhNLxR2p2W5VFoPko7fE14r3E9vOS9iUEUGYSAiTXEsqWrVYIkxfPMCRzc4k34F4ekY6/v/YJruskUtIHCSWv2Yxepava6jEV6Wld+klPMner7Q0rxHovsEqjJqVQWVyRWAH+lO3v9Dtm0j6S7gbWIqpGcxhZia3SRqtQjrzWLZ/wJY03B7IMsBdhr7Jspbg/J1qjPGb7dOBu26+vEbfEuJBoXWptTk/S0kRisxbRrXGKQ4VwaGnOwZI+RLRdfhq4uXYruqRzgY/afrRmnKmOpHcQJtr3l02rA3vXTjRKt9WjwBKEqNwMYiG1qnS/pF2Av7Z9uKRViE6gKotukm5xSOV/mrh/vnEYF3ymEsMggLIj0Tp0paRLCWNLTf6UvnA38SF4d08SWVtNspc3AbsTrTRNm6UZLtWyxYXHFd5yjazs+6nnS/Yp4v3csCQherIs0SKWydxw8PaW420FbCWNe2qsViGzfVTztaSXEwIdexLn6aMmel5/Qs+fuDoULmsntJcClzFmTq8ypxJ2NdcS4jbrEb/rYWaJIqbzPqLK/VwLf1sYES+6kxbtTCS9DPg4sJrtvcvc69oeEUMbCiTNBB6wfWn5GfclhCNmE9L5VekR6HiGMGmvjsKGaAlC8+Fwot3yG4zMVPc9JIDtL1V6/aRlBj6Zs30BcEEZOn8vMSuwksJx/fyKqzhdJZENOxGrOLmaUp/9iBXCdST9lmilreUbtaTtB3oe/6jMRDxR3uPJcPBqYt60twVwXaJSV4Px1BRfRghIvILwRquCpBWIm9APEUnHxrZrm8P+VNIetk8bcyy7EQtxfWdBc3o1YvawXjODKekU4IbK8RYHTiBmyG4DrikVleozc8R7+Mu0pCrZwyxCeGzL8vi3RNv0UCVzxN/1beXrzYCDCLG3DYnrcFVzbUnvJs6HryPukauLYgFberSp9BPqs6n0GFbUaFP2Udj+j4qxkwoMfJvleBSlqZ2AnW1vWzlWk0TuSlTFTqNuEtnEvYBoOchWj5Yof+tpzQ14pRi/sL3WBPvus71mrdhJe5SL9sZN9agIONzkyqqSJVZTIduL8E48qtZ5RNKRxMLXicDxtv9YI844cV9L+H8+w2jl3ZcSyru/rRCzyzm9W3rfO2MfTxU0YnJdM8aNtmtVTCaL2whz9Lb731ZzlqsLen8mSccDj9k+tDyuMt4wJv4viHPWHW21pSu8A7egXAMkvYIQUqoiuiLpIULUbaJWjU48/pJFZyiTua5oOYm8ClifGPxuWj1s+701405FiorYEbafLI//CviE7b6aPpbXPgO4yvN7+e0DbGV7137HTNpnvJsSVbaeGKdCdnTtClmZaX6WaDfsvdi0sdqNpG2ARpb/p7Yvrxiryzm9FxipvopIWp+mpd9zFyiscg4HXmP7nZLWA7awfUrluP9BvKcvosV59SJitC1wXbnhX5MQM9q0Zty2Ke2rGxaBjruJRetrmn2uKDxWYlwJbGu7etW1R4hkD2AHYsHpm4SX4Wdtn1Up7pRc7BlmMpkbUIpS2byHhCLRLh4iP6HFBY0jS1zrZChpJcJH51mguTn4W2ApwmjykX7HTNpH0nnAVcTqKMDHgK1tv69SvE4qZFMJSffafsPC7ksWDUmXEK2H/2Z7g9Lm+hNXtvwoN/tjsetbE2wHHELMQ84mlI4/YvuqmnHbRtK/Ae8ihOVWo3QwSFoLONX2myd9gRcffybRZnk1o5P1vrce9t5HSPobRnw4L7N9Z7/j9cSdUp6FU4FM5gYYzW+JcJ7tY7s9quFD0u3ATNvPlscvJdohqiXOYyoKQ+/lN9UoSfsxRGu2gcuBAyq2O3ZaIZsKlNb38yaY0/tAbYGMqYZaVBleXCjtd5sTn9vrbT/e8SFVQdLmxFzxbNtzyrY3EF6CtSugswkT7bEevn1vPewqqZK0gofUn3CqMvACKFMNjW+JIHdkUjtFOAO4XNKs8nhPok2tGiV5ywRuSClJ2y4L/Mb+xZvWVqwpzH7AeZL+mXHm9Do7quFlTklumrnTzYFWvP0kbU8sti3dbLP9uUqx1rF9t6SmE6RRUl5N0mq1k5susH39ONvubSn8a2q3cvbQiRBJJnLDRyZzg0fXlghTDttfLtW5Zg7yMNs/6PKYksFE0qdsHyHpWEZXyIC6XmRJXYqoymZjqurfrzmnN8X5ODG3tqak6wg/zqpKhwCSvkEowW5NGEu/n7rqoR8H9mbEymPseSPtiPrL9yX9Q20Ru8J0wnaoTSX0ZAjJNssBQ9L7iBX9NxN+RmcBJ9teo9MDS5JkgUj6R9sXS/rwePttV634JskwUebk1iZuhu+x/VwLMW+3vX7P/8sCl9h+a6V4mwK/tv1wefxh4J8IW4ZDs8rSXyQ9BSxDtKU/R8VW9BQiSfpFJnMDSleWCFOR0r5zLOEDtiSxmjYn54ySJEnaRdKOk+23fV7l+D+2vZmk6wlRoSeAOz2BpUwf4t0CvM3hPfZ3xAJu47u2ru3q1cikDilEkvSLbLMcUMpQ8LeBb/dYIhxEqFwl/eU4ohp6DjEDsweQynTJQiPposn2p0hGkiyQf5xknwl/wZp8V9LywBGMzEaeXDHe9J7q287AibbPBc6VdGvFuFMSSecCpwCXtmBPUNXCKpk6ZGUuSRZAj1nrPB+wXFFLFgVJjwEPAGcCP2bMrITtq7s4riRJJqdI1j/Q0+64B7AbMcderd2xa9+1qYaktxEiZ5sTC7izbN/T7VElyeRkZS5JFszTkpYEbpV0BKEmluqAyaKwMrAd0R79QeB7hPHvXZ0eVZIMIG2qSgInED5glHbHLzHS7ngi9cRXzgSulvQ48AwhgEbxXWtFvXMqYfsy4DJJM4jz9GWSHgBOAv6zjbnMJFlYsjKXJAtA0uuAR4h5uQOBGcDXGjXRJFkUJC1F3CwcCXzW9nEdH1KSDAwTqUra3qtSvNtsb1C+Ph54zPah5XFVf7sufdemIsXyYjdgd+BBwp7oLcCbbG/V4aElybhkMpckE1A8fH7d9XEkw0VJ4rYnErnVCXn1bxZp+yRJ/gI6UJXMdscpgKTzCYXU04kWy4d79t1ke5PODi5JJiBbxZJkYi5ovihD0UnyopB0GvDfwMZENW6m7cMykUuSheaZ8v/Tkl4DPE9Ur2rRtDteSLY7Dh2SZkpaGTjG9npERe4EScdIWgEgE7lkcSUrc0kyAb0iJyl4kvQDSXOBOeVh78m3mpdRkgwjkj5DWMZsAxxfNp9s+zMVY2a745CSFhDJIJMCKEkyMZ7g6yRZJGxnN0SSvAh6VCUPK4+XBe4gVCW/UjO27evH2XZvzZhJa6QFRDKw5I1FkkzMBpL+IOkpYP3y9R8kPSXpD10fXJIkyRTkBODPMEpV8gSi1fHEDo8rGWymS2oKHNsCV/Tsy8JHsliTb9AkmQDb07s+hiRJkmQUWUFJapAWEMnAkslckiRJkiSDwnRJL7H9PFFB2btnX97TJIuE7S9IupyRmchmtGIaMTuXJIsteeJLkiRJkmRQyApKUoWciUwGlVSzTJIkSZJkYEhVySRJkhEymUuSJEmSJEmSJBlAUs0ySZIkSZIkSZJkAMlkLkmSJEmSJEmSZADJZC5JkiQZCiStLOksSfdJulnS98ssVZIkSZIMJalmmSRJkgw8kgScD5xqe5eybQPgVUAq0iVJkiRDSVbmkiRJkmFga+A5299oNti+DfiRpCMl3SnpDkk7A0jaStLVki6UdL+kL0n6kKQbyvetWb7vW5K+IekmSfdKenfZvrqkayXdUv5t2fO6V0n6jqS7JZ2hYBtJFzTHJmk7See3+QtKkiRJho+szCVJkiTDwBuBm8fZviOwIbAB8ErgRknXlH0bAOsCTwD3Ayfb3lTS/oRR8AHl+1YHNgXWBK4snmaPAtvZ/pOk1xP+Z5uU798I+BvgQeA64M3AlcDXJK1o+zFgT+CbffrZkyRJkilKVuaSJEmSYeYtwJm2X7D9CHA1MLPsu9H2Q7afBe4DZpftdxAJXMPZtufa/jmR9K0DLAGcJOkO4BxgvZ7vv8H2b2zPBW4FVnf4AJ0O7CZpeWAL4JIKP2+SJEkyhcjKXJIkSTIM3AW8fyGf82zP13N7Hs9l9PVxrCGrgQOBR4jq3jTgTxO87gs9rzULuLh87zm2n1/I402SJEmSUWRlLkmSJBkGrgCWkrR3s0HS+sCTwM6SpktaEfg74IaFfO2dJE0rc3R/DdwDzAAeKtW33YHpC3oR2w8SrZeHEIldkiRJkrwosjKXJEmSDDy2LWkH4KuSDiKqX78i5t6WBW4jKmqfsv2wpHUW4uV/TSSAywH7ljm5rwHnStoDuBSY8xe+1hnAirZ/thDxkyRJkmRcFG38SZIkSZKMRdK3gO/a/k6fXu844Ce2T+nH6yVJkiRTm6zMJUmSJEkLSLqZqOB9outjSZIkSYaDrMwlSZIkSZIkSZIMICmAkiRJkiRJkiRJMoBkMpckSZIkSZIkSTKAZDKXJEmSJEmSJEkygGQylyRJkiRJkiRJMoBkMpckSZIkSZIkSTKAZDKXJEmSJEmSJEkygPx/tyJmbmHQ0e0AAAAASUVORK5CYII=\n", 3262 | "text/plain": [ 3263 | "
" 3264 | ] 3265 | }, 3266 | "metadata": { 3267 | "needs_background": "light" 3268 | }, 3269 | "output_type": "display_data" 3270 | } 3271 | ], 3272 | "source": [ 3273 | "# Group the DataFrame by \"Company\" and \"Process\", count the number of elements,\n", 3274 | "# then unstack by \"Process\", then plot a bar chart\n", 3275 | "df.groupby([\"Company\", \"Process\"]).size().unstack(level=1).plot(kind=\"bar\", figsize=(15, 8))" 3276 | ] 3277 | }, 3278 | { 3279 | "cell_type": "markdown", 3280 | "metadata": {}, 3281 | "source": [ 3282 | "## 6. Common pitfalls\n", 3283 | "pandas is great for most day-to-day data analysis. It's instrumental to my job and I'm grateful that the entire pandas community is actively developing it. However, I think some of pandas design decisions are a bit questionable.\n", 3284 | "\n", 3285 | "Some of the common pandas pitfalls:\n", 3286 | "### 6.1 NaNs\n", 3287 | "NaNs are stored as floats in pandas, so when an operation fails because of NaNs, it doesn't say that there's a NaN but because that operation doesn't exist for floats.\n", 3288 | "\n", 3289 | "### 6.2 Changes not Inplace \n", 3290 | "Most pandas operations aren't inplace by default, so if you make changes to your `DataFrame`, you need to assign the changes back to your DataFrame. You can make changes inplace by setting argument `inplace=True`." 3291 | ] 3292 | }, 3293 | { 3294 | "cell_type": "code", 3295 | "execution_count": 39, 3296 | "metadata": {}, 3297 | "outputs": [ 3298 | { 3299 | "data": { 3300 | "text/plain": [ 3301 | "Index(['Company', 'Title', 'Job', 'Level', 'Date', 'Upvotes', 'Offer',\n", 3302 | " 'Experience', 'Difficulty', 'Review', 'Process'],\n", 3303 | " dtype='object')" 3304 | ] 3305 | }, 3306 | "execution_count": 39, 3307 | "metadata": {}, 3308 | "output_type": "execute_result" 3309 | } 3310 | ], 3311 | "source": [ 3312 | "# \"Process\" column is still in df\n", 3313 | "df.drop(columns=[\"Process\"])\n", 3314 | "df.columns" 3315 | ] 3316 | }, 3317 | { 3318 | "cell_type": "code", 3319 | "execution_count": 40, 3320 | "metadata": {}, 3321 | "outputs": [ 3322 | { 3323 | "data": { 3324 | "text/plain": [ 3325 | "Index(['Company', 'Title', 'Job', 'Level', 'Date', 'Upvotes', 'Offer',\n", 3326 | " 'Experience', 'Difficulty', 'Review'],\n", 3327 | " dtype='object')" 3328 | ] 3329 | }, 3330 | "execution_count": 40, 3331 | "metadata": {}, 3332 | "output_type": "execute_result" 3333 | } 3334 | ], 3335 | "source": [ 3336 | "# To make changes to df, set `inplace=True`\n", 3337 | "df.drop(columns=[\"Process\"], inplace=True)\n", 3338 | "df.columns\n", 3339 | "# This is equivalent to\n", 3340 | "# df = df.drop(columns=[\"Process\"])" 3341 | ] 3342 | }, 3343 | { 3344 | "cell_type": "markdown", 3345 | "metadata": {}, 3346 | "source": [ 3347 | "### 6.3 Performance issues with very large datasets\n", 3348 | "\n", 3349 | "### 6.4 Reproducibility issues\n", 3350 | "Especially with dumping and loading `DataFrame` to/from files. There are two main causes:\n", 3351 | "\n", 3352 | "- Problem with labels (see the section about labels above).\n", 3353 | "- [Weird rounding issues for floats](https://stackoverflow.com/questions/47368296/pandas-read-csv-file-with-float-values-results-in-weird-rounding-and-decimal-dig). \n", 3354 | "\n", 3355 | "### 6.5 Not GPU compatible\n", 3356 | "pandas can't take advantage of GPUs, so if your computations are on on GPUs and your feature engineering is on CPUs, it can become a time bottleneck to move data from CPUs to GPUs. If you want something like pandas but works on GPUs, check out dask and modin." 3357 | ] 3358 | }, 3359 | { 3360 | "cell_type": "markdown", 3361 | "metadata": {}, 3362 | "source": [ 3363 | "Oh, and this is [the analysis I did based on this data](https://huyenchip.com/2019/08/21/glassdoor-interview-reviews-tech-hiring-cultures.html), in case you're interested!" 3364 | ] 3365 | } 3366 | ], 3367 | "metadata": { 3368 | "kernelspec": { 3369 | "display_name": "Python 3", 3370 | "language": "python", 3371 | "name": "python3" 3372 | }, 3373 | "language_info": { 3374 | "codemirror_mode": { 3375 | "name": "ipython", 3376 | "version": 3 3377 | }, 3378 | "file_extension": ".py", 3379 | "mimetype": "text/x-python", 3380 | "name": "python", 3381 | "nbconvert_exporter": "python", 3382 | "pygments_lexer": "ipython3", 3383 | "version": "3.6.10" 3384 | }, 3385 | "varInspector": { 3386 | "cols": { 3387 | "lenName": 16, 3388 | "lenType": 16, 3389 | "lenVar": 40 3390 | }, 3391 | "kernels_config": { 3392 | "python": { 3393 | "delete_cmd_postfix": "", 3394 | "delete_cmd_prefix": "del ", 3395 | "library": "var_list.py", 3396 | "varRefreshCmd": "print(var_dic_list())" 3397 | }, 3398 | "r": { 3399 | "delete_cmd_postfix": ") ", 3400 | "delete_cmd_prefix": "rm(", 3401 | "library": "var_list.r", 3402 | "varRefreshCmd": "cat(var_dic_list()) " 3403 | } 3404 | }, 3405 | "types_to_exclude": [ 3406 | "module", 3407 | "function", 3408 | "builtin_function_or_method", 3409 | "instance", 3410 | "_Feature" 3411 | ], 3412 | "window_display": false 3413 | } 3414 | }, 3415 | "nbformat": 4, 3416 | "nbformat_minor": 2 3417 | } 3418 | --------------------------------------------------------------------------------