├── 2020_election_results.txt ├── 2024 Election Prediction - Bayesian Method.ipynb ├── ARMA Stock Forecasting.ipynb ├── Backtesting.ipynb ├── Bayesian Time Series.ipynb ├── Bayesian Treasure Hunt.ipynb ├── Capture Recapture.ipynb ├── Coding PCA.ipynb ├── Decision Tree Coding.ipynb ├── Easy Community Detection.ipynb ├── Gibbs Sampling Code.ipynb ├── Gradient Descent.ipynb ├── Linear Tree.ipynb ├── Logistic Regression.ipynb ├── MCMC Board Game.ipynb ├── MCMC Experiments.ipynb ├── Monte Carlo Code.ipynb ├── Probability Calibration.ipynb ├── PyMC3.ipynb ├── RNNExample.ipynb ├── SVM_Kernels.Rmd ├── Sample Size.ipynb ├── Shift Scheduling.ipynb ├── Spam Filtering.ipynb ├── Starbucks Dists.ipynb ├── Thompson Sampling.ipynb ├── Unbalanced Data.ipynb ├── UnsubscribeBot.ipynb ├── Word2Vec.ipynb ├── YouTubeAgent - Driver.ipynb ├── fish.csv ├── stopwords.txt ├── training_text.txt └── youtube_agent_utils.py /2020_election_results.txt: -------------------------------------------------------------------------------- 1 | STATE RESULTS 2 | President: Alabama 3 | 9 Electoral Votes 4 | Trump 5 | PROJECTED WINNER 6 | + FOLLOW 7 | Candidate % Votes 8 | Trump 9 | 62.0% 10 | 1,441,170 11 | Biden 12 | 36.6% 13 | 849,624 14 | Est. 99% In 15 | Updated 10:17 p.m. ET, Mar. 6 16 | Full Details 17 | President: Alaska 18 | 3 Electoral Votes 19 | Trump 20 | PROJECTED WINNER 21 | + FOLLOW 22 | Candidate % Votes 23 | Trump 24 | 52.8% 25 | 189,951 26 | Biden 27 | 42.8% 28 | 153,778 29 | Est. 99% In 30 | Updated 09:51 a.m. ET, Dec. 2 31 | Full Details 32 | President: Arizona 33 | Party change 34 | BATTLEGROUND 35 | 11 Electoral Votes 36 | Biden 37 | PROJECTED WINNER 38 | + FOLLOW 39 | Candidate % Votes 40 | Biden 41 | 49.4% 42 | 1,672,143 43 | Trump 44 | 49.0% 45 | 1,661,686 46 | Est. 99% In 47 | Updated 04:11 p.m. ET, Nov. 30 48 | Full Details 49 | President: Arkansas 50 | 6 Electoral Votes 51 | Trump 52 | PROJECTED WINNER 53 | + FOLLOW 54 | Candidate % Votes 55 | Trump 56 | 62.4% 57 | 760,647 58 | Biden 59 | 34.8% 60 | 423,932 61 | Est. 99% In 62 | Updated 01:42 p.m. ET, Nov. 30 63 | Full Details 64 | President: California 65 | 55 Electoral Votes 66 | Biden 67 | PROJECTED WINNER 68 | + FOLLOW 69 | Candidate % Votes 70 | Biden 71 | 63.5% 72 | 11,110,250 73 | Trump 74 | 34.3% 75 | 6,006,429 76 | Est. 99% In 77 | Updated 12:28 p.m. ET, Jan. 12 78 | Full Details 79 | President: Colorado 80 | BATTLEGROUND 81 | 9 Electoral Votes 82 | Biden 83 | PROJECTED WINNER 84 | + FOLLOW 85 | Candidate % Votes 86 | Biden 87 | 55.4% 88 | 1,804,352 89 | Trump 90 | 41.9% 91 | 1,364,607 92 | Est. 99% In 93 | Updated 01:21 p.m. ET, Jan. 26 94 | Full Details 95 | President: Connecticut 96 | 7 Electoral Votes 97 | Biden 98 | PROJECTED WINNER 99 | + FOLLOW 100 | Candidate % Votes 101 | Biden 102 | 59.3% 103 | 1,080,831 104 | Trump 105 | 39.2% 106 | 714,717 107 | Est. 99% In 108 | Updated 01:01 p.m. ET, Mar. 8 109 | Full Details 110 | President: Delaware 111 | 3 Electoral Votes 112 | Biden 113 | PROJECTED WINNER 114 | + FOLLOW 115 | Candidate % Votes 116 | Biden 117 | 58.7% 118 | 296,268 119 | Trump 120 | 39.8% 121 | 200,603 122 | Est. 99% In 123 | Updated 12:30 p.m. ET, Apr. 28 124 | Full Details 125 | President: District Of Columbia 126 | 3 Electoral Votes 127 | Biden 128 | PROJECTED WINNER 129 | + FOLLOW 130 | Candidate % Votes 131 | Biden 132 | 92.1% 133 | 317,323 134 | Trump 135 | 5.4% 136 | 18,586 137 | Est. 99% In 138 | Updated 10:00 p.m. ET, Dec. 2 139 | Full Details 140 | President: Florida 141 | BATTLEGROUND 142 | 29 Electoral Votes 143 | Trump 144 | PROJECTED WINNER 145 | + FOLLOW 146 | Candidate % Votes 147 | Trump 148 | 51.2% 149 | 5,668,731 150 | Biden 151 | 47.9% 152 | 5,297,045 153 | Est. 99% In 154 | Updated 01:55 p.m. ET, Jan. 26 155 | Full Details 156 | President: Georgia 157 | Party change 158 | BATTLEGROUND 159 | 16 Electoral Votes 160 | Biden 161 | PROJECTED WINNER 162 | + FOLLOW 163 | Candidate % Votes 164 | Biden 165 | 49.5% 166 | 2,473,633 167 | Trump 168 | 49.2% 169 | 2,461,854 170 | Est. 99% In 171 | Updated 10:10 p.m. ET, Mar. 6 172 | Full Details 173 | President: Hawaii 174 | 4 Electoral Votes 175 | Biden 176 | PROJECTED WINNER 177 | + FOLLOW 178 | Candidate % Votes 179 | Biden 180 | 63.7% 181 | 366,130 182 | Trump 183 | 34.3% 184 | 196,864 185 | Est. 99% In 186 | Updated 08:23 a.m. ET, Nov. 24 187 | Full Details 188 | President: Idaho 189 | 4 Electoral Votes 190 | Trump 191 | PROJECTED WINNER 192 | + FOLLOW 193 | Candidate % Votes 194 | Trump 195 | 63.8% 196 | 554,119 197 | Biden 198 | 33.1% 199 | 287,021 200 | Est. 99% In 201 | Updated 10:39 p.m. ET, Mar. 6 202 | Full Details 203 | President: Illinois 204 | 20 Electoral Votes 205 | Biden 206 | PROJECTED WINNER 207 | + FOLLOW 208 | Candidate % Votes 209 | Biden 210 | 57.5% 211 | 3,471,915 212 | Trump 213 | 40.6% 214 | 2,446,891 215 | Est. 99% In 216 | Updated 10:44 a.m. ET, May. 5 217 | Full Details 218 | President: Indiana 219 | 11 Electoral Votes 220 | Trump 221 | PROJECTED WINNER 222 | + FOLLOW 223 | Candidate % Votes 224 | Trump 225 | 57.0% 226 | 1,729,857 227 | Biden 228 | 41.0% 229 | 1,242,498 230 | Est. 99% In 231 | Updated 10:40 p.m. ET, Mar. 6 232 | Full Details 233 | President: Iowa 234 | BATTLEGROUND 235 | 6 Electoral Votes 236 | Trump 237 | PROJECTED WINNER 238 | + FOLLOW 239 | Candidate % Votes 240 | Trump 241 | 53.1% 242 | 897,672 243 | Biden 244 | 44.9% 245 | 759,061 246 | Est. 99% In 247 | Updated 06:28 p.m. ET, Dec. 5 248 | Full Details 249 | President: Kansas 250 | 6 Electoral Votes 251 | Trump 252 | PROJECTED WINNER 253 | + FOLLOW 254 | Candidate % Votes 255 | Trump 256 | 56.1% 257 | 771,406 258 | Biden 259 | 41.5% 260 | 570,323 261 | Est. 99% In 262 | Updated 12:16 p.m. ET, Apr. 29 263 | Full Details 264 | President: Kentucky 265 | 8 Electoral Votes 266 | Trump 267 | PROJECTED WINNER 268 | + FOLLOW 269 | Candidate % Votes 270 | Trump 271 | 62.1% 272 | 1,326,646 273 | Biden 274 | 36.1% 275 | 772,474 276 | Est. 99% In 277 | Updated 10:32 p.m. ET, Mar. 6 278 | Full Details 279 | President: Louisiana 280 | 8 Electoral Votes 281 | Trump 282 | PROJECTED WINNER 283 | + FOLLOW 284 | Candidate % Votes 285 | Trump 286 | 58.5% 287 | 1,255,776 288 | Biden 289 | 39.9% 290 | 856,034 291 | Est. 99% In 292 | Updated 10:47 a.m. ET, Jan. 26 293 | Full Details 294 | President: Maine 295 | 4 Electoral Votes 296 | Biden 297 | PROJECTED WINNER 298 | + FOLLOW 299 | Electoral Vote Breakdown 300 | Biden 3 301 | Trump 1 302 | Candidate % Votes 303 | Biden 304 | 53.1% 305 | 435,072 306 | Trump 307 | 44.0% 308 | 360,737 309 | Est. 99% In 310 | Updated 01:33 p.m. ET, Feb. 24 311 | Full Details 312 | President: Maryland 313 | 10 Electoral Votes 314 | Biden 315 | PROJECTED WINNER 316 | + FOLLOW 317 | Candidate % Votes 318 | Biden 319 | 65.4% 320 | 1,985,023 321 | Trump 322 | 32.1% 323 | 976,414 324 | Est. 99% In 325 | Updated 06:14 p.m. ET, Mar. 6 326 | Full Details 327 | President: Massachusetts 328 | 11 Electoral Votes 329 | Biden 330 | PROJECTED WINNER 331 | + FOLLOW 332 | Candidate % Votes 333 | Biden 334 | 65.6% 335 | 2,382,202 336 | Trump 337 | 32.1% 338 | 1,167,202 339 | Est. 99% In 340 | Updated 10:48 p.m. ET, Mar. 6 341 | Full Details 342 | President: Michigan 343 | Party change 344 | BATTLEGROUND 345 | 16 Electoral Votes 346 | Biden 347 | PROJECTED WINNER 348 | + FOLLOW 349 | Candidate % Votes 350 | Biden 351 | 50.6% 352 | 2,804,040 353 | Trump 354 | 47.8% 355 | 2,649,852 356 | Est. 99% In 357 | Updated 10:49 p.m. ET, Mar. 6 358 | Full Details 359 | President: Minnesota 360 | BATTLEGROUND 361 | 10 Electoral Votes 362 | Biden 363 | PROJECTED WINNER 364 | + FOLLOW 365 | Candidate % Votes 366 | Biden 367 | 52.4% 368 | 1,717,077 369 | Trump 370 | 45.3% 371 | 1,484,065 372 | Est. 99% In 373 | Updated 10:13 a.m. ET, Jan. 26 374 | Full Details 375 | President: Mississippi 376 | 6 Electoral Votes 377 | Trump 378 | PROJECTED WINNER 379 | + FOLLOW 380 | Candidate % Votes 381 | Trump 382 | 57.5% 383 | 756,764 384 | Biden 385 | 41.0% 386 | 539,398 387 | Est. 99% In 388 | Updated 03:45 p.m. ET, Apr. 29 389 | Full Details 390 | President: Missouri 391 | 10 Electoral Votes 392 | Trump 393 | PROJECTED WINNER 394 | + FOLLOW 395 | Candidate % Votes 396 | Trump 397 | 56.8% 398 | 1,718,736 399 | Biden 400 | 41.4% 401 | 1,253,014 402 | Est. 99% In 403 | Updated 07:37 a.m. ET, Dec. 10 404 | Full Details 405 | President: Montana 406 | 3 Electoral Votes 407 | Trump 408 | PROJECTED WINNER 409 | + FOLLOW 410 | Candidate % Votes 411 | Trump 412 | 56.9% 413 | 343,602 414 | Biden 415 | 40.5% 416 | 244,786 417 | Est. 99% In 418 | Updated 06:15 p.m. ET, Mar. 6 419 | Full Details 420 | President: Nebraska 421 | 5 Electoral Votes 422 | Trump 423 | PROJECTED WINNER 424 | + FOLLOW 425 | Electoral Vote Breakdown 426 | Trump 4 427 | Biden 1 428 | Candidate % Votes 429 | Trump 430 | 58.2% 431 | 556,846 432 | Biden 433 | 39.2% 434 | 374,583 435 | Est. 99% In 436 | Updated 03:47 p.m. ET, Jan. 15 437 | Full Details 438 | President: Nevada 439 | BATTLEGROUND 440 | 6 Electoral Votes 441 | Biden 442 | PROJECTED WINNER 443 | + FOLLOW 444 | Candidate % Votes 445 | Biden 446 | 50.1% 447 | 703,486 448 | Trump 449 | 47.7% 450 | 669,890 451 | Est. 99% In 452 | Updated 10:50 p.m. ET, Mar. 6 453 | Full Details 454 | President: New Hampshire 455 | BATTLEGROUND 456 | 4 Electoral Votes 457 | Biden 458 | PROJECTED WINNER 459 | + FOLLOW 460 | Candidate % Votes 461 | Biden 462 | 52.7% 463 | 424,937 464 | Trump 465 | 45.4% 466 | 365,660 467 | Est. 99% In 468 | Updated 10:35 a.m. ET, Mar. 8 469 | Full Details 470 | President: New Jersey 471 | 14 Electoral Votes 472 | Biden 473 | PROJECTED WINNER 474 | + FOLLOW 475 | Candidate % Votes 476 | Biden 477 | 57.1% 478 | 2,608,400 479 | Trump 480 | 41.3% 481 | 1,883,313 482 | Est. 99% In 483 | Updated 04:24 p.m. ET, Mar. 11 484 | Full Details 485 | President: New Mexico 486 | 5 Electoral Votes 487 | Biden 488 | PROJECTED WINNER 489 | + FOLLOW 490 | Candidate % Votes 491 | Biden 492 | 54.3% 493 | 501,614 494 | Trump 495 | 43.5% 496 | 401,894 497 | Est. 99% In 498 | Updated 10:37 p.m. ET, Mar. 6 499 | Full Details 500 | President: New York 501 | 29 Electoral Votes 502 | Biden 503 | PROJECTED WINNER 504 | + FOLLOW 505 | Candidate % Votes 506 | Biden 507 | 60.9% 508 | 5,244,886 509 | Trump 510 | 37.7% 511 | 3,251,997 512 | Est. 99% In 513 | Updated 10:00 a.m. ET, Apr. 2 514 | Full Details 515 | President: North Carolina 516 | BATTLEGROUND 517 | 15 Electoral Votes 518 | Trump 519 | PROJECTED WINNER 520 | + FOLLOW 521 | Candidate % Votes 522 | Trump 523 | 49.9% 524 | 2,758,775 525 | Biden 526 | 48.6% 527 | 2,684,292 528 | Est. 99% In 529 | Updated 04:15 p.m. ET, Mar. 7 530 | Full Details 531 | President: North Dakota 532 | 3 Electoral Votes 533 | Trump 534 | PROJECTED WINNER 535 | + FOLLOW 536 | Candidate % Votes 537 | Trump 538 | 65.1% 539 | 235,595 540 | Biden 541 | 31.8% 542 | 114,902 543 | Est. 99% In 544 | Updated 08:22 a.m. ET, Nov. 24 545 | Full Details 546 | President: Ohio 547 | BATTLEGROUND 548 | 18 Electoral Votes 549 | Trump 550 | PROJECTED WINNER 551 | + FOLLOW 552 | Candidate % Votes 553 | Trump 554 | 53.3% 555 | 3,154,834 556 | Biden 557 | 45.2% 558 | 2,679,165 559 | Est. 99% In 560 | Updated 06:14 p.m. ET, Mar. 6 561 | Full Details 562 | President: Oklahoma 563 | 7 Electoral Votes 564 | Trump 565 | PROJECTED WINNER 566 | + FOLLOW 567 | Candidate % Votes 568 | Trump 569 | 65.4% 570 | 1,020,280 571 | Biden 572 | 32.3% 573 | 503,890 574 | Est. 99% In 575 | Updated 08:21 a.m. ET, Nov. 24 576 | Full Details 577 | President: Oregon 578 | 7 Electoral Votes 579 | Biden 580 | PROJECTED WINNER 581 | + FOLLOW 582 | Candidate % Votes 583 | Biden 584 | 56.5% 585 | 1,340,383 586 | Trump 587 | 40.4% 588 | 958,448 589 | Est. 99% In 590 | Updated 08:37 p.m. ET, Dec. 5 591 | Full Details 592 | President: Pennsylvania 593 | Party change 594 | BATTLEGROUND 595 | 20 Electoral Votes 596 | Biden 597 | PROJECTED WINNER 598 | + FOLLOW 599 | Candidate % Votes 600 | Biden 601 | 50.0% 602 | 3,459,923 603 | Trump 604 | 48.8% 605 | 3,378,263 606 | Est. 99% In 607 | Updated 02:32 p.m. ET, May. 3 608 | Full Details 609 | President: Rhode Island 610 | 4 Electoral Votes 611 | Biden 612 | PROJECTED WINNER 613 | + FOLLOW 614 | Candidate % Votes 615 | Biden 616 | 59.4% 617 | 307,486 618 | Trump 619 | 38.6% 620 | 199,922 621 | Est. 99% In 622 | Updated 11:23 a.m. ET, Dec. 11 623 | Full Details 624 | President: South Carolina 625 | 9 Electoral Votes 626 | Trump 627 | PROJECTED WINNER 628 | + FOLLOW 629 | Candidate % Votes 630 | Trump 631 | 55.1% 632 | 1,385,103 633 | Biden 634 | 43.4% 635 | 1,091,541 636 | Est. 99% In 637 | Updated 10:52 p.m. ET, Mar. 6 638 | Full Details 639 | President: South Dakota 640 | 3 Electoral Votes 641 | Trump 642 | PROJECTED WINNER 643 | + FOLLOW 644 | Candidate % Votes 645 | Trump 646 | 61.8% 647 | 261,043 648 | Biden 649 | 35.6% 650 | 150,471 651 | Est. 99% In 652 | Updated 08:21 a.m. ET, Nov. 24 653 | Full Details 654 | President: Tennessee 655 | 11 Electoral Votes 656 | Trump 657 | PROJECTED WINNER 658 | + FOLLOW 659 | Candidate % Votes 660 | Trump 661 | 60.7% 662 | 1,852,475 663 | Biden 664 | 37.4% 665 | 1,143,711 666 | Est. 99% In 667 | Updated 06:20 p.m. ET, Mar. 6 668 | Full Details 669 | President: Texas 670 | BATTLEGROUND 671 | 38 Electoral Votes 672 | Trump 673 | PROJECTED WINNER 674 | + FOLLOW 675 | Candidate % Votes 676 | Trump 677 | 52.1% 678 | 5,890,347 679 | Biden 680 | 46.5% 681 | 5,259,126 682 | Est. 99% In 683 | Updated 03:14 p.m. ET, Feb. 22 684 | Full Details 685 | President: Utah 686 | 6 Electoral Votes 687 | Trump 688 | PROJECTED WINNER 689 | + FOLLOW 690 | Candidate % Votes 691 | Trump 692 | 58.1% 693 | 865,140 694 | Biden 695 | 37.6% 696 | 560,282 697 | Est. 99% In 698 | Updated 10:32 a.m. ET, Jan. 27 699 | Full Details 700 | President: Vermont 701 | 3 Electoral Votes 702 | Biden 703 | PROJECTED WINNER 704 | + FOLLOW 705 | Candidate % Votes 706 | Biden 707 | 66.1% 708 | 242,820 709 | Trump 710 | 30.7% 711 | 112,704 712 | Est. 99% In 713 | Updated 08:22 a.m. ET, Nov. 24 714 | Full Details 715 | President: Virginia 716 | BATTLEGROUND 717 | 13 Electoral Votes 718 | Biden 719 | PROJECTED WINNER 720 | + FOLLOW 721 | Candidate % Votes 722 | Biden 723 | 54.1% 724 | 2,413,568 725 | Trump 726 | 44.0% 727 | 1,962,430 728 | Est. 99% In 729 | Updated 08:23 a.m. ET, Nov. 24 730 | Full Details 731 | President: Washington 732 | 12 Electoral Votes 733 | Biden 734 | PROJECTED WINNER 735 | + FOLLOW 736 | Candidate % Votes 737 | Biden 738 | 58.0% 739 | 2,369,612 740 | Trump 741 | 38.8% 742 | 1,584,651 743 | Est. 99% In 744 | Updated 10:45 a.m. ET, Dec. 8 745 | Full Details 746 | President: West Virginia 747 | 5 Electoral Votes 748 | Trump 749 | PROJECTED WINNER 750 | + FOLLOW 751 | Candidate % Votes 752 | Trump 753 | 68.6% 754 | 545,382 755 | Biden 756 | 29.7% 757 | 235,984 758 | Est. 99% In 759 | Updated 09:11 a.m. ET, Jan. 27 760 | Full Details 761 | President: Wisconsin 762 | Party change 763 | BATTLEGROUND 764 | 10 Electoral Votes 765 | Biden 766 | PROJECTED WINNER 767 | + FOLLOW 768 | Candidate % Votes 769 | Biden 770 | 49.4% 771 | 1,630,866 772 | Trump 773 | 48.8% 774 | 1,610,184 775 | Est. 99% In 776 | Updated 03:17 p.m. ET, Mar. 8 777 | Full Details 778 | President: Wyoming 779 | 3 Electoral Votes 780 | Trump 781 | PROJECTED WINNER 782 | + FOLLOW 783 | Candidate % Votes 784 | Trump 785 | 69.9% 786 | 193,559 787 | Biden 788 | 26.6% 789 | 73,491 790 | Est. 99% In 791 | Updated 08:22 a.m. ET, Nov. 24 792 | Full Details -------------------------------------------------------------------------------- /Coding PCA.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from sklearn.decomposition import PCA\n", 10 | "import numpy as np\n", 11 | "import matplotlib.pyplot as plt" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "# Generate Data" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 2, 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "#generate some data\n", 28 | "X = np.random.normal(0, 1, (100, 4))\n", 29 | "X[:,2] = 3 * X[:,0] - 2 * X[:,1] + np.random.normal(0, 0.1, 100)\n", 30 | "X[:,3] = 1.5 * X[:,0] - 0.5 * X[:,1] + np.random.normal(0, 0.1, 100)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 3, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "#each feature will have zero mean\n", 40 | "X = X - np.mean(X, axis=0)" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 4, 46 | "metadata": {}, 47 | "outputs": [ 48 | { 49 | "data": { 50 | "image/png": "\n", 51 | "text/plain": [ 52 | "
" 53 | ] 54 | }, 55 | "metadata": { 56 | "needs_background": "light" 57 | }, 58 | "output_type": "display_data" 59 | } 60 | ], 61 | "source": [ 62 | "plt.figure(figsize=(10,10))\n", 63 | "for i in range(4):\n", 64 | " for j in range(4):\n", 65 | " if j > i:\n", 66 | " plt.subplot(4,4,i*4+j+1)\n", 67 | " plt.scatter(X[:,i], X[:,j])\n", 68 | " plt.xlabel(f'x{i+1}', fontsize=20)\n", 69 | " plt.ylabel(f'x{j+1}', fontsize=20)\n", 70 | "plt.tight_layout()" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "## Observations:\n", 78 | "### - x1 and x2 do not seem correlated\n", 79 | "### - x1 seems very correlated with both x3 and x4\n", 80 | "### - x2 seems somewhat correlated with both x3 and x4\n", 81 | "### - x3 and x4 seem very correlated" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "# Apply PCA" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 5, 94 | "metadata": {}, 95 | "outputs": [ 96 | { 97 | "data": { 98 | "text/plain": [ 99 | "PCA(n_components=4)" 100 | ] 101 | }, 102 | "execution_count": 5, 103 | "metadata": {}, 104 | "output_type": "execute_result" 105 | } 106 | ], 107 | "source": [ 108 | "#initialize\n", 109 | "pca = PCA(n_components=4)\n", 110 | "\n", 111 | "#fit\n", 112 | "pca.fit(X)" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": 6, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "#get principal components\n", 122 | "principal_comps_builtin = pca.components_.T" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 7, 128 | "metadata": {}, 129 | "outputs": [ 130 | { 131 | "name": "stdout", 132 | "output_type": "stream", 133 | "text": [ 134 | "principal component 0\n", 135 | "[ 0.21836467 -0.11571309 0.88882471 0.38589893]\n", 136 | "\n", 137 | "principal component 1\n", 138 | "[ 0.48454841 0.80382872 -0.14943543 0.31103261]\n", 139 | "\n", 140 | "principal component 2\n", 141 | "[ 0.18131723 0.29326099 0.36846687 -0.863339 ]\n", 142 | "\n", 143 | "principal component 3\n", 144 | "[ 0.82743808 -0.50444808 -0.22779784 -0.0947972 ]\n", 145 | "\n" 146 | ] 147 | } 148 | ], 149 | "source": [ 150 | "#print each principal component\n", 151 | "for i,component in enumerate(pca.components_):\n", 152 | " print(f'principal component {i}')\n", 153 | " print(component)\n", 154 | " print()" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "# Can we do this by hand?" 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "## Principal components are the eigenvectors of the covariance matrix" 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": 8, 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "#compute covariance matrix\n", 178 | "#https://www.youtube.com/watch?v=F-aku75OpoM\n", 179 | "cov_matrix = sum([X[i].reshape(-1,1) @ X[i].reshape(1,-1) for i in range(100)]) / 100" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 9, 185 | "metadata": {}, 186 | "outputs": [], 187 | "source": [ 188 | "#eigenvalues and eigenvectors of covariance matrix\n", 189 | "eigvecs = np.linalg.eig(cov_matrix)" 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": 10, 195 | "metadata": {}, 196 | "outputs": [], 197 | "source": [ 198 | "#sort order by magnitude of eigenvalue\n", 199 | "ordering = np.argsort(eigvecs[0])[::-1]" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": 11, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [ 208 | "#get eigenvectors\n", 209 | "principal_comps_byhand = eigvecs[1][:,ordering]" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": 12, 215 | "metadata": {}, 216 | "outputs": [ 217 | { 218 | "data": { 219 | "text/plain": [ 220 | "array([[ 0.21836467, 0.48454841, -0.18131723, 0.82743808],\n", 221 | " [-0.11571309, 0.80382872, -0.29326099, -0.50444808],\n", 222 | " [ 0.88882471, -0.14943543, -0.36846687, -0.22779784],\n", 223 | " [ 0.38589893, 0.31103261, 0.863339 , -0.0947972 ]])" 224 | ] 225 | }, 226 | "execution_count": 12, 227 | "metadata": {}, 228 | "output_type": "execute_result" 229 | } 230 | ], 231 | "source": [ 232 | "#our by-hand eigenvectors\n", 233 | "principal_comps_byhand" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 13, 239 | "metadata": {}, 240 | "outputs": [ 241 | { 242 | "data": { 243 | "text/plain": [ 244 | "array([[ 0.21836467, 0.48454841, 0.18131723, 0.82743808],\n", 245 | " [-0.11571309, 0.80382872, 0.29326099, -0.50444808],\n", 246 | " [ 0.88882471, -0.14943543, 0.36846687, -0.22779784],\n", 247 | " [ 0.38589893, 0.31103261, -0.863339 , -0.0947972 ]])" 248 | ] 249 | }, 250 | "execution_count": 13, 251 | "metadata": {}, 252 | "output_type": "execute_result" 253 | } 254 | ], 255 | "source": [ 256 | "#results from built-in call\n", 257 | "principal_comps_builtin" 258 | ] 259 | }, 260 | { 261 | "cell_type": "markdown", 262 | "metadata": {}, 263 | "source": [ 264 | "# ✔️" 265 | ] 266 | } 267 | ], 268 | "metadata": { 269 | "kernelspec": { 270 | "display_name": "Python 3", 271 | "language": "python", 272 | "name": "python3" 273 | }, 274 | "language_info": { 275 | "codemirror_mode": { 276 | "name": "ipython", 277 | "version": 3 278 | }, 279 | "file_extension": ".py", 280 | "mimetype": "text/x-python", 281 | "name": "python", 282 | "nbconvert_exporter": "python", 283 | "pygments_lexer": "ipython3", 284 | "version": "3.7.7" 285 | } 286 | }, 287 | "nbformat": 4, 288 | "nbformat_minor": 4 289 | } 290 | -------------------------------------------------------------------------------- /Logistic Regression.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Logistic Regression\n", 8 | "# $ln(\\frac{p}{1-p}) = \\beta_0 + \\beta_1x$\n", 9 | "# $p = \\frac{e^{\\beta_0 + \\beta_1x}}{1+e^{\\beta_0 + \\beta_1x}}$" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "# $\\text{Likelihood} = \\prod_{i=1}^{N} p_{i}^{y_i}(1-p_{i})^{1-y_i} $\n", 17 | "# $LL = log(\\text{Likelihood}) = \\sum_{i=1}^{N} y_ilog(p_i) + (1-y_i)log(1-p_i)$" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": {}, 23 | "source": [ 24 | "# $\\frac{\\partial LL}{\\partial \\beta_0} = \\sum_{i=1}^{N} (y_i - p_i) $\n", 25 | "# $\\frac{\\partial LL}{\\partial \\beta_1} = \\sum_{i=1}^{N} (y_i - p_i) x_i $" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": {}, 31 | "source": [ 32 | "# Goal: Maximize Log Likelihood" 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": 1, 38 | "metadata": {}, 39 | "outputs": [], 40 | "source": [ 41 | "import numpy as np\n", 42 | "import pandas as pd\n", 43 | "import matplotlib.pyplot as plt\n", 44 | "from sklearn.linear_model import LogisticRegression" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 2, 50 | "metadata": {}, 51 | "outputs": [], 52 | "source": [ 53 | "class_1 = np.random.random(20)*2 + 1\n", 54 | "class_2 = np.random.random(20)*2 - 0.5" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 3, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "data = pd.DataFrame()\n", 64 | "data['x'] = np.concatenate([class_1, class_2])\n", 65 | "data['y'] = [0]*20 + [1]*20\n", 66 | "data = data.sample(frac=1)" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 4, 72 | "metadata": {}, 73 | "outputs": [ 74 | { 75 | "data": { 76 | "text/html": [ 77 | "
\n", 78 | "\n", 91 | "\n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | "
xy
38-0.4458931
220.7878491
61.4674420
42.4783440
28-0.4562361
\n", 127 | "
" 128 | ], 129 | "text/plain": [ 130 | " x y\n", 131 | "38 -0.445893 1\n", 132 | "22 0.787849 1\n", 133 | "6 1.467442 0\n", 134 | "4 2.478344 0\n", 135 | "28 -0.456236 1" 136 | ] 137 | }, 138 | "execution_count": 4, 139 | "metadata": {}, 140 | "output_type": "execute_result" 141 | } 142 | ], 143 | "source": [ 144 | "data.head()" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 5, 150 | "metadata": {}, 151 | "outputs": [ 152 | { 153 | "data": { 154 | "text/plain": [ 155 | "" 156 | ] 157 | }, 158 | "execution_count": 5, 159 | "metadata": {}, 160 | "output_type": "execute_result" 161 | }, 162 | { 163 | "data": { 164 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAR1UlEQVR4nO3db4xd913n8ffH47hQnN04eFpC/thBSgFTUTYe3FRod7OCpUn6ICCBlBTxp6qwgpoV+6zZXS3sige7rLSrVUUgmBK1SBALiUJDlRKKROkDSJuZKP3jhhQTYsekW08St0kdEq8z330w1+7N5P45d+bOzPVv3y9p5Hvu73fO+Z7vTD4598y9c1JVSJIufTu2uwBJ0nQY6JLUCANdkhphoEtSIwx0SWrEzu3a8d69e2v//v3btXtJuiQtLS09V1Xzg8a2LdD379/P4uLidu1eki5JSU4MG/OSiyQ1wkCXpEYY6JLUCANdkhphoEtSI8YGepL7k5xO8qUh40nyoSTHk3whyY3TL1OSNE6XM/SPALeMGL8VuKH3dRj4rY2XNdzKSvG1b7zC6RdfYdK/FLmyUiy/9CqvvbbC8kuvXlz/wvMb/cuTG93Oetcft96w8WH92Ki1++tfnvQYB83vr/trL47/WRi1z2l976VZMPZ96FX1mST7R0y5Hfi9Wv0v4pEkVyS5qqq+OqUaL1pZKe448gife/oFAA7t38PRw+9ix450WvfO33mExadf4M1v2snL515jYd8efv/97+RnfvezLJ04w8F9e3jgF2/qtL1h21/vdta7/rj1ho1f7MeJM7x51xwvv3qehf1Xrvv4h+2vv783XrcHKB47+fVOxzioduBbdV82x0uvngfg0P4rOXr4jdsb1Z+Nfs+kWTONa+hXA8/0LZ/qPfcGSQ4nWUyyuLy8PPGOnj97jqWTZy4uL504w/Nnz3Vf98QZXit46ZXzvLZSLJ04w/Hlb7J04gzne8tdtzds++vdznrXH7fesPGL/Vip1X7UZP3sWs/r+nvyzETHOKj219XdC3OApZODtzeqPxv9nkmzZhqBPuiUZuDr16o6UlULVbUwPz/wk6sj7d29i4P79lxcPrhvD3t375po3bnA5d+2k7kd4eC+Pbztrbs5uG8PO3vLXbc3bPvr3c561x+33rDxi/3YkdV+ZLJ+dq1nbX8nOcZBtb+u7jd96wXmsO2N6s9Gv2fSrEmXa4e9Sy6fqKq3Dxj7beDTVfVAb/lJ4OZxl1wWFhZqPR/9v3DNM4H5y99EMtlljefPnuPKN1/GCy//X/bu3kWSi89fWF6vjW5nveuPW2/Y+LB+bNTa/fUvVzHRMQ6qvb/u586eI4z+WRjVn2l976WtkmSpqhYGjk0h0N8D3A3cBrwT+FBVHRq3zfUGuiT9/2xUoI/9pWiSB4Cbgb1JTgG/ClwGUFX3AQ+xGubHgZeB902nbEnSJLq8y+XOMeMFfGBqFUmS1sVPikpSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1IhOgZ7kliRPJjme5J4B4/88yZ8m+XySY0neN/1SJUmjjA30JHPAvcCtwAHgziQH1kz7APDlqnoHcDPwP5PsmnKtkqQRupyhHwKOV9VTVXUOOArcvmZOAZcnCbAbeAE4P9VKJUkjdQn0q4Fn+pZP9Z7r9xvA9wPPAl8EfrmqVtZuKMnhJItJFpeXl9dZsiRpkC6BngHP1ZrldwOPA98N/BDwG0n+2RtWqjpSVQtVtTA/Pz9xsZKk4boE+ing2r7la1g9E+/3PuBjteo48A/A902nRElSF10C/VHghiTX937ReQfw4Jo5J4EfBUjyVuB7gaemWagkabSd4yZU1fkkdwMPA3PA/VV1LMldvfH7gF8DPpLki6xeovlgVT23iXVLktYYG+gAVfUQ8NCa5+7re/ws8OPTLU2SNAk/KSpJjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIa0SnQk9yS5Mkkx5PcM2TOzUkeT3IsyV9Nt0xJ0jg7x01IMgfcC/xb4BTwaJIHq+rLfXOuAH4TuKWqTiZ5y2YVLEkarMsZ+iHgeFU9VVXngKPA7WvmvBf4WFWdBKiq09MtU5I0TpdAvxp4pm/5VO+5fm8D9iT5dJKlJD83aENJDidZTLK4vLy8voolSQN1CfQMeK7WLO8EDgLvAd4N/Ockb3vDSlVHqmqhqhbm5+cnLlaSNNzYa+isnpFf27d8DfDsgDnPVdVZ4GySzwDvAL4ylSolSWN1OUN/FLghyfVJdgF3AA+umfNx4F8m2ZnkzcA7gSemW6okaZSxZ+hVdT7J3cDDwBxwf1UdS3JXb/y+qnoiyZ8BXwBWgA9X1Zc2s3BJ0uulau3l8K2xsLBQi4uL27JvSbpUJVmqqoVBY35SVJIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRnQK9CS3JHkyyfEk94yY98NJXkvyU9MrUZLUxdhATzIH3AvcChwA7kxyYMi8XwcennaRkqTxupyhHwKOV9VTVXUOOArcPmDevwP+CDg9xfokSR11CfSrgWf6lk/1nrsoydXATwL3jdpQksNJFpMsLi8vT1qrJGmELoGeAc/VmuX/DXywql4btaGqOlJVC1W1MD8/37VGSVIHOzvMOQVc27d8DfDsmjkLwNEkAHuB25Kcr6o/mUqVkqSxugT6o8ANSa4H/hG4A3hv/4Squv7C4yQfAT5hmEvS1hob6FV1PsndrL57ZQ64v6qOJbmrNz7yurkkaWt0OUOnqh4CHlrz3MAgr6pf2HhZkqRJ+UlRSWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1IhOgZ7kliRPJjme5J4B4z+T5Au9r79O8o7plypJGmVsoCeZA+4FbgUOAHcmObBm2j8A/7qqfhD4NeDItAuVJI3W5Qz9EHC8qp6qqnPAUeD2/glV9ddVdaa3+AhwzXTLlCSN0yXQrwae6Vs+1XtumPcDnxw0kORwksUki8vLy92rlCSN1SXQM+C5Gjgx+TesBvoHB41X1ZGqWqiqhfn5+e5VSpLG2tlhzing2r7la4Bn105K8oPAh4Fbq+r56ZQnSeqqyxn6o8ANSa5Psgu4A3iwf0KS64CPAT9bVV+ZfpmSpHHGnqFX1fkkdwMPA3PA/VV1LMldvfH7gF8BvhP4zSQA56tqYfPKliStlaqBl8M33cLCQi0uLm7LviXpUpVkadgJs58UlaRGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEZ0CPcktSZ5McjzJPQPGk+RDvfEvJLlx+qVKkkYZG+hJ5oB7gVuBA8CdSQ6smXYrcEPv6zDwW1OuU5rIykqx/NKrVNXE4+PW7TpnknmbbVAd0z6GjfZ0EsOO52svvsL/+cY/cfrFV96wr/51hj2eRh3j5m3mz8TODnMOAcer6imAJEeB24Ev9825Hfi9Wq3wkSRXJLmqqr469YqlMVZWijt/5xGWTpzh4L49PPCLN7FjRzqNj1u365xJ5m22QXUAUz2GjfZ0Gsdzx5G/4XNPn7k479D+Kzl6eHVf/evceN0VQHjs5BluvG4PUDx28usT17ae3mxkf110ueRyNfBM3/Kp3nOTziHJ4SSLSRaXl5cnrVXq5Pmz51g6cYbzK8XSiTM8f/Zc5/Fx63adM8m8zTaojmkfw0Z7Oq3j6bd08lv7WrvO0sne45Nn1l3bunqzgf110SXQB/3vY+1rhS5zqKojVbVQVQvz8/Nd6pMmtnf3Lg7u28POHeHgvj3s3b2r8/i4dbvOmWTeZhtUx7SPYaM9ndbx9Ovf19p1hj2epLb19mYzfyYy7jpOkncB/6Wq3t1b/g8AVfXf+ub8NvDpqnqgt/wkcPOoSy4LCwu1uLi48SOQBlhZKZ4/e469u3eRDH4ZPGx83Lpd50wyb7MNqmPax7DRnk7jeJa/uXptekfC/OVvet2++tepYuDjSWtbT282sj+AJEtVtTBwrEOg7wS+Avwo8I/Ao8B7q+pY35z3AHcDtwHvBD5UVYdGbddAl6TJjQr0sb8UrarzSe4GHgbmgPur6liSu3rj9wEPsRrmx4GXgfdNq3hJUjdd3uVCVT3Eamj3P3df3+MCPjDd0iRJk/CTopLUCANdkhphoEtSIwx0SWrE2LctbtqOk2XgxASr7AWe26RyNsOlVi9Y81a41OoFa94Kk9S7r6oGfjJz2wJ9UkkWh733chZdavWCNW+FS61esOatMK16veQiSY0w0CWpEZdSoB/Z7gImdKnVC9a8FS61esGat8JU6r1krqFLkka7lM7QJUkjGOiS1IiZDPQkVyb5VJK/6/27Z8i8p5N8McnjSbblb/FeijfQ7lDzzUm+0evr40l+ZTvq7Kvn/iSnk3xpyPhM9bhDvTPV315N1yb5yyRPJDmW5JcHzJmZPnesd6b6nOTbknwuyed7Nf/XAXM21uOqmrkv4H8A9/Qe3wP8+pB5TwN7t7HOOeDvge8BdgGfBw6smXMb8ElW7+p0E/DZbe5tl5pvBj6x3T8HffX8K+BG4EtDxmetx+Pqnan+9mq6Crix9/hyVu+BMLM/yx3rnak+9/q2u/f4MuCzwE3T7PFMnqGzetPpj/YefxT4iW2sZZSLN9CuqnPAhRto97t4A+2qegS4IslVW11ony41z5Sq+gzwwogpM9XjDvXOnKr6alU91nv8EvAEb7wv8Mz0uWO9M6XXt2/2Fi/rfa19V8qGejyrgf7W6t2+rvfvW4bMK+DPkywlObxl1X3L1G6gvYW61vOu3kvDTyb5ga0pbd1mrcddzGx/k+wH/gWrZ5D9ZrLPI+qFGetzkrkkjwOngU9V1VR73OkGF5shyV8A3zVg6D9NsJkfqapnk7wF+FSSv+2dHW2Vqd1Aewt1qecxVv9exDeT3Ab8CXDDple2frPW43Fmtr9JdgN/BPz7qnpx7fCAVba1z2Pqnbk+V9VrwA8luQL44yRvr6r+37VsqMfbdoZeVT9WVW8f8PVx4GsXXmb0/j09ZBvP9v49Dfwxq5cTttIp4Nq+5WuAZ9cxZyuNraeqXrzw0rBW71Z1WZK9W1fixGatxyPNan+TXMZqOP5+VX1swJSZ6vO4eme1zwBV9XXg08Ata4Y21ONZveTyIPDzvcc/D3x87YQk35Hk8guPgR8HBr6rYBM9CtyQ5Poku4A7WK2934PAz/V+e30T8I0Ll5O2ydiak3xXsno78iSHWP05eX7LK+1u1no80iz2t1fP7wJPVNX/GjJtZvrcpd5Z63OS+d6ZOUm+Hfgx4G/XTNtQj7ftkssY/x34wyTvB04CPw2Q5LuBD1fVbcBbWX3JAqvH8QdV9WdbWWRdgjfQ7ljzTwG/lOQ88E/AHdX7Ffx2SPIAq+9Y2JvkFPCrrP5CaSZ73KHemepvz48APwt8sXeNF+A/AtfBTPa5S72z1uergI8mmWP1fy5/WFWfmGZe+NF/SWrErF5ykSRNyECXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5Jjfh/NCZAkTKIo6MAAAAASUVORK5CYII=\n", 165 | "text/plain": [ 166 | "
" 167 | ] 168 | }, 169 | "metadata": { 170 | "needs_background": "light" 171 | }, 172 | "output_type": "display_data" 173 | } 174 | ], 175 | "source": [ 176 | "plt.scatter(data.x, data.y, s=5)" 177 | ] 178 | }, 179 | { 180 | "cell_type": "code", 181 | "execution_count": 7, 182 | "metadata": {}, 183 | "outputs": [], 184 | "source": [ 185 | "def calculate_gradient_log_likelihood(curr_betas, data):\n", 186 | " numerator = np.exp(curr_betas[0] + curr_betas[1]*data.x)\n", 187 | " p = numerator / (1 + numerator)\n", 188 | " \n", 189 | " partial_0 = np.sum(data.y - p)\n", 190 | " partial_1 = np.sum((data.y - p)*data.x)\n", 191 | " \n", 192 | " return np.array([partial_0, partial_1])" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": 14, 198 | "metadata": {}, 199 | "outputs": [ 200 | { 201 | "name": "stdout", 202 | "output_type": "stream", 203 | "text": [ 204 | "[ 20.3250053 -17.71620914]\n" 205 | ] 206 | } 207 | ], 208 | "source": [ 209 | "curr_betas = np.array([0.0,0.0])\n", 210 | "diff = np.inf\n", 211 | "eta = 0.1\n", 212 | "\n", 213 | "while diff > .001:\n", 214 | " grad = calculate_gradient_log_likelihood(curr_betas, data)\n", 215 | " diff = abs(grad).sum()\n", 216 | " curr_betas += eta*grad\n", 217 | " \n", 218 | "print(curr_betas)" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": 15, 224 | "metadata": {}, 225 | "outputs": [ 226 | { 227 | "data": { 228 | "text/plain": [ 229 | "[]" 230 | ] 231 | }, 232 | "execution_count": 15, 233 | "metadata": {}, 234 | "output_type": "execute_result" 235 | }, 236 | { 237 | "data": { 238 | "image/png": "\n", 239 | "text/plain": [ 240 | "
" 241 | ] 242 | }, 243 | "metadata": { 244 | "needs_background": "light" 245 | }, 246 | "output_type": "display_data" 247 | } 248 | ], 249 | "source": [ 250 | "plt.scatter(data.x, data.y, s=5)\n", 251 | "\n", 252 | "x_vals = np.arange(data.x.min(), data.x.max(), .01)\n", 253 | "p_vals = 1 / (1 + np.exp(-(curr_betas[0] + curr_betas[1]*x_vals)))\n", 254 | "plt.plot(x_vals, p_vals)" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": {}, 260 | "source": [ 261 | "# Built-In Solution" 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": 16, 267 | "metadata": {}, 268 | "outputs": [ 269 | { 270 | "name": "stdout", 271 | "output_type": "stream", 272 | "text": [ 273 | "beta_0: 20.45573473276307\n", 274 | "beta_1: -17.83121265209636\n" 275 | ] 276 | } 277 | ], 278 | "source": [ 279 | "clf = LogisticRegression(penalty='none')\n", 280 | "clf.fit(np.array(data.x).reshape(-1,1), data.y)\n", 281 | "print('beta_0: %s'%clf.intercept_[0])\n", 282 | "print('beta_1: %s'%clf.coef_[0][0])" 283 | ] 284 | }, 285 | { 286 | "cell_type": "code", 287 | "execution_count": 17, 288 | "metadata": {}, 289 | "outputs": [ 290 | { 291 | "data": { 292 | "text/plain": [ 293 | "[]" 294 | ] 295 | }, 296 | "execution_count": 17, 297 | "metadata": {}, 298 | "output_type": "execute_result" 299 | }, 300 | { 301 | "data": { 302 | "image/png": "\n", 303 | "text/plain": [ 304 | "
" 305 | ] 306 | }, 307 | "metadata": { 308 | "needs_background": "light" 309 | }, 310 | "output_type": "display_data" 311 | } 312 | ], 313 | "source": [ 314 | "plt.scatter(data.x, data.y, s=5)\n", 315 | "\n", 316 | "x_vals = np.arange(data.x.min(), data.x.max(), .01)\n", 317 | "p_vals = 1 / (1 + np.exp(-(clf.intercept_[0] + clf.coef_[0][0]*x_vals)))\n", 318 | "plt.plot(x_vals, p_vals)" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": null, 324 | "metadata": {}, 325 | "outputs": [], 326 | "source": [] 327 | } 328 | ], 329 | "metadata": { 330 | "kernelspec": { 331 | "display_name": "Python 3", 332 | "language": "python", 333 | "name": "python3" 334 | }, 335 | "language_info": { 336 | "codemirror_mode": { 337 | "name": "ipython", 338 | "version": 3 339 | }, 340 | "file_extension": ".py", 341 | "mimetype": "text/x-python", 342 | "name": "python", 343 | "nbconvert_exporter": "python", 344 | "pygments_lexer": "ipython3", 345 | "version": "3.7.7" 346 | } 347 | }, 348 | "nbformat": 4, 349 | "nbformat_minor": 4 350 | } 351 | -------------------------------------------------------------------------------- /Monte Carlo Code.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Monte Carlo Coding Video: \n", 8 | "# https://www.youtube.com/watch?v=yA6_V-v3ODo" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 29, 14 | "metadata": {}, 15 | "outputs": [], 16 | "source": [ 17 | "import numpy as np\n", 18 | "import matplotlib.pyplot as plt\n", 19 | "from time import time" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 50, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "def get_calories_burned(lower_temp, upper_temp, avg_tol, sd_tol, avg_calories_burned):\n", 29 | " #get the temp\n", 30 | " temp = np.random.uniform(lower_temp, upper_temp)\n", 31 | " \n", 32 | " #get the tolerance\n", 33 | " tol = np.random.normal(avg_tol, sd_tol)\n", 34 | " \n", 35 | " #if the temp is higher than our tolerance, then run\n", 36 | " if temp > tol:\n", 37 | " cals = np.random.exponential(avg_calories_burned)\n", 38 | " else:\n", 39 | " cals = 0\n", 40 | " \n", 41 | " return cals" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": 62, 47 | "metadata": {}, 48 | "outputs": [ 49 | { 50 | "name": "stdout", 51 | "output_type": "stream", 52 | "text": [ 53 | "18.804136753082275\n" 54 | ] 55 | } 56 | ], 57 | "source": [ 58 | "num_days = 1000000\n", 59 | "daily_calories = []\n", 60 | "\n", 61 | "start = time()\n", 62 | "for _ in range(num_days):\n", 63 | " cals = get_calories_burned(40, 60, 55, 5, 200)\n", 64 | " daily_calories.append(cals)\n", 65 | "end = time()\n", 66 | "print(end - start)" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 63, 72 | "metadata": {}, 73 | "outputs": [ 74 | { 75 | "data": { 76 | "text/plain": [ 77 | "Text(0.5, 1.0, '54.060966025098885')" 78 | ] 79 | }, 80 | "execution_count": 63, 81 | "metadata": {}, 82 | "output_type": "execute_result" 83 | }, 84 | { 85 | "data": { 86 | "image/png": "\n", 87 | "text/plain": [ 88 | "
" 89 | ] 90 | }, 91 | "metadata": { 92 | "needs_background": "light" 93 | }, 94 | "output_type": "display_data" 95 | } 96 | ], 97 | "source": [ 98 | "plt.hist(daily_calories)\n", 99 | "plt.title(np.mean(daily_calories))" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": 64, 105 | "metadata": {}, 106 | "outputs": [ 107 | { 108 | "name": "stdout", 109 | "output_type": "stream", 110 | "text": [ 111 | "0.730156\n" 112 | ] 113 | } 114 | ], 115 | "source": [ 116 | "print(len([i for i in daily_calories if i == 0]) / num_days)" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": {}, 122 | "source": [ 123 | "# Efficient Method : Vectorization" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": 66, 129 | "metadata": {}, 130 | "outputs": [ 131 | { 132 | "name": "stdout", 133 | "output_type": "stream", 134 | "text": [ 135 | "0.2194535732269287\n" 136 | ] 137 | } 138 | ], 139 | "source": [ 140 | "start = time()\n", 141 | "\n", 142 | "#get all temps at once\n", 143 | "temps = np.random.uniform(40, 60, num_days)\n", 144 | "\n", 145 | "#get all the tolerances at once\n", 146 | "tols = np.random.normal(55, 5, num_days)\n", 147 | "\n", 148 | "#get all the calories at once\n", 149 | "daily_calories = np.random.exponential(200, num_days)\n", 150 | "\n", 151 | "#if temp is less than tol, then you didnt run\n", 152 | "daily_calories[temps < tols] = 0\n", 153 | "\n", 154 | "end = time()\n", 155 | "print(end - start)" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": 67, 161 | "metadata": {}, 162 | "outputs": [ 163 | { 164 | "data": { 165 | "text/plain": [ 166 | "Text(0.5, 1.0, '54.275049753539044')" 167 | ] 168 | }, 169 | "execution_count": 67, 170 | "metadata": {}, 171 | "output_type": "execute_result" 172 | }, 173 | { 174 | "data": { 175 | "image/png": "\n", 176 | "text/plain": [ 177 | "
" 178 | ] 179 | }, 180 | "metadata": { 181 | "needs_background": "light" 182 | }, 183 | "output_type": "display_data" 184 | } 185 | ], 186 | "source": [ 187 | "plt.hist(daily_calories)\n", 188 | "plt.title(np.mean(daily_calories))" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 68, 194 | "metadata": {}, 195 | "outputs": [ 196 | { 197 | "name": "stdout", 198 | "output_type": "stream", 199 | "text": [ 200 | "0.729439\n" 201 | ] 202 | } 203 | ], 204 | "source": [ 205 | "print(len([i for i in daily_calories if i == 0]) / num_days)" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": {}, 212 | "outputs": [], 213 | "source": [] 214 | } 215 | ], 216 | "metadata": { 217 | "kernelspec": { 218 | "display_name": "Python 3", 219 | "language": "python", 220 | "name": "python3" 221 | }, 222 | "language_info": { 223 | "codemirror_mode": { 224 | "name": "ipython", 225 | "version": 3 226 | }, 227 | "file_extension": ".py", 228 | "mimetype": "text/x-python", 229 | "name": "python", 230 | "nbconvert_exporter": "python", 231 | "pygments_lexer": "ipython3", 232 | "version": "3.7.7" 233 | } 234 | }, 235 | "nbformat": 4, 236 | "nbformat_minor": 4 237 | } 238 | -------------------------------------------------------------------------------- /SVM_Kernels.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Week 9 Discussion" 3 | author: "Ritvik Kharkar" 4 | date: "12/1/2020" 5 | output: pdf_document 6 | --- 7 | 8 | ```{r} 9 | library(ggplot2) 10 | 11 | #SVM 12 | library(e1071) 13 | ``` 14 | 15 | # Generate Data According to Complex Polynomial 16 | 17 | ```{r} 18 | generate_data <- function(n) 19 | { 20 | coefs = c(-7.1, 2.02, 5.01, 0.97) 21 | + 0.1*(runif(4) - .5) 22 | x1_vals = -5 + 6.5*runif(n) 23 | y1_vals = coefs[1] + coefs[2]*x1_vals + coefs[3]*x1_vals^2 + coefs[4]*x1_vals^3 + 10*(runif(n)-.2) 24 | 25 | x2_vals = -5 + 6.5*runif(n) 26 | y2_vals = coefs[1] + coefs[2]*x2_vals + coefs[3]*x2_vals^2 + coefs[4]*x2_vals^3 - 10*(runif(n)-.2) 27 | 28 | D = as.data.frame(rbind(cbind(x1_vals, y1_vals), cbind(x2_vals, y2_vals))) 29 | D = as.data.frame(scale(D)) 30 | D$label = 0 31 | D$label[(n+1):(2*n)] = 1 32 | D$label = as.factor(D$label) 33 | colnames(D) = c("x1", "x2", "label") 34 | 35 | return(D) 36 | } 37 | 38 | D_train = generate_data(500) 39 | ggplot(D_train, aes(x1, x2, color=label)) + geom_point() + ggtitle('Training Data') 40 | 41 | D_test = generate_data(500) 42 | ggplot(D_test, aes(x1, x2, color=label)) + geom_point() + ggtitle('Testing Data') 43 | ``` 44 | 45 | # Test Several Kernels 46 | 47 | ```{r} 48 | decision_grid = expand.grid(x1=seq(-2,2,.1),x2=seq(-3,3,.1)) 49 | 50 | for (k in c("linear", "polynomial", "radial", "sigmoid")) 51 | { 52 | svc_model = svm(label~., data = D_train, kernel=k) 53 | preds = predict(svc_model, D_test) 54 | acc = mean(preds == D_test$label) 55 | 56 | D_test_k = D_test 57 | D_test_k$pred = preds 58 | 59 | labels = predict(svc_model, decision_grid) 60 | decision_grid$pred = labels 61 | p = ggplot(D_test_k, aes(x1, x2, color=pred)) + geom_point() + geom_point(data=decision_grid, pch=20, cex=0.3) + ggtitle(paste(k,'kernel - Accuracy=',acc)) 62 | print(p) 63 | 64 | } 65 | ``` 66 | 67 | # Effect of Gamma on Result 68 | 69 | ```{r} 70 | decision_grid = expand.grid(x1=seq(-2,2,.1),x2=seq(-3,3,.1)) 71 | 72 | for (g in c(0.01,0.1,1,10,100,1000)) 73 | { 74 | svc_model = svm(label~., data = D_train, kernel='radial', gamma=g) 75 | preds = predict(svc_model, D_test) 76 | acc = mean(preds == D_test$label) 77 | 78 | D_test_k = D_test 79 | D_test_k$pred = preds 80 | 81 | labels = predict(svc_model, decision_grid) 82 | decision_grid$pred = labels 83 | p = ggplot(D_test_k, aes(x1, x2, color=pred)) + geom_point() + geom_point(data=decision_grid, pch=20, cex=0.3) + ggtitle(paste('Gamma=',g,' Accuracy=',acc)) 84 | print(p) 85 | 86 | } 87 | ``` 88 | 89 | ## CYU: Talk about changing gamma in terms of Bias Variance Tradeoff 90 | 91 | # Effect of Cost on Result 92 | 93 | ```{r} 94 | decision_grid = expand.grid(x1=seq(-2,2,.1),x2=seq(-3,3,.1)) 95 | 96 | for (c in c(0.01,1,100,10000,1000000)) 97 | { 98 | svc_model = svm(label~., data = D_train, kernel='radial', gamma=1, cost=c) 99 | preds = predict(svc_model, D_test) 100 | acc = mean(preds == D_test$label) 101 | 102 | D_test_k = D_test 103 | D_test_k$pred = preds 104 | 105 | labels = predict(svc_model, decision_grid) 106 | decision_grid$pred = labels 107 | p = ggplot(D_test_k, aes(x1, x2, color=pred)) + geom_point() + geom_point(data=decision_grid, pch=20, cex=0.3) + ggtitle(paste('Cost=',c,' Accuracy=',acc)) 108 | print(p) 109 | 110 | } 111 | ``` 112 | 113 | ## CYU: Talk about changing cost in terms of Bias Variance Tradeoff -------------------------------------------------------------------------------- /Spam Filtering.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import numpy as np\n", 10 | "import matplotlib.pyplot as plt\n", 11 | "import pandas as pd\n", 12 | "import string" 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": {}, 18 | "source": [ 19 | "# Read Data" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 2, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "#read dataset\n", 29 | "spam_df = pd.read_csv('spam.csv', encoding=\"ISO-8859-1\")\n", 30 | "\n", 31 | "#subset and rename columns\n", 32 | "spam_df = spam_df[['v1', 'v2']]\n", 33 | "spam_df.rename(columns={'v1': 'spam', 'v2': 'text'}, inplace=True)\n", 34 | "\n", 35 | "#convert spam column to binary\n", 36 | "spam_df.spam = spam_df.spam.apply(lambda s: True if s=='spam' else False)\n", 37 | "\n", 38 | "#lowercase everything and remove punctuation\n", 39 | "spam_df.text = spam_df.text.apply(lambda t: t.lower().translate(str.maketrans('', '', string.punctuation)))\n", 40 | "\n", 41 | "#shuffle\n", 42 | "spam_df = spam_df.sample(frac=1)" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 3, 48 | "metadata": {}, 49 | "outputs": [ 50 | { 51 | "data": { 52 | "text/html": [ 53 | "
\n", 54 | "\n", 67 | "\n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | "
spamtext
5195Falsedarren was saying dat if u meeting da ge den w...
5533Falsehey chief can you give me a bell when you get ...
1348Falsenothing much chillin at home any super bowl plan
3125Falseu coming 2 pick me
865Truecongratulations ur awarded either a yrs supply...
.........
4351Falsehows the pain deary r u smiling
1898Falsewat would u like 4 ur birthday
4993Falsemy drive can only be read i need to write
283Falseokie
1832Falsewhat time is ur flight tmr
\n", 133 | "

5572 rows × 2 columns

\n", 134 | "
" 135 | ], 136 | "text/plain": [ 137 | " spam text\n", 138 | "5195 False darren was saying dat if u meeting da ge den w...\n", 139 | "5533 False hey chief can you give me a bell when you get ...\n", 140 | "1348 False nothing much chillin at home any super bowl plan\n", 141 | "3125 False u coming 2 pick me\n", 142 | "865 True congratulations ur awarded either a yrs supply...\n", 143 | "... ... ...\n", 144 | "4351 False hows the pain deary r u smiling\n", 145 | "1898 False wat would u like 4 ur birthday\n", 146 | "4993 False my drive can only be read i need to write\n", 147 | "283 False okie\n", 148 | "1832 False what time is ur flight tmr\n", 149 | "\n", 150 | "[5572 rows x 2 columns]" 151 | ] 152 | }, 153 | "execution_count": 3, 154 | "metadata": {}, 155 | "output_type": "execute_result" 156 | } 157 | ], 158 | "source": [ 159 | "spam_df" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": 4, 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "name": "stdout", 169 | "output_type": "stream", 170 | "text": [ 171 | "congratulations ur awarded either a yrs supply of cds from virgin records or a mystery gift guaranteed call 09061104283 tscs wwwsmsconet å£150pm approx 3mins\n", 172 | "-------\n", 173 | "please call our customer service representative on freephone 0808 145 4742 between 9am11pm as you have won a guaranteed å£1000 cash or å£5000 prize\n", 174 | "-------\n", 175 | "this msg is for your mobile content order it has been resent as previous attempt failed due to network error queries to customersqueriesnetvisionukcom\n", 176 | "-------\n", 177 | "urgent we are trying to contact u todays draw shows that you have won a å£800 prize guaranteed call 09050003091 from land line claim c52 valid 12hrs only\n", 178 | "-------\n", 179 | "sms ac sun0819 posts helloyou seem cool\n", 180 | "-------\n" 181 | ] 182 | } 183 | ], 184 | "source": [ 185 | "for t in spam_df[spam_df.spam == True].iloc[:5].text:\n", 186 | " print(t)\n", 187 | " print('-------')" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 5, 193 | "metadata": {}, 194 | "outputs": [ 195 | { 196 | "name": "stdout", 197 | "output_type": "stream", 198 | "text": [ 199 | "darren was saying dat if u meeting da ge den we dun meet 4 dinner cos later u leave xy will feel awkward den u meet him 4 lunch lor\n", 200 | "-------\n", 201 | "hey chief can you give me a bell when you get this need to talk to you about this royal visit on the 1st june \n", 202 | "-------\n", 203 | "nothing much chillin at home any super bowl plan\n", 204 | "-------\n", 205 | "u coming 2 pick me\n", 206 | "-------\n", 207 | "see i knew giving you a break a few times woul lead to you always wanting to miss curfew i was gonna gibe you til one but a midnight movie is not gonna get out til after 2 you need to come home you need to getsleep and if anything you need to b studdying ear training\n", 208 | "-------\n" 209 | ] 210 | } 211 | ], 212 | "source": [ 213 | "for t in spam_df[spam_df.spam == False].iloc[:5].text:\n", 214 | " print(t)\n", 215 | " print('-------')" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": 6, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [ 224 | "#get training set\n", 225 | "train_spam_df = spam_df.iloc[:int(len(spam_df)*0.7)]\n", 226 | "\n", 227 | "#get testing set\n", 228 | "test_spam_df = spam_df.iloc[int(len(spam_df)*0.7):]" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 7, 234 | "metadata": {}, 235 | "outputs": [ 236 | { 237 | "name": "stdout", 238 | "output_type": "stream", 239 | "text": [ 240 | "0.13794871794871794\n" 241 | ] 242 | } 243 | ], 244 | "source": [ 245 | "FRAC_SPAM_TEXTS = train_spam_df.spam.mean()\n", 246 | "print(FRAC_SPAM_TEXTS)" 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "# Create Spam Bag of Words and Non-Spam Bag of Words" 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": 8, 259 | "metadata": {}, 260 | "outputs": [], 261 | "source": [ 262 | "#get all words from spam and non-spam datasets\n", 263 | "train_spam_words = ' '.join(train_spam_df[train_spam_df.spam == True].text).split(' ')\n", 264 | "train_non_spam_words = ' '.join(train_spam_df[train_spam_df.spam == False].text).split(' ')\n", 265 | "\n", 266 | "common_words = set(train_spam_words).intersection(set(train_non_spam_words))" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": 9, 272 | "metadata": {}, 273 | "outputs": [], 274 | "source": [ 275 | "train_spam_bow = dict()\n", 276 | "for w in common_words:\n", 277 | " train_spam_bow[w] = train_spam_words.count(w) / len(train_spam_words)" 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": 10, 283 | "metadata": {}, 284 | "outputs": [], 285 | "source": [ 286 | "train_non_spam_bow = dict()\n", 287 | "for w in common_words:\n", 288 | " train_non_spam_bow[w] = train_non_spam_words.count(w) / len(train_non_spam_words)" 289 | ] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "metadata": {}, 294 | "source": [ 295 | "# Predict on Test Set" 296 | ] 297 | }, 298 | { 299 | "cell_type": "markdown", 300 | "metadata": {}, 301 | "source": [ 302 | "# $ P(\\text{SPAM} | \\text{\"urgent please call this number\"}) $\n", 303 | "# $\\propto P(\\text{\"urgent please call this number\"} | \\text{SPAM}) \\times P(\\text{SPAM}) $\n", 304 | "# $= P(\\text{\"urgent\"} | \\text{SPAM}) \\times P(\\text{\"please\"} | \\text{SPAM}) \\times \\dots \\times P(\\text{SPAM})$" 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "# Due to numerical issues, equivalently compute:\n", 312 | "\n", 313 | "# $log(P(\\text{\"urgent\"} | \\text{SPAM}) \\times P(\\text{\"please\"} | \\text{SPAM}) \\times \\dots \\times P(\\text{SPAM}))$\n", 314 | "# $ = log(P(\\text{\"urgent\"} | \\text{SPAM})) + log(P(\\text{\"please\"} | \\text{SPAM})) + \\dots + log(P(\\text{SPAM}))$" 315 | ] 316 | }, 317 | { 318 | "cell_type": "code", 319 | "execution_count": 11, 320 | "metadata": {}, 321 | "outputs": [], 322 | "source": [ 323 | "def predict_text(t, verbose=False):\n", 324 | " #if some word doesnt appear in either spam or non-spam BOW, disregard it\n", 325 | " valid_words = [w for w in t if w in train_spam_bow]\n", 326 | " \n", 327 | " #get the probabilities of each valid word showing up in spam and non-spam BOW\n", 328 | " spam_probs = [train_spam_bow[w] for w in valid_words]\n", 329 | " non_spam_probs = [train_non_spam_bow[w] for w in valid_words]\n", 330 | " \n", 331 | " #print probs if requested\n", 332 | " if verbose:\n", 333 | " data_df = pd.DataFrame()\n", 334 | " data_df['word'] = valid_words\n", 335 | " data_df['spam_prob'] = spam_probs\n", 336 | " data_df['non_spam_prob'] = non_spam_probs\n", 337 | " data_df['ratio'] = [s/n if n > 0 else np.inf for s,n in zip(spam_probs, non_spam_probs)]\n", 338 | " print(data_df)\n", 339 | " \n", 340 | " #calculate spam score as sum of logs for all probabilities\n", 341 | " spam_score = sum([np.log(p) for p in spam_probs]) + np.log(FRAC_SPAM_TEXTS)\n", 342 | " \n", 343 | " #calculate non-spam score as sum of logs for all probabilities\n", 344 | " non_spam_score = sum([np.log(p) for p in non_spam_probs]) + np.log(1-FRAC_SPAM_TEXTS)\n", 345 | " \n", 346 | " #if verbose, report the two scores\n", 347 | " if verbose:\n", 348 | " print('Spam Score: %s'%spam_score)\n", 349 | " print('Non-Spam Score: %s'%non_spam_score)\n", 350 | " \n", 351 | " #if spam score is higher, mark this as spam\n", 352 | " return (spam_score >= non_spam_score)" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": 12, 358 | "metadata": {}, 359 | "outputs": [ 360 | { 361 | "name": "stdout", 362 | "output_type": "stream", 363 | "text": [ 364 | " word spam_prob non_spam_prob ratio\n", 365 | "0 urgent 0.003879 0.000021 188.634600\n", 366 | "1 call 0.018929 0.003311 5.717620\n", 367 | "2 this 0.005275 0.003537 1.491529\n", 368 | "3 number 0.001629 0.001049 1.553461\n", 369 | "Spam Score: -23.16448028206801\n", 370 | "Non-Spam Score: -29.15569965721826\n" 371 | ] 372 | }, 373 | { 374 | "data": { 375 | "text/plain": [ 376 | "True" 377 | ] 378 | }, 379 | "execution_count": 12, 380 | "metadata": {}, 381 | "output_type": "execute_result" 382 | } 383 | ], 384 | "source": [ 385 | "predict_text('urgent call this number'.split(), verbose=True)" 386 | ] 387 | }, 388 | { 389 | "cell_type": "code", 390 | "execution_count": 13, 391 | "metadata": {}, 392 | "outputs": [ 393 | { 394 | "name": "stdout", 395 | "output_type": "stream", 396 | "text": [ 397 | " word spam_prob non_spam_prob ratio\n", 398 | "0 hey 0.000310 0.001522 0.203929\n", 399 | "1 do 0.001241 0.005223 0.237650\n", 400 | "2 you 0.016447 0.025992 0.632762\n", 401 | "3 want 0.001474 0.002365 0.623314\n", 402 | "4 to 0.039488 0.022311 1.769862\n", 403 | "5 go 0.001474 0.003619 0.407279\n", 404 | "6 a 0.021567 0.014950 1.442653\n", 405 | "7 movie 0.000155 0.000288 0.538956\n", 406 | "8 tonight 0.000078 0.000864 0.089826\n", 407 | "Spam Score: -59.20117350818938\n", 408 | "Non-Spam Score: -50.42256289899204\n" 409 | ] 410 | }, 411 | { 412 | "data": { 413 | "text/plain": [ 414 | "False" 415 | ] 416 | }, 417 | "execution_count": 13, 418 | "metadata": {}, 419 | "output_type": "execute_result" 420 | } 421 | ], 422 | "source": [ 423 | "predict_text('hey do you want to go a movie tonight'.split(), verbose=True)" 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "execution_count": 14, 429 | "metadata": {}, 430 | "outputs": [ 431 | { 432 | "name": "stdout", 433 | "output_type": "stream", 434 | "text": [ 435 | " word spam_prob non_spam_prob ratio\n", 436 | "0 offer 0.001552 0.000144 10.779120\n", 437 | "1 for 0.010939 0.007197 1.519856\n", 438 | "2 unlimited 0.000698 0.000062 11.318076\n", 439 | "3 money 0.000078 0.000802 0.096736\n", 440 | "4 call 0.018929 0.003311 5.717620\n", 441 | "5 now 0.010784 0.003989 2.703114\n", 442 | "Spam Score: -38.192756947863074\n", 443 | "Non-Spam Score: -41.98513617615185\n" 444 | ] 445 | }, 446 | { 447 | "data": { 448 | "text/plain": [ 449 | "True" 450 | ] 451 | }, 452 | "execution_count": 14, 453 | "metadata": {}, 454 | "output_type": "execute_result" 455 | } 456 | ], 457 | "source": [ 458 | "predict_text('offer for unlimited money call now'.split(), verbose=True)" 459 | ] 460 | }, 461 | { 462 | "cell_type": "code", 463 | "execution_count": 15, 464 | "metadata": {}, 465 | "outputs": [ 466 | { 467 | "name": "stdout", 468 | "output_type": "stream", 469 | "text": [ 470 | " word spam_prob non_spam_prob ratio\n", 471 | "0 are 0.004422 0.006005 0.736450\n", 472 | "1 you 0.016447 0.025992 0.632762\n", 473 | "2 at 0.001396 0.005778 0.241667\n", 474 | "3 class 0.000155 0.000658 0.235793\n", 475 | "4 yet 0.000078 0.000576 0.134739\n", 476 | "Spam Score: -36.318752270659445\n", 477 | "Non-Spam Score: -28.85333457585457\n" 478 | ] 479 | }, 480 | { 481 | "data": { 482 | "text/plain": [ 483 | "False" 484 | ] 485 | }, 486 | "execution_count": 15, 487 | "metadata": {}, 488 | "output_type": "execute_result" 489 | } 490 | ], 491 | "source": [ 492 | "predict_text('are you at class yet'.split(), verbose=True)" 493 | ] 494 | }, 495 | { 496 | "cell_type": "code", 497 | "execution_count": 16, 498 | "metadata": {}, 499 | "outputs": [], 500 | "source": [ 501 | "predictions = test_spam_df.text.apply(lambda t: predict_text(t.split()))" 502 | ] 503 | }, 504 | { 505 | "cell_type": "code", 506 | "execution_count": 17, 507 | "metadata": {}, 508 | "outputs": [ 509 | { 510 | "name": "stdout", 511 | "output_type": "stream", 512 | "text": [ 513 | "Fraction Spam Correctly Detected: 0.9234449760765551\n" 514 | ] 515 | } 516 | ], 517 | "source": [ 518 | "frac_spam_messages_correctly_detected = np.sum((predictions == True) & (test_spam_df.spam == True)) / np.sum(test_spam_df.spam == True)\n", 519 | "print('Fraction Spam Correctly Detected: %s'%frac_spam_messages_correctly_detected)" 520 | ] 521 | }, 522 | { 523 | "cell_type": "code", 524 | "execution_count": 18, 525 | "metadata": {}, 526 | "outputs": [ 527 | { 528 | "name": "stdout", 529 | "output_type": "stream", 530 | "text": [ 531 | "Fraction Valid Messages Sent to Spam: 0.02323991797676008\n" 532 | ] 533 | } 534 | ], 535 | "source": [ 536 | "frac_valid_sent_to_spam = np.sum((predictions == True) & (test_spam_df.spam == False)) / np.sum(test_spam_df.spam == False)\n", 537 | "print('Fraction Valid Messages Sent to Spam: %s'%frac_valid_sent_to_spam)" 538 | ] 539 | }, 540 | { 541 | "cell_type": "code", 542 | "execution_count": null, 543 | "metadata": {}, 544 | "outputs": [], 545 | "source": [] 546 | } 547 | ], 548 | "metadata": { 549 | "kernelspec": { 550 | "display_name": "Python 3", 551 | "language": "python", 552 | "name": "python3" 553 | }, 554 | "language_info": { 555 | "codemirror_mode": { 556 | "name": "ipython", 557 | "version": 3 558 | }, 559 | "file_extension": ".py", 560 | "mimetype": "text/x-python", 561 | "name": "python", 562 | "nbconvert_exporter": "python", 563 | "pygments_lexer": "ipython3", 564 | "version": "3.7.7" 565 | } 566 | }, 567 | "nbformat": 4, 568 | "nbformat_minor": 4 569 | } 570 | -------------------------------------------------------------------------------- /UnsubscribeBot.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "#Imports\n", 10 | "\n", 11 | "from __future__ import print_function\n", 12 | "\n", 13 | "from random import random\n", 14 | "from datetime import datetime\n", 15 | "import sys\n", 16 | "import re\n", 17 | "import webbrowser\n", 18 | "import time\n", 19 | "\n", 20 | "import os\n", 21 | "import pickle\n", 22 | "import google.oauth2.credentials\n", 23 | "from email.mime.text import MIMEText\n", 24 | "import base64\n", 25 | "\n", 26 | "from googleapiclient.discovery import build\n", 27 | "from googleapiclient.errors import HttpError\n", 28 | "from google_auth_oauthlib.flow import InstalledAppFlow\n", 29 | "from google.auth.transport.requests import Request\n", 30 | "from google.auth.transport.requests import Request\n", 31 | "from google.oauth2.credentials import Credentials" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "#register chrome webbrowser\n", 41 | "webbrowser.register('chrome', None, webbrowser.BackgroundBrowser(\"C://Program Files (x86)//Google//Chrome//Application//chrome.exe\"))" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "metadata": {}, 48 | "outputs": [], 49 | "source": [ 50 | "#we can read and write messages to our GMail\n", 51 | "SCOPES = [\n", 52 | " 'https://www.googleapis.com/auth/gmail.readonly'\n", 53 | "]" 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": null, 59 | "metadata": {}, 60 | "outputs": [], 61 | "source": [ 62 | "def get_authenticated_gmail_service():\n", 63 | " \"\"\"\n", 64 | " This function uses the token.json file to get an authenticated object used to talk to Gmail\n", 65 | " \"\"\"\n", 66 | "\n", 67 | " creds = None\n", 68 | " API_SERVICE_NAME = 'gmail'\n", 69 | " API_VERSION = 'v1'\n", 70 | " \n", 71 | " if os.path.exists('token.json'):\n", 72 | " creds = Credentials.from_authorized_user_file('token.json', SCOPES)\n", 73 | " \n", 74 | " if not creds or not creds.valid:\n", 75 | " if creds and creds.expired and creds.refresh_token:\n", 76 | " creds.refresh(Request())\n", 77 | " else:\n", 78 | " flow = InstalledAppFlow.from_client_secrets_file(\n", 79 | " 'credentials.json', \n", 80 | " SCOPES\n", 81 | " )\n", 82 | " creds = flow.run_local_server(port=0)\n", 83 | " # Save the credentials for the next run\n", 84 | " with open('token.pickle', 'wb') as token:\n", 85 | " pickle.dump(creds, token)\n", 86 | "\n", 87 | " return build('gmail', 'v1', credentials=creds)" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": null, 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "#get the service\n", 97 | "service = get_authenticated_gmail_service() " 98 | ] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "execution_count": null, 103 | "metadata": {}, 104 | "outputs": [], 105 | "source": [ 106 | "#get me last 3mo of messages which are in promotions folder and \n", 107 | "#which contain \"unsubscribe\" as an option\n", 108 | "result = service.users().messages().list(\n", 109 | " userId='me', \n", 110 | " q='\"unsubscribe\" AND category:promotions AND newer_than:90d'\n", 111 | ").execute()" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": { 118 | "scrolled": false 119 | }, 120 | "outputs": [], 121 | "source": [ 122 | "#number of links we've opened\n", 123 | "curr_open = 0\n", 124 | "\n", 125 | "#avoid opening duplicate links\n", 126 | "seen_urls = set()\n", 127 | "\n", 128 | "#for each message...\n", 129 | "for message in result['messages']:\n", 130 | " #get the id\n", 131 | " message_id = message.get('id')\n", 132 | " \n", 133 | " #get all info about that message\n", 134 | " message_info = service.users().messages().get(userId='me', id=message_id).execute()\n", 135 | " \n", 136 | " #for each header...\n", 137 | " for h in message_info.get('payload').get('headers'):\n", 138 | " #look out of the List-Unsubscribe header\n", 139 | " if h.get('name') == 'List-Unsubscribe':\n", 140 | " #get this url\n", 141 | " url = re.findall(r'\\<(.*?)\\>', h.get('value'))[0]\n", 142 | "\n", 143 | " #find first url with any of these extensions\n", 144 | " for ext in ['.com', '.edu', '.net', '.gov']:\n", 145 | " base_url = re.findall(f'(.*?{ext})', url)\n", 146 | " if len(base_url) > 0:\n", 147 | " base = base_url[0]\n", 148 | " break\n", 149 | " \n", 150 | " #open the url in chrome\n", 151 | " if url[:4] == 'http' and base not in seen_urls:\n", 152 | " webbrowser.get('chrome').open(url)\n", 153 | " seen_urls.add(base)\n", 154 | " curr_open += 1\n", 155 | "\n", 156 | " #can open max of 3 at a time so we don't spam the browser\n", 157 | " if curr_open == 3:\n", 158 | " answer = input('opened 3 tabs. type \"yes\" to continue. type \"no\" to stop. ')\n", 159 | " if answer == \"yes\":\n", 160 | " curr_open = 0\n", 161 | " else:\n", 162 | " break" 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": null, 168 | "metadata": {}, 169 | "outputs": [], 170 | "source": [] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": null, 175 | "metadata": {}, 176 | "outputs": [], 177 | "source": [] 178 | }, 179 | { 180 | "cell_type": "code", 181 | "execution_count": null, 182 | "metadata": {}, 183 | "outputs": [], 184 | "source": [] 185 | } 186 | ], 187 | "metadata": { 188 | "kernelspec": { 189 | "display_name": "Python 3", 190 | "language": "python", 191 | "name": "python3" 192 | }, 193 | "language_info": { 194 | "codemirror_mode": { 195 | "name": "ipython", 196 | "version": 3 197 | }, 198 | "file_extension": ".py", 199 | "mimetype": "text/x-python", 200 | "name": "python", 201 | "nbconvert_exporter": "python", 202 | "pygments_lexer": "ipython3", 203 | "version": "3.7.7" 204 | } 205 | }, 206 | "nbformat": 4, 207 | "nbformat_minor": 4 208 | } 209 | -------------------------------------------------------------------------------- /YouTubeAgent - Driver.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "id": "3c77be67", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "from youtube_agent_utils import (\n", 11 | " OPEN_AI_API_KEY,\n", 12 | " get_authenticated_youtube_api,\n", 13 | " get_views_snippet,\n", 14 | " update_video_title,\n", 15 | " get_last_n_videos_with_views,\n", 16 | " get_openai_client,\n", 17 | " chat,\n", 18 | " get_messages\n", 19 | ")" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "id": "572a2b2d", 26 | "metadata": {}, 27 | "outputs": [], 28 | "source": [ 29 | "client = get_openai_client()" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": null, 35 | "id": "3bd37f9a", 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "youtube = get_authenticated_youtube_api()\n", 40 | "last_n = 20\n", 41 | "last_n_videos = get_last_n_videos_with_views(youtube, last_n)\n", 42 | "VIDEO_ID = 'SET_YOUTUBE_VIDEO_ID_HERE'" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "id": "b7e80938", 49 | "metadata": {}, 50 | "outputs": [], 51 | "source": [ 52 | "from flask import Flask, render_template_string, request\n", 53 | "import json\n", 54 | "\n", 55 | "app = Flask(__name__)\n", 56 | "\n", 57 | "# HTML template\n", 58 | "HTML = \"\"\"\n", 59 | "\n", 60 | "\n", 61 | " \n", 62 | " Flask Text Processing\n", 63 | " \n", 91 | " \n", 92 | " \n", 93 | "

Text Processor

\n", 94 | "
\n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | "
\n", 99 | " {% if new_title and reasoning and url%}\n", 100 | "

New Title:

\n", 101 | "

{{ new_title }}

\n", 102 | "

Reasoning:

\n", 103 | "

{{ reasoning }}

\n", 104 | "

Video URL:

\n", 105 | "

{{ url }}

\n", 106 | " {% endif %}\n", 107 | " \n", 108 | "\n", 109 | "\"\"\"" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": null, 115 | "id": "c0e0dd26", 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [ 119 | "@app.route(\"/\", methods=[\"GET\", \"POST\"])\n", 120 | "def home():\n", 121 | " desc = \"i am making a video on martingales\"\n", 122 | " output = None\n", 123 | " new_title = None\n", 124 | " reasoning = None\n", 125 | " url = None\n", 126 | " if request.method == \"POST\":\n", 127 | " # Get the input text from the form\n", 128 | " desc = request.form.get(\"desc\")\n", 129 | " # Call your function f with the input text\n", 130 | " messages = get_messages(last_n_videos, desc)\n", 131 | " output = chat(client, messages)\n", 132 | " output = json.loads(output)\n", 133 | " new_title = output['new_title']\n", 134 | " reasoning = output['reasoning']\n", 135 | " update_video_title(youtube, VIDEO_ID, new_title)\n", 136 | " url = f'https://www.youtube.com/watch?v={VIDEO_ID}'\n", 137 | " # Render the page with the current_title and output\n", 138 | " return render_template_string(HTML, desc=desc, new_title=new_title, reasoning=reasoning, url=url)\n", 139 | "\n", 140 | "if __name__ == \"__main__\":\n", 141 | " app.run(host=\"0.0.0.0\", port=5001)" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "id": "ffa43871", 148 | "metadata": {}, 149 | "outputs": [], 150 | "source": [] 151 | } 152 | ], 153 | "metadata": { 154 | "kernelspec": { 155 | "display_name": "Python 3 (ipykernel)", 156 | "language": "python", 157 | "name": "python3" 158 | }, 159 | "language_info": { 160 | "codemirror_mode": { 161 | "name": "ipython", 162 | "version": 3 163 | }, 164 | "file_extension": ".py", 165 | "mimetype": "text/x-python", 166 | "name": "python", 167 | "nbconvert_exporter": "python", 168 | "pygments_lexer": "ipython3", 169 | "version": "3.9.13" 170 | } 171 | }, 172 | "nbformat": 4, 173 | "nbformat_minor": 5 174 | } 175 | -------------------------------------------------------------------------------- /fish.csv: -------------------------------------------------------------------------------- 1 | length,weight,type 2 | 8.61838093097377,7.848030155527202,tuna 3 | 2.5220458519064666,8.762507114511225,tuna 4 | 2.035690396609505,7.101971486616695,tuna 5 | 6.695007803469322,4.9539323163306905,salmon 6 | 9.834069442655217,7.908453150379013,tuna 7 | 5.825596978904972,4.379918143796759,tuna 8 | 0.903503285616254,0.19049640876650176,tuna 9 | 6.585882012007895,0.7330121724269822,salmon 10 | 4.713703234269383,0.7603483258106902,salmon 11 | 1.7006764892445585,2.515498567973005,tuna 12 | 4.2960675292027934,8.975219469270025,salmon 13 | 1.0787286858722611,4.916033452175656,tuna 14 | 6.0338495366594085,5.104907041636415,salmon 15 | 7.3043997937470575,5.593652182181923,tuna 16 | 1.2340792813167922,2.1565615541830683,tuna 17 | 6.929397432897288,8.441940609708556,tuna 18 | 9.0112758712544,0.07974997845336618,salmon 19 | 3.768340409513324,1.67249539961277,salmon 20 | 9.59873506057171,2.274392158059013,salmon 21 | 0.7797263037420199,6.7648398939995715,tuna 22 | 4.944697649629248,3.8554835086854,salmon 23 | 7.179219824595374,3.2038430618266034,salmon 24 | 5.393076591233015,0.23266380035032674,salmon 25 | 2.722230086826837,5.284084477986447,tuna 26 | 0.9378201508037298,2.575116194999957,tuna 27 | 5.006217342870363,4.716443006733946,tuna 28 | 6.049061380386817,6.03996014832049,salmon 29 | 9.256297755985036,3.31333096727822,salmon 30 | 6.208459382346439,7.981338142328081,tuna 31 | 5.650078870657326,6.862045424809363,tuna 32 | 1.1372831658996632,3.7005309279064247,tuna 33 | 8.532210940169753,6.594492818499802,tuna 34 | 2.090439124223882,1.3387424359149769,tuna 35 | 8.009731229932374,9.584625706392893,tuna 36 | 4.416185744693878,2.2038058756970886,salmon 37 | 4.8412530511639496,1.4446183464261877,salmon 38 | 4.121144532030703,1.3305612057265992,salmon 39 | 9.212326047188569,3.1506512699177147,salmon 40 | 2.9316973616878403,8.90198766933183,tuna 41 | 7.941095934252679,5.0774937472823725,tuna 42 | 5.127781745463607,4.854670906190943,tuna 43 | 2.55728096584108,8.041530914519946,tuna 44 | 0.4737064061477736,1.436225946549332,tuna 45 | 3.6049378225655855,0.7643757817863517,salmon 46 | 7.615697535214528,7.355028935812999,tuna 47 | 6.525532887775773,0.6571414376014317,salmon 48 | 8.508103317401499,8.362265959590857,tuna 49 | 8.877108246528369,1.6509343014192568,salmon 50 | 0.7593519696369244,3.17161094368371,tuna 51 | 1.9112549458496544,3.564887558122849,tuna 52 | 4.816652526078854,9.923334963110856,salmon 53 | 6.6279241924992895,9.013021963405649,tuna 54 | 2.542103651778599,1.753844373947604,tuna 55 | 8.364243680502506,8.466427824825951,tuna 56 | 4.642332657268433,2.3360557603054932,salmon 57 | 2.568864583224712,2.8740884793653536,tuna 58 | 5.247147342603377,0.928975248188384,salmon 59 | 6.071650261984285,0.26938623086380065,salmon 60 | 7.9836913957357085,1.944765337134172,salmon 61 | 4.857567541405051,8.38948488585803,salmon 62 | 9.411066269746179,4.288695171401482,tuna 63 | 9.400684442392432,0.09540465639656492,salmon 64 | 0.3846554809153269,0.6835599245261614,tuna 65 | 9.943213043508292,0.5642105862473634,salmon 66 | 8.227493693537772,5.352803956819933,tuna 67 | 8.868332824705162,7.310504319223868,tuna 68 | 8.768391921740088,1.9850026008502841,salmon 69 | 1.2238316130255422,1.3896067995862849,tuna 70 | 2.54382754755702,0.196040448710737,tuna 71 | 0.7349296738410283,5.951490286292441,tuna 72 | 7.050772779040444,1.1057377514197977,salmon 73 | 1.5131863509804588,5.879111000757227,tuna 74 | 2.59608902939253,9.549490008962467,tuna 75 | 4.660287567732258,7.63809363159825,salmon 76 | 3.4617930300397064,5.167379057060017,tuna 77 | 5.016603607911812,8.905080341921193,tuna 78 | 4.165364613378195,5.266291303051527,tuna 79 | 4.638144344026407,4.152609477191503,tuna 80 | 0.9423014536239682,7.399150717348016,tuna 81 | 7.18434426436739,4.1541630680322275,tuna 82 | 8.952799948249883,4.569736417241899,tuna 83 | 9.31988367740768,8.292350506174246,tuna 84 | 6.906213455458436,7.202738324847712,tuna 85 | 5.322326965132027,4.4640955705897465,tuna 86 | 6.080007786412118,9.622395393187043,tuna 87 | 2.653827684847295,2.2727409836404706,tuna 88 | 6.530518781025638,2.58570000176223,salmon 89 | 5.812002624265259,2.8651278464764163,salmon 90 | 3.3223424703828597,9.833876846052037,salmon 91 | 8.690956613034858,2.7496885762274745,salmon 92 | 9.718266214527238,4.980111723287091,tuna 93 | 0.7593390383398246,8.130925278777156,tuna 94 | 5.460406753503786,9.066464051557578,tuna 95 | 4.161305269287391,1.4847596027799326,salmon 96 | 4.296755997995123,7.990513705692128,salmon 97 | 6.949272270598612,0.23699776297617106,salmon 98 | 9.348163415342572,6.88960583257645,tuna 99 | 2.9160895853184265,1.2462438848294377,tuna 100 | 8.956621919098186,1.966543407795014,salmon 101 | 8.240011762106082,9.640394238256002,tuna 102 | 9.122630362085424,3.762983439729109,salmon 103 | 3.9266312357014255,5.452709888482644,tuna 104 | 5.781505547070206,5.289945543810653,tuna 105 | 1.9625935386079367,0.6624195966514335,tuna 106 | 7.895540758063228,6.484985170181289,tuna 107 | 8.471086818031559,4.255339949444489,tuna 108 | 1.929279533134316,7.947647453639746,tuna 109 | 1.982799695442864,6.118409539142848,tuna 110 | 6.448858192798209,9.016061221101744,tuna 111 | 1.9544484368492647,3.3943473235422594,tuna 112 | 4.934202174692216,1.2380456711778245,salmon 113 | 9.544497887331907,9.75190898264452,tuna 114 | 7.815901215593424,9.277642080017166,tuna 115 | 7.5284711190706375,2.75698879739383,salmon 116 | 8.561270018484965,3.57373184186536,salmon 117 | 4.488836775474347,5.8162055981467224,tuna 118 | 1.450779490912104,4.218956669313489,tuna 119 | 5.128659289245124,0.4711261384130516,salmon 120 | 7.894376107520522,9.208820224450509,tuna 121 | 2.396756559784201,2.067677202409055,tuna 122 | 2.4668285470500733,1.5010312622108857,tuna 123 | 1.203388103945542,0.3629026386133127,tuna 124 | 0.2628858225109509,6.631139859898528,tuna 125 | 1.7779384735396409,4.825397783493325,tuna 126 | 8.062886580024777,0.3404466914750881,salmon 127 | 6.19293190893772,5.0642024836796615,salmon 128 | 9.222661558240633,3.089187282922712,salmon 129 | 0.020612676953580825,5.350227687899217,tuna 130 | 9.320511705922234,8.173658921621929,tuna 131 | 3.3480246700826166,3.5406886531843726,salmon 132 | 0.006545662636546767,9.907586679874546,tuna 133 | 3.2626928270528452,5.878282139148246,tuna 134 | 5.044448306480466,9.11773370366416,tuna 135 | 4.477510373947345,6.666272302370802,tuna 136 | 5.822375704539416,0.28452198004285445,salmon 137 | 3.552107439877629,3.853155959138274,salmon 138 | 9.689447801720384,9.130070837010539,tuna 139 | 2.807188885972576,5.976845677491827,tuna 140 | 1.6631024710244169,4.8398063851306174,tuna 141 | 9.053114751371725,2.5433363082937754,salmon 142 | 3.3933560575413035,0.6847582685517549,salmon 143 | 5.97166067127568,2.407310609407779,salmon 144 | 1.5509832267826285,5.016227306658958,tuna 145 | 7.19065649420372,3.895313863117448,salmon 146 | 5.429037812062735,3.1371569722876287,salmon 147 | 3.6612518901465982,0.9296887903071038,salmon 148 | 0.2745434386776757,3.200546779755038,tuna 149 | 9.00948432955258,2.9645912361643885,salmon 150 | 3.7999625802146135,2.4116649906551757,salmon 151 | 8.297345032328677,4.35006311445432,tuna 152 | 6.657725954330107,0.8647348832414836,salmon 153 | 5.865848186007595,6.218246927605216,tuna 154 | 2.506287338092652,1.0998986043125314,tuna 155 | 7.89745927913918,5.1641123847781785,tuna 156 | 2.0881464616429057,7.558345050626568,tuna 157 | 8.614960311880097,1.4630259744061258,salmon 158 | 6.77787131834362,7.034927233121269,tuna 159 | 0.5171290677736429,5.144014938749304,tuna 160 | 7.287640867031128,1.3919508017008686,salmon 161 | 3.714485165893864,3.920680827509553,salmon 162 | 4.4179198106634585,7.8688691666827975,salmon 163 | 2.081958511445288,1.3037019512807946,tuna 164 | 0.2592452738648843,1.6614079344038946,tuna 165 | 8.800880065496667,7.238084777699338,tuna 166 | 7.545472452574358,3.6510916051567257,salmon 167 | 7.960591803122853,5.66143480531737,tuna 168 | 4.332045886510324,2.951087956297456,salmon 169 | 7.745321306102184,3.859463706701373,salmon 170 | 7.868052575175713,5.525714914832776,tuna 171 | 2.2639849752691643,2.6487218176520075,tuna 172 | 6.509616463509076,4.640435707324777,salmon 173 | 6.471925246872412,6.00380833170544,salmon 174 | 3.731909868123777,6.2728492135930525,tuna 175 | 1.1072373155096416,4.619100039113987,tuna 176 | 2.011359126166096,8.12406326634969,tuna 177 | 7.8276131197058065,2.565913061622505,salmon 178 | 9.757928481622583,1.9839576446257927,salmon 179 | 2.810709428930659,1.128033659073412,tuna 180 | 0.5951448293008388,9.034801870004843,tuna 181 | 8.924080896256486,9.582338809259317,tuna 182 | 1.69575295036412,3.9400791302315135,tuna 183 | 8.263097362117362,4.097150091379731,tuna 184 | 6.588690592037464,7.091688265227263,tuna 185 | 8.725194279704139,4.656156166137121,tuna 186 | 3.0075171530263467,2.2694190867579147,salmon 187 | 9.943314408311153,6.4195537329788435,tuna 188 | 4.725364705759297,4.452075266206334,tuna 189 | 7.65861384403427,1.0992014552541862,salmon 190 | 0.8688533554334887,3.5773176713646624,tuna 191 | 7.0136333670484525,8.296496537112727,tuna 192 | 2.17060233509214,5.663455581531959,tuna 193 | 5.045253485256131,1.3988342101136728,salmon 194 | 7.824254220430458,2.3923433385130632,salmon 195 | 3.5180136741742336,6.1590927528039785,tuna 196 | 5.400091438144744,4.588735472819913,tuna 197 | 7.933019237851406,1.5548545170059302,salmon 198 | 2.692830028910179,3.6821729121442366,tuna 199 | 4.488270774365553,0.17442152063577132,salmon 200 | 2.320533113956503,7.124770128204478,tuna 201 | 4.748170093176438,9.966937675345664,salmon 202 | 3.099526541711106,8.362631363403842,salmon 203 | 0.9258623337995332,9.46049556334433,tuna 204 | 8.903756774678289,2.8834768871665206,salmon 205 | 6.000443258841674,7.4292944147605535,tuna 206 | 3.3439069423588696,8.792248391849974,salmon 207 | 1.1170112176339528,3.633924570889752,tuna 208 | 5.250633859412028,6.312779066220482,tuna 209 | 9.99334782175023,5.536532141656823,tuna 210 | 8.00132290837515,0.7542539798831549,salmon 211 | 8.334859511382827,8.021699334112588,tuna 212 | 6.428889810495738,2.3864531375892017,salmon 213 | 3.7844272017315514,5.782382752400332,tuna 214 | 6.29870258074729,6.393791241735619,salmon 215 | 1.981045272137792,2.6199614078410955,tuna 216 | 6.689904066319281,3.4529814463009325,salmon 217 | 7.032855235817922,8.611235707076046,tuna 218 | 7.976437637075803,8.095738315965805,tuna 219 | 5.761089170566459,3.1981634520679187,salmon 220 | 0.3581432206419499,5.701341855641775,tuna 221 | 8.462794775281383,0.06093882142623741,salmon 222 | 4.345721749737371,2.242471861029941,salmon 223 | 7.34466960514883,2.2133112408467017,salmon 224 | 1.7881807110353476,3.3467302838551527,tuna 225 | 7.346680384464784,6.5072265812417776,tuna 226 | 3.053616808836741,1.4618163167991416,salmon 227 | 9.402153277100094,1.161405989736467,salmon 228 | 7.767198058417982,4.360535994118624,tuna 229 | 0.9795356597840844,4.195457003936065,tuna 230 | 3.517267003555075,0.3403916642708571,salmon 231 | 8.410926376592629,8.564192712199619,tuna 232 | 8.332971144796014,2.613026354761958,salmon 233 | 5.1496549930693,7.476723571325492,tuna 234 | 9.705991153810276,6.901097193888023,tuna 235 | 0.06800804455299181,4.873046068089819,tuna 236 | 6.307099177088639,7.173557555577519,tuna 237 | 9.441960945487814,5.640842202652276,tuna 238 | 5.180968415802536,7.845667091696281,tuna 239 | 6.953332096189008,6.137138915474367,salmon 240 | 4.029553642024951,4.9475377962766505,tuna 241 | 4.8812551118261895,8.169978625961816,salmon 242 | 1.7005964962850049,4.677519255507923,tuna 243 | 8.819796859178606,6.058430087690202,tuna 244 | 3.091047524641357,9.769258049265963,salmon 245 | 8.25572654711742,9.591362601629351,tuna 246 | 6.592049336432941,2.4779486843012,salmon 247 | 5.373784452375292,0.4365593228892661,salmon 248 | 1.1343064717932885,5.252208781612653,tuna 249 | 9.81984816808049,1.6381949895368908,salmon 250 | 9.77254588297191,7.047326490691414,tuna 251 | 4.369437652302638,2.8176279052537057,salmon 252 | 9.806374942639222,7.433297288201616,tuna 253 | 2.4015498282655745,4.667137603192928,tuna 254 | 8.1111727083543,6.821240474801041,tuna 255 | 0.7382548719324644,1.37410777227698,tuna 256 | 8.542609273850298,9.252642097839287,tuna 257 | 8.409210560142293,7.742495503186513,tuna 258 | 5.3653481580107805,7.278471300699296,tuna 259 | 4.971515104247262,9.057443387836935,salmon 260 | 2.7429151362188344,6.911959680250328,tuna 261 | 6.923338112699707,3.2319961854621737,salmon 262 | 4.830003970168226,8.49033188110255,salmon 263 | 6.724969139123932,6.630079000536041,salmon 264 | 1.274057573586885,8.430941403528879,tuna 265 | 8.631196775562946,7.3570848765774715,tuna 266 | 3.0899328630679634,9.631423276419133,salmon 267 | 5.179824960843095,6.077746515436325,tuna 268 | 6.883385502660804,5.176627711650447,salmon 269 | 2.2782016821029236,1.236103117684885,tuna 270 | 4.4617092597445565,4.4892032017098895,tuna 271 | 2.9690113044199866,0.13918678446695587,tuna 272 | 1.698672215788315,4.1167883101263545,tuna 273 | 3.395152786329978,0.0012601995773187102,salmon 274 | 2.9197409449568634,7.766282078769735,tuna 275 | 5.804772249289101,3.1498011410347475,salmon 276 | 5.120985161283908,5.189205577914039,tuna 277 | 2.959767801570204,7.19041573475641,tuna 278 | 6.3173389783747425,7.9442368255401075,tuna 279 | 4.695907271526608,2.5471137153286154,salmon 280 | 9.923769399966664,8.130527592013904,tuna 281 | 7.014641415189903,4.271552948547671,tuna 282 | 4.541557507074467,1.0424971339590705,salmon 283 | 9.902666820961732,3.4345569189594647,salmon 284 | 7.334003197877287,1.3924858110060367,salmon 285 | 8.947136244337553,7.395143622147517,tuna 286 | 4.05452485331481,6.172983338674773,tuna 287 | 6.7546656587378076,2.5827459245220794,salmon 288 | 2.242522463752352,4.909352249664281,tuna 289 | 4.954005382903945,6.72905771265205,tuna 290 | 9.161974661172977,5.526516962486116,tuna 291 | 0.024427224335781258,7.522048162679793,tuna 292 | 4.534660408756811,3.652351687013036,salmon 293 | 9.633403790033409,2.8225247876691366,salmon 294 | 4.3759749935332435,0.7056342227119761,salmon 295 | 9.87627905737068,9.262280560649367,tuna 296 | 0.5546609530039037,4.171487717760036,tuna 297 | 6.500305049544273,0.7022832697547043,salmon 298 | 8.613818949912291,5.230722436625451,tuna 299 | 5.399944220776131,4.990716138791634,tuna 300 | 4.897129698880806,1.3725469991174757,salmon 301 | 1.4001415711880427,4.244408516119003,tuna 302 | 0.3406281871978467,3.428814183609772,tuna 303 | 2.8680058815111136,2.5989046153339777,tuna 304 | 2.88298132774934,4.007595198519873,tuna 305 | 2.633734390157584,8.78637088569919,tuna 306 | 3.190415368531464,0.9021278281871714,salmon 307 | 4.753289048935633,6.326714781848296,tuna 308 | 8.081350165940902,2.7878615865238965,salmon 309 | 1.9979198359964376,5.0728270070079935,tuna 310 | 1.632483344491139,9.43799649506739,tuna 311 | 2.4137293465232594,0.5199320194430213,tuna 312 | 9.105933927283195,2.0894903745428763,salmon 313 | 4.629573810477113,2.1936091828695647,salmon 314 | 4.089085582859656,4.28582021222823,tuna 315 | 6.0334856073207215,3.3703554808294816,salmon 316 | 0.2691903712379695,5.8389883654906365,tuna 317 | 7.011852138516513,6.797921311310277,tuna 318 | 2.5572445607463266,8.209612063185196,tuna 319 | 3.2226597001869317,9.672532886867472,salmon 320 | 0.034902123136253405,0.4527978254799692,tuna 321 | 5.732304937902009,0.4633566196469074,salmon 322 | 5.072862730146205,4.125818576781302,tuna 323 | 6.629153927216097,9.576255399568673,tuna 324 | 7.0067008258168375,7.345985561170156,tuna 325 | 0.1320029638619158,7.4003539586294735,tuna 326 | 2.8523821463464607,0.2015342931630948,tuna 327 | 8.333844279239809,4.996488918243457,tuna 328 | 7.429345115067872,6.654292318351778,tuna 329 | 1.3331540327413294,1.0706292064911915,tuna 330 | 4.203849929893268,8.847158843771703,salmon 331 | 2.5655330044215505,5.323336702153584,tuna 332 | 3.995362587476949,1.5700557934907702,salmon 333 | 9.83379375378565,2.111382548593772,salmon 334 | 8.960516557587349,5.305431921856067,tuna 335 | 8.875488550802867,7.4138253966279555,tuna 336 | 0.13954933109912027,8.512462125843921,tuna 337 | 8.879845812579884,8.799065966963868,tuna 338 | 6.825886080425082,8.10659877233536,tuna 339 | 7.531739240834299,2.313680414206185,salmon 340 | 7.88282353952046,9.325286851594443,tuna 341 | 8.872369772264264,9.768526459476801,tuna 342 | 9.535571084068053,6.363050244181043,tuna 343 | 5.366179012091641,7.263149557051912,tuna 344 | 3.6700519039337616,6.851385328750296,tuna 345 | 0.9354692958228628,9.417511879887444,tuna 346 | 4.314356029348969,4.323252583362047,tuna 347 | 0.9958667004762434,5.589009208371958,tuna 348 | 9.368405835889783,3.0752858461669907,salmon 349 | 0.6616851252969513,8.610230068705787,tuna 350 | 3.008879131610924,9.234832633964595,salmon 351 | 8.383362036102033,5.935729937664656,tuna 352 | 1.9479398439309423,6.2649926906856646,tuna 353 | 7.412326354954443,0.9641834760696012,salmon 354 | 2.9413225817938886,4.662967409887196,tuna 355 | 4.047135916790453,4.5753610780973615,tuna 356 | 1.5612476999065927,9.730338950342155,tuna 357 | 1.1544936416516682,5.613662760280901,tuna 358 | 9.9297486371105,8.820990223995873,tuna 359 | 2.0597864159923693,3.395730688990425,tuna 360 | 5.0134414760737265,1.8854876331909896,salmon 361 | 1.6757620614311215,5.9304019986100815,tuna 362 | 6.070782708356083,4.851552285086403,salmon 363 | 2.7652922891160703,5.099202405536286,tuna 364 | 3.7324471298092687,5.542108026673462,tuna 365 | 4.538481417383902,2.576417202140492,salmon 366 | 7.646101860663508,3.1547808133249067,salmon 367 | 7.966373614074992,6.638398142371483,tuna 368 | 3.2922888320787025,9.775131403376587,salmon 369 | 8.302167079104656,6.019009088676093,tuna 370 | 8.033936188132692,1.9783726261543482,salmon 371 | 1.727030480747925,1.9591949812669247,tuna 372 | 2.1896506808786165,5.393300078320491,tuna 373 | 8.148539936672032,1.420982148014952,salmon 374 | 4.381713828241446,7.439105609273268,salmon 375 | 6.8451615308168,7.149252664026692,tuna 376 | 4.4583226301992775,0.8711715592773217,salmon 377 | 8.507215481001616,1.1406113009433283,salmon 378 | 8.87450118048584,7.890067665835144,tuna 379 | 9.081825125379467,7.197178320520452,tuna 380 | 4.943166341334175,3.0375676678533385,salmon 381 | 6.476778157426293,6.423742393371875,salmon 382 | 1.385622341117566,8.347129320769234,tuna 383 | 8.820103701614949,9.847405704848883,tuna 384 | 5.062460476279878,9.877969349483397,tuna 385 | 2.2651718976392137,6.2064837030753175,tuna 386 | 8.359981799983858,3.500495299247308,salmon 387 | 2.3178994813819074,4.1163423800181365,tuna 388 | 1.5802467157661015,7.252683007106786,tuna 389 | 5.803647632114652,0.14190641776190227,salmon 390 | 2.06945293764629,1.796378576479306,tuna 391 | 3.5865144617133438,5.423153226735907,tuna 392 | 9.213813608173933,8.98213065398316,tuna 393 | 7.381046929558788,0.7347530860113993,salmon 394 | 1.9254616146688917,4.4872115940932495,tuna 395 | 3.1941536350079405,8.096196798521433,salmon 396 | 0.9768673664143191,3.8163889576347207,tuna 397 | 0.2949390328410162,4.289024077840057,tuna 398 | 1.6216551714796668,0.6834573722134439,tuna 399 | 3.5433415378529265,6.178749411475835,tuna 400 | 3.66466776850215,2.8826951615256133,salmon 401 | 7.1969463601315145,1.86002552862862,salmon 402 | 0.166709408774931,5.530075372927735,tuna 403 | 4.6985508605636594,0.6472707203221784,salmon 404 | 5.713374198918794,3.6787956714693415,salmon 405 | 7.599373360739031,7.908872930082889,tuna 406 | 4.675391251636125,2.2203788892057865,salmon 407 | 4.148930924762395,0.7400595189381087,salmon 408 | 6.8551338373303095,9.059096620545251,tuna 409 | 1.869688748062115,5.763657468409371,tuna 410 | 7.585686331268313,8.904534436635823,tuna 411 | 8.337025522547753,6.1944933208929776,tuna 412 | 8.065745183379981,8.887095559809103,tuna 413 | 7.462246252701826,9.256695260088811,tuna 414 | 2.167958259209799,3.3964572269109667,tuna 415 | 9.781157072838768,9.44955578145509,tuna 416 | 3.120701149222601,2.322155843960344,salmon 417 | 6.530464894019223,6.4879142418498255,salmon 418 | 6.406727761435343,5.114775840290488,salmon 419 | 1.0164837089599166,6.807661263045187,tuna 420 | 2.348165305385842,8.831274544563811,tuna 421 | 6.4357893089643525,3.75863684175658,salmon 422 | 3.2625386477375002,9.24924457241873,salmon 423 | 3.9811296514440384,3.212674897117581,salmon 424 | 0.6407921674901318,2.9898514792416666,tuna 425 | 0.6020492204159378,0.0720348070876442,tuna 426 | 8.136712707670377,6.925851409668088,tuna 427 | 4.043083737624434,7.045468112292258,salmon 428 | 5.02761964227802,0.9625808033302474,salmon 429 | 5.8918423811888445,4.275113957921053,tuna 430 | 0.6257078769775148,4.0974025222098245,tuna 431 | 7.225061277487093,4.676458110779032,tuna 432 | 9.161289100657251,0.6272269205340597,salmon 433 | 3.920931469059453,6.459769152263275,tuna 434 | 0.4569424296700431,6.938959550774579,tuna 435 | 8.099755347725397,4.68313590329671,tuna 436 | 0.07169857213637942,2.952497723066294,tuna 437 | 9.765588936894847,7.775362108860892,tuna 438 | 9.88735156891696,1.3126737312186487,salmon 439 | 9.38983613235122,9.142152383000063,tuna 440 | 6.004339261072436,0.7816095389709277,salmon 441 | 6.493351447532773,3.8743097628869014,salmon 442 | 1.8206072632893608,6.6419463199928215,tuna 443 | 2.9096569706806816,1.0455976192142646,tuna 444 | 1.3443196875861696,3.181975720377287,tuna 445 | 8.183930159048293,1.0272092514830011,salmon 446 | 5.354424828196463,0.5699590858958459,salmon 447 | 9.504242455095108,0.3771062894675237,salmon 448 | 5.292652517809398,5.104341953795912,tuna 449 | 2.5252996117528417,3.121612377541946,tuna 450 | 5.609889212367516,0.05088387832308472,salmon 451 | 7.998963621686647,3.8996680467860467,salmon 452 | 5.455283761070424,2.0309646503129852,salmon 453 | 1.290321559962384,5.335593404377498,tuna 454 | 0.8387012827392437,4.743387115780061,tuna 455 | 6.7721457303892665,7.413487810096169,tuna 456 | 2.6343878811521946,6.144273970223822,tuna 457 | 1.6396827332378816,2.381487144130987,tuna 458 | 3.827407799543283,2.8050438966707745,salmon 459 | 0.2328791982416356,5.330070010193833,tuna 460 | 6.820746145718356,6.602164727965372,salmon 461 | 7.30420548148482,9.618576414063881,tuna 462 | 8.115393501864931,7.862384684017322,tuna 463 | 6.684028655433828,4.995714290442273,salmon 464 | 1.0171924906542271,8.265467914239178,tuna 465 | 7.953218133933087,9.79606977964677,tuna 466 | 5.596442309711031,3.5909877476643377,salmon 467 | 7.053493616993678,4.749504807579111,tuna 468 | 2.5160773655758453,6.583372237826853,tuna 469 | 2.7989706452929597,0.7086179666867009,tuna 470 | 7.124176446649004,1.855986690880741,salmon 471 | 8.102686226426773,4.7525919022458885,tuna 472 | 1.195398711767529,0.29059472982300383,tuna 473 | 3.519610151014096,2.3584175738615096,salmon 474 | 6.588968949210835,0.2791892811781893,salmon 475 | 9.327576448485082,5.677123762839375,tuna 476 | 6.482568173408035,6.588567173104867,salmon 477 | 7.5451604072051905,2.4524042877284207,salmon 478 | 9.629610699102948,6.494146256856129,tuna 479 | 5.572298386111674,9.624232484484095,tuna 480 | 4.030105122494506,7.222937441551203,salmon 481 | 0.12025874967902972,0.7418846675931978,tuna 482 | 1.9422081999567609,1.5233338628082949,tuna 483 | 8.003486282192396,3.2423258341751815,salmon 484 | 4.753232178150268,8.943747235800439,salmon 485 | 2.8502016098676366,7.4068021070290975,tuna 486 | 1.6697921076756872,3.630889439039376,tuna 487 | 4.672835020454689,1.44785932150707,salmon 488 | 5.18238269800751,1.2934300136203123,salmon 489 | 3.7909743718701203,0.3275853320298039,salmon 490 | 8.484036269722793,5.594621738322132,tuna 491 | 4.6429179315682445,0.7487198543742679,salmon 492 | 0.4380403351898599,8.455345015698153,tuna 493 | 4.620883344201443,7.050047351263081,salmon 494 | 6.943003664371737,5.411045335070785,salmon 495 | 3.0731183575644208,3.0286818920591028,salmon 496 | 2.9845124349190746,4.599090797192509,tuna 497 | 8.793541780239376,2.5971310731969046,salmon 498 | 2.662474482176877,3.215577116050916,tuna 499 | 0.2962015953370101,2.751312896970948,tuna 500 | 1.977565983760452,1.1063444568670855,tuna 501 | 3.735675378443173,5.346339789992276,tuna 502 | 6.538543692912973,2.6894190805362594,salmon 503 | 2.4107871059296104,3.4940816448091603,tuna 504 | 3.2007459511931287,8.0501154457415,salmon 505 | 2.5737409201131687,9.5866473422633,tuna 506 | 6.58135255750488,3.9878265620437214,salmon 507 | 7.123715369284028,9.320685553171439,tuna 508 | 7.780085185873222,8.387369946271791,tuna 509 | 4.8935656334149815,4.919973151160706,tuna 510 | 0.5923200988423505,4.836983979416126,tuna 511 | 5.022492986198269,8.63023389733587,tuna 512 | 6.676148648973857,1.1824477141519607,salmon 513 | 1.8062224257232649,9.615300640125954,tuna 514 | 0.4776735876621474,6.87999012991058,tuna 515 | 5.365853224991533,5.206423488543527,tuna 516 | 5.696003278683692,0.12307614012504575,salmon 517 | 5.072100646723351,7.339926690987964,tuna 518 | 1.4794522194898083,5.835214380738418,tuna 519 | 0.5236413401599371,7.92724128417428,tuna 520 | 9.094159343483707,9.001292050238769,tuna 521 | 2.7479869775014256,4.507873475146113,tuna 522 | 0.7842390338479688,2.717143519388116,tuna 523 | 7.835246306838902,8.638758267306006,tuna 524 | 0.05167883766738046,4.220486903685858,tuna 525 | 2.4399254109841406,7.823696379064405,tuna 526 | 2.9800829960128485,9.920552026065982,tuna 527 | 2.529165346691924,9.793013145798762,tuna 528 | 4.691269812202976,2.3883060822043207,salmon 529 | 2.59276793098828,7.5833706693079215,tuna 530 | 1.9080009714316948,4.021823289890504,tuna 531 | 2.239759132847329,2.771827244809141,tuna 532 | 4.270318981668799,4.484170788939276,tuna 533 | 2.396653798498316,7.487727524344143,tuna 534 | 8.911947939689316,1.819448933939689,salmon 535 | 5.044625016350861,2.6476973976226126,salmon 536 | 5.384294572275727,8.39972837009199,tuna 537 | 3.8047681314923776,5.107043706387692,tuna 538 | 1.9747733931571176,0.6457829098363943,tuna 539 | 8.121370627755388,2.4464837350159554,salmon 540 | 4.893120770358039,8.091804274634613,salmon 541 | 7.321456898219672,5.8568791287277415,tuna 542 | 2.20482175544082,1.561971055201522,tuna 543 | 5.649349419075084,2.8867797836517486,salmon 544 | 9.079077002552511,8.081463745277658,tuna 545 | 0.5353732108567577,7.1057735926812695,tuna 546 | 9.482103563482012,6.509754977991822,tuna 547 | 5.203425256161895,1.4710239583316909,salmon 548 | 0.3422310487818392,6.195017317469977,tuna 549 | 2.8841006112070584,0.3887112425278727,tuna 550 | 0.4994986584088623,0.6425384864033001,tuna 551 | 1.5271055762317975,8.366318064046675,tuna 552 | 4.115991949573733,2.7234153750180825,salmon 553 | 4.071149452726043,9.435989070056204,salmon 554 | 0.5321383524208412,0.027758058171949436,tuna 555 | 8.044676535626909,3.9874109980498362,salmon 556 | 3.9247757598755264,7.502553604973874,salmon 557 | 1.8567317677098103,1.6508675377284399,tuna 558 | 3.0958079650522183,3.8847260086630806,salmon 559 | 6.044286784300921,4.635887539228274,salmon 560 | 2.8361358347560133,7.700582601487169,tuna 561 | 6.713020588959853,9.318083428143687,tuna 562 | 1.209594860479406,1.7403906070278408,tuna 563 | 1.2783342055582003,2.1055213502902244,tuna 564 | 9.150597025679604,6.451852245175412,tuna 565 | 0.5642355270308874,8.423451976324936,tuna 566 | 3.469928197911748,9.627982218358824,salmon 567 | 8.077039536407325,5.307879636220916,tuna 568 | 5.515518999152333,7.925217250475132,tuna 569 | 6.254802256084603,0.1320328802491144,salmon 570 | 9.925518009740804,4.203813762161709,tuna 571 | 4.4837208905275325,1.6868825963841338,salmon 572 | 2.7260963389382438,0.3823738168775237,tuna 573 | 0.8408296543941629,0.6821919593910519,tuna 574 | 9.702770268610573,2.020202197638948,salmon 575 | 0.7872897640576759,5.090780437588835,tuna 576 | 4.9907441466904245,7.2774285257286015,salmon 577 | 9.836197817924754,7.3947215693737185,tuna 578 | 3.607526245025644,3.1840457608187203,salmon 579 | 6.220224134042058,9.995240826338387,tuna 580 | 3.768163029052424,3.982878157008313,salmon 581 | 2.541590055469713,7.045529470790751,tuna 582 | 6.435638572444381,7.13192481058651,tuna 583 | 0.9542774289665934,0.29991237373496515,tuna 584 | 1.610238831823394,1.4134582425170283,tuna 585 | 6.756216962480344,5.283849754558624,salmon 586 | 1.7797026374148373,8.122438750441509,tuna 587 | 2.433407702034613,7.597050151812207,tuna 588 | 3.3936955945196923,7.792710849087888,salmon 589 | 2.4422109204641007,1.7734755441981875,tuna 590 | 1.1051246223225208,2.7454142973955173,tuna 591 | 8.342742465609835,2.4730966939903665,salmon 592 | 6.433036505650625,9.346609688141408,tuna 593 | 4.9318820940795005,2.8856443152170765,salmon 594 | 4.777071225352857,6.227062871436429,tuna 595 | 3.3451403035904783,0.5285400370768911,salmon 596 | 1.6801687667297538,9.278714792849437,tuna 597 | 3.122781054683026,6.260986986674117,tuna 598 | 9.58887951080894,4.372317812918725,tuna 599 | 7.693315456984098,1.3426821021423119,salmon 600 | 2.2728462395259355,2.68336381051689,tuna 601 | 5.28684898514561,3.9824694067516586,salmon 602 | 4.941393146315912,5.35817091587086,tuna 603 | 8.596572643295836,8.466298944448255,tuna 604 | 5.383348229957305,6.60462036439991,tuna 605 | 3.8076958750346304,4.157020330069027,tuna 606 | 3.421604994094141,2.6202826423538217,salmon 607 | 1.7346634393895854,6.632933955728174,tuna 608 | 0.6818634765439413,5.525847894835584,tuna 609 | 7.966421111788362,4.9292207885648285,tuna 610 | 9.378447671895778,0.5135821228455917,salmon 611 | 3.5983743255106337,3.94811324078135,salmon 612 | 2.6631567072166566,4.740067990146465,tuna 613 | 0.5685490415286643,2.7595290969292097,tuna 614 | 4.185434758974669,2.5131608927553195,salmon 615 | 8.457769305663547,0.5715682970033342,salmon 616 | 1.0807590129629108,8.962787361971289,tuna 617 | 3.824810765705504,7.295138846800205,salmon 618 | 3.1842760397632954,9.952192242824724,salmon 619 | 1.5374427330929274,9.410364768922602,tuna 620 | 7.239388700855237,1.796382309976382,salmon 621 | 1.2318579712983468,4.09246540392238,tuna 622 | 5.416252344205214,5.5857094178726205,tuna 623 | 2.318259966077083,2.212950899590086,tuna 624 | 9.70040763779866,2.3097084991857963,salmon 625 | 5.258608464206733,4.27905223185612,tuna 626 | 9.41136202109834,7.174055397121763,tuna 627 | 3.678322225831802,2.5295695887982417,salmon 628 | 5.973738854778071,1.370968769354609,salmon 629 | 9.126936768200018,7.700588822762837,tuna 630 | 8.97269427553878,7.360892277633908,tuna 631 | 3.661742732520095,0.4641037238502455,salmon 632 | 0.38168062030830896,9.333940363164622,tuna 633 | 4.304698607842473,2.875964295856938,salmon 634 | 4.3361430491499515,2.6746680113014087,salmon 635 | 6.443431329537972,3.768457801583287,salmon 636 | 6.553754807982514,7.931830632385777,tuna 637 | 2.4387178495279893,9.062028444725026,tuna 638 | 6.552202625190829,1.1418518188113191,salmon 639 | 7.420316883648957,6.7678047822108445,tuna 640 | 2.2411223476729045,9.014311104348304,tuna 641 | 1.0884806683359185,3.303585173865838,tuna 642 | 0.3639077725568962,7.45836895068109,tuna 643 | 8.8232790416284,2.131598490231841,salmon 644 | 0.3892733333586473,6.550205766721303,tuna 645 | 9.781533580258609,1.1353306914025152,salmon 646 | 2.5950347560973186,2.089262045804018,tuna 647 | 0.007275015407069985,6.787698999470407,tuna 648 | 4.464949607378935,1.724922527312589,salmon 649 | 8.734278063447963,6.272815249932452,tuna 650 | 4.000250815508179,3.165505745219828,salmon 651 | 8.380624821531969,0.7593325751599178,salmon 652 | 9.290224668129712,1.9714788978949167,salmon 653 | 0.13722834669051331,8.375323873675343,tuna 654 | 7.216209940887468,9.551337349343221,tuna 655 | 4.863541255183828,2.2608524388373863,salmon 656 | 0.9732993208158148,2.600421807439957,tuna 657 | 3.792017061458147,7.716280472715964,salmon 658 | 9.978152362958287,8.650158989552681,tuna 659 | 5.103232288658024,0.8491727086544787,salmon 660 | 2.186402443286028,4.105800321682544,tuna 661 | 3.3199835328727003,9.39821196318316,salmon 662 | 8.274573508131803,8.154985845440246,tuna 663 | 6.3617369953916025,9.423721900455341,tuna 664 | 8.785543336050496,7.479124340922589,tuna 665 | 1.378188050910324,0.5159162748673285,tuna 666 | 7.870826469517407,1.0017780865197634,salmon 667 | 1.2807394729417099,0.1697697833049161,tuna 668 | 7.868095230170673,1.4558260175713211,salmon 669 | 9.511787230005334,7.564332248227108,tuna 670 | 1.2162514266796265,1.6770730983257487,tuna 671 | 0.4045276865473624,7.93654023091328,tuna 672 | 2.118123035457041,7.705364660233354,tuna 673 | 6.0537496615459085,8.863981279625326,tuna 674 | 4.584470837826785,1.1057843090078012,salmon 675 | 2.5451051397765525,2.518968842304253,tuna 676 | 8.4374714133214,0.7823769015610982,salmon 677 | 0.3263835986556873,2.4758958878906028,tuna 678 | 0.3724774031294509,6.479349152179104,tuna 679 | 2.6337642661187957,3.2637854387878606,tuna 680 | 5.461836899200118,4.025588309292317,tuna 681 | 5.72511554097038,0.05334328170747527,salmon 682 | 9.759781730259407,1.7840068295824365,salmon 683 | 5.227812425689749,7.559360000525338,tuna 684 | 7.6602674599654685,1.7946365657973196,salmon 685 | 4.5524277067409935,0.15447854267887862,salmon 686 | 1.1318098372403074,9.17269865893935,tuna 687 | 9.20387451123376,3.5076020956810505,salmon 688 | 5.734926400014567,9.278851662410148,tuna 689 | 4.091963883197996,1.0992945919137087,salmon 690 | 3.4026546697182183,4.11382688374597,tuna 691 | 6.859676760351538,7.67123539694599,tuna 692 | 1.3774429895322926,6.794807684562574,tuna 693 | 4.8460326158825335,8.096224851060695,salmon 694 | 6.9714380465198325,2.1963799862653244,salmon 695 | 7.729281475291666,8.628220220522895,tuna 696 | 6.440216534101403,1.5396400889358195,salmon 697 | 5.970936760551843,4.9542331706270835,tuna 698 | 2.7413678716002594,0.9867641093994659,tuna 699 | 0.5306531134415993,6.420845836576654,tuna 700 | 4.705107665736946,8.159954690170903,salmon 701 | 0.4044825279490316,5.346117487338241,tuna 702 | 8.765569811765557,9.66032357949005,tuna 703 | 4.863830498692781,5.737890378177931,tuna 704 | 0.6682248536698965,2.2589039977339915,tuna 705 | 6.1829014687182,5.066953130213983,salmon 706 | 5.4523804920948615,3.4246571907195857,salmon 707 | 9.238504725926058,4.497634888590404,tuna 708 | 3.7868808196846313,7.769323386111065,salmon 709 | 1.4638616321511444,3.804183929590846,tuna 710 | 2.9117640163307272,6.782363224593303,tuna 711 | 8.031076246913607,0.3110807447091102,salmon 712 | 4.390475711873877,9.059382545094168,salmon 713 | 0.3210772717318744,0.04672306768055079,tuna 714 | 0.4164057394459197,6.119779345697597,tuna 715 | 4.84520902020333,0.3845043376935564,salmon 716 | 1.0433981327557584,2.281415561730574,tuna 717 | 1.7185369030578437,0.4446610770824389,tuna 718 | 6.5098319007271686,7.612153282484527,tuna 719 | 9.834715374349374,5.6664755828875855,tuna 720 | 7.127056642676696,1.8409127867743005,salmon 721 | 7.30894992683499,0.4615171950557861,salmon 722 | 5.809621413171003,4.327171120968662,tuna 723 | 5.536526280061435,6.707120963276143,tuna 724 | 5.416584202868222,6.166101502878753,tuna 725 | 0.3926583087935676,6.771848592165719,tuna 726 | 9.208557709626847,1.4162289228997638,salmon 727 | 5.0156334970900005,9.309674291000738,tuna 728 | 1.1806875077260637,0.7868910453514188,tuna 729 | 7.178372617664094,5.240051159812787,tuna 730 | 7.485000069735178,2.2320427559071767,salmon 731 | 6.382968480563482,5.365234865777296,salmon 732 | 7.512723478634513,6.324405687646857,tuna 733 | 7.556232238741938,5.9402535189906684,tuna 734 | 9.024660884975065,4.73960154437378,tuna 735 | 7.806319431746585,1.7198839851948795,salmon 736 | 1.890337481184644,4.28314699523996,tuna 737 | 2.2725709501369384,2.6016382166559824,tuna 738 | 7.733043710467357,3.6922757338400025,salmon 739 | 9.475298597818467,9.171424322213532,tuna 740 | 5.983412623472997,9.745110399428931,tuna 741 | 9.79568428786758,1.7879579513070354,salmon 742 | 1.6103823627087732,6.559547825034082,tuna 743 | 1.994838322547381,7.546824760300965,tuna 744 | 1.9209154872243683,7.220049602679357,tuna 745 | 4.9853207973582085,8.923947863708527,salmon 746 | 3.7273857542282673,4.8587551007404315,tuna 747 | 7.417877695781348,6.574580260454786,tuna 748 | 5.0701512613089275,6.410910295527179,tuna 749 | 9.146669551965367,8.99431521056407,tuna 750 | 7.924139663124557,0.2130085703279383,salmon 751 | 8.064188092350982,7.67456364117733,tuna 752 | 5.79248972099617,3.8896211302524386,salmon 753 | 6.699045981317905,1.0486617147111454,salmon 754 | 3.4155688072313306,4.262136421432585,tuna 755 | 8.287186014293637,7.400654666292898,tuna 756 | 7.8970005335290825,7.1249924792653285,tuna 757 | 7.838666038458554,3.4037167513955566,salmon 758 | 5.352540027164563,9.82281009654071,tuna 759 | 7.210991276882672,6.0625297706631125,tuna 760 | 7.031989933731035,0.7778009127017638,salmon 761 | 8.033587938567056,7.747793269443194,tuna 762 | 7.628244271806303,3.723696499135647,salmon 763 | 7.897766349316948,1.212938231933396,salmon 764 | 2.197791834132268,9.587993603479287,tuna 765 | 7.7552964491963605,1.6153419123579815,salmon 766 | 3.4670907356558667,4.201836052001884,tuna 767 | 4.1534670298345215,9.564273780826168,salmon 768 | 7.479647920494102,0.3672582121377643,salmon 769 | 5.152336765603549,1.9189014976088548,salmon 770 | 8.101241971234856,0.8131704017451158,salmon 771 | 3.8020416486880575,6.546428359968494,tuna 772 | 5.711212946338671,9.532247173590097,tuna 773 | 9.645188519532583,9.371085991195542,tuna 774 | 7.290632491521205,9.301889854549524,tuna 775 | 4.610269991763274,6.343874906889192,tuna 776 | 4.3954426709571095,6.785871236569159,tuna 777 | 9.899743707521539,6.2978908455799925,tuna 778 | 9.244745435443859,1.1613242104876031,salmon 779 | 6.798822768644018,2.0145720371481057,salmon 780 | 1.7427283249072811,3.6525833461619,tuna 781 | 7.019128757042004,6.896555886424665,tuna 782 | 0.6887638145740282,5.962626782656951,tuna 783 | 9.587143721677633,3.36601577924419,salmon 784 | 7.472564686048101,3.0983866143861984,salmon 785 | 2.5596211273823184,8.997667092828499,tuna 786 | 9.90750666911372,5.706302381920807,tuna 787 | 2.6616813577360108,8.357787037076577,tuna 788 | 7.995099245135496,5.918559192531885,tuna 789 | 7.0864882080337726,2.694700899843032,salmon 790 | 5.212741243623503,6.604038258325526,tuna 791 | 3.651956394472412,2.0852898569765377,salmon 792 | 6.6099321837469205,6.512598722597379,salmon 793 | 4.8420758858804405,1.0654203768457091,salmon 794 | 1.4323491775448471,6.688211704202884,tuna 795 | 4.268824509122392,3.4092224514491343,salmon 796 | 8.684521517242615,3.0502611435038296,salmon 797 | 7.2058947371169815,0.50874714881743,salmon 798 | 9.523426300513243,6.058915580766469,tuna 799 | 5.7510191412414375,2.153067095422201,salmon 800 | 5.889221774352009,9.040809852414473,tuna 801 | 4.974916522556788,0.11402684794985185,salmon 802 | 6.244642318293992,3.483803939446233,salmon 803 | 4.2631205518359305,9.119612266120015,salmon 804 | 3.995367916933061,3.728080728091172,salmon 805 | 0.25235534237784685,4.411712320595663,tuna 806 | 6.436320650130794,8.433081603491022,tuna 807 | 3.642341016819792,2.4413287025976373,salmon 808 | 9.089120307808367,2.455152508324045,salmon 809 | 1.1718647057687004,5.336646741068702,tuna 810 | 2.933698700085575,9.126072258110627,tuna 811 | 2.001243485752351,9.43760286147998,tuna 812 | 4.321250186174633,6.189818617126424,tuna 813 | 5.8927568357957325,0.742009163961953,salmon 814 | 7.666596574973352,9.370748412839193,tuna 815 | 9.584517150921316,0.15959513959054486,salmon 816 | 8.896106535705444,5.457763872922658,tuna 817 | 1.0063231517003537,3.9549029724615794,tuna 818 | 2.591736638100557,4.1574001764311275,tuna 819 | 5.9133701602436295,1.8680800442613832,salmon 820 | 3.0826146468587274,7.326475983843848,salmon 821 | 9.021618958789553,0.5387603441225586,salmon 822 | 8.153629092156518,8.742430456316693,tuna 823 | 4.867380197411162,9.085021012718792,salmon 824 | 5.736115851802548,1.9183227131045355,salmon 825 | 0.9817225404484164,5.466119683561433,tuna 826 | 7.9396224254758945,8.268633278340253,tuna 827 | 9.660534315080511,3.4919176306621647,salmon 828 | 2.112524027606037,1.0090512782666738,tuna 829 | 0.41413604384218816,0.7394440894716869,tuna 830 | 3.5923902011325293,5.294225793838341,tuna 831 | 1.8729421968329416,7.033741291794533,tuna 832 | 5.3345024092701845,9.103166655289057,tuna 833 | 8.06666959060832,1.5465998766448177,salmon 834 | 1.236881209401678,6.196453976730139,tuna 835 | 3.089466760602939,9.697853354094462,salmon 836 | 3.3442204874224486,1.9454056783779918,salmon 837 | 0.4581053933155188,6.2858170047367725,tuna 838 | 4.947125610506607,1.3309661396706884,salmon 839 | 4.461627326284513,8.533311784458501,salmon 840 | 4.110253031885188,8.572128999465834,salmon 841 | 9.102336071591902,0.32930304175882386,salmon 842 | 9.67046757341702,4.7191728400367,tuna 843 | 4.534194982632132,3.8224295799412413,salmon 844 | 9.450351573833824,2.0683799196774046,salmon 845 | 9.59278574326165,1.5817690274754768,salmon 846 | 9.319608441072225,8.110504377469438,tuna 847 | 4.318462434622309,7.705407462717942,salmon 848 | 9.891954937596203,3.4281995159767518,salmon 849 | 4.067935689342976,0.7311951978799947,salmon 850 | 7.615380292712772,9.766490844351448,tuna 851 | 9.678477051396404,2.633689171649504,salmon 852 | 5.267919543691229,1.4711604871756911,salmon 853 | 2.130671176272875,6.581855404645848,tuna 854 | 9.95233317452123,6.6154809966010175,tuna 855 | 8.774356283772821,2.6902741801636307,salmon 856 | 7.301657686243727,2.0015313513269644,salmon 857 | 5.176613942324277,0.9971567370835444,salmon 858 | 4.63508408872496,1.8046791622227576,salmon 859 | 4.0434630656687895,2.7340637007725412,salmon 860 | 0.8629490392604944,3.893780730277116,tuna 861 | 0.504056533966849,2.6547437205318003,tuna 862 | 5.565400587512423,4.897531858427705,tuna 863 | 4.662669353124231,4.271805800475777,tuna 864 | 7.899157785588268,2.5126136169026125,salmon 865 | 0.1681389587753057,9.418670836063429,tuna 866 | 1.2692837190363226,2.3390805193602366,tuna 867 | 8.647007576594461,7.0254674515819415,tuna 868 | 1.9100213461595967,8.636575296693842,tuna 869 | 5.32396998616775,3.975196106093607,salmon 870 | 6.048591472252277,0.8718075325553598,salmon 871 | 4.58669625017941,8.960699254294736,salmon 872 | 6.8001954983315445,1.5016220924029888,salmon 873 | 3.692590093125694,0.1230523782385673,salmon 874 | 4.146502376509055,0.1880284045810432,salmon 875 | 5.587250079630389,1.7798632962817142,salmon 876 | 9.981740621606932,2.6326755652010214,salmon 877 | 7.676609041547373,2.6744939786944624,salmon 878 | 7.363168091834423,4.464504405010485,tuna 879 | 9.395225482863461,6.475630797273518,tuna 880 | 1.987371703110728,1.4260070878923925,tuna 881 | 3.346214661150484,8.295466525177376,salmon 882 | 0.7423761920066074,0.8842154513655542,tuna 883 | 0.18924029260236064,5.3945757785278765,tuna 884 | 6.15120823598502,9.629433276948019,tuna 885 | 1.6858422285593566,5.674514372799831,tuna 886 | 8.295058516044383,2.080588567633644,salmon 887 | 2.526417287877295,2.178462683479859,tuna 888 | 9.703202042453317,5.005652312639186,tuna 889 | 9.59853564145001,2.094420930306482,salmon 890 | 9.407382764905883,1.0356465529303671,salmon 891 | 9.520930767295011,2.0908253670888852,salmon 892 | 9.331116577975976,8.52844955093219,tuna 893 | 7.236767840052543,3.9070316111963086,salmon 894 | 6.7750090317825595,5.3134529748336865,salmon 895 | 4.153823056927395,8.078490714826515,salmon 896 | 0.9186233194277638,6.412478322681435,tuna 897 | 6.3461818414764055,5.877560516012554,salmon 898 | 1.8296217301498765,5.8194823052630245,tuna 899 | 0.3060901997437293,1.3189898551556367,tuna 900 | 2.6190467718099866,6.957014460242259,tuna 901 | 0.4672342064973456,4.194922234009331,tuna 902 | 2.1853357362972967,9.97311202442671,tuna 903 | 4.212530257721475,1.559826379823741,salmon 904 | 8.172969307732586,6.983937928515397,tuna 905 | 2.9822903606146656,6.197104005299432,tuna 906 | 0.9488513902838936,3.399982786426431,tuna 907 | 9.077796288063286,1.4351165941570916,salmon 908 | 7.860478784674151,3.734351304573426,salmon 909 | 8.361971085782322,2.645255666372206,salmon 910 | 3.4958359071195377,5.881083877817838,tuna 911 | 9.241818714505456,0.2772721289255864,salmon 912 | 2.3205462081334383,6.682619534151563,tuna 913 | 1.3941318113106782,6.667793566424751,tuna 914 | 3.730341630276504,2.8858805816517252,salmon 915 | 7.307152603912492,9.19611326895692,tuna 916 | 2.6184867466186645,6.202678560323695,tuna 917 | 0.3557054732588616,3.9116534085765564,tuna 918 | 5.872383785977871,8.431149691684482,tuna 919 | 4.252922040492609,3.0149898775702635,salmon 920 | 2.555298854982774,2.9709376780645425,tuna 921 | 8.083149053864066,1.9203162774784943,salmon 922 | 0.08239123555240703,0.2834107816118858,tuna 923 | 9.197993172478094,6.031727156264646,tuna 924 | 5.8433267580476675,6.537574437360277,tuna 925 | 6.1523257366326645,1.9326842360528032,salmon 926 | 7.850834498495599,4.6623719677115165,tuna 927 | 3.884219260861851,0.4947475469369844,salmon 928 | 6.8443126251322965,2.1678073857340365,salmon 929 | 6.510122415077647,7.253555854894622,tuna 930 | 5.575473023794238,5.696286508680448,tuna 931 | 5.8332861162499166,1.7365415754927351,salmon 932 | 6.4749506160125545,9.451180016332179,tuna 933 | 1.0212136635599425,9.83044283236326,tuna 934 | 8.514639887122986,6.357427551995393,tuna 935 | 3.204663220528209,3.3247405096988425,salmon 936 | 0.7721108968772306,6.718977301283743,tuna 937 | 3.1606092520579265,0.8123727301099704,salmon 938 | 8.63563116752694,0.7657371749879458,salmon 939 | 2.5845688084525253,1.7138510524255768,tuna 940 | 6.804117146724213,4.090198546278547,salmon 941 | 0.5589572048230851,1.3034566177065288,tuna 942 | 6.579002315724701,9.806313867142176,tuna 943 | 9.68392140412208,5.824206440515352,tuna 944 | 0.32139699845969205,1.1973664045059829,tuna 945 | 2.2052777155637453,0.32205413795796,tuna 946 | 3.5722526481942687,6.254800612475543,tuna 947 | 7.003118664734718,9.629990113595897,tuna 948 | 3.73040267418898,4.460588382815021,tuna 949 | 2.8234908384148127,3.0483528078276287,tuna 950 | 7.500351241912976,3.1882170336865654,salmon 951 | 7.244734655477675,9.022917245635147,tuna 952 | 1.6601020931880173,7.115425248243414,tuna 953 | 5.337092554866766,8.697620700946395,tuna 954 | 6.180227884040692,3.4237010731391693,salmon 955 | 4.456728777504474,7.522115442126434,salmon 956 | 6.546829783358882,4.250192299906956,salmon 957 | 8.329125194712526,0.5083911563397181,salmon 958 | 2.6297087349498383,9.933969732942707,tuna 959 | 7.402735798784211,0.5074454371312664,salmon 960 | 3.729292214256387,8.028555060545349,salmon 961 | 2.94188414211342,8.690466842623463,tuna 962 | 4.302076176922057,2.662476158030552,salmon 963 | 4.954407323004525,9.488850430852033,salmon 964 | 6.28541813401768,4.334185177104303,salmon 965 | 5.325128271089513,6.038614258204719,tuna 966 | 4.436190285341421,1.407899930663592,salmon 967 | 5.6335792465954855,8.397232116348832,tuna 968 | 3.609386435178792,4.0651283093223,tuna 969 | 5.412303898462605,5.06719399601079,tuna 970 | 7.577185499191503,4.6373986364354405,tuna 971 | 1.865404656486564,4.404326343132524,tuna 972 | 9.158933362782188,3.4830970752916324,salmon 973 | 1.842533600998424,1.2154389595146842,tuna 974 | 1.2376640589546497,6.691547906469934,tuna 975 | 6.308117622466495,7.774548344225707,tuna 976 | 4.086064254432918,9.445545953695717,salmon 977 | 2.4108266826491054,8.75507706797304,tuna 978 | 2.1286072454671343,1.7616869662332413,tuna 979 | 1.3469263285129207,8.389179551108542,tuna 980 | 6.213851567175722,3.218297671437077,salmon 981 | 0.011585073641443566,9.069538236085867,tuna 982 | 8.752632054858678,1.261065198986705,salmon 983 | 1.6617446371762843,1.7496918862049768,tuna 984 | 0.7576784259806746,4.649472301162535,tuna 985 | 1.258904669736376,7.001206765608058,tuna 986 | 5.646949480288553,1.1505178617000789,salmon 987 | 7.048066969095743,4.086777020245998,tuna 988 | 2.6958528452726447,5.819445933764235,tuna 989 | 2.1486746445753124,5.404471949007653,tuna 990 | 5.193689482586604,2.1637380037019915,salmon 991 | 5.818983112157264,5.550505299369439,tuna 992 | 0.0987859556194748,3.9903691123129614,tuna 993 | 4.422371368725061,3.5613133044784133,salmon 994 | 9.671252444263118,7.233976683698076,tuna 995 | 9.379131474447153,1.6613633165616526,salmon 996 | 1.9291945361103613,6.8290688927452186,tuna 997 | 3.710185941296072,0.9566117873115421,salmon 998 | 3.749595557592703,9.557449502198773,salmon 999 | 2.4100874968814634,3.3557808966095823,tuna 1000 | 0.9907730003344428,3.8122102799437796,tuna 1001 | 7.519777921959507,5.3928172750898105,tuna 1002 | -------------------------------------------------------------------------------- /stopwords.txt: -------------------------------------------------------------------------------- 1 | i 2 | me 3 | my 4 | myself 5 | we 6 | our 7 | ours 8 | ourselves 9 | you 10 | your 11 | yours 12 | yourself 13 | yourselves 14 | he 15 | him 16 | his 17 | himself 18 | she 19 | her 20 | hers 21 | herself 22 | it 23 | its 24 | itself 25 | they 26 | them 27 | their 28 | theirs 29 | themselves 30 | what 31 | which 32 | who 33 | whom 34 | this 35 | that 36 | these 37 | those 38 | am 39 | is 40 | are 41 | was 42 | were 43 | be 44 | been 45 | being 46 | have 47 | has 48 | had 49 | having 50 | do 51 | does 52 | did 53 | doing 54 | a 55 | an 56 | the 57 | and 58 | but 59 | if 60 | or 61 | because 62 | as 63 | until 64 | while 65 | of 66 | at 67 | by 68 | for 69 | with 70 | about 71 | against 72 | between 73 | into 74 | through 75 | during 76 | before 77 | after 78 | above 79 | below 80 | to 81 | from 82 | up 83 | down 84 | in 85 | out 86 | on 87 | off 88 | over 89 | under 90 | again 91 | further 92 | then 93 | once 94 | here 95 | there 96 | when 97 | where 98 | why 99 | how 100 | all 101 | any 102 | both 103 | each 104 | few 105 | more 106 | most 107 | other 108 | some 109 | such 110 | no 111 | nor 112 | not 113 | only 114 | own 115 | same 116 | so 117 | than 118 | too 119 | very 120 | s 121 | t 122 | can 123 | will 124 | just 125 | don 126 | should 127 | now 128 | -------------------------------------------------------------------------------- /training_text.txt: -------------------------------------------------------------------------------- 1 | Today we will be learning about the fundamentals of data science and statistics. Data Science and statistics are hot and growing fields with alternative names of machine learning, artificial intelligence, big data, etc. I'm really excited to talk to you about data science and statistics because data science and statistics have long been a passions of mine. I didn't used to be very good at data science and statistics but after studying data science and statistics for a long time, I got better and better at it until I became a data science and statistics expert. I'm really excited to talk to you about data science and statistics, thanks for listening to me talk about data science and statistics. -------------------------------------------------------------------------------- /youtube_agent_utils.py: -------------------------------------------------------------------------------- 1 | # imports 2 | from google_auth_oauthlib.flow import InstalledAppFlow 3 | from googleapiclient.discovery import build 4 | import os 5 | import google.oauth2.credentials 6 | from googleapiclient.http import MediaFileUpload 7 | import urllib.request 8 | import matplotlib.pyplot as plt 9 | import numpy as np 10 | from datetime import datetime, timezone, timedelta 11 | import pytz 12 | import pandas as pd 13 | from time import sleep 14 | import seaborn as sns 15 | from openai import OpenAI 16 | import pandas as pd 17 | import re 18 | from datetime import datetime 19 | 20 | OPEN_AI_API_KEY = "SET_OPEN_AI_API_KEY_HERE" 21 | CHANNEL_ID = 'SET_YOUR_YOUTUBE_CHANNEL_ID_HERE' 22 | 23 | def get_authenticated_youtube_api(): 24 | flow = InstalledAppFlow.from_client_secrets_file( 25 | '/path/to/your/client_secret.json', 26 | scopes=['https://www.googleapis.com/auth/youtube'] 27 | ) 28 | 29 | # If credentials don't exist, open a web browser to authenticate 30 | if not os.path.exists('credentials.json'): 31 | credentials = flow.run_local_server(port=0) 32 | with open('credentials.json', 'w') as credentials_file: 33 | credentials_file.write(credentials.to_json()) 34 | else: 35 | credentials = google.oauth2.credentials.Credentials.from_authorized_user_file('credentials.json') 36 | 37 | youtube = build("youtube", "v3", credentials=credentials) 38 | return youtube 39 | 40 | def get_views_snippet(youtube, video_id): 41 | video_info = youtube.videos().list( 42 | id=video_id, 43 | part='snippet,statistics' 44 | ).execute() 45 | views = video_info['items'][0]['statistics']['viewCount'] 46 | snippet = video_info['items'][0]['snippet'] 47 | return int(views), snippet 48 | 49 | def update_video_title(youtube, video_id, new_title): 50 | views, snippet = get_views_snippet(youtube, video_id) 51 | snippet['title'] = new_title 52 | youtube.videos().update( 53 | part="snippet", 54 | body={ 55 | "id": f"{video_id}", 56 | "snippet": snippet 57 | } 58 | ).execute() 59 | 60 | def get_last_n_videos_with_views(youtube, n): 61 | """ 62 | Fetch the last 10 videos from a YouTube channel with their view counts. 63 | 64 | Args: 65 | youtube: Authenticated YouTube API client. 66 | 67 | Returns: 68 | A list of dictionaries containing video titles, URLs, and view counts. 69 | """ 70 | try: 71 | # Step 1: Fetch the last n videos using `search.list` 72 | search_request = youtube.search().list( 73 | part="snippet", 74 | channelId=CHANNEL_ID, 75 | maxResults=n, 76 | order="date", 77 | type="video" 78 | ) 79 | search_response = search_request.execute() 80 | 81 | # Step 2: Extract video IDs 82 | video_ids = [item["id"]["videoId"] for item in search_response.get("items", [])] 83 | 84 | if not video_ids: 85 | print("No videos found for the specified channel.") 86 | return [] 87 | 88 | # Step 3: Fetch video statistics using `videos.list` 89 | videos_request = youtube.videos().list( 90 | part="snippet,statistics", 91 | id=",".join(video_ids) 92 | ) 93 | videos_response = videos_request.execute() 94 | 95 | # Step 4: Process the response 96 | videos = [] 97 | for item in videos_response.get("items", []): 98 | title = item["snippet"]["title"] 99 | publishedAt = datetime.strptime(item["snippet"]["publishedAt"], '%Y-%m-%dT%H:%M:%SZ') 100 | utc_datetime = publishedAt.replace(tzinfo=timezone.utc) 101 | pdt_datetime = utc_datetime.astimezone(pytz.timezone('America/Los_Angeles')) 102 | delta = (datetime.now(pytz.timezone('America/Los_Angeles')) - pdt_datetime).total_seconds() 103 | video_id = item["id"] 104 | view_count = int(item["statistics"].get("viewCount", "0")) 105 | like_count = int(item["statistics"].get("likeCount", "0")) 106 | dislike_count = int(item["statistics"].get("dislikeCount", "0")) 107 | comment_count = int(item["statistics"].get("commentCount", "0")) 108 | views_per_day = view_count / (delta / 3600 / 24) 109 | videos.append({ 110 | "title": title, 111 | "publishedAt": pdt_datetime, 112 | "publishedDaysAgo": delta / 3600 / 24, 113 | "url": f"https://www.youtube.com/watch?v={video_id}", 114 | "views": view_count, 115 | "views_per_day": views_per_day, 116 | "likes": like_count, 117 | "dislikes": dislike_count, 118 | "like_dislike_ratio": like_count / dislike_count, 119 | "comments": comment_count 120 | 121 | }) 122 | 123 | return videos 124 | 125 | except Exception as e: 126 | print(f"An error occurred: {e}") 127 | return [] 128 | 129 | def get_openai_client(): 130 | client = OpenAI(api_key=OPEN_AI_API_KEY) 131 | return client 132 | 133 | def chat(client, messages): 134 | completion = client.chat.completions.create( 135 | model="gpt-4o", 136 | store=True, 137 | messages=messages, 138 | temperature=0 139 | ) 140 | return completion.choices[0].message.content 141 | 142 | 143 | def get_messages(last_n_videos, user_input): 144 | last_n = len(last_n_videos) 145 | messages = [] 146 | messages.append( 147 | { 148 | 'role': 'system', 149 | 'content': 150 | f""" 151 | You are an assistant who is an expert in generating video titles for YouTube videos which are likely to get lots of engagement. 152 | You will be provided below with information about the last {last_n} videos posted by the YouTube channel ritvikmath. 153 | These videos will be in the VIDEO_DATA section at the end of this prompt. 154 | This channel focusses on data science, statistics, and mathematics educational videos. 155 | Each item in the provided list below has the following schema: 156 | - title: the title of the video 157 | - publishedAt: the datetime when this video was first published 158 | - url: the url of the video 159 | - views: the current number of views of the video 160 | - views_per_day: the number of views this video got per day so far 161 | - likes: the number of likes the video got 162 | - dislikes: the number of dislikes the video got 163 | - like_dislike_ratio: the ratio of number of likes to number of dislikes 164 | - comments: the number of dislikes the video got 165 | The user will provide a description of what the a new video is about. 166 | Your job is to use the strongly-performing videos from the provided data to suggest a strong title for this new video. 167 | By "strong", we mean a video title that is more likely to get engagement. 168 | Please output the new title as well as your reasoning in the following json format: 169 | {{ 170 | new_title: the suggested new title, 171 | reasoning: the reasoning for this new title 172 | }} 173 | The reasoning should reference one or more videos provided in the data above. 174 | The reasoning should be 75 words or fewer. 175 | Return the output as raw JSON without any Markdown formatting or additional text. 176 | VIDEO_DATA: 177 | {last_n_videos} 178 | """ 179 | } 180 | ) 181 | messages.append({"role": "user", "content": user_input}) 182 | return messages 183 | 184 | --------------------------------------------------------------------------------