├── .gitignore ├── AFINN-111.txt ├── AFINN-README.txt ├── BDA_Senti_ipython ├── .ipynb_checkpoints │ ├── Input_Output_Functions-checkpoint.ipynb │ └── SentiWordNet_reviewClassification-checkpoint.ipynb ├── Input_Output_Functions.ipynb ├── SentiWordNet_3.0.0_20130122.txt ├── SentiWordNet_reviewClassification.ipynb ├── good1.txt ├── les.csv ├── les.txt ├── rev_bad_1.txt ├── rev_bad_2.txt ├── rev_good_b.txt ├── rev_good_s.txt ├── rev_good_t1.txt └── rev_nutralbad_2.txt ├── DeriveTweetSentimentEasy.py ├── DocumentSentimentClassification.py ├── ExtractTweet.py ├── NewTermSentimentInference.py ├── README.md ├── SentiWordNet_3.0.0_20130122.txt ├── SentiWordnet.py ├── config.json ├── review_bad.txt └── review_good.txt /.gitignore: -------------------------------------------------------------------------------- 1 | .project 2 | .pydevproject 3 | -------------------------------------------------------------------------------- /AFINN-111.txt: -------------------------------------------------------------------------------- 1 | abandon -2 2 | abandoned -2 3 | abandons -2 4 | abducted -2 5 | abduction -2 6 | abductions -2 7 | abhor -3 8 | abhorred -3 9 | abhorrent -3 10 | abhors -3 11 | abilities 2 12 | ability 2 13 | aboard 1 14 | absentee -1 15 | absentees -1 16 | absolve 2 17 | absolved 2 18 | absolves 2 19 | absolving 2 20 | absorbed 1 21 | abuse -3 22 | abused -3 23 | abuses -3 24 | abusive -3 25 | accept 1 26 | accepted 1 27 | accepting 1 28 | accepts 1 29 | accident -2 30 | accidental -2 31 | accidentally -2 32 | accidents -2 33 | accomplish 2 34 | accomplished 2 35 | accomplishes 2 36 | accusation -2 37 | accusations -2 38 | accuse -2 39 | accused -2 40 | accuses -2 41 | accusing -2 42 | ache -2 43 | achievable 1 44 | aching -2 45 | acquit 2 46 | acquits 2 47 | acquitted 2 48 | acquitting 2 49 | acrimonious -3 50 | active 1 51 | adequate 1 52 | admire 3 53 | admired 3 54 | admires 3 55 | admiring 3 56 | admit -1 57 | admits -1 58 | admitted -1 59 | admonish -2 60 | admonished -2 61 | adopt 1 62 | adopts 1 63 | adorable 3 64 | adore 3 65 | adored 3 66 | adores 3 67 | advanced 1 68 | advantage 2 69 | advantages 2 70 | adventure 2 71 | adventures 2 72 | adventurous 2 73 | affected -1 74 | affection 3 75 | affectionate 3 76 | afflicted -1 77 | affronted -1 78 | afraid -2 79 | aggravate -2 80 | aggravated -2 81 | aggravates -2 82 | aggravating -2 83 | aggression -2 84 | aggressions -2 85 | aggressive -2 86 | aghast -2 87 | agog 2 88 | agonise -3 89 | agonised -3 90 | agonises -3 91 | agonising -3 92 | agonize -3 93 | agonized -3 94 | agonizes -3 95 | agonizing -3 96 | agree 1 97 | agreeable 2 98 | agreed 1 99 | agreement 1 100 | agrees 1 101 | alarm -2 102 | alarmed -2 103 | alarmist -2 104 | alarmists -2 105 | alas -1 106 | alert -1 107 | alienation -2 108 | alive 1 109 | allergic -2 110 | allow 1 111 | alone -2 112 | amaze 2 113 | amazed 2 114 | amazes 2 115 | amazing 4 116 | ambitious 2 117 | ambivalent -1 118 | amuse 3 119 | amused 3 120 | amusement 3 121 | amusements 3 122 | anger -3 123 | angers -3 124 | angry -3 125 | anguish -3 126 | anguished -3 127 | animosity -2 128 | annoy -2 129 | annoyance -2 130 | annoyed -2 131 | annoying -2 132 | annoys -2 133 | antagonistic -2 134 | anti -1 135 | anticipation 1 136 | anxiety -2 137 | anxious -2 138 | apathetic -3 139 | apathy -3 140 | apeshit -3 141 | apocalyptic -2 142 | apologise -1 143 | apologised -1 144 | apologises -1 145 | apologising -1 146 | apologize -1 147 | apologized -1 148 | apologizes -1 149 | apologizing -1 150 | apology -1 151 | appalled -2 152 | appalling -2 153 | appease 2 154 | appeased 2 155 | appeases 2 156 | appeasing 2 157 | applaud 2 158 | applauded 2 159 | applauding 2 160 | applauds 2 161 | applause 2 162 | appreciate 2 163 | appreciated 2 164 | appreciates 2 165 | appreciating 2 166 | appreciation 2 167 | apprehensive -2 168 | approval 2 169 | approved 2 170 | approves 2 171 | ardent 1 172 | arrest -2 173 | arrested -3 174 | arrests -2 175 | arrogant -2 176 | ashame -2 177 | ashamed -2 178 | ass -4 179 | assassination -3 180 | assassinations -3 181 | asset 2 182 | assets 2 183 | assfucking -4 184 | asshole -4 185 | astonished 2 186 | astound 3 187 | astounded 3 188 | astounding 3 189 | astoundingly 3 190 | astounds 3 191 | attack -1 192 | attacked -1 193 | attacking -1 194 | attacks -1 195 | attract 1 196 | attracted 1 197 | attracting 2 198 | attraction 2 199 | attractions 2 200 | attracts 1 201 | audacious 3 202 | authority 1 203 | avert -1 204 | averted -1 205 | averts -1 206 | avid 2 207 | avoid -1 208 | avoided -1 209 | avoids -1 210 | await -1 211 | awaited -1 212 | awaits -1 213 | award 3 214 | awarded 3 215 | awards 3 216 | awesome 4 217 | awful -3 218 | awkward -2 219 | axe -1 220 | axed -1 221 | backed 1 222 | backing 2 223 | backs 1 224 | bad -3 225 | badass -3 226 | badly -3 227 | bailout -2 228 | bamboozle -2 229 | bamboozled -2 230 | bamboozles -2 231 | ban -2 232 | banish -1 233 | bankrupt -3 234 | bankster -3 235 | banned -2 236 | bargain 2 237 | barrier -2 238 | bastard -5 239 | bastards -5 240 | battle -1 241 | battles -1 242 | beaten -2 243 | beatific 3 244 | beating -1 245 | beauties 3 246 | beautiful 3 247 | beautifully 3 248 | beautify 3 249 | belittle -2 250 | belittled -2 251 | beloved 3 252 | benefit 2 253 | benefits 2 254 | benefitted 2 255 | benefitting 2 256 | bereave -2 257 | bereaved -2 258 | bereaves -2 259 | bereaving -2 260 | best 3 261 | betray -3 262 | betrayal -3 263 | betrayed -3 264 | betraying -3 265 | betrays -3 266 | better 2 267 | bias -1 268 | biased -2 269 | big 1 270 | bitch -5 271 | bitches -5 272 | bitter -2 273 | bitterly -2 274 | bizarre -2 275 | blah -2 276 | blame -2 277 | blamed -2 278 | blames -2 279 | blaming -2 280 | bless 2 281 | blesses 2 282 | blessing 3 283 | blind -1 284 | bliss 3 285 | blissful 3 286 | blithe 2 287 | block -1 288 | blockbuster 3 289 | blocked -1 290 | blocking -1 291 | blocks -1 292 | bloody -3 293 | blurry -2 294 | boastful -2 295 | bold 2 296 | boldly 2 297 | bomb -1 298 | boost 1 299 | boosted 1 300 | boosting 1 301 | boosts 1 302 | bore -2 303 | bored -2 304 | boring -3 305 | bother -2 306 | bothered -2 307 | bothers -2 308 | bothersome -2 309 | boycott -2 310 | boycotted -2 311 | boycotting -2 312 | boycotts -2 313 | brainwashing -3 314 | brave 2 315 | breakthrough 3 316 | breathtaking 5 317 | bribe -3 318 | bright 1 319 | brightest 2 320 | brightness 1 321 | brilliant 4 322 | brisk 2 323 | broke -1 324 | broken -1 325 | brooding -2 326 | bullied -2 327 | bullshit -4 328 | bully -2 329 | bullying -2 330 | bummer -2 331 | buoyant 2 332 | burden -2 333 | burdened -2 334 | burdening -2 335 | burdens -2 336 | calm 2 337 | calmed 2 338 | calming 2 339 | calms 2 340 | can't stand -3 341 | cancel -1 342 | cancelled -1 343 | cancelling -1 344 | cancels -1 345 | cancer -1 346 | capable 1 347 | captivated 3 348 | care 2 349 | carefree 1 350 | careful 2 351 | carefully 2 352 | careless -2 353 | cares 2 354 | cashing in -2 355 | casualty -2 356 | catastrophe -3 357 | catastrophic -4 358 | cautious -1 359 | celebrate 3 360 | celebrated 3 361 | celebrates 3 362 | celebrating 3 363 | censor -2 364 | censored -2 365 | censors -2 366 | certain 1 367 | chagrin -2 368 | chagrined -2 369 | challenge -1 370 | chance 2 371 | chances 2 372 | chaos -2 373 | chaotic -2 374 | charged -3 375 | charges -2 376 | charm 3 377 | charming 3 378 | charmless -3 379 | chastise -3 380 | chastised -3 381 | chastises -3 382 | chastising -3 383 | cheat -3 384 | cheated -3 385 | cheater -3 386 | cheaters -3 387 | cheats -3 388 | cheer 2 389 | cheered 2 390 | cheerful 2 391 | cheering 2 392 | cheerless -2 393 | cheers 2 394 | cheery 3 395 | cherish 2 396 | cherished 2 397 | cherishes 2 398 | cherishing 2 399 | chic 2 400 | childish -2 401 | chilling -1 402 | choke -2 403 | choked -2 404 | chokes -2 405 | choking -2 406 | clarifies 2 407 | clarity 2 408 | clash -2 409 | classy 3 410 | clean 2 411 | cleaner 2 412 | clear 1 413 | cleared 1 414 | clearly 1 415 | clears 1 416 | clever 2 417 | clouded -1 418 | clueless -2 419 | cock -5 420 | cocksucker -5 421 | cocksuckers -5 422 | cocky -2 423 | coerced -2 424 | collapse -2 425 | collapsed -2 426 | collapses -2 427 | collapsing -2 428 | collide -1 429 | collides -1 430 | colliding -1 431 | collision -2 432 | collisions -2 433 | colluding -3 434 | combat -1 435 | combats -1 436 | comedy 1 437 | comfort 2 438 | comfortable 2 439 | comforting 2 440 | comforts 2 441 | commend 2 442 | commended 2 443 | commit 1 444 | commitment 2 445 | commits 1 446 | committed 1 447 | committing 1 448 | compassionate 2 449 | compelled 1 450 | competent 2 451 | competitive 2 452 | complacent -2 453 | complain -2 454 | complained -2 455 | complains -2 456 | comprehensive 2 457 | conciliate 2 458 | conciliated 2 459 | conciliates 2 460 | conciliating 2 461 | condemn -2 462 | condemnation -2 463 | condemned -2 464 | condemns -2 465 | confidence 2 466 | confident 2 467 | conflict -2 468 | conflicting -2 469 | conflictive -2 470 | conflicts -2 471 | confuse -2 472 | confused -2 473 | confusing -2 474 | congrats 2 475 | congratulate 2 476 | congratulation 2 477 | congratulations 2 478 | consent 2 479 | consents 2 480 | consolable 2 481 | conspiracy -3 482 | constrained -2 483 | contagion -2 484 | contagions -2 485 | contagious -1 486 | contempt -2 487 | contemptuous -2 488 | contemptuously -2 489 | contend -1 490 | contender -1 491 | contending -1 492 | contentious -2 493 | contestable -2 494 | controversial -2 495 | controversially -2 496 | convince 1 497 | convinced 1 498 | convinces 1 499 | convivial 2 500 | cool 1 501 | cool stuff 3 502 | cornered -2 503 | corpse -1 504 | costly -2 505 | courage 2 506 | courageous 2 507 | courteous 2 508 | courtesy 2 509 | cover-up -3 510 | coward -2 511 | cowardly -2 512 | coziness 2 513 | cramp -1 514 | crap -3 515 | crash -2 516 | crazier -2 517 | craziest -2 518 | crazy -2 519 | creative 2 520 | crestfallen -2 521 | cried -2 522 | cries -2 523 | crime -3 524 | criminal -3 525 | criminals -3 526 | crisis -3 527 | critic -2 528 | criticism -2 529 | criticize -2 530 | criticized -2 531 | criticizes -2 532 | criticizing -2 533 | critics -2 534 | cruel -3 535 | cruelty -3 536 | crush -1 537 | crushed -2 538 | crushes -1 539 | crushing -1 540 | cry -1 541 | crying -2 542 | cunt -5 543 | curious 1 544 | curse -1 545 | cut -1 546 | cute 2 547 | cuts -1 548 | cutting -1 549 | cynic -2 550 | cynical -2 551 | cynicism -2 552 | damage -3 553 | damages -3 554 | damn -4 555 | damned -4 556 | damnit -4 557 | danger -2 558 | daredevil 2 559 | daring 2 560 | darkest -2 561 | darkness -1 562 | dauntless 2 563 | dead -3 564 | deadlock -2 565 | deafening -1 566 | dear 2 567 | dearly 3 568 | death -2 569 | debonair 2 570 | debt -2 571 | deceit -3 572 | deceitful -3 573 | deceive -3 574 | deceived -3 575 | deceives -3 576 | deceiving -3 577 | deception -3 578 | decisive 1 579 | dedicated 2 580 | defeated -2 581 | defect -3 582 | defects -3 583 | defender 2 584 | defenders 2 585 | defenseless -2 586 | defer -1 587 | deferring -1 588 | defiant -1 589 | deficit -2 590 | degrade -2 591 | degraded -2 592 | degrades -2 593 | dehumanize -2 594 | dehumanized -2 595 | dehumanizes -2 596 | dehumanizing -2 597 | deject -2 598 | dejected -2 599 | dejecting -2 600 | dejects -2 601 | delay -1 602 | delayed -1 603 | delight 3 604 | delighted 3 605 | delighting 3 606 | delights 3 607 | demand -1 608 | demanded -1 609 | demanding -1 610 | demands -1 611 | demonstration -1 612 | demoralized -2 613 | denied -2 614 | denier -2 615 | deniers -2 616 | denies -2 617 | denounce -2 618 | denounces -2 619 | deny -2 620 | denying -2 621 | depressed -2 622 | depressing -2 623 | derail -2 624 | derailed -2 625 | derails -2 626 | deride -2 627 | derided -2 628 | derides -2 629 | deriding -2 630 | derision -2 631 | desirable 2 632 | desire 1 633 | desired 2 634 | desirous 2 635 | despair -3 636 | despairing -3 637 | despairs -3 638 | desperate -3 639 | desperately -3 640 | despondent -3 641 | destroy -3 642 | destroyed -3 643 | destroying -3 644 | destroys -3 645 | destruction -3 646 | destructive -3 647 | detached -1 648 | detain -2 649 | detained -2 650 | detention -2 651 | determined 2 652 | devastate -2 653 | devastated -2 654 | devastating -2 655 | devoted 3 656 | diamond 1 657 | dick -4 658 | dickhead -4 659 | die -3 660 | died -3 661 | difficult -1 662 | diffident -2 663 | dilemma -1 664 | dipshit -3 665 | dire -3 666 | direful -3 667 | dirt -2 668 | dirtier -2 669 | dirtiest -2 670 | dirty -2 671 | disabling -1 672 | disadvantage -2 673 | disadvantaged -2 674 | disappear -1 675 | disappeared -1 676 | disappears -1 677 | disappoint -2 678 | disappointed -2 679 | disappointing -2 680 | disappointment -2 681 | disappointments -2 682 | disappoints -2 683 | disaster -2 684 | disasters -2 685 | disastrous -3 686 | disbelieve -2 687 | discard -1 688 | discarded -1 689 | discarding -1 690 | discards -1 691 | disconsolate -2 692 | disconsolation -2 693 | discontented -2 694 | discord -2 695 | discounted -1 696 | discouraged -2 697 | discredited -2 698 | disdain -2 699 | disgrace -2 700 | disgraced -2 701 | disguise -1 702 | disguised -1 703 | disguises -1 704 | disguising -1 705 | disgust -3 706 | disgusted -3 707 | disgusting -3 708 | disheartened -2 709 | dishonest -2 710 | disillusioned -2 711 | disinclined -2 712 | disjointed -2 713 | dislike -2 714 | dismal -2 715 | dismayed -2 716 | disorder -2 717 | disorganized -2 718 | disoriented -2 719 | disparage -2 720 | disparaged -2 721 | disparages -2 722 | disparaging -2 723 | displeased -2 724 | dispute -2 725 | disputed -2 726 | disputes -2 727 | disputing -2 728 | disqualified -2 729 | disquiet -2 730 | disregard -2 731 | disregarded -2 732 | disregarding -2 733 | disregards -2 734 | disrespect -2 735 | disrespected -2 736 | disruption -2 737 | disruptions -2 738 | disruptive -2 739 | dissatisfied -2 740 | distort -2 741 | distorted -2 742 | distorting -2 743 | distorts -2 744 | distract -2 745 | distracted -2 746 | distraction -2 747 | distracts -2 748 | distress -2 749 | distressed -2 750 | distresses -2 751 | distressing -2 752 | distrust -3 753 | distrustful -3 754 | disturb -2 755 | disturbed -2 756 | disturbing -2 757 | disturbs -2 758 | dithering -2 759 | dizzy -1 760 | dodging -2 761 | dodgy -2 762 | does not work -3 763 | dolorous -2 764 | dont like -2 765 | doom -2 766 | doomed -2 767 | doubt -1 768 | doubted -1 769 | doubtful -1 770 | doubting -1 771 | doubts -1 772 | douche -3 773 | douchebag -3 774 | downcast -2 775 | downhearted -2 776 | downside -2 777 | drag -1 778 | dragged -1 779 | drags -1 780 | drained -2 781 | dread -2 782 | dreaded -2 783 | dreadful -3 784 | dreading -2 785 | dream 1 786 | dreams 1 787 | dreary -2 788 | droopy -2 789 | drop -1 790 | drown -2 791 | drowned -2 792 | drowns -2 793 | drunk -2 794 | dubious -2 795 | dud -2 796 | dull -2 797 | dumb -3 798 | dumbass -3 799 | dump -1 800 | dumped -2 801 | dumps -1 802 | dupe -2 803 | duped -2 804 | dysfunction -2 805 | eager 2 806 | earnest 2 807 | ease 2 808 | easy 1 809 | ecstatic 4 810 | eerie -2 811 | eery -2 812 | effective 2 813 | effectively 2 814 | elated 3 815 | elation 3 816 | elegant 2 817 | elegantly 2 818 | embarrass -2 819 | embarrassed -2 820 | embarrasses -2 821 | embarrassing -2 822 | embarrassment -2 823 | embittered -2 824 | embrace 1 825 | emergency -2 826 | empathetic 2 827 | emptiness -1 828 | empty -1 829 | enchanted 2 830 | encourage 2 831 | encouraged 2 832 | encouragement 2 833 | encourages 2 834 | endorse 2 835 | endorsed 2 836 | endorsement 2 837 | endorses 2 838 | enemies -2 839 | enemy -2 840 | energetic 2 841 | engage 1 842 | engages 1 843 | engrossed 1 844 | enjoy 2 845 | enjoying 2 846 | enjoys 2 847 | enlighten 2 848 | enlightened 2 849 | enlightening 2 850 | enlightens 2 851 | ennui -2 852 | enrage -2 853 | enraged -2 854 | enrages -2 855 | enraging -2 856 | enrapture 3 857 | enslave -2 858 | enslaved -2 859 | enslaves -2 860 | ensure 1 861 | ensuring 1 862 | enterprising 1 863 | entertaining 2 864 | enthral 3 865 | enthusiastic 3 866 | entitled 1 867 | entrusted 2 868 | envies -1 869 | envious -2 870 | envy -1 871 | envying -1 872 | erroneous -2 873 | error -2 874 | errors -2 875 | escape -1 876 | escapes -1 877 | escaping -1 878 | esteemed 2 879 | ethical 2 880 | euphoria 3 881 | euphoric 4 882 | eviction -1 883 | evil -3 884 | exaggerate -2 885 | exaggerated -2 886 | exaggerates -2 887 | exaggerating -2 888 | exasperated 2 889 | excellence 3 890 | excellent 3 891 | excite 3 892 | excited 3 893 | excitement 3 894 | exciting 3 895 | exclude -1 896 | excluded -2 897 | exclusion -1 898 | exclusive 2 899 | excuse -1 900 | exempt -1 901 | exhausted -2 902 | exhilarated 3 903 | exhilarates 3 904 | exhilarating 3 905 | exonerate 2 906 | exonerated 2 907 | exonerates 2 908 | exonerating 2 909 | expand 1 910 | expands 1 911 | expel -2 912 | expelled -2 913 | expelling -2 914 | expels -2 915 | exploit -2 916 | exploited -2 917 | exploiting -2 918 | exploits -2 919 | exploration 1 920 | explorations 1 921 | expose -1 922 | exposed -1 923 | exposes -1 924 | exposing -1 925 | extend 1 926 | extends 1 927 | exuberant 4 928 | exultant 3 929 | exultantly 3 930 | fabulous 4 931 | fad -2 932 | fag -3 933 | faggot -3 934 | faggots -3 935 | fail -2 936 | failed -2 937 | failing -2 938 | fails -2 939 | failure -2 940 | failures -2 941 | fainthearted -2 942 | fair 2 943 | faith 1 944 | faithful 3 945 | fake -3 946 | fakes -3 947 | faking -3 948 | fallen -2 949 | falling -1 950 | falsified -3 951 | falsify -3 952 | fame 1 953 | fan 3 954 | fantastic 4 955 | farce -1 956 | fascinate 3 957 | fascinated 3 958 | fascinates 3 959 | fascinating 3 960 | fascist -2 961 | fascists -2 962 | fatalities -3 963 | fatality -3 964 | fatigue -2 965 | fatigued -2 966 | fatigues -2 967 | fatiguing -2 968 | favor 2 969 | favored 2 970 | favorite 2 971 | favorited 2 972 | favorites 2 973 | favors 2 974 | fear -2 975 | fearful -2 976 | fearing -2 977 | fearless 2 978 | fearsome -2 979 | fed up -3 980 | feeble -2 981 | feeling 1 982 | felonies -3 983 | felony -3 984 | fervent 2 985 | fervid 2 986 | festive 2 987 | fiasco -3 988 | fidgety -2 989 | fight -1 990 | fine 2 991 | fire -2 992 | fired -2 993 | firing -2 994 | fit 1 995 | fitness 1 996 | flagship 2 997 | flees -1 998 | flop -2 999 | flops -2 1000 | flu -2 1001 | flustered -2 1002 | focused 2 1003 | fond 2 1004 | fondness 2 1005 | fool -2 1006 | foolish -2 1007 | fools -2 1008 | forced -1 1009 | foreclosure -2 1010 | foreclosures -2 1011 | forget -1 1012 | forgetful -2 1013 | forgive 1 1014 | forgiving 1 1015 | forgotten -1 1016 | fortunate 2 1017 | frantic -1 1018 | fraud -4 1019 | frauds -4 1020 | fraudster -4 1021 | fraudsters -4 1022 | fraudulence -4 1023 | fraudulent -4 1024 | free 1 1025 | freedom 2 1026 | frenzy -3 1027 | fresh 1 1028 | friendly 2 1029 | fright -2 1030 | frightened -2 1031 | frightening -3 1032 | frikin -2 1033 | frisky 2 1034 | frowning -1 1035 | frustrate -2 1036 | frustrated -2 1037 | frustrates -2 1038 | frustrating -2 1039 | frustration -2 1040 | ftw 3 1041 | fuck -4 1042 | fucked -4 1043 | fucker -4 1044 | fuckers -4 1045 | fuckface -4 1046 | fuckhead -4 1047 | fucking -4 1048 | fucktard -4 1049 | fud -3 1050 | fuked -4 1051 | fuking -4 1052 | fulfill 2 1053 | fulfilled 2 1054 | fulfills 2 1055 | fuming -2 1056 | fun 4 1057 | funeral -1 1058 | funerals -1 1059 | funky 2 1060 | funnier 4 1061 | funny 4 1062 | furious -3 1063 | futile 2 1064 | gag -2 1065 | gagged -2 1066 | gain 2 1067 | gained 2 1068 | gaining 2 1069 | gains 2 1070 | gallant 3 1071 | gallantly 3 1072 | gallantry 3 1073 | generous 2 1074 | genial 3 1075 | ghost -1 1076 | giddy -2 1077 | gift 2 1078 | glad 3 1079 | glamorous 3 1080 | glamourous 3 1081 | glee 3 1082 | gleeful 3 1083 | gloom -1 1084 | gloomy -2 1085 | glorious 2 1086 | glory 2 1087 | glum -2 1088 | god 1 1089 | goddamn -3 1090 | godsend 4 1091 | good 3 1092 | goodness 3 1093 | grace 1 1094 | gracious 3 1095 | grand 3 1096 | grant 1 1097 | granted 1 1098 | granting 1 1099 | grants 1 1100 | grateful 3 1101 | gratification 2 1102 | grave -2 1103 | gray -1 1104 | great 3 1105 | greater 3 1106 | greatest 3 1107 | greed -3 1108 | greedy -2 1109 | green wash -3 1110 | green washing -3 1111 | greenwash -3 1112 | greenwasher -3 1113 | greenwashers -3 1114 | greenwashing -3 1115 | greet 1 1116 | greeted 1 1117 | greeting 1 1118 | greetings 2 1119 | greets 1 1120 | grey -1 1121 | grief -2 1122 | grieved -2 1123 | gross -2 1124 | growing 1 1125 | growth 2 1126 | guarantee 1 1127 | guilt -3 1128 | guilty -3 1129 | gullibility -2 1130 | gullible -2 1131 | gun -1 1132 | ha 2 1133 | hacked -1 1134 | haha 3 1135 | hahaha 3 1136 | hahahah 3 1137 | hail 2 1138 | hailed 2 1139 | hapless -2 1140 | haplessness -2 1141 | happiness 3 1142 | happy 3 1143 | hard -1 1144 | hardier 2 1145 | hardship -2 1146 | hardy 2 1147 | harm -2 1148 | harmed -2 1149 | harmful -2 1150 | harming -2 1151 | harms -2 1152 | harried -2 1153 | harsh -2 1154 | harsher -2 1155 | harshest -2 1156 | hate -3 1157 | hated -3 1158 | haters -3 1159 | hates -3 1160 | hating -3 1161 | haunt -1 1162 | haunted -2 1163 | haunting 1 1164 | haunts -1 1165 | havoc -2 1166 | healthy 2 1167 | heartbreaking -3 1168 | heartbroken -3 1169 | heartfelt 3 1170 | heaven 2 1171 | heavenly 4 1172 | heavyhearted -2 1173 | hell -4 1174 | help 2 1175 | helpful 2 1176 | helping 2 1177 | helpless -2 1178 | helps 2 1179 | hero 2 1180 | heroes 2 1181 | heroic 3 1182 | hesitant -2 1183 | hesitate -2 1184 | hid -1 1185 | hide -1 1186 | hides -1 1187 | hiding -1 1188 | highlight 2 1189 | hilarious 2 1190 | hindrance -2 1191 | hoax -2 1192 | homesick -2 1193 | honest 2 1194 | honor 2 1195 | honored 2 1196 | honoring 2 1197 | honour 2 1198 | honoured 2 1199 | honouring 2 1200 | hooligan -2 1201 | hooliganism -2 1202 | hooligans -2 1203 | hope 2 1204 | hopeful 2 1205 | hopefully 2 1206 | hopeless -2 1207 | hopelessness -2 1208 | hopes 2 1209 | hoping 2 1210 | horrendous -3 1211 | horrible -3 1212 | horrific -3 1213 | horrified -3 1214 | hostile -2 1215 | huckster -2 1216 | hug 2 1217 | huge 1 1218 | hugs 2 1219 | humerous 3 1220 | humiliated -3 1221 | humiliation -3 1222 | humor 2 1223 | humorous 2 1224 | humour 2 1225 | humourous 2 1226 | hunger -2 1227 | hurrah 5 1228 | hurt -2 1229 | hurting -2 1230 | hurts -2 1231 | hypocritical -2 1232 | hysteria -3 1233 | hysterical -3 1234 | hysterics -3 1235 | idiot -3 1236 | idiotic -3 1237 | ignorance -2 1238 | ignorant -2 1239 | ignore -1 1240 | ignored -2 1241 | ignores -1 1242 | ill -2 1243 | illegal -3 1244 | illiteracy -2 1245 | illness -2 1246 | illnesses -2 1247 | imbecile -3 1248 | immobilized -1 1249 | immortal 2 1250 | immune 1 1251 | impatient -2 1252 | imperfect -2 1253 | importance 2 1254 | important 2 1255 | impose -1 1256 | imposed -1 1257 | imposes -1 1258 | imposing -1 1259 | impotent -2 1260 | impress 3 1261 | impressed 3 1262 | impresses 3 1263 | impressive 3 1264 | imprisoned -2 1265 | improve 2 1266 | improved 2 1267 | improvement 2 1268 | improves 2 1269 | improving 2 1270 | inability -2 1271 | inaction -2 1272 | inadequate -2 1273 | incapable -2 1274 | incapacitated -2 1275 | incensed -2 1276 | incompetence -2 1277 | incompetent -2 1278 | inconsiderate -2 1279 | inconvenience -2 1280 | inconvenient -2 1281 | increase 1 1282 | increased 1 1283 | indecisive -2 1284 | indestructible 2 1285 | indifference -2 1286 | indifferent -2 1287 | indignant -2 1288 | indignation -2 1289 | indoctrinate -2 1290 | indoctrinated -2 1291 | indoctrinates -2 1292 | indoctrinating -2 1293 | ineffective -2 1294 | ineffectively -2 1295 | infatuated 2 1296 | infatuation 2 1297 | infected -2 1298 | inferior -2 1299 | inflamed -2 1300 | influential 2 1301 | infringement -2 1302 | infuriate -2 1303 | infuriated -2 1304 | infuriates -2 1305 | infuriating -2 1306 | inhibit -1 1307 | injured -2 1308 | injury -2 1309 | injustice -2 1310 | innovate 1 1311 | innovates 1 1312 | innovation 1 1313 | innovative 2 1314 | inquisition -2 1315 | inquisitive 2 1316 | insane -2 1317 | insanity -2 1318 | insecure -2 1319 | insensitive -2 1320 | insensitivity -2 1321 | insignificant -2 1322 | insipid -2 1323 | inspiration 2 1324 | inspirational 2 1325 | inspire 2 1326 | inspired 2 1327 | inspires 2 1328 | inspiring 3 1329 | insult -2 1330 | insulted -2 1331 | insulting -2 1332 | insults -2 1333 | intact 2 1334 | integrity 2 1335 | intelligent 2 1336 | intense 1 1337 | interest 1 1338 | interested 2 1339 | interesting 2 1340 | interests 1 1341 | interrogated -2 1342 | interrupt -2 1343 | interrupted -2 1344 | interrupting -2 1345 | interruption -2 1346 | interrupts -2 1347 | intimidate -2 1348 | intimidated -2 1349 | intimidates -2 1350 | intimidating -2 1351 | intimidation -2 1352 | intricate 2 1353 | intrigues 1 1354 | invincible 2 1355 | invite 1 1356 | inviting 1 1357 | invulnerable 2 1358 | irate -3 1359 | ironic -1 1360 | irony -1 1361 | irrational -1 1362 | irresistible 2 1363 | irresolute -2 1364 | irresponsible 2 1365 | irreversible -1 1366 | irritate -3 1367 | irritated -3 1368 | irritating -3 1369 | isolated -1 1370 | itchy -2 1371 | jackass -4 1372 | jackasses -4 1373 | jailed -2 1374 | jaunty 2 1375 | jealous -2 1376 | jeopardy -2 1377 | jerk -3 1378 | jesus 1 1379 | jewel 1 1380 | jewels 1 1381 | jocular 2 1382 | join 1 1383 | joke 2 1384 | jokes 2 1385 | jolly 2 1386 | jovial 2 1387 | joy 3 1388 | joyful 3 1389 | joyfully 3 1390 | joyless -2 1391 | joyous 3 1392 | jubilant 3 1393 | jumpy -1 1394 | justice 2 1395 | justifiably 2 1396 | justified 2 1397 | keen 1 1398 | kill -3 1399 | killed -3 1400 | killing -3 1401 | kills -3 1402 | kind 2 1403 | kinder 2 1404 | kiss 2 1405 | kudos 3 1406 | lack -2 1407 | lackadaisical -2 1408 | lag -1 1409 | lagged -2 1410 | lagging -2 1411 | lags -2 1412 | lame -2 1413 | landmark 2 1414 | laugh 1 1415 | laughed 1 1416 | laughing 1 1417 | laughs 1 1418 | laughting 1 1419 | launched 1 1420 | lawl 3 1421 | lawsuit -2 1422 | lawsuits -2 1423 | lazy -1 1424 | leak -1 1425 | leaked -1 1426 | leave -1 1427 | legal 1 1428 | legally 1 1429 | lenient 1 1430 | lethargic -2 1431 | lethargy -2 1432 | liar -3 1433 | liars -3 1434 | libelous -2 1435 | lied -2 1436 | lifesaver 4 1437 | lighthearted 1 1438 | like 2 1439 | liked 2 1440 | likes 2 1441 | limitation -1 1442 | limited -1 1443 | limits -1 1444 | litigation -1 1445 | litigious -2 1446 | lively 2 1447 | livid -2 1448 | lmao 4 1449 | lmfao 4 1450 | loathe -3 1451 | loathed -3 1452 | loathes -3 1453 | loathing -3 1454 | lobby -2 1455 | lobbying -2 1456 | lol 3 1457 | lonely -2 1458 | lonesome -2 1459 | longing -1 1460 | loom -1 1461 | loomed -1 1462 | looming -1 1463 | looms -1 1464 | loose -3 1465 | looses -3 1466 | loser -3 1467 | losing -3 1468 | loss -3 1469 | lost -3 1470 | lovable 3 1471 | love 3 1472 | loved 3 1473 | lovelies 3 1474 | lovely 3 1475 | loving 2 1476 | lowest -1 1477 | loyal 3 1478 | loyalty 3 1479 | luck 3 1480 | luckily 3 1481 | lucky 3 1482 | lugubrious -2 1483 | lunatic -3 1484 | lunatics -3 1485 | lurk -1 1486 | lurking -1 1487 | lurks -1 1488 | mad -3 1489 | maddening -3 1490 | made-up -1 1491 | madly -3 1492 | madness -3 1493 | mandatory -1 1494 | manipulated -1 1495 | manipulating -1 1496 | manipulation -1 1497 | marvel 3 1498 | marvelous 3 1499 | marvels 3 1500 | masterpiece 4 1501 | masterpieces 4 1502 | matter 1 1503 | matters 1 1504 | mature 2 1505 | meaningful 2 1506 | meaningless -2 1507 | medal 3 1508 | mediocrity -3 1509 | meditative 1 1510 | melancholy -2 1511 | menace -2 1512 | menaced -2 1513 | mercy 2 1514 | merry 3 1515 | mess -2 1516 | messed -2 1517 | messing up -2 1518 | methodical 2 1519 | mindless -2 1520 | miracle 4 1521 | mirth 3 1522 | mirthful 3 1523 | mirthfully 3 1524 | misbehave -2 1525 | misbehaved -2 1526 | misbehaves -2 1527 | misbehaving -2 1528 | mischief -1 1529 | mischiefs -1 1530 | miserable -3 1531 | misery -2 1532 | misgiving -2 1533 | misinformation -2 1534 | misinformed -2 1535 | misinterpreted -2 1536 | misleading -3 1537 | misread -1 1538 | misreporting -2 1539 | misrepresentation -2 1540 | miss -2 1541 | missed -2 1542 | missing -2 1543 | mistake -2 1544 | mistaken -2 1545 | mistakes -2 1546 | mistaking -2 1547 | misunderstand -2 1548 | misunderstanding -2 1549 | misunderstands -2 1550 | misunderstood -2 1551 | moan -2 1552 | moaned -2 1553 | moaning -2 1554 | moans -2 1555 | mock -2 1556 | mocked -2 1557 | mocking -2 1558 | mocks -2 1559 | mongering -2 1560 | monopolize -2 1561 | monopolized -2 1562 | monopolizes -2 1563 | monopolizing -2 1564 | moody -1 1565 | mope -1 1566 | moping -1 1567 | moron -3 1568 | motherfucker -5 1569 | motherfucking -5 1570 | motivate 1 1571 | motivated 2 1572 | motivating 2 1573 | motivation 1 1574 | mourn -2 1575 | mourned -2 1576 | mournful -2 1577 | mourning -2 1578 | mourns -2 1579 | mumpish -2 1580 | murder -2 1581 | murderer -2 1582 | murdering -3 1583 | murderous -3 1584 | murders -2 1585 | myth -1 1586 | n00b -2 1587 | naive -2 1588 | nasty -3 1589 | natural 1 1590 | naïve -2 1591 | needy -2 1592 | negative -2 1593 | negativity -2 1594 | neglect -2 1595 | neglected -2 1596 | neglecting -2 1597 | neglects -2 1598 | nerves -1 1599 | nervous -2 1600 | nervously -2 1601 | nice 3 1602 | nifty 2 1603 | niggas -5 1604 | nigger -5 1605 | no -1 1606 | no fun -3 1607 | noble 2 1608 | noisy -1 1609 | nonsense -2 1610 | noob -2 1611 | nosey -2 1612 | not good -2 1613 | not working -3 1614 | notorious -2 1615 | novel 2 1616 | numb -1 1617 | nuts -3 1618 | obliterate -2 1619 | obliterated -2 1620 | obnoxious -3 1621 | obscene -2 1622 | obsessed 2 1623 | obsolete -2 1624 | obstacle -2 1625 | obstacles -2 1626 | obstinate -2 1627 | odd -2 1628 | offend -2 1629 | offended -2 1630 | offender -2 1631 | offending -2 1632 | offends -2 1633 | offline -1 1634 | oks 2 1635 | ominous 3 1636 | once-in-a-lifetime 3 1637 | opportunities 2 1638 | opportunity 2 1639 | oppressed -2 1640 | oppressive -2 1641 | optimism 2 1642 | optimistic 2 1643 | optionless -2 1644 | outcry -2 1645 | outmaneuvered -2 1646 | outrage -3 1647 | outraged -3 1648 | outreach 2 1649 | outstanding 5 1650 | overjoyed 4 1651 | overload -1 1652 | overlooked -1 1653 | overreact -2 1654 | overreacted -2 1655 | overreaction -2 1656 | overreacts -2 1657 | oversell -2 1658 | overselling -2 1659 | oversells -2 1660 | oversimplification -2 1661 | oversimplified -2 1662 | oversimplifies -2 1663 | oversimplify -2 1664 | overstatement -2 1665 | overstatements -2 1666 | overweight -1 1667 | oxymoron -1 1668 | pain -2 1669 | pained -2 1670 | panic -3 1671 | panicked -3 1672 | panics -3 1673 | paradise 3 1674 | paradox -1 1675 | pardon 2 1676 | pardoned 2 1677 | pardoning 2 1678 | pardons 2 1679 | parley -1 1680 | passionate 2 1681 | passive -1 1682 | passively -1 1683 | pathetic -2 1684 | pay -1 1685 | peace 2 1686 | peaceful 2 1687 | peacefully 2 1688 | penalty -2 1689 | pensive -1 1690 | perfect 3 1691 | perfected 2 1692 | perfectly 3 1693 | perfects 2 1694 | peril -2 1695 | perjury -3 1696 | perpetrator -2 1697 | perpetrators -2 1698 | perplexed -2 1699 | persecute -2 1700 | persecuted -2 1701 | persecutes -2 1702 | persecuting -2 1703 | perturbed -2 1704 | pesky -2 1705 | pessimism -2 1706 | pessimistic -2 1707 | petrified -2 1708 | phobic -2 1709 | picturesque 2 1710 | pileup -1 1711 | pique -2 1712 | piqued -2 1713 | piss -4 1714 | pissed -4 1715 | pissing -3 1716 | piteous -2 1717 | pitied -1 1718 | pity -2 1719 | playful 2 1720 | pleasant 3 1721 | please 1 1722 | pleased 3 1723 | pleasure 3 1724 | poised -2 1725 | poison -2 1726 | poisoned -2 1727 | poisons -2 1728 | pollute -2 1729 | polluted -2 1730 | polluter -2 1731 | polluters -2 1732 | pollutes -2 1733 | poor -2 1734 | poorer -2 1735 | poorest -2 1736 | popular 3 1737 | positive 2 1738 | positively 2 1739 | possessive -2 1740 | postpone -1 1741 | postponed -1 1742 | postpones -1 1743 | postponing -1 1744 | poverty -1 1745 | powerful 2 1746 | powerless -2 1747 | praise 3 1748 | praised 3 1749 | praises 3 1750 | praising 3 1751 | pray 1 1752 | praying 1 1753 | prays 1 1754 | prblm -2 1755 | prblms -2 1756 | prepared 1 1757 | pressure -1 1758 | pressured -2 1759 | pretend -1 1760 | pretending -1 1761 | pretends -1 1762 | pretty 1 1763 | prevent -1 1764 | prevented -1 1765 | preventing -1 1766 | prevents -1 1767 | prick -5 1768 | prison -2 1769 | prisoner -2 1770 | prisoners -2 1771 | privileged 2 1772 | proactive 2 1773 | problem -2 1774 | problems -2 1775 | profiteer -2 1776 | progress 2 1777 | prominent 2 1778 | promise 1 1779 | promised 1 1780 | promises 1 1781 | promote 1 1782 | promoted 1 1783 | promotes 1 1784 | promoting 1 1785 | propaganda -2 1786 | prosecute -1 1787 | prosecuted -2 1788 | prosecutes -1 1789 | prosecution -1 1790 | prospect 1 1791 | prospects 1 1792 | prosperous 3 1793 | protect 1 1794 | protected 1 1795 | protects 1 1796 | protest -2 1797 | protesters -2 1798 | protesting -2 1799 | protests -2 1800 | proud 2 1801 | proudly 2 1802 | provoke -1 1803 | provoked -1 1804 | provokes -1 1805 | provoking -1 1806 | pseudoscience -3 1807 | punish -2 1808 | punished -2 1809 | punishes -2 1810 | punitive -2 1811 | pushy -1 1812 | puzzled -2 1813 | quaking -2 1814 | questionable -2 1815 | questioned -1 1816 | questioning -1 1817 | racism -3 1818 | racist -3 1819 | racists -3 1820 | rage -2 1821 | rageful -2 1822 | rainy -1 1823 | rant -3 1824 | ranter -3 1825 | ranters -3 1826 | rants -3 1827 | rape -4 1828 | rapist -4 1829 | rapture 2 1830 | raptured 2 1831 | raptures 2 1832 | rapturous 4 1833 | rash -2 1834 | ratified 2 1835 | reach 1 1836 | reached 1 1837 | reaches 1 1838 | reaching 1 1839 | reassure 1 1840 | reassured 1 1841 | reassures 1 1842 | reassuring 2 1843 | rebellion -2 1844 | recession -2 1845 | reckless -2 1846 | recommend 2 1847 | recommended 2 1848 | recommends 2 1849 | redeemed 2 1850 | refuse -2 1851 | refused -2 1852 | refusing -2 1853 | regret -2 1854 | regretful -2 1855 | regrets -2 1856 | regretted -2 1857 | regretting -2 1858 | reject -1 1859 | rejected -1 1860 | rejecting -1 1861 | rejects -1 1862 | rejoice 4 1863 | rejoiced 4 1864 | rejoices 4 1865 | rejoicing 4 1866 | relaxed 2 1867 | relentless -1 1868 | reliant 2 1869 | relieve 1 1870 | relieved 2 1871 | relieves 1 1872 | relieving 2 1873 | relishing 2 1874 | remarkable 2 1875 | remorse -2 1876 | repulse -1 1877 | repulsed -2 1878 | rescue 2 1879 | rescued 2 1880 | rescues 2 1881 | resentful -2 1882 | resign -1 1883 | resigned -1 1884 | resigning -1 1885 | resigns -1 1886 | resolute 2 1887 | resolve 2 1888 | resolved 2 1889 | resolves 2 1890 | resolving 2 1891 | respected 2 1892 | responsible 2 1893 | responsive 2 1894 | restful 2 1895 | restless -2 1896 | restore 1 1897 | restored 1 1898 | restores 1 1899 | restoring 1 1900 | restrict -2 1901 | restricted -2 1902 | restricting -2 1903 | restriction -2 1904 | restricts -2 1905 | retained -1 1906 | retard -2 1907 | retarded -2 1908 | retreat -1 1909 | revenge -2 1910 | revengeful -2 1911 | revered 2 1912 | revive 2 1913 | revives 2 1914 | reward 2 1915 | rewarded 2 1916 | rewarding 2 1917 | rewards 2 1918 | rich 2 1919 | ridiculous -3 1920 | rig -1 1921 | rigged -1 1922 | right direction 3 1923 | rigorous 3 1924 | rigorously 3 1925 | riot -2 1926 | riots -2 1927 | risk -2 1928 | risks -2 1929 | rob -2 1930 | robber -2 1931 | robed -2 1932 | robing -2 1933 | robs -2 1934 | robust 2 1935 | rofl 4 1936 | roflcopter 4 1937 | roflmao 4 1938 | romance 2 1939 | rotfl 4 1940 | rotflmfao 4 1941 | rotflol 4 1942 | ruin -2 1943 | ruined -2 1944 | ruining -2 1945 | ruins -2 1946 | sabotage -2 1947 | sad -2 1948 | sadden -2 1949 | saddened -2 1950 | sadly -2 1951 | safe 1 1952 | safely 1 1953 | safety 1 1954 | salient 1 1955 | sappy -1 1956 | sarcastic -2 1957 | satisfied 2 1958 | save 2 1959 | saved 2 1960 | scam -2 1961 | scams -2 1962 | scandal -3 1963 | scandalous -3 1964 | scandals -3 1965 | scapegoat -2 1966 | scapegoats -2 1967 | scare -2 1968 | scared -2 1969 | scary -2 1970 | sceptical -2 1971 | scold -2 1972 | scoop 3 1973 | scorn -2 1974 | scornful -2 1975 | scream -2 1976 | screamed -2 1977 | screaming -2 1978 | screams -2 1979 | screwed -2 1980 | screwed up -3 1981 | scumbag -4 1982 | secure 2 1983 | secured 2 1984 | secures 2 1985 | sedition -2 1986 | seditious -2 1987 | seduced -1 1988 | self-confident 2 1989 | self-deluded -2 1990 | selfish -3 1991 | selfishness -3 1992 | sentence -2 1993 | sentenced -2 1994 | sentences -2 1995 | sentencing -2 1996 | serene 2 1997 | severe -2 1998 | sexy 3 1999 | shaky -2 2000 | shame -2 2001 | shamed -2 2002 | shameful -2 2003 | share 1 2004 | shared 1 2005 | shares 1 2006 | shattered -2 2007 | shit -4 2008 | shithead -4 2009 | shitty -3 2010 | shock -2 2011 | shocked -2 2012 | shocking -2 2013 | shocks -2 2014 | shoot -1 2015 | short-sighted -2 2016 | short-sightedness -2 2017 | shortage -2 2018 | shortages -2 2019 | shrew -4 2020 | shy -1 2021 | sick -2 2022 | sigh -2 2023 | significance 1 2024 | significant 1 2025 | silencing -1 2026 | silly -1 2027 | sincere 2 2028 | sincerely 2 2029 | sincerest 2 2030 | sincerity 2 2031 | sinful -3 2032 | singleminded -2 2033 | skeptic -2 2034 | skeptical -2 2035 | skepticism -2 2036 | skeptics -2 2037 | slam -2 2038 | slash -2 2039 | slashed -2 2040 | slashes -2 2041 | slashing -2 2042 | slavery -3 2043 | sleeplessness -2 2044 | slick 2 2045 | slicker 2 2046 | slickest 2 2047 | sluggish -2 2048 | slut -5 2049 | smart 1 2050 | smarter 2 2051 | smartest 2 2052 | smear -2 2053 | smile 2 2054 | smiled 2 2055 | smiles 2 2056 | smiling 2 2057 | smog -2 2058 | sneaky -1 2059 | snub -2 2060 | snubbed -2 2061 | snubbing -2 2062 | snubs -2 2063 | sobering 1 2064 | solemn -1 2065 | solid 2 2066 | solidarity 2 2067 | solution 1 2068 | solutions 1 2069 | solve 1 2070 | solved 1 2071 | solves 1 2072 | solving 1 2073 | somber -2 2074 | some kind 0 2075 | son-of-a-bitch -5 2076 | soothe 3 2077 | soothed 3 2078 | soothing 3 2079 | sophisticated 2 2080 | sore -1 2081 | sorrow -2 2082 | sorrowful -2 2083 | sorry -1 2084 | spam -2 2085 | spammer -3 2086 | spammers -3 2087 | spamming -2 2088 | spark 1 2089 | sparkle 3 2090 | sparkles 3 2091 | sparkling 3 2092 | speculative -2 2093 | spirit 1 2094 | spirited 2 2095 | spiritless -2 2096 | spiteful -2 2097 | splendid 3 2098 | sprightly 2 2099 | squelched -1 2100 | stab -2 2101 | stabbed -2 2102 | stable 2 2103 | stabs -2 2104 | stall -2 2105 | stalled -2 2106 | stalling -2 2107 | stamina 2 2108 | stampede -2 2109 | startled -2 2110 | starve -2 2111 | starved -2 2112 | starves -2 2113 | starving -2 2114 | steadfast 2 2115 | steal -2 2116 | steals -2 2117 | stereotype -2 2118 | stereotyped -2 2119 | stifled -1 2120 | stimulate 1 2121 | stimulated 1 2122 | stimulates 1 2123 | stimulating 2 2124 | stingy -2 2125 | stolen -2 2126 | stop -1 2127 | stopped -1 2128 | stopping -1 2129 | stops -1 2130 | stout 2 2131 | straight 1 2132 | strange -1 2133 | strangely -1 2134 | strangled -2 2135 | strength 2 2136 | strengthen 2 2137 | strengthened 2 2138 | strengthening 2 2139 | strengthens 2 2140 | stressed -2 2141 | stressor -2 2142 | stressors -2 2143 | stricken -2 2144 | strike -1 2145 | strikers -2 2146 | strikes -1 2147 | strong 2 2148 | stronger 2 2149 | strongest 2 2150 | struck -1 2151 | struggle -2 2152 | struggled -2 2153 | struggles -2 2154 | struggling -2 2155 | stubborn -2 2156 | stuck -2 2157 | stunned -2 2158 | stunning 4 2159 | stupid -2 2160 | stupidly -2 2161 | suave 2 2162 | substantial 1 2163 | substantially 1 2164 | subversive -2 2165 | success 2 2166 | successful 3 2167 | suck -3 2168 | sucks -3 2169 | suffer -2 2170 | suffering -2 2171 | suffers -2 2172 | suicidal -2 2173 | suicide -2 2174 | suing -2 2175 | sulking -2 2176 | sulky -2 2177 | sullen -2 2178 | sunshine 2 2179 | super 3 2180 | superb 5 2181 | superior 2 2182 | support 2 2183 | supported 2 2184 | supporter 1 2185 | supporters 1 2186 | supporting 1 2187 | supportive 2 2188 | supports 2 2189 | survived 2 2190 | surviving 2 2191 | survivor 2 2192 | suspect -1 2193 | suspected -1 2194 | suspecting -1 2195 | suspects -1 2196 | suspend -1 2197 | suspended -1 2198 | suspicious -2 2199 | swear -2 2200 | swearing -2 2201 | swears -2 2202 | sweet 2 2203 | swift 2 2204 | swiftly 2 2205 | swindle -3 2206 | swindles -3 2207 | swindling -3 2208 | sympathetic 2 2209 | sympathy 2 2210 | tard -2 2211 | tears -2 2212 | tender 2 2213 | tense -2 2214 | tension -1 2215 | terrible -3 2216 | terribly -3 2217 | terrific 4 2218 | terrified -3 2219 | terror -3 2220 | terrorize -3 2221 | terrorized -3 2222 | terrorizes -3 2223 | thank 2 2224 | thankful 2 2225 | thanks 2 2226 | thorny -2 2227 | thoughtful 2 2228 | thoughtless -2 2229 | threat -2 2230 | threaten -2 2231 | threatened -2 2232 | threatening -2 2233 | threatens -2 2234 | threats -2 2235 | thrilled 5 2236 | thwart -2 2237 | thwarted -2 2238 | thwarting -2 2239 | thwarts -2 2240 | timid -2 2241 | timorous -2 2242 | tired -2 2243 | tits -2 2244 | tolerant 2 2245 | toothless -2 2246 | top 2 2247 | tops 2 2248 | torn -2 2249 | torture -4 2250 | tortured -4 2251 | tortures -4 2252 | torturing -4 2253 | totalitarian -2 2254 | totalitarianism -2 2255 | tout -2 2256 | touted -2 2257 | touting -2 2258 | touts -2 2259 | tragedy -2 2260 | tragic -2 2261 | tranquil 2 2262 | trap -1 2263 | trapped -2 2264 | trauma -3 2265 | traumatic -3 2266 | travesty -2 2267 | treason -3 2268 | treasonous -3 2269 | treasure 2 2270 | treasures 2 2271 | trembling -2 2272 | tremulous -2 2273 | tricked -2 2274 | trickery -2 2275 | triumph 4 2276 | triumphant 4 2277 | trouble -2 2278 | troubled -2 2279 | troubles -2 2280 | true 2 2281 | trust 1 2282 | trusted 2 2283 | tumor -2 2284 | twat -5 2285 | ugly -3 2286 | unacceptable -2 2287 | unappreciated -2 2288 | unapproved -2 2289 | unaware -2 2290 | unbelievable -1 2291 | unbelieving -1 2292 | unbiased 2 2293 | uncertain -1 2294 | unclear -1 2295 | uncomfortable -2 2296 | unconcerned -2 2297 | unconfirmed -1 2298 | unconvinced -1 2299 | uncredited -1 2300 | undecided -1 2301 | underestimate -1 2302 | underestimated -1 2303 | underestimates -1 2304 | underestimating -1 2305 | undermine -2 2306 | undermined -2 2307 | undermines -2 2308 | undermining -2 2309 | undeserving -2 2310 | undesirable -2 2311 | uneasy -2 2312 | unemployment -2 2313 | unequal -1 2314 | unequaled 2 2315 | unethical -2 2316 | unfair -2 2317 | unfocused -2 2318 | unfulfilled -2 2319 | unhappy -2 2320 | unhealthy -2 2321 | unified 1 2322 | unimpressed -2 2323 | unintelligent -2 2324 | united 1 2325 | unjust -2 2326 | unlovable -2 2327 | unloved -2 2328 | unmatched 1 2329 | unmotivated -2 2330 | unprofessional -2 2331 | unresearched -2 2332 | unsatisfied -2 2333 | unsecured -2 2334 | unsettled -1 2335 | unsophisticated -2 2336 | unstable -2 2337 | unstoppable 2 2338 | unsupported -2 2339 | unsure -1 2340 | untarnished 2 2341 | unwanted -2 2342 | unworthy -2 2343 | upset -2 2344 | upsets -2 2345 | upsetting -2 2346 | uptight -2 2347 | urgent -1 2348 | useful 2 2349 | usefulness 2 2350 | useless -2 2351 | uselessness -2 2352 | vague -2 2353 | validate 1 2354 | validated 1 2355 | validates 1 2356 | validating 1 2357 | verdict -1 2358 | verdicts -1 2359 | vested 1 2360 | vexation -2 2361 | vexing -2 2362 | vibrant 3 2363 | vicious -2 2364 | victim -3 2365 | victimize -3 2366 | victimized -3 2367 | victimizes -3 2368 | victimizing -3 2369 | victims -3 2370 | vigilant 3 2371 | vile -3 2372 | vindicate 2 2373 | vindicated 2 2374 | vindicates 2 2375 | vindicating 2 2376 | violate -2 2377 | violated -2 2378 | violates -2 2379 | violating -2 2380 | violence -3 2381 | violent -3 2382 | virtuous 2 2383 | virulent -2 2384 | vision 1 2385 | visionary 3 2386 | visioning 1 2387 | visions 1 2388 | vitality 3 2389 | vitamin 1 2390 | vitriolic -3 2391 | vivacious 3 2392 | vociferous -1 2393 | vulnerability -2 2394 | vulnerable -2 2395 | walkout -2 2396 | walkouts -2 2397 | wanker -3 2398 | want 1 2399 | war -2 2400 | warfare -2 2401 | warm 1 2402 | warmth 2 2403 | warn -2 2404 | warned -2 2405 | warning -3 2406 | warnings -3 2407 | warns -2 2408 | waste -1 2409 | wasted -2 2410 | wasting -2 2411 | wavering -1 2412 | weak -2 2413 | weakness -2 2414 | wealth 3 2415 | wealthy 2 2416 | weary -2 2417 | weep -2 2418 | weeping -2 2419 | weird -2 2420 | welcome 2 2421 | welcomed 2 2422 | welcomes 2 2423 | whimsical 1 2424 | whitewash -3 2425 | whore -4 2426 | wicked -2 2427 | widowed -1 2428 | willingness 2 2429 | win 4 2430 | winner 4 2431 | winning 4 2432 | wins 4 2433 | winwin 3 2434 | wish 1 2435 | wishes 1 2436 | wishing 1 2437 | withdrawal -3 2438 | woebegone -2 2439 | woeful -3 2440 | won 3 2441 | wonderful 4 2442 | woo 3 2443 | woohoo 3 2444 | wooo 4 2445 | woow 4 2446 | worn -1 2447 | worried -3 2448 | worry -3 2449 | worrying -3 2450 | worse -3 2451 | worsen -3 2452 | worsened -3 2453 | worsening -3 2454 | worsens -3 2455 | worshiped 3 2456 | worst -3 2457 | worth 2 2458 | worthless -2 2459 | worthy 2 2460 | wow 4 2461 | wowow 4 2462 | wowww 4 2463 | wrathful -3 2464 | wreck -2 2465 | wrong -2 2466 | wronged -2 2467 | wtf -4 2468 | yeah 1 2469 | yearning 1 2470 | yeees 2 2471 | yes 1 2472 | youthful 2 2473 | yucky -2 2474 | yummy 3 2475 | zealot -2 2476 | zealots -2 2477 | zealous 2 -------------------------------------------------------------------------------- /AFINN-README.txt: -------------------------------------------------------------------------------- 1 | AFINN is a list of English words rated for valence with an integer 2 | between minus five (negative) and plus five (positive). The words have 3 | been manually labeled by Finn Årup Nielsen in 2009-2011. The file 4 | is tab-separated. There are two versions: 5 | 6 | AFINN-111: Newest version with 2477 words and phrases. 7 | 8 | AFINN-96: 1468 unique words and phrases on 1480 lines. Note that there 9 | are 1480 lines, as some words are listed twice. The word list in not 10 | entirely in alphabetic ordering. 11 | 12 | An evaluation of the word list is available in: 13 | 14 | 15 | The list was used in: 16 | 17 | Lars Kai Hansen, Adam Arvidsson, Finn Årup Nielsen, Elanor Colleoni, 18 | Michael Etter, "Good Friends, Bad News - Affect and Virality in 19 | Twitter", The 2011 International Workshop on Social Computing, 20 | Network, and Services (SocialComNet 2011). 21 | 22 | 23 | This database of words is copyright protected and distributed under 24 | "Open Database License (ODbL) v1.0" 25 | http://www.opendatacommons.org/licenses/odbl/1.0/ or a similar 26 | copyleft license. 27 | -------------------------------------------------------------------------------- /BDA_Senti_ipython/.ipynb_checkpoints/Input_Output_Functions-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "" 4 | }, 5 | "nbformat": 3, 6 | "nbformat_minor": 0, 7 | "worksheets": [ 8 | { 9 | "cells": [ 10 | { 11 | "cell_type": "code", 12 | "collapsed": false, 13 | "input": [ 14 | "import sys\n", 15 | "import csv\n", 16 | "import nltk\n", 17 | "from nltk.corpus import wordnet\n", 18 | "import re\n", 19 | "import codecs\n", 20 | "import pprint\n", 21 | "import nltk\n", 22 | "import requests\n", 23 | "import math\n", 24 | "import urllib2\n", 25 | "\n", 26 | "from apiclient.discovery import build" 27 | ], 28 | "language": "python", 29 | "metadata": {}, 30 | "outputs": [], 31 | "prompt_number": 1 32 | }, 33 | { 34 | "cell_type": "code", 35 | "collapsed": false, 36 | "input": [ 37 | "class SentiWordNetCorpusReader:\n", 38 | " def __init__(self, filename):\n", 39 | " \"\"\"\n", 40 | " Argument:\n", 41 | " filename -- the name of the text file containing the\n", 42 | " SentiWordNet database\n", 43 | " \"\"\" \n", 44 | " self.filename = filename\n", 45 | " self.db = {}\n", 46 | " self.parse_src_file()\n", 47 | "\n", 48 | " def parse_src_file(self):\n", 49 | " lines = codecs.open(self.filename, \"r\", \"utf8\").read().splitlines()\n", 50 | " lines = filter((lambda x : not re.search(r\"^\\s*#\", x)), lines)\n", 51 | " for i, line in enumerate(lines):\n", 52 | " fields = re.split(r\"\\t+\", line)\n", 53 | " fields = map(unicode.strip, fields)\n", 54 | " try: \n", 55 | " pos, offset, pos_score, neg_score, synset_terms, gloss = fields\n", 56 | " except:\n", 57 | " sys.stderr.write(\"Line %s formatted incorrectly: %s\\n\" % (i, line))\n", 58 | " if pos and offset:\n", 59 | " offset = int(offset)\n", 60 | " self.db[(pos, offset)] = (float(pos_score), float(neg_score))\n", 61 | "\n", 62 | " def senti_synset(self, *vals): \n", 63 | " if tuple(vals) in self.db:\n", 64 | " pos_score, neg_score = self.db[tuple(vals)]\n", 65 | " pos, offset = vals\n", 66 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 67 | " return SentiSynset(pos_score, neg_score, synset)\n", 68 | " else:\n", 69 | " synset = wordnet.synset(vals[0])\n", 70 | " pos = synset.pos\n", 71 | " offset = synset.offset\n", 72 | " if (pos, offset) in self.db:\n", 73 | " pos_score, neg_score = self.db[(pos, offset)]\n", 74 | " return SentiSynset(pos_score, neg_score, synset)\n", 75 | " else:\n", 76 | " return None\n", 77 | "\n", 78 | " def senti_synsets(self, string, pos=None):\n", 79 | " sentis = []\n", 80 | " synset_list = wordnet.synsets(string, pos)\n", 81 | " for synset in synset_list:\n", 82 | " sentis.append(self.senti_synset(synset.name))\n", 83 | " sentis = filter(lambda x : x, sentis)\n", 84 | " return sentis\n", 85 | "\n", 86 | " def all_senti_synsets(self):\n", 87 | " for key, fields in self.db.iteritems():\n", 88 | " pos, offset = key\n", 89 | " pos_score, neg_score = fields\n", 90 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 91 | " yield SentiSynset(pos_score, neg_score, synset)\n", 92 | "\n", 93 | "class SentiSynset:\n", 94 | " def __init__(self, pos_score, neg_score, synset):\n", 95 | " self.pos_score = pos_score\n", 96 | " self.neg_score = neg_score\n", 97 | " self.obj_score = 1.0 - (self.pos_score + self.neg_score)\n", 98 | " self.synset = synset\n", 99 | "\n", 100 | " def __str__(self):\n", 101 | " \"\"\"Prints just the Pos/Neg scores for now.\"\"\"\n", 102 | " s = \"\"\n", 103 | " s += self.synset.name + \"\\t\"\n", 104 | " s += \"PosScore: %s\\t\" % self.pos_score \n", 105 | " s += \"NegScore: %s\" % self.neg_score\n", 106 | " return s\n", 107 | "\n", 108 | " def __repr__(self):\n", 109 | " return \"Senti\" + repr(self.synset)\n", 110 | "\n", 111 | "def tweet_dict(twitterData): \n", 112 | " ''' (file) -> list of dictionaries\n", 113 | " This method should take your .csv\n", 114 | " file and create a list of dictionaries.\n", 115 | " '''\n", 116 | " twitter_list_dict = [] \n", 117 | " twitterfile = open(twitterData)\n", 118 | " twitterreader = csv.reader(twitterfile)\n", 119 | " for line in twitterreader:\n", 120 | " twitter_list_dict.append(line[0])\n", 121 | " return twitter_list_dict\n", 122 | "\n", 123 | "# return true if a string ia a stopword\n", 124 | "def is_stopword(string):\n", 125 | " if string.lower() in nltk.corpus.stopwords.words('english'):\n", 126 | " return True\n", 127 | " else:\n", 128 | " return False\n", 129 | "\n", 130 | " # return true if a string is punctation \n", 131 | "def is_punctuation(string):\n", 132 | " for char in string:\n", 133 | " if char.isalpha() or char.isdigit():\n", 134 | " return False\n", 135 | " return True\n", 136 | "\n", 137 | "# Translation from nltk to Wordnet (words tag) (code)\n", 138 | "def wordnet_pos_code(tag):\n", 139 | " if tag.startswith('NN'):\n", 140 | " return wordnet.NOUN\n", 141 | " elif tag.startswith('VB'):\n", 142 | " return wordnet.VERB\n", 143 | " elif tag.startswith('JJ'):\n", 144 | " return wordnet.ADJ\n", 145 | " elif tag.startswith('RB'):\n", 146 | " return wordnet.ADV\n", 147 | " else:\n", 148 | " return ''\n", 149 | "\n", 150 | " \n", 151 | "# Translation from nltk to Wordnet (words tag) (label)\n", 152 | "def wordnet_pos_label(tag):\n", 153 | " if tag.startswith('NN'):\n", 154 | " return \"Noun\"\n", 155 | " elif tag.startswith('VB'):\n", 156 | " return \"Verb\"\n", 157 | " elif tag.startswith('JJ'):\n", 158 | " return \"Adjective\"\n", 159 | " elif tag.startswith('RB'):\n", 160 | " return \"Adverb\"\n", 161 | " else:\n", 162 | " return tag\n", 163 | " \n", 164 | "\n", 165 | "\"\"\" input -> a sentence \n", 166 | " otput -> sentence in which each words is enriched of -> lemma, wordnet_pos, wordnet_definitions \n", 167 | "\n", 168 | "\"\"\"\n", 169 | "def wordnet_definitions(sentence):\n", 170 | " wnl = nltk.WordNetLemmatizer()\n", 171 | " for token in sentence:\n", 172 | " word = token['word']\n", 173 | " wn_pos = wordnet_pos_code(token['pos'])\n", 174 | " if is_punctuation(word):\n", 175 | " token['punct'] = True\n", 176 | " elif is_stopword(word):\n", 177 | " pass\n", 178 | " elif len(wordnet.synsets(word, wn_pos)) > 0:\n", 179 | " token['wn_lemma'] = wnl.lemmatize(word.lower())\n", 180 | " token['wn_pos'] = wordnet_pos_label(token['pos'])\n", 181 | " defs = [sense.definition for sense in wordnet.synsets(word, wn_pos)]\n", 182 | " token['wn_def'] = \"; \\n\".join(defs) \n", 183 | " else:\n", 184 | " pass\n", 185 | " return sentence\n", 186 | "\n", 187 | "\n", 188 | "#Tokenization\n", 189 | "\n", 190 | "def tag_tweet(tweet): \n", 191 | " sents = nltk.sent_tokenize(tweet)\n", 192 | " sentence = []\n", 193 | " for sent in sents:\n", 194 | " tokens = nltk.word_tokenize(sent)\n", 195 | " tag_tuples = nltk.pos_tag(tokens)\n", 196 | " for (string, tag) in tag_tuples:\n", 197 | " token = {'word':string, 'pos':tag} \n", 198 | " sentence.append(token) \n", 199 | " return sentence\n", 200 | "\n", 201 | "\n", 202 | "\n", 203 | "# WSD\n", 204 | "\n", 205 | "def word_sense_disambiguate(word, wn_pos, tweet):\n", 206 | " senses = wordnet.synsets(word, wn_pos)\n", 207 | " if len(senses) >0:\n", 208 | " cfd = nltk.ConditionalFreqDist(\n", 209 | " (sense, def_word)\n", 210 | " for sense in senses\n", 211 | " for def_word in sense.definition.split()\n", 212 | " if def_word in tweet)\n", 213 | " best_sense = senses[0] # start with first sense\n", 214 | " for sense in senses:\n", 215 | " try:\n", 216 | " if cfd[sense].max() > cfd[best_sense].max():\n", 217 | " best_sense = sense\n", 218 | " except: \n", 219 | " pass \n", 220 | " return best_sense\n", 221 | " else:\n", 222 | " return None\n", 223 | " \n" 224 | ], 225 | "language": "python", 226 | "metadata": {}, 227 | "outputs": [], 228 | "prompt_number": 2 229 | }, 230 | { 231 | "cell_type": "code", 232 | "collapsed": false, 233 | "input": [ 234 | "sentiment = SentiWordNetCorpusReader(\"SentiWordNet_3.0.0_20130122.txt\")" 235 | ], 236 | "language": "python", 237 | "metadata": {}, 238 | "outputs": [], 239 | "prompt_number": 3 240 | }, 241 | { 242 | "cell_type": "code", 243 | "collapsed": false, 244 | "input": [ 245 | "\n", 246 | "revfile = open(\"les.txt\")\n", 247 | "\n", 248 | "review = revfile.read()\n", 249 | "\n", 250 | "review" 251 | ], 252 | "language": "python", 253 | "metadata": {}, 254 | "outputs": [ 255 | { 256 | "metadata": {}, 257 | "output_type": "pyout", 258 | "prompt_number": 4, 259 | "text": [ 260 | "\"This is an example of great writing. The style, pace, characters, descriptions, settings, action--all of it. It's great. I read this book in a literature class and really enjoyed it. It is perhaps the best classic I have read besides Tolkien's The Hobbit. Treasure Island manages to stand alone as the go-to book on pirates, sea voyages, and young men seeking adventure. Before anyone decides to read children's or YA literature, they must first read Treasure Island to get an idea of what a good book should contain.\\nQuestion: Why haven't they made a modern movie of this? It would be great cinema.\"" 261 | ] 262 | } 263 | ], 264 | "prompt_number": 4 265 | }, 266 | { 267 | "cell_type": "code", 268 | "collapsed": false, 269 | "input": [ 270 | "a = tag_tweet(review)\n", 271 | "\n", 272 | "a" 273 | ], 274 | "language": "python", 275 | "metadata": {}, 276 | "outputs": [ 277 | { 278 | "metadata": {}, 279 | "output_type": "pyout", 280 | "prompt_number": 5, 281 | "text": [ 282 | "[{'pos': 'DT', 'word': 'This'},\n", 283 | " {'pos': 'VBZ', 'word': 'is'},\n", 284 | " {'pos': 'DT', 'word': 'an'},\n", 285 | " {'pos': 'NN', 'word': 'example'},\n", 286 | " {'pos': 'IN', 'word': 'of'},\n", 287 | " {'pos': 'JJ', 'word': 'great'},\n", 288 | " {'pos': 'NN', 'word': 'writing'},\n", 289 | " {'pos': '.', 'word': '.'},\n", 290 | " {'pos': 'DT', 'word': 'The'},\n", 291 | " {'pos': 'NN', 'word': 'style'},\n", 292 | " {'pos': ',', 'word': ','},\n", 293 | " {'pos': 'NN', 'word': 'pace'},\n", 294 | " {'pos': ',', 'word': ','},\n", 295 | " {'pos': 'NNS', 'word': 'characters'},\n", 296 | " {'pos': ',', 'word': ','},\n", 297 | " {'pos': 'NNS', 'word': 'descriptions'},\n", 298 | " {'pos': ',', 'word': ','},\n", 299 | " {'pos': 'NNS', 'word': 'settings'},\n", 300 | " {'pos': ',', 'word': ','},\n", 301 | " {'pos': 'NN', 'word': 'action'},\n", 302 | " {'pos': ':', 'word': '--'},\n", 303 | " {'pos': 'DT', 'word': 'all'},\n", 304 | " {'pos': 'IN', 'word': 'of'},\n", 305 | " {'pos': 'PRP', 'word': 'it'},\n", 306 | " {'pos': '.', 'word': '.'},\n", 307 | " {'pos': 'PRP', 'word': 'It'},\n", 308 | " {'pos': 'VBZ', 'word': \"'s\"},\n", 309 | " {'pos': 'JJ', 'word': 'great'},\n", 310 | " {'pos': '.', 'word': '.'},\n", 311 | " {'pos': 'PRP', 'word': 'I'},\n", 312 | " {'pos': 'VBP', 'word': 'read'},\n", 313 | " {'pos': 'DT', 'word': 'this'},\n", 314 | " {'pos': 'NN', 'word': 'book'},\n", 315 | " {'pos': 'IN', 'word': 'in'},\n", 316 | " {'pos': 'DT', 'word': 'a'},\n", 317 | " {'pos': 'NN', 'word': 'literature'},\n", 318 | " {'pos': 'NN', 'word': 'class'},\n", 319 | " {'pos': 'CC', 'word': 'and'},\n", 320 | " {'pos': 'RB', 'word': 'really'},\n", 321 | " {'pos': 'VBN', 'word': 'enjoyed'},\n", 322 | " {'pos': 'PRP', 'word': 'it'},\n", 323 | " {'pos': '.', 'word': '.'},\n", 324 | " {'pos': 'PRP', 'word': 'It'},\n", 325 | " {'pos': 'VBZ', 'word': 'is'},\n", 326 | " {'pos': 'RB', 'word': 'perhaps'},\n", 327 | " {'pos': 'DT', 'word': 'the'},\n", 328 | " {'pos': 'JJS', 'word': 'best'},\n", 329 | " {'pos': 'JJ', 'word': 'classic'},\n", 330 | " {'pos': 'PRP', 'word': 'I'},\n", 331 | " {'pos': 'VBP', 'word': 'have'},\n", 332 | " {'pos': 'VBN', 'word': 'read'},\n", 333 | " {'pos': 'NNS', 'word': 'besides'},\n", 334 | " {'pos': 'NNP', 'word': 'Tolkien'},\n", 335 | " {'pos': 'POS', 'word': \"'s\"},\n", 336 | " {'pos': 'NNP', 'word': 'The'},\n", 337 | " {'pos': 'NNP', 'word': 'Hobbit'},\n", 338 | " {'pos': '.', 'word': '.'},\n", 339 | " {'pos': 'NN', 'word': 'Treasure'},\n", 340 | " {'pos': 'NNP', 'word': 'Island'},\n", 341 | " {'pos': 'NNS', 'word': 'manages'},\n", 342 | " {'pos': 'TO', 'word': 'to'},\n", 343 | " {'pos': 'VB', 'word': 'stand'},\n", 344 | " {'pos': 'RB', 'word': 'alone'},\n", 345 | " {'pos': 'IN', 'word': 'as'},\n", 346 | " {'pos': 'DT', 'word': 'the'},\n", 347 | " {'pos': 'NNP', 'word': 'go-to'},\n", 348 | " {'pos': 'NN', 'word': 'book'},\n", 349 | " {'pos': 'IN', 'word': 'on'},\n", 350 | " {'pos': 'NNS', 'word': 'pirates'},\n", 351 | " {'pos': ',', 'word': ','},\n", 352 | " {'pos': 'NN', 'word': 'sea'},\n", 353 | " {'pos': 'NNS', 'word': 'voyages'},\n", 354 | " {'pos': ',', 'word': ','},\n", 355 | " {'pos': 'CC', 'word': 'and'},\n", 356 | " {'pos': 'JJ', 'word': 'young'},\n", 357 | " {'pos': 'NNS', 'word': 'men'},\n", 358 | " {'pos': 'VBG', 'word': 'seeking'},\n", 359 | " {'pos': 'NN', 'word': 'adventure'},\n", 360 | " {'pos': '.', 'word': '.'},\n", 361 | " {'pos': 'IN', 'word': 'Before'},\n", 362 | " {'pos': 'NN', 'word': 'anyone'},\n", 363 | " {'pos': 'NNS', 'word': 'decides'},\n", 364 | " {'pos': 'TO', 'word': 'to'},\n", 365 | " {'pos': 'VB', 'word': 'read'},\n", 366 | " {'pos': 'NNS', 'word': 'children'},\n", 367 | " {'pos': 'POS', 'word': \"'s\"},\n", 368 | " {'pos': 'CC', 'word': 'or'},\n", 369 | " {'pos': 'NNP', 'word': 'YA'},\n", 370 | " {'pos': 'NN', 'word': 'literature'},\n", 371 | " {'pos': ',', 'word': ','},\n", 372 | " {'pos': 'PRP', 'word': 'they'},\n", 373 | " {'pos': 'MD', 'word': 'must'},\n", 374 | " {'pos': 'RB', 'word': 'first'},\n", 375 | " {'pos': 'VB', 'word': 'read'},\n", 376 | " {'pos': 'NNP', 'word': 'Treasure'},\n", 377 | " {'pos': 'NNP', 'word': 'Island'},\n", 378 | " {'pos': 'TO', 'word': 'to'},\n", 379 | " {'pos': 'VB', 'word': 'get'},\n", 380 | " {'pos': 'DT', 'word': 'an'},\n", 381 | " {'pos': 'NN', 'word': 'idea'},\n", 382 | " {'pos': 'IN', 'word': 'of'},\n", 383 | " {'pos': 'WP', 'word': 'what'},\n", 384 | " {'pos': 'DT', 'word': 'a'},\n", 385 | " {'pos': 'JJ', 'word': 'good'},\n", 386 | " {'pos': 'NN', 'word': 'book'},\n", 387 | " {'pos': 'MD', 'word': 'should'},\n", 388 | " {'pos': 'VB', 'word': 'contain'},\n", 389 | " {'pos': '.', 'word': '.'},\n", 390 | " {'pos': 'NN', 'word': 'Question'},\n", 391 | " {'pos': ':', 'word': ':'},\n", 392 | " {'pos': 'WRB', 'word': 'Why'},\n", 393 | " {'pos': 'VBP', 'word': 'have'},\n", 394 | " {'pos': 'RB', 'word': \"n't\"},\n", 395 | " {'pos': 'PRP', 'word': 'they'},\n", 396 | " {'pos': 'VBD', 'word': 'made'},\n", 397 | " {'pos': 'DT', 'word': 'a'},\n", 398 | " {'pos': 'JJ', 'word': 'modern'},\n", 399 | " {'pos': 'NN', 'word': 'movie'},\n", 400 | " {'pos': 'IN', 'word': 'of'},\n", 401 | " {'pos': 'DT', 'word': 'this'},\n", 402 | " {'pos': '.', 'word': '?'},\n", 403 | " {'pos': 'PRP', 'word': 'It'},\n", 404 | " {'pos': 'MD', 'word': 'would'},\n", 405 | " {'pos': 'VB', 'word': 'be'},\n", 406 | " {'pos': 'JJ', 'word': 'great'},\n", 407 | " {'pos': 'NN', 'word': 'cinema'},\n", 408 | " {'pos': '.', 'word': '.'}]" 409 | ] 410 | } 411 | ], 412 | "prompt_number": 5 413 | }, 414 | { 415 | "cell_type": "code", 416 | "collapsed": false, 417 | "input": [ 418 | "a=wordnet_definitions(a)\n", 419 | "a" 420 | ], 421 | "language": "python", 422 | "metadata": {}, 423 | "outputs": [ 424 | { 425 | "metadata": {}, 426 | "output_type": "pyout", 427 | "prompt_number": 6, 428 | "text": [ 429 | "[{'pos': 'DT', 'word': 'This'},\n", 430 | " {'pos': 'VBZ', 'word': 'is'},\n", 431 | " {'pos': 'DT', 'word': 'an'},\n", 432 | " {'pos': 'NN',\n", 433 | " 'wn_def': 'an item of information that is typical of a class or group; \\na representative form or pattern; \\nsomething to be imitated; \\npunishment intended as a warning to others; \\nan occurrence of something; \\na task performed or problem solved in order to develop skill or understanding',\n", 434 | " 'wn_lemma': 'example',\n", 435 | " 'wn_pos': 'Noun',\n", 436 | " 'word': 'example'},\n", 437 | " {'pos': 'IN', 'word': 'of'},\n", 438 | " {'pos': 'JJ',\n", 439 | " 'wn_def': 'relatively large in size or number or extent; larger than others of its kind; \\nof major significance or importance; \\nremarkable or out of the ordinary in degree or magnitude or effect; \\nvery good; \\nuppercase; \\nin an advanced stage of pregnancy',\n", 440 | " 'wn_lemma': 'great',\n", 441 | " 'wn_pos': 'Adjective',\n", 442 | " 'word': 'great'},\n", 443 | " {'pos': 'NN',\n", 444 | " 'wn_def': 'the act of creating written works; \\nthe work of a writer; anything expressed in letters of the alphabet (especially when considered from the point of view of style and effect); \\n(usually plural) the collected work of an author; \\nletters or symbols that are written or imprinted on a surface to represent the sounds or words of a language; \\nthe activity of putting something in written form',\n", 445 | " 'wn_lemma': 'writing',\n", 446 | " 'wn_pos': 'Noun',\n", 447 | " 'word': 'writing'},\n", 448 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 449 | " {'pos': 'DT', 'word': 'The'},\n", 450 | " {'pos': 'NN',\n", 451 | " 'wn_def': 'how something is done or how it happens; \\na way of expressing something (in language or art or music etc.) that is characteristic of a particular person or group of people or period; \\na particular kind (as to appearance); \\nthe popular taste at a given time; \\n(botany) the narrow elongated part of the pistil between the ovary and the stigma; \\neditorial directions to be followed in spelling and punctuation and capitalization and typographical display; \\ndistinctive and stylish elegance; \\na pointed tool for writing or drawing or engraving; \\na slender bristlelike or tubular process',\n", 452 | " 'wn_lemma': 'style',\n", 453 | " 'wn_pos': 'Noun',\n", 454 | " 'word': 'style'},\n", 455 | " {'pos': ',', 'punct': True, 'word': ','},\n", 456 | " {'pos': 'NN',\n", 457 | " 'wn_def': 'the rate of moving (especially walking or running); \\nthe distance covered by a step; \\nthe relative speed of progress or change; \\na step in walking or running; \\nthe rate of some repeating event; \\na unit of length equal to 3 feet; defined as 91.44 centimeters; originally taken to be the average length of a stride',\n", 458 | " 'wn_lemma': 'pace',\n", 459 | " 'wn_pos': 'Noun',\n", 460 | " 'word': 'pace'},\n", 461 | " {'pos': ',', 'punct': True, 'word': ','},\n", 462 | " {'pos': 'NNS',\n", 463 | " 'wn_def': \"an imaginary person represented in a work of fiction (play or film or story); \\na characteristic property that defines the apparent individual nature of something; \\nthe inherent complex of attributes that determines a persons moral and ethical actions and reactions; \\nan actor's portrayal of someone in a play; \\na person of a specified kind (usually with many eccentricities); \\ngood repute; \\na formal recommendation by a former employer to a potential future employer describing the person's qualifications and dependability; \\na written symbol that is used to represent speech; \\n(genetics) an attribute (structural or functional) that is determined by a gene or group of genes\",\n", 464 | " 'wn_lemma': 'character',\n", 465 | " 'wn_pos': 'Noun',\n", 466 | " 'word': 'characters'},\n", 467 | " {'pos': ',', 'punct': True, 'word': ','},\n", 468 | " {'pos': 'NNS',\n", 469 | " 'wn_def': 'a statement that represents something in words; \\nthe act of describing something; \\nsort or variety',\n", 470 | " 'wn_lemma': 'description',\n", 471 | " 'wn_pos': 'Noun',\n", 472 | " 'word': 'descriptions'},\n", 473 | " {'pos': ',', 'punct': True, 'word': ','},\n", 474 | " {'pos': 'NNS',\n", 475 | " 'wn_def': 'the context and environment in which something is set; \\nthe state of the environment in which a situation exists; \\narrangement of scenery and properties to represent the place where a play or movie is enacted; \\nthe set of facts or circumstances that surround a situation or event; \\nthe physical position of something; \\na table service for one person; \\na mounting consisting of a piece of metal (as in a ring or other jewelry) that holds a gem in place',\n", 476 | " 'wn_lemma': 'setting',\n", 477 | " 'wn_pos': 'Noun',\n", 478 | " 'word': 'settings'},\n", 479 | " {'pos': ',', 'punct': True, 'word': ','},\n", 480 | " {'pos': 'NN',\n", 481 | " 'wn_def': 'something done (usually as opposed to something said); \\nthe state of being active; \\na military engagement; \\na process existing in or produced by nature (rather than by the intent of human beings); \\nthe series of events that form a plot; \\nthe trait of being active and energetic and forceful; \\nthe operating part that transmits power to a mechanism; \\na judicial proceeding brought by one party against another; one party prosecutes another for a wrong done or for protection of a right or for prevention of a wrong; \\nan act by a government body or supranational organization; \\nthe most important or interesting work or activity in a specific area or field',\n", 482 | " 'wn_lemma': 'action',\n", 483 | " 'wn_pos': 'Noun',\n", 484 | " 'word': 'action'},\n", 485 | " {'pos': ':', 'punct': True, 'word': '--'},\n", 486 | " {'pos': 'DT', 'word': 'all'},\n", 487 | " {'pos': 'IN', 'word': 'of'},\n", 488 | " {'pos': 'PRP', 'word': 'it'},\n", 489 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 490 | " {'pos': 'PRP', 'word': 'It'},\n", 491 | " {'pos': 'VBZ', 'word': \"'s\"},\n", 492 | " {'pos': 'JJ',\n", 493 | " 'wn_def': 'relatively large in size or number or extent; larger than others of its kind; \\nof major significance or importance; \\nremarkable or out of the ordinary in degree or magnitude or effect; \\nvery good; \\nuppercase; \\nin an advanced stage of pregnancy',\n", 494 | " 'wn_lemma': 'great',\n", 495 | " 'wn_pos': 'Adjective',\n", 496 | " 'word': 'great'},\n", 497 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 498 | " {'pos': 'PRP', 'word': 'I'},\n", 499 | " {'pos': 'VBP',\n", 500 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 501 | " 'wn_lemma': 'read',\n", 502 | " 'wn_pos': 'Verb',\n", 503 | " 'word': 'read'},\n", 504 | " {'pos': 'DT', 'word': 'this'},\n", 505 | " {'pos': 'NN',\n", 506 | " 'wn_def': 'a written work or composition that has been published (printed on pages bound together); \\nphysical objects consisting of a number of pages bound together; \\na compilation of the known facts regarding something or someone; \\na written version of a play or other dramatic composition; used in preparing for a performance; \\na record in which commercial accounts are recorded; \\na collection of playing cards satisfying the rules of a card game; \\na collection of rules or prescribed standards on the basis of which decisions are made; \\nthe sacred writings of Islam revealed by God to the prophet Muhammad during his life at Mecca and Medina; \\nthe sacred writings of the Christian religions; \\na major division of a long written composition; \\na number of sheets (ticket or stamps etc.) bound together on one edge',\n", 507 | " 'wn_lemma': 'book',\n", 508 | " 'wn_pos': 'Noun',\n", 509 | " 'word': 'book'},\n", 510 | " {'pos': 'IN', 'word': 'in'},\n", 511 | " {'pos': 'DT', 'word': 'a'},\n", 512 | " {'pos': 'NN',\n", 513 | " 'wn_def': 'creative writing of recognized artistic value; \\nthe humanistic study of a body of literature; \\npublished writings in a particular style on a particular subject; \\nthe profession or art of a writer',\n", 514 | " 'wn_lemma': 'literature',\n", 515 | " 'wn_pos': 'Noun',\n", 516 | " 'word': 'literature'},\n", 517 | " {'pos': 'NN',\n", 518 | " 'wn_def': 'a collection of things sharing a common attribute; \\na body of students who are taught together; \\npeople having the same social, economic, or educational status; \\neducation imparted in a series of lessons or meetings; \\na league ranked by quality; \\na body of students who graduate together; \\n(biology) a taxonomic group containing one or more orders; \\nelegance in dress or behavior',\n", 519 | " 'wn_lemma': 'class',\n", 520 | " 'wn_pos': 'Noun',\n", 521 | " 'word': 'class'},\n", 522 | " {'pos': 'CC', 'word': 'and'},\n", 523 | " {'pos': 'RB',\n", 524 | " 'wn_def': \"in accordance with truth or fact or reality; \\nin actual fact; \\nin fact (used as intensifiers or sentence modifiers); \\nused as intensifiers; `real' is sometimes used informally for `really'; `rattling' is informal\",\n", 525 | " 'wn_lemma': 'really',\n", 526 | " 'wn_pos': 'Adverb',\n", 527 | " 'word': 'really'},\n", 528 | " {'pos': 'VBN',\n", 529 | " 'wn_def': \"derive or receive pleasure from; get enjoyment from; take pleasure in; \\nhave benefit from; \\nget pleasure from; \\nhave for one's benefit; \\ntake delight in\",\n", 530 | " 'wn_lemma': 'enjoyed',\n", 531 | " 'wn_pos': 'Verb',\n", 532 | " 'word': 'enjoyed'},\n", 533 | " {'pos': 'PRP', 'word': 'it'},\n", 534 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 535 | " {'pos': 'PRP', 'word': 'It'},\n", 536 | " {'pos': 'VBZ', 'word': 'is'},\n", 537 | " {'pos': 'RB',\n", 538 | " 'wn_def': 'by chance',\n", 539 | " 'wn_lemma': 'perhaps',\n", 540 | " 'wn_pos': 'Adverb',\n", 541 | " 'word': 'perhaps'},\n", 542 | " {'pos': 'DT', 'word': 'the'},\n", 543 | " {'pos': 'JJS',\n", 544 | " 'wn_def': \"(superlative of `good') having the most positive qualities; \\n(comparative and superlative of `well') wiser or more advantageous and hence advisable; \\nhaving desirable or positive qualities especially those suitable for a thing specified; \\nhaving the normally expected amount; \\nmorally admirable; \\ndeserving of esteem and respect; \\npromoting or enhancing well-being; \\nagreeable or pleasing; \\nof moral excellence; \\nhaving or showing knowledge and skill and aptitude; \\nthorough; \\nwith or in a close or intimate relationship; \\nfinancially sound; \\nmost suitable or right for a particular purpose; \\nresulting favorably; \\nexerting force or influence; \\ncapable of pleasing; \\nappealing to the mind; \\nin excellent physical condition; \\ntending to promote physical well-being; beneficial to health; \\nnot forged; \\nnot left to spoil; \\ngenerally admired\",\n", 545 | " 'wn_lemma': 'best',\n", 546 | " 'wn_pos': 'Adjective',\n", 547 | " 'word': 'best'},\n", 548 | " {'pos': 'JJ',\n", 549 | " 'wn_def': 'of recognized authority or excellence; \\nof or relating to the most highly developed stage of an earlier civilisation and its culture; \\nof or pertaining to or characteristic of the ancient Greek and Roman cultures',\n", 550 | " 'wn_lemma': 'classic',\n", 551 | " 'wn_pos': 'Adjective',\n", 552 | " 'word': 'classic'},\n", 553 | " {'pos': 'PRP', 'word': 'I'},\n", 554 | " {'pos': 'VBP', 'word': 'have'},\n", 555 | " {'pos': 'VBN',\n", 556 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 557 | " 'wn_lemma': 'read',\n", 558 | " 'wn_pos': 'Verb',\n", 559 | " 'word': 'read'},\n", 560 | " {'pos': 'NNS', 'word': 'besides'},\n", 561 | " {'pos': 'NNP',\n", 562 | " 'wn_def': 'British philologist and writer of fantasies (born in South Africa) (1892-1973)',\n", 563 | " 'wn_lemma': 'tolkien',\n", 564 | " 'wn_pos': 'Noun',\n", 565 | " 'word': 'Tolkien'},\n", 566 | " {'pos': 'POS', 'word': \"'s\"},\n", 567 | " {'pos': 'NNP', 'word': 'The'},\n", 568 | " {'pos': 'NNP',\n", 569 | " 'wn_def': 'an imaginary being similar to a person but smaller and with hairy feet; invented by J.R.R. Tolkien',\n", 570 | " 'wn_lemma': 'hobbit',\n", 571 | " 'wn_pos': 'Noun',\n", 572 | " 'word': 'Hobbit'},\n", 573 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 574 | " {'pos': 'NN',\n", 575 | " 'wn_def': 'accumulated wealth in the form of money or jewels etc.; \\nart highly prized for its beauty or perfection; \\nany possession that is highly valued by its owner; \\na collection of precious things',\n", 576 | " 'wn_lemma': 'treasure',\n", 577 | " 'wn_pos': 'Noun',\n", 578 | " 'word': 'Treasure'},\n", 579 | " {'pos': 'NNP',\n", 580 | " 'wn_def': 'a land mass (smaller than a continent) that is surrounded by water; \\na zone or area resembling an island',\n", 581 | " 'wn_lemma': 'island',\n", 582 | " 'wn_pos': 'Noun',\n", 583 | " 'word': 'Island'},\n", 584 | " {'pos': 'NNS', 'word': 'manages'},\n", 585 | " {'pos': 'TO', 'word': 'to'},\n", 586 | " {'pos': 'VB',\n", 587 | " 'wn_def': \"be standing; be upright; \\nbe in some specified state or condition; \\noccupy a place or location, also metaphorically; \\nhold one's ground; maintain a position; be steadfast or upright; \\nput up with something or somebody unpleasant; \\nhave or maintain a position or stand on an issue; \\nremain inactive or immobile; \\nbe in effect; be or remain in force; \\nbe tall; have a height of; copula; \\nput into an upright position; \\nwithstand the force of something; \\nbe available for stud services\",\n", 588 | " 'wn_lemma': 'stand',\n", 589 | " 'wn_pos': 'Verb',\n", 590 | " 'word': 'stand'},\n", 591 | " {'pos': 'RB',\n", 592 | " 'wn_def': 'without any others being included or involved; \\nwithout anybody else or anything else',\n", 593 | " 'wn_lemma': 'alone',\n", 594 | " 'wn_pos': 'Adverb',\n", 595 | " 'word': 'alone'},\n", 596 | " {'pos': 'IN', 'word': 'as'},\n", 597 | " {'pos': 'DT', 'word': 'the'},\n", 598 | " {'pos': 'NNP', 'word': 'go-to'},\n", 599 | " {'pos': 'NN',\n", 600 | " 'wn_def': 'a written work or composition that has been published (printed on pages bound together); \\nphysical objects consisting of a number of pages bound together; \\na compilation of the known facts regarding something or someone; \\na written version of a play or other dramatic composition; used in preparing for a performance; \\na record in which commercial accounts are recorded; \\na collection of playing cards satisfying the rules of a card game; \\na collection of rules or prescribed standards on the basis of which decisions are made; \\nthe sacred writings of Islam revealed by God to the prophet Muhammad during his life at Mecca and Medina; \\nthe sacred writings of the Christian religions; \\na major division of a long written composition; \\na number of sheets (ticket or stamps etc.) bound together on one edge',\n", 601 | " 'wn_lemma': 'book',\n", 602 | " 'wn_pos': 'Noun',\n", 603 | " 'word': 'book'},\n", 604 | " {'pos': 'IN', 'word': 'on'},\n", 605 | " {'pos': 'NNS',\n", 606 | " 'wn_def': \"someone who uses another person's words or ideas as if they were his own; \\nsomeone who robs at sea or plunders the land from the sea without having a commission from any sovereign nation; \\na ship that is manned by pirates\",\n", 607 | " 'wn_lemma': 'pirate',\n", 608 | " 'wn_pos': 'Noun',\n", 609 | " 'word': 'pirates'},\n", 610 | " {'pos': ',', 'punct': True, 'word': ','},\n", 611 | " {'pos': 'NN',\n", 612 | " 'wn_def': 'a division of an ocean or a large body of salt water partially enclosed by land; \\nanything apparently limitless in quantity or volume; \\nturbulent water with swells of considerable size',\n", 613 | " 'wn_lemma': 'sea',\n", 614 | " 'wn_pos': 'Noun',\n", 615 | " 'word': 'sea'},\n", 616 | " {'pos': 'NNS',\n", 617 | " 'wn_def': 'an act of traveling by water; \\na journey to some distant place',\n", 618 | " 'wn_lemma': 'voyage',\n", 619 | " 'wn_pos': 'Noun',\n", 620 | " 'word': 'voyages'},\n", 621 | " {'pos': ',', 'punct': True, 'word': ','},\n", 622 | " {'pos': 'CC', 'word': 'and'},\n", 623 | " {'pos': 'JJ',\n", 624 | " 'wn_def': '(used of living things especially persons) in an early period of life or development or growth; \\n(of crops) harvested at an early stage of development; before complete maturity; \\nsuggestive of youth; vigorous and fresh; \\nbeing in its early stage; \\nnot tried or tested by experience',\n", 625 | " 'wn_lemma': 'young',\n", 626 | " 'wn_pos': 'Adjective',\n", 627 | " 'word': 'young'},\n", 628 | " {'pos': 'NNS',\n", 629 | " 'wn_def': 'the force of workers available; \\nan adult person who is male (as opposed to a woman); \\nsomeone who serves in the armed forces; a member of a military force; \\nthe generic use of the word to refer to any human being; \\nany living or extinct member of the family Hominidae characterized by superior intelligence, articulate speech, and erect carriage; \\na male subordinate; \\nan adult male person who has a manly character (virile and courageous competent); \\na manservant who acts as a personal attendant to his employer; \\na male person who plays a significant role (husband or lover or boyfriend) in the life of a particular woman; \\none of the British Isles in the Irish Sea; \\ngame equipment consisting of an object used in playing certain board games; \\nall of the living human inhabitants of the earth',\n", 630 | " 'wn_lemma': 'men',\n", 631 | " 'wn_pos': 'Noun',\n", 632 | " 'word': 'men'},\n", 633 | " {'pos': 'VBG',\n", 634 | " 'wn_def': 'try to get or reach; \\ntry to locate or discover, or try to establish the existence of; \\nmake an effort or attempt; \\ngo to or towards; \\ninquire for',\n", 635 | " 'wn_lemma': 'seeking',\n", 636 | " 'wn_pos': 'Verb',\n", 637 | " 'word': 'seeking'},\n", 638 | " {'pos': 'NN',\n", 639 | " 'wn_def': 'a wild and exciting undertaking (not necessarily lawful)',\n", 640 | " 'wn_lemma': 'adventure',\n", 641 | " 'wn_pos': 'Noun',\n", 642 | " 'word': 'adventure'},\n", 643 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 644 | " {'pos': 'IN', 'word': 'Before'},\n", 645 | " {'pos': 'NN', 'word': 'anyone'},\n", 646 | " {'pos': 'NNS', 'word': 'decides'},\n", 647 | " {'pos': 'TO', 'word': 'to'},\n", 648 | " {'pos': 'VB',\n", 649 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 650 | " 'wn_lemma': 'read',\n", 651 | " 'wn_pos': 'Verb',\n", 652 | " 'word': 'read'},\n", 653 | " {'pos': 'NNS',\n", 654 | " 'wn_def': 'a young person of either sex; \\na human offspring (son or daughter) of any age; \\nan immature childish person; \\na member of a clan or tribe',\n", 655 | " 'wn_lemma': 'child',\n", 656 | " 'wn_pos': 'Noun',\n", 657 | " 'word': 'children'},\n", 658 | " {'pos': 'POS', 'word': \"'s\"},\n", 659 | " {'pos': 'CC', 'word': 'or'},\n", 660 | " {'pos': 'NNP', 'word': 'YA'},\n", 661 | " {'pos': 'NN',\n", 662 | " 'wn_def': 'creative writing of recognized artistic value; \\nthe humanistic study of a body of literature; \\npublished writings in a particular style on a particular subject; \\nthe profession or art of a writer',\n", 663 | " 'wn_lemma': 'literature',\n", 664 | " 'wn_pos': 'Noun',\n", 665 | " 'word': 'literature'},\n", 666 | " {'pos': ',', 'punct': True, 'word': ','},\n", 667 | " {'pos': 'PRP', 'word': 'they'},\n", 668 | " {'pos': 'MD', 'word': 'must'},\n", 669 | " {'pos': 'RB',\n", 670 | " 'wn_def': 'before anything else; \\nthe initial time; \\nbefore another in time, space, or importance; \\nprominently forward',\n", 671 | " 'wn_lemma': 'first',\n", 672 | " 'wn_pos': 'Adverb',\n", 673 | " 'word': 'first'},\n", 674 | " {'pos': 'VB',\n", 675 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 676 | " 'wn_lemma': 'read',\n", 677 | " 'wn_pos': 'Verb',\n", 678 | " 'word': 'read'},\n", 679 | " {'pos': 'NNP',\n", 680 | " 'wn_def': 'accumulated wealth in the form of money or jewels etc.; \\nart highly prized for its beauty or perfection; \\nany possession that is highly valued by its owner; \\na collection of precious things',\n", 681 | " 'wn_lemma': 'treasure',\n", 682 | " 'wn_pos': 'Noun',\n", 683 | " 'word': 'Treasure'},\n", 684 | " {'pos': 'NNP',\n", 685 | " 'wn_def': 'a land mass (smaller than a continent) that is surrounded by water; \\na zone or area resembling an island',\n", 686 | " 'wn_lemma': 'island',\n", 687 | " 'wn_pos': 'Noun',\n", 688 | " 'word': 'Island'},\n", 689 | " {'pos': 'TO', 'word': 'to'},\n", 690 | " {'pos': 'VB',\n", 691 | " 'wn_def': 'come into the possession of something concrete or abstract; \\nenter or assume a certain state or condition; \\ncause to move; cause to be in a certain position or condition; \\nreceive a specified treatment (abstract); \\nreach a destination; arrive by movement or progress; \\ngo or come after and bring or take back; \\ngo through (mental or physical states or experiences); \\ntake vengeance on or get even; \\nachieve a point or goal; \\ncause to do; cause to act in a specified manner; \\nsucceed in catching or seizing, especially after a chase; \\ncome to have or undergo a change of (physical features and attributes); \\nbe stricken by an illness, fall victim to an illness; \\ncommunicate with a place or person; establish communication with, as if by telephone; \\ngive certain properties to something; \\nmove into a desired direction of discourse; \\ngrasp with the mind or develop an understanding of; \\nattract and fix; \\nreach with a blow or hit in a particular spot; \\nreach by calculation; \\nacquire as a result of some effort or action; \\npurchase; \\nperceive by hearing; \\nsuffer from the receipt of; \\nreceive as a retribution or punishment; \\nleave immediately; used usually in the imperative form; \\nreach and board; \\nirritate; \\nevoke an emotional response; \\napprehend and reproduce accurately; \\nearn or achieve a base by being walked by the pitcher; \\novercome or destroy; \\nbe a mystery or bewildering to; \\ntake the first step or steps in carrying out an action; \\nundergo (as of injuries and illnesses); \\nmake children',\n", 692 | " 'wn_lemma': 'get',\n", 693 | " 'wn_pos': 'Verb',\n", 694 | " 'word': 'get'},\n", 695 | " {'pos': 'DT', 'word': 'an'},\n", 696 | " {'pos': 'NN',\n", 697 | " 'wn_def': 'the content of cognition; the main thing you are thinking about; \\nyour intention; what you intend to do; \\na personal view; \\nan approximate calculation of quantity or degree or worth; \\n(music) melodic subject of a musical composition',\n", 698 | " 'wn_lemma': 'idea',\n", 699 | " 'wn_pos': 'Noun',\n", 700 | " 'word': 'idea'},\n", 701 | " {'pos': 'IN', 'word': 'of'},\n", 702 | " {'pos': 'WP', 'word': 'what'},\n", 703 | " {'pos': 'DT', 'word': 'a'},\n", 704 | " {'pos': 'JJ',\n", 705 | " 'wn_def': 'having desirable or positive qualities especially those suitable for a thing specified; \\nhaving the normally expected amount; \\nmorally admirable; \\ndeserving of esteem and respect; \\npromoting or enhancing well-being; \\nagreeable or pleasing; \\nof moral excellence; \\nhaving or showing knowledge and skill and aptitude; \\nthorough; \\nwith or in a close or intimate relationship; \\nfinancially sound; \\nmost suitable or right for a particular purpose; \\nresulting favorably; \\nexerting force or influence; \\ncapable of pleasing; \\nappealing to the mind; \\nin excellent physical condition; \\ntending to promote physical well-being; beneficial to health; \\nnot forged; \\nnot left to spoil; \\ngenerally admired',\n", 706 | " 'wn_lemma': 'good',\n", 707 | " 'wn_pos': 'Adjective',\n", 708 | " 'word': 'good'},\n", 709 | " {'pos': 'NN',\n", 710 | " 'wn_def': 'a written work or composition that has been published (printed on pages bound together); \\nphysical objects consisting of a number of pages bound together; \\na compilation of the known facts regarding something or someone; \\na written version of a play or other dramatic composition; used in preparing for a performance; \\na record in which commercial accounts are recorded; \\na collection of playing cards satisfying the rules of a card game; \\na collection of rules or prescribed standards on the basis of which decisions are made; \\nthe sacred writings of Islam revealed by God to the prophet Muhammad during his life at Mecca and Medina; \\nthe sacred writings of the Christian religions; \\na major division of a long written composition; \\na number of sheets (ticket or stamps etc.) bound together on one edge',\n", 711 | " 'wn_lemma': 'book',\n", 712 | " 'wn_pos': 'Noun',\n", 713 | " 'word': 'book'},\n", 714 | " {'pos': 'MD', 'word': 'should'},\n", 715 | " {'pos': 'VB',\n", 716 | " 'wn_def': 'include or contain; have as a component; \\ncontain or hold; have within; \\nlessen the intensity of; temper; hold in restraint; hold or keep within limits; \\nbe divisible by; \\nbe capable of holding or containing; \\nhold back, as of a danger or an enemy; check the expansion or influence of',\n", 717 | " 'wn_lemma': 'contain',\n", 718 | " 'wn_pos': 'Verb',\n", 719 | " 'word': 'contain'},\n", 720 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 721 | " {'pos': 'NN',\n", 722 | " 'wn_def': 'an instance of questioning; \\nthe subject matter at issue; \\na sentence of inquiry that asks for a reply; \\nuncertainty about the truth or factuality or existence of something; \\na formal proposal for action made to a deliberative assembly for discussion and vote; \\nan informal reference to a marriage proposal',\n", 723 | " 'wn_lemma': 'question',\n", 724 | " 'wn_pos': 'Noun',\n", 725 | " 'word': 'Question'},\n", 726 | " {'pos': ':', 'punct': True, 'word': ':'},\n", 727 | " {'pos': 'WRB', 'word': 'Why'},\n", 728 | " {'pos': 'VBP', 'word': 'have'},\n", 729 | " {'pos': 'RB', 'word': \"n't\"},\n", 730 | " {'pos': 'PRP', 'word': 'they'},\n", 731 | " {'pos': 'VBD',\n", 732 | " 'wn_def': 'engage in; \\ngive certain properties to something; \\nmake or cause to be or to become; \\ncause to do; cause to act in a specified manner; \\ngive rise to; cause to happen or occur, not always intentionally; \\ncreate or manufacture a man-made product; \\nmake, formulate, or derive in the mind; \\ncompel or make somebody or something to act in a certain way; \\ncreate by artistic means; \\nearn on some commercial or business transaction; earn as salary or wages; \\ncreate or design, often in a certain way; \\nto compose or represent:\"This wall forms the background of the stage setting\"; \\nreach a goal, e.g., \"make the first team\"; \\nbe or be capable of being changed or made into; \\nmake by shaping or bringing together constituents; \\nperform or carry out; \\nmake by combining materials and parts; \\nchange from one form into another; \\nact in a certain way so as to acquire; \\ncharge with a function; charge to be; \\nachieve a point or goal; \\nreach a destination, either real or abstract; \\ninstitute, enact, or establish; \\ncarry out or commit; \\nform by assembling individuals or constituents; \\norganize or be responsible for; \\nput in order or neaten; \\nhead into a specified direction; \\nhave a bowel movement; \\nundergo fabrication or creation; \\nbe suitable for; \\nadd up to; \\namount to; \\nconstitute the essence of; \\nappear to begin an activity; \\nproceed along a path; \\nreach in time; \\ngather and light the materials for; \\nprepare for eating by applying heat; \\ninduce to have sex; \\nassure the success of; \\nrepresent fictitiously, as in a play, or pretend to be or act like; \\nconsider as being; \\ncalculate as being; \\ncause to be enjoyable or pleasurable; \\nfavor the development of; \\ndevelop into; \\nbehave in a certain way; \\neliminate urine',\n", 733 | " 'wn_lemma': 'made',\n", 734 | " 'wn_pos': 'Verb',\n", 735 | " 'word': 'made'},\n", 736 | " {'pos': 'DT', 'word': 'a'},\n", 737 | " {'pos': 'JJ',\n", 738 | " 'wn_def': 'belonging to the modern era; since the Middle Ages; \\nrelating to a recently developed fashion or style; ; \\ncharacteristic of present-day art and music and literature and architecture; \\nahead of the times; \\nused of a living language; being the current stage in its development',\n", 739 | " 'wn_lemma': 'modern',\n", 740 | " 'wn_pos': 'Adjective',\n", 741 | " 'word': 'modern'},\n", 742 | " {'pos': 'NN',\n", 743 | " 'wn_def': 'a form of entertainment that enacts a story by sound and a sequence of images giving the illusion of continuous movement',\n", 744 | " 'wn_lemma': 'movie',\n", 745 | " 'wn_pos': 'Noun',\n", 746 | " 'word': 'movie'},\n", 747 | " {'pos': 'IN', 'word': 'of'},\n", 748 | " {'pos': 'DT', 'word': 'this'},\n", 749 | " {'pos': '.', 'punct': True, 'word': '?'},\n", 750 | " {'pos': 'PRP', 'word': 'It'},\n", 751 | " {'pos': 'MD', 'word': 'would'},\n", 752 | " {'pos': 'VB', 'word': 'be'},\n", 753 | " {'pos': 'JJ',\n", 754 | " 'wn_def': 'relatively large in size or number or extent; larger than others of its kind; \\nof major significance or importance; \\nremarkable or out of the ordinary in degree or magnitude or effect; \\nvery good; \\nuppercase; \\nin an advanced stage of pregnancy',\n", 755 | " 'wn_lemma': 'great',\n", 756 | " 'wn_pos': 'Adjective',\n", 757 | " 'word': 'great'},\n", 758 | " {'pos': 'NN',\n", 759 | " 'wn_def': 'a medium that disseminates moving pictures; \\na theater where films are shown',\n", 760 | " 'wn_lemma': 'cinema',\n", 761 | " 'wn_pos': 'Noun',\n", 762 | " 'word': 'cinema'},\n", 763 | " {'pos': '.', 'punct': True, 'word': '.'}]" 764 | ] 765 | } 766 | ], 767 | "prompt_number": 6 768 | }, 769 | { 770 | "cell_type": "code", 771 | "collapsed": false, 772 | "input": [ 773 | "\n", 774 | "a[39]\n" 775 | ], 776 | "language": "python", 777 | "metadata": {}, 778 | "outputs": [ 779 | { 780 | "metadata": {}, 781 | "output_type": "pyout", 782 | "prompt_number": 7, 783 | "text": [ 784 | "{'pos': 'VBN',\n", 785 | " 'wn_def': \"derive or receive pleasure from; get enjoyment from; take pleasure in; \\nhave benefit from; \\nget pleasure from; \\nhave for one's benefit; \\ntake delight in\",\n", 786 | " 'wn_lemma': 'enjoyed',\n", 787 | " 'wn_pos': 'Verb',\n", 788 | " 'word': 'enjoyed'}" 789 | ] 790 | } 791 | ], 792 | "prompt_number": 7 793 | }, 794 | { 795 | "cell_type": "code", 796 | "collapsed": false, 797 | "input": [ 798 | "sense = word_sense_disambiguate(a[39]['word'], wordnet_pos_code(a[39]['pos']), review)\n", 799 | "sense" 800 | ], 801 | "language": "python", 802 | "metadata": {}, 803 | "outputs": [ 804 | { 805 | "metadata": {}, 806 | "output_type": "pyout", 807 | "prompt_number": 8, 808 | "text": [ 809 | "Synset('enjoy.v.01')" 810 | ] 811 | } 812 | ], 813 | "prompt_number": 8 814 | }, 815 | { 816 | "cell_type": "code", 817 | "collapsed": false, 818 | "input": [ 819 | "wordnet.synset('enjoy.v.01').definition" 820 | ], 821 | "language": "python", 822 | "metadata": {}, 823 | "outputs": [ 824 | { 825 | "metadata": {}, 826 | "output_type": "pyout", 827 | "prompt_number": 9, 828 | "text": [ 829 | "'derive or receive pleasure from; get enjoyment from; take pleasure in'" 830 | ] 831 | } 832 | ], 833 | "prompt_number": 9 834 | }, 835 | { 836 | "cell_type": "code", 837 | "collapsed": false, 838 | "input": [ 839 | "sent = sentiment.senti_synset(sense.name)\n", 840 | "sent.pos_score\n", 841 | "\n" 842 | ], 843 | "language": "python", 844 | "metadata": {}, 845 | "outputs": [ 846 | { 847 | "metadata": {}, 848 | "output_type": "pyout", 849 | "prompt_number": 10, 850 | "text": [ 851 | "0.375" 852 | ] 853 | } 854 | ], 855 | "prompt_number": 10 856 | }, 857 | { 858 | "cell_type": "code", 859 | "collapsed": false, 860 | "input": [ 861 | "sent.neg_score" 862 | ], 863 | "language": "python", 864 | "metadata": {}, 865 | "outputs": [ 866 | { 867 | "metadata": {}, 868 | "output_type": "pyout", 869 | "prompt_number": 14, 870 | "text": [ 871 | "0.0" 872 | ] 873 | } 874 | ], 875 | "prompt_number": 14 876 | }, 877 | { 878 | "cell_type": "code", 879 | "collapsed": false, 880 | "input": [ 881 | "sent.obj_score" 882 | ], 883 | "language": "python", 884 | "metadata": {}, 885 | "outputs": [ 886 | { 887 | "metadata": {}, 888 | "output_type": "pyout", 889 | "prompt_number": 15, 890 | "text": [ 891 | "0.625" 892 | ] 893 | } 894 | ], 895 | "prompt_number": 15 896 | }, 897 | { 898 | "cell_type": "code", 899 | "collapsed": false, 900 | "input": [ 901 | "API_KEY_BING=''\n", 902 | "API_KEY_GOOGLE=''\n", 903 | "\n", 904 | "class GoogleApi:\n", 905 | " def __init__(self):\n", 906 | " self.service = build(\"customsearch\", \"v1\", developerKey=API_KEY_GOOGLE)\n", 907 | "\n", 908 | "\n", 909 | " def count(self,query):\n", 910 | " res = self.service.cse().list(\n", 911 | " q=query,\n", 912 | " cx='017576662512468239146:omuauf_lfve',\n", 913 | " ).execute()\n", 914 | " if 'nextPage' in res['queries']:\n", 915 | " return float(res['queries']['nextPage'][0]['totalResults'])\n", 916 | " else:\n", 917 | " return float(res['queries']['request'][0]['totalResults'])\n", 918 | "\n", 919 | "def request_bing(query, **params):\n", 920 | " URL_BING = 'https://api.datamarket.azure.com/Bing/Search/v1/Composite?Sources=%(source)s&Query=%(query)s&$top=50&$format=json'\n", 921 | " url = URL_BING % {'source': urllib2.quote(\"'web'\"),\n", 922 | " 'query': urllib2.quote(\"'\"+query+\"'\")}\n", 923 | " r = requests.get(url, auth=('', API_KEY_BING))\n", 924 | " return float(r.json()['d']['results'][0]['WebTotal'])" 925 | ], 926 | "language": "python", 927 | "metadata": {}, 928 | "outputs": [], 929 | "prompt_number": 11 930 | }, 931 | { 932 | "cell_type": "code", 933 | "collapsed": false, 934 | "input": [ 935 | "g = GoogleApi()\n", 936 | "res = g.count(\"aaa\")\n", 937 | "res" 938 | ], 939 | "language": "python", 940 | "metadata": {}, 941 | "outputs": [ 942 | { 943 | "metadata": {}, 944 | "output_type": "pyout", 945 | "prompt_number": 12, 946 | "text": [ 947 | "62700000.0" 948 | ] 949 | } 950 | ], 951 | "prompt_number": 12 952 | }, 953 | { 954 | "cell_type": "code", 955 | "collapsed": false, 956 | "input": [ 957 | "res = request_bing(\"prova\")\n", 958 | "res" 959 | ], 960 | "language": "python", 961 | "metadata": {}, 962 | "outputs": [ 963 | { 964 | "metadata": {}, 965 | "output_type": "pyout", 966 | "prompt_number": 19, 967 | "text": [ 968 | "55600000.0" 969 | ] 970 | } 971 | ], 972 | "prompt_number": 19 973 | } 974 | ], 975 | "metadata": {} 976 | } 977 | ] 978 | } -------------------------------------------------------------------------------- /BDA_Senti_ipython/.ipynb_checkpoints/SentiWordNet_reviewClassification-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "" 4 | }, 5 | "nbformat": 3, 6 | "nbformat_minor": 0, 7 | "worksheets": [ 8 | { 9 | "cells": [ 10 | { 11 | "cell_type": "code", 12 | "collapsed": false, 13 | "input": [ 14 | "import sys\n", 15 | "import csv\n", 16 | "import nltk\n", 17 | "from nltk.corpus import wordnet\n", 18 | "import re\n", 19 | "import codecs\n", 20 | "import pprint\n", 21 | "\n", 22 | "class SentiWordNetCorpusReader:\n", 23 | " def __init__(self, filename):\n", 24 | " \"\"\"\n", 25 | " Argument:\n", 26 | " filename -- the name of the text file containing the\n", 27 | " SentiWordNet database\n", 28 | " \"\"\" \n", 29 | " self.filename = filename\n", 30 | " self.db = {}\n", 31 | " self.parse_src_file()\n", 32 | "\n", 33 | " def parse_src_file(self):\n", 34 | " lines = codecs.open(self.filename, \"r\", \"utf8\").read().splitlines()\n", 35 | " lines = filter((lambda x : not re.search(r\"^\\s*#\", x)), lines)\n", 36 | " for i, line in enumerate(lines):\n", 37 | " fields = re.split(r\"\\t+\", line)\n", 38 | " fields = map(unicode.strip, fields)\n", 39 | " try: \n", 40 | " pos, offset, pos_score, neg_score, synset_terms, gloss = fields\n", 41 | " except:\n", 42 | " sys.stderr.write(\"Line %s formatted incorrectly: %s\\n\" % (i, line))\n", 43 | " if pos and offset:\n", 44 | " offset = int(offset)\n", 45 | " self.db[(pos, offset)] = (float(pos_score), float(neg_score))\n", 46 | "\n", 47 | " def senti_synset(self, *vals): \n", 48 | " if tuple(vals) in self.db:\n", 49 | " pos_score, neg_score = self.db[tuple(vals)]\n", 50 | " pos, offset = vals\n", 51 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 52 | " return SentiSynset(pos_score, neg_score, synset)\n", 53 | " else:\n", 54 | " synset = wordnet.synset(vals[0])\n", 55 | " pos = synset.pos\n", 56 | " offset = synset.offset\n", 57 | " if (pos, offset) in self.db:\n", 58 | " pos_score, neg_score = self.db[(pos, offset)]\n", 59 | " return SentiSynset(pos_score, neg_score, synset)\n", 60 | " else:\n", 61 | " return None\n", 62 | "\n", 63 | " def senti_synsets(self, string, pos=None):\n", 64 | " sentis = []\n", 65 | " synset_list = wordnet.synsets(string, pos)\n", 66 | " for synset in synset_list:\n", 67 | " sentis.append(self.senti_synset(synset.name))\n", 68 | " sentis = filter(lambda x : x, sentis)\n", 69 | " return sentis\n", 70 | "\n", 71 | " def all_senti_synsets(self):\n", 72 | " for key, fields in self.db.iteritems():\n", 73 | " pos, offset = key\n", 74 | " pos_score, neg_score = fields\n", 75 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 76 | " yield SentiSynset(pos_score, neg_score, synset)\n", 77 | "\n", 78 | "class SentiSynset:\n", 79 | " def __init__(self, pos_score, neg_score, synset):\n", 80 | " self.pos_score = pos_score\n", 81 | " self.neg_score = neg_score\n", 82 | " self.obj_score = 1.0 - (self.pos_score + self.neg_score)\n", 83 | " self.synset = synset\n", 84 | "\n", 85 | " def __str__(self):\n", 86 | " \"\"\"Prints just the Pos/Neg scores for now.\"\"\"\n", 87 | " s = \"\"\n", 88 | " s += self.synset.name + \"\\t\"\n", 89 | " s += \"PosScore: %s\\t\" % self.pos_score \n", 90 | " s += \"NegScore: %s\" % self.neg_score\n", 91 | " return s\n", 92 | "\n", 93 | " def __repr__(self):\n", 94 | " return \"Senti\" + repr(self.synset)\n", 95 | "\n", 96 | "def tweet_dict(twitterData): \n", 97 | " ''' (file) -> list of dictionaries\n", 98 | " This method should take your .csv\n", 99 | " file and create a list of dictionaries.\n", 100 | " '''\n", 101 | " twitter_list_dict = [] \n", 102 | " twitterfile = open(twitterData)\n", 103 | " twitterreader = csv.reader(twitterfile)\n", 104 | " for line in twitterreader:\n", 105 | " twitter_list_dict.append(line[0])\n", 106 | " return twitter_list_dict\n", 107 | "\n", 108 | "# return true if a string ia a stopword\n", 109 | "def is_stopword(string):\n", 110 | " if string.lower() in nltk.corpus.stopwords.words('english'):\n", 111 | " return True\n", 112 | " else:\n", 113 | " return False\n", 114 | "\n", 115 | " # return true if a string is punctation \n", 116 | "def is_punctuation(string):\n", 117 | " for char in string:\n", 118 | " if char.isalpha() or char.isdigit():\n", 119 | " return False\n", 120 | " return True\n", 121 | "\n", 122 | "# Translation from nltk to Wordnet (words tag) (code)\n", 123 | "def wordnet_pos_code(tag):\n", 124 | " if tag.startswith('NN'):\n", 125 | " return wordnet.NOUN\n", 126 | " elif tag.startswith('VB'):\n", 127 | " return wordnet.VERB\n", 128 | " elif tag.startswith('JJ'):\n", 129 | " return wordnet.ADJ\n", 130 | " elif tag.startswith('RB'):\n", 131 | " return wordnet.ADV\n", 132 | " else:\n", 133 | " return ''\n", 134 | "\n", 135 | " \n", 136 | "# Translation from nltk to Wordnet (words tag) (label)\n", 137 | "def wordnet_pos_label(tag):\n", 138 | " if tag.startswith('NN'):\n", 139 | " return \"Noun\"\n", 140 | " elif tag.startswith('VB'):\n", 141 | " return \"Verb\"\n", 142 | " elif tag.startswith('JJ'):\n", 143 | " return \"Adjective\"\n", 144 | " elif tag.startswith('RB'):\n", 145 | " return \"Adverb\"\n", 146 | " else:\n", 147 | " return tag\n", 148 | " \n", 149 | "\n", 150 | "\"\"\" input -> a sentence \n", 151 | " otput -> sentence in which each words is enriched of -> lemma, wordnet_pos, wordnet_definitions \n", 152 | "\n", 153 | "\"\"\"\n", 154 | "def wordnet_definitions(sentence):\n", 155 | " wnl = nltk.WordNetLemmatizer()\n", 156 | " for token in sentence:\n", 157 | " word = token['word']\n", 158 | " wn_pos = wordnet_pos_code(token['pos'])\n", 159 | " if is_punctuation(word):\n", 160 | " token['punct'] = True\n", 161 | " elif is_stopword(word):\n", 162 | " pass\n", 163 | " elif len(wordnet.synsets(word, wn_pos)) > 0:\n", 164 | " token['wn_lemma'] = wnl.lemmatize(word.lower())\n", 165 | " token['wn_pos'] = wordnet_pos_label(token['pos'])\n", 166 | " defs = [sense.definition for sense in wordnet.synsets(word, wn_pos)]\n", 167 | " token['wn_def'] = \"; \\n\".join(defs) \n", 168 | " else:\n", 169 | " pass\n", 170 | " return sentence\n", 171 | "\n", 172 | "\n", 173 | "#Tokenization\n", 174 | "\n", 175 | "def tag_tweet(tweet): \n", 176 | " sents = nltk.sent_tokenize(tweet)\n", 177 | " sentence = []\n", 178 | " for sent in sents:\n", 179 | " tokens = nltk.word_tokenize(sent)\n", 180 | " tag_tuples = nltk.pos_tag(tokens)\n", 181 | " for (string, tag) in tag_tuples:\n", 182 | " token = {'word':string, 'pos':tag} \n", 183 | " sentence.append(token) \n", 184 | " return sentence\n", 185 | "\n", 186 | "\n", 187 | "\n", 188 | "# WSD\n", 189 | "\n", 190 | "def word_sense_disambiguate(word, wn_pos, tweet):\n", 191 | " senses = wordnet.synsets(word, wn_pos)\n", 192 | " if len(senses) >0:\n", 193 | " cfd = nltk.ConditionalFreqDist(\n", 194 | " (sense, def_word)\n", 195 | " for sense in senses\n", 196 | " for def_word in sense.definition.split()\n", 197 | " if def_word in tweet)\n", 198 | " best_sense = senses[0] # start with first sense\n", 199 | " for sense in senses:\n", 200 | " try:\n", 201 | " if cfd[sense].max() > cfd[best_sense].max():\n", 202 | " best_sense = sense\n", 203 | " except: \n", 204 | " pass \n", 205 | " return best_sense\n", 206 | " else:\n", 207 | " return None" 208 | ], 209 | "language": "python", 210 | "metadata": {}, 211 | "outputs": [], 212 | "prompt_number": 1 213 | }, 214 | { 215 | "cell_type": "code", 216 | "collapsed": false, 217 | "input": [ 218 | "sentiment = SentiWordNetCorpusReader(\"SentiWordNet_3.0.0_20130122.txt\")" 219 | ], 220 | "language": "python", 221 | "metadata": {}, 222 | "outputs": [], 223 | "prompt_number": 2 224 | }, 225 | { 226 | "cell_type": "code", 227 | "collapsed": false, 228 | "input": [ 229 | "revfile = open(\"rev_bad_2.txt\")\n", 230 | "\n", 231 | "review = revfile.read()\n", 232 | "\n", 233 | "review" 234 | ], 235 | "language": "python", 236 | "metadata": {}, 237 | "outputs": [ 238 | { 239 | "metadata": {}, 240 | "output_type": "pyout", 241 | "prompt_number": 20, 242 | "text": [ 243 | "'The company sent me 2 of this product and they both were not able to charge or sync my idevices. This Apple product is very high tech it actually has a chip that verifies the phone and the cable. The customer service was outstanding when dealing with the replacement of the product. I would buy from this seller again just not an apple product that has such high tech specifications. You get what you pay for.'" 244 | ] 245 | } 246 | ], 247 | "prompt_number": 20 248 | }, 249 | { 250 | "cell_type": "code", 251 | "collapsed": false, 252 | "input": [ 253 | "a = wordnet_definitions(tag_tweet(review))\n", 254 | "obj_score = 0 # object score \n", 255 | "pos_score=0 # positive score\n", 256 | "neg_score=0 #negative score\n", 257 | "pos_score_tre=0\n", 258 | "neg_score_tre=0\n", 259 | "threshold = 0.75\n", 260 | "count = 0\n", 261 | "count_tre = 0\n", 262 | "\n", 263 | "\"\"\"\n", 264 | "Conversion from plain text to SentiWordnet scores\n", 265 | "\"\"\"\n", 266 | " \n", 267 | "for word in a:\n", 268 | " if 'punct' not in word :\n", 269 | " sense = word_sense_disambiguate(word['word'], wordnet_pos_code(word['pos']), review)\n", 270 | " \n", 271 | " if sense is not None:\n", 272 | " sent = sentiment.senti_synset(sense.name)\n", 273 | " # Extraction of the scores\n", 274 | " if sent is not None and sent.obj_score <> 1:\n", 275 | " obj_score = obj_score + float(sent.obj_score)\n", 276 | " pos_score = pos_score + float(sent.pos_score)\n", 277 | " neg_score = neg_score + float(sent.neg_score)\n", 278 | " count=count+1\n", 279 | " print str(sent.pos_score)+ \" - \"+str(sent.neg_score)+ \" - \"+ str(sent.obj_score)+\" - \"+sent.synset.name\n", 280 | " if sent.obj_score < threshold:\n", 281 | " pos_score_tre = pos_score_tre + float(sent.pos_score)\n", 282 | " neg_score_tre = neg_score_tre + float(sent.neg_score)\n", 283 | " count_tre=count_tre+1\n", 284 | "print review\n", 285 | "\n", 286 | "#Evaluation by different methods\n", 287 | "\n", 288 | "avg_pos_score=0\n", 289 | "avg_neg_score=0\n", 290 | "avg_neg_score_tre=0\n", 291 | "avg_neg_score_tre=0\n", 292 | "\n", 293 | "#2\n", 294 | "\n", 295 | "if count <> 0:\n", 296 | " \n", 297 | " avg_pos_score=pos_score/count\n", 298 | " avg_neg_score=neg_score/count\n", 299 | "\n", 300 | "#3\n", 301 | "\n", 302 | "if count_tre <> 0:\n", 303 | " avg_pos_score_tre=pos_score_tre/count_tre\n", 304 | " avg_neg_score_tre=neg_score_tre/count_tre\n", 305 | "\n", 306 | "#pint results\n", 307 | "#1\n", 308 | "print \"pos_total : \"+str(pos_score)+\" - neg_ total: \"+str(neg_score)+\" - count : \"+str(count)+\" -> \"+(\" positivo \" if pos_score > neg_score else (\"negativo\" if pos_score < neg_score else \"neutro\"))\n", 309 | "#2\n", 310 | "print \"(AVG) pos : \"+str(avg_pos_score)+\" - (AVG) neg : \"+str(avg_neg_score)+\" -> \"+(\" positivo \" if avg_pos_score > avg_neg_score else (\"negativo\" if avg_pos_score < avg_neg_score else \"neutro\"))\n", 311 | "#3\n", 312 | "if count_tre > 0:\n", 313 | " print \"(AVG_TRE) pos : \"+str(avg_pos_score_tre)+\" - (AVG_TRE) neg : \"+str(avg_neg_score_tre)+\" -> \"+(\" positivo \" if avg_pos_score_tre > avg_neg_score_tre else (\"negativo\" if avg_pos_score_tre < avg_neg_score_tre else \"neutro\"))\n", 314 | "print \"\"" 315 | ], 316 | "language": "python", 317 | "metadata": {}, 318 | "outputs": [ 319 | { 320 | "output_type": "stream", 321 | "stream": "stdout", 322 | "text": [ 323 | "0.25 - 0.125 - 0.625 - be.v.01\n", 324 | "0.0 - 0.625 - 0.375 - not.r.01\n", 325 | "0.25 - 0.125 - 0.625 - be.v.01\n", 326 | "0.25 - 0.25 - 0.5 - very.r.01\n", 327 | "0.125 - 0.0 - 0.875 - actually.r.02\n", 328 | "0.125 - 0.0 - 0.875 - affirm.v.02\n", 329 | "0.25 - 0.125 - 0.625 - be.v.01\n", 330 | "0.125 - 0.0 - 0.875 - barely.r.01\n", 331 | "0.0 - 0.625 - 0.375 - not.r.01\n", 332 | "The company sent me 2 of this product and they both were not able to charge or sync my idevices. This Apple product is very high tech it actually has a chip that verifies the phone and the cable. The customer service was outstanding when dealing with the replacement of the product. I would buy from this seller again just not an apple product that has such high tech specifications. You get what you pay for.\n", 333 | "pos_total : 1.375 - neg_ total: 1.875 - count : 9 -> negativo\n", 334 | "(AVG) pos : 0.152777777778 - (AVG) neg : 0.208333333333 -> negativo\n", 335 | "(AVG_TRE) pos : 0.166666666667 - (AVG_TRE) neg : 0.3125 -> negativo\n", 336 | "\n" 337 | ] 338 | } 339 | ], 340 | "prompt_number": 21 341 | } 342 | ], 343 | "metadata": {} 344 | } 345 | ] 346 | } -------------------------------------------------------------------------------- /BDA_Senti_ipython/Input_Output_Functions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "" 4 | }, 5 | "nbformat": 3, 6 | "nbformat_minor": 0, 7 | "worksheets": [ 8 | { 9 | "cells": [ 10 | { 11 | "cell_type": "code", 12 | "collapsed": false, 13 | "input": [ 14 | "import sys\n", 15 | "import csv\n", 16 | "import nltk\n", 17 | "from nltk.corpus import wordnet\n", 18 | "import re\n", 19 | "import codecs\n", 20 | "import pprint\n", 21 | "import nltk\n", 22 | "import requests\n", 23 | "import math\n", 24 | "import urllib2\n", 25 | "\n", 26 | "from apiclient.discovery import build" 27 | ], 28 | "language": "python", 29 | "metadata": {}, 30 | "outputs": [], 31 | "prompt_number": 1 32 | }, 33 | { 34 | "cell_type": "code", 35 | "collapsed": false, 36 | "input": [ 37 | "class SentiWordNetCorpusReader:\n", 38 | " def __init__(self, filename):\n", 39 | " \"\"\"\n", 40 | " Argument:\n", 41 | " filename -- the name of the text file containing the\n", 42 | " SentiWordNet database\n", 43 | " \"\"\" \n", 44 | " self.filename = filename\n", 45 | " self.db = {}\n", 46 | " self.parse_src_file()\n", 47 | "\n", 48 | " def parse_src_file(self):\n", 49 | " lines = codecs.open(self.filename, \"r\", \"utf8\").read().splitlines()\n", 50 | " lines = filter((lambda x : not re.search(r\"^\\s*#\", x)), lines)\n", 51 | " for i, line in enumerate(lines):\n", 52 | " fields = re.split(r\"\\t+\", line)\n", 53 | " fields = map(unicode.strip, fields)\n", 54 | " try: \n", 55 | " pos, offset, pos_score, neg_score, synset_terms, gloss = fields\n", 56 | " except:\n", 57 | " sys.stderr.write(\"Line %s formatted incorrectly: %s\\n\" % (i, line))\n", 58 | " if pos and offset:\n", 59 | " offset = int(offset)\n", 60 | " self.db[(pos, offset)] = (float(pos_score), float(neg_score))\n", 61 | "\n", 62 | " def senti_synset(self, *vals): \n", 63 | " if tuple(vals) in self.db:\n", 64 | " pos_score, neg_score = self.db[tuple(vals)]\n", 65 | " pos, offset = vals\n", 66 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 67 | " return SentiSynset(pos_score, neg_score, synset)\n", 68 | " else:\n", 69 | " synset = wordnet.synset(vals[0])\n", 70 | " pos = synset.pos\n", 71 | " offset = synset.offset\n", 72 | " if (pos, offset) in self.db:\n", 73 | " pos_score, neg_score = self.db[(pos, offset)]\n", 74 | " return SentiSynset(pos_score, neg_score, synset)\n", 75 | " else:\n", 76 | " return None\n", 77 | "\n", 78 | " def senti_synsets(self, string, pos=None):\n", 79 | " sentis = []\n", 80 | " synset_list = wordnet.synsets(string, pos)\n", 81 | " for synset in synset_list:\n", 82 | " sentis.append(self.senti_synset(synset.name))\n", 83 | " sentis = filter(lambda x : x, sentis)\n", 84 | " return sentis\n", 85 | "\n", 86 | " def all_senti_synsets(self):\n", 87 | " for key, fields in self.db.iteritems():\n", 88 | " pos, offset = key\n", 89 | " pos_score, neg_score = fields\n", 90 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 91 | " yield SentiSynset(pos_score, neg_score, synset)\n", 92 | "\n", 93 | "class SentiSynset:\n", 94 | " def __init__(self, pos_score, neg_score, synset):\n", 95 | " self.pos_score = pos_score\n", 96 | " self.neg_score = neg_score\n", 97 | " self.obj_score = 1.0 - (self.pos_score + self.neg_score)\n", 98 | " self.synset = synset\n", 99 | "\n", 100 | " def __str__(self):\n", 101 | " \"\"\"Prints just the Pos/Neg scores for now.\"\"\"\n", 102 | " s = \"\"\n", 103 | " s += self.synset.name + \"\\t\"\n", 104 | " s += \"PosScore: %s\\t\" % self.pos_score \n", 105 | " s += \"NegScore: %s\" % self.neg_score\n", 106 | " return s\n", 107 | "\n", 108 | " def __repr__(self):\n", 109 | " return \"Senti\" + repr(self.synset)\n", 110 | "\n", 111 | "def tweet_dict(twitterData): \n", 112 | " ''' (file) -> list of dictionaries\n", 113 | " This method should take your .csv\n", 114 | " file and create a list of dictionaries.\n", 115 | " '''\n", 116 | " twitter_list_dict = [] \n", 117 | " twitterfile = open(twitterData)\n", 118 | " twitterreader = csv.reader(twitterfile)\n", 119 | " for line in twitterreader:\n", 120 | " twitter_list_dict.append(line[0])\n", 121 | " return twitter_list_dict\n", 122 | "\n", 123 | "# return true if a string ia a stopword\n", 124 | "def is_stopword(string):\n", 125 | " if string.lower() in nltk.corpus.stopwords.words('english'):\n", 126 | " return True\n", 127 | " else:\n", 128 | " return False\n", 129 | "\n", 130 | " # return true if a string is punctation \n", 131 | "def is_punctuation(string):\n", 132 | " for char in string:\n", 133 | " if char.isalpha() or char.isdigit():\n", 134 | " return False\n", 135 | " return True\n", 136 | "\n", 137 | "# Translation from nltk to Wordnet (words tag) (code)\n", 138 | "def wordnet_pos_code(tag):\n", 139 | " if tag.startswith('NN'):\n", 140 | " return wordnet.NOUN\n", 141 | " elif tag.startswith('VB'):\n", 142 | " return wordnet.VERB\n", 143 | " elif tag.startswith('JJ'):\n", 144 | " return wordnet.ADJ\n", 145 | " elif tag.startswith('RB'):\n", 146 | " return wordnet.ADV\n", 147 | " else:\n", 148 | " return ''\n", 149 | "\n", 150 | " \n", 151 | "# Translation from nltk to Wordnet (words tag) (label)\n", 152 | "def wordnet_pos_label(tag):\n", 153 | " if tag.startswith('NN'):\n", 154 | " return \"Noun\"\n", 155 | " elif tag.startswith('VB'):\n", 156 | " return \"Verb\"\n", 157 | " elif tag.startswith('JJ'):\n", 158 | " return \"Adjective\"\n", 159 | " elif tag.startswith('RB'):\n", 160 | " return \"Adverb\"\n", 161 | " else:\n", 162 | " return tag\n", 163 | " \n", 164 | "\n", 165 | "\"\"\" input -> a sentence \n", 166 | " otput -> sentence in which each words is enriched of -> lemma, wordnet_pos, wordnet_definitions \n", 167 | "\n", 168 | "\"\"\"\n", 169 | "def wordnet_definitions(sentence):\n", 170 | " wnl = nltk.WordNetLemmatizer()\n", 171 | " for token in sentence:\n", 172 | " word = token['word']\n", 173 | " wn_pos = wordnet_pos_code(token['pos'])\n", 174 | " if is_punctuation(word):\n", 175 | " token['punct'] = True\n", 176 | " elif is_stopword(word):\n", 177 | " pass\n", 178 | " elif len(wordnet.synsets(word, wn_pos)) > 0:\n", 179 | " token['wn_lemma'] = wnl.lemmatize(word.lower())\n", 180 | " token['wn_pos'] = wordnet_pos_label(token['pos'])\n", 181 | " defs = [sense.definition for sense in wordnet.synsets(word, wn_pos)]\n", 182 | " token['wn_def'] = \"; \\n\".join(defs) \n", 183 | " else:\n", 184 | " pass\n", 185 | " return sentence\n", 186 | "\n", 187 | "\n", 188 | "#Tokenization\n", 189 | "\n", 190 | "def tag_tweet(tweet): \n", 191 | " sents = nltk.sent_tokenize(tweet)\n", 192 | " sentence = []\n", 193 | " for sent in sents:\n", 194 | " tokens = nltk.word_tokenize(sent)\n", 195 | " tag_tuples = nltk.pos_tag(tokens)\n", 196 | " for (string, tag) in tag_tuples:\n", 197 | " token = {'word':string, 'pos':tag} \n", 198 | " sentence.append(token) \n", 199 | " return sentence\n", 200 | "\n", 201 | "\n", 202 | "\n", 203 | "# WSD\n", 204 | "\n", 205 | "def word_sense_disambiguate(word, wn_pos, tweet):\n", 206 | " senses = wordnet.synsets(word, wn_pos)\n", 207 | " if len(senses) >0:\n", 208 | " cfd = nltk.ConditionalFreqDist(\n", 209 | " (sense, def_word)\n", 210 | " for sense in senses\n", 211 | " for def_word in sense.definition.split()\n", 212 | " if def_word in tweet)\n", 213 | " best_sense = senses[0] # start with first sense\n", 214 | " for sense in senses:\n", 215 | " try:\n", 216 | " if cfd[sense].max() > cfd[best_sense].max():\n", 217 | " best_sense = sense\n", 218 | " except: \n", 219 | " pass \n", 220 | " return best_sense\n", 221 | " else:\n", 222 | " return None\n", 223 | " \n" 224 | ], 225 | "language": "python", 226 | "metadata": {}, 227 | "outputs": [], 228 | "prompt_number": 2 229 | }, 230 | { 231 | "cell_type": "code", 232 | "collapsed": false, 233 | "input": [ 234 | "sentiment = SentiWordNetCorpusReader(\"SentiWordNet_3.0.0_20130122.txt\")" 235 | ], 236 | "language": "python", 237 | "metadata": {}, 238 | "outputs": [], 239 | "prompt_number": 3 240 | }, 241 | { 242 | "cell_type": "code", 243 | "collapsed": false, 244 | "input": [ 245 | "\n", 246 | "revfile = open(\"les.txt\")\n", 247 | "\n", 248 | "review = revfile.read()\n", 249 | "\n", 250 | "review" 251 | ], 252 | "language": "python", 253 | "metadata": {}, 254 | "outputs": [ 255 | { 256 | "metadata": {}, 257 | "output_type": "pyout", 258 | "prompt_number": 4, 259 | "text": [ 260 | "\"This is an example of great writing. The style, pace, characters, descriptions, settings, action--all of it. It's great. I read this book in a literature class and really enjoyed it. It is perhaps the best classic I have read besides Tolkien's The Hobbit. Treasure Island manages to stand alone as the go-to book on pirates, sea voyages, and young men seeking adventure. Before anyone decides to read children's or YA literature, they must first read Treasure Island to get an idea of what a good book should contain.\\nQuestion: Why haven't they made a modern movie of this? It would be great cinema.\"" 261 | ] 262 | } 263 | ], 264 | "prompt_number": 4 265 | }, 266 | { 267 | "cell_type": "code", 268 | "collapsed": false, 269 | "input": [ 270 | "a = tag_tweet(review)\n", 271 | "\n", 272 | "a" 273 | ], 274 | "language": "python", 275 | "metadata": {}, 276 | "outputs": [ 277 | { 278 | "metadata": {}, 279 | "output_type": "pyout", 280 | "prompt_number": 5, 281 | "text": [ 282 | "[{'pos': 'DT', 'word': 'This'},\n", 283 | " {'pos': 'VBZ', 'word': 'is'},\n", 284 | " {'pos': 'DT', 'word': 'an'},\n", 285 | " {'pos': 'NN', 'word': 'example'},\n", 286 | " {'pos': 'IN', 'word': 'of'},\n", 287 | " {'pos': 'JJ', 'word': 'great'},\n", 288 | " {'pos': 'NN', 'word': 'writing'},\n", 289 | " {'pos': '.', 'word': '.'},\n", 290 | " {'pos': 'DT', 'word': 'The'},\n", 291 | " {'pos': 'NN', 'word': 'style'},\n", 292 | " {'pos': ',', 'word': ','},\n", 293 | " {'pos': 'NN', 'word': 'pace'},\n", 294 | " {'pos': ',', 'word': ','},\n", 295 | " {'pos': 'NNS', 'word': 'characters'},\n", 296 | " {'pos': ',', 'word': ','},\n", 297 | " {'pos': 'NNS', 'word': 'descriptions'},\n", 298 | " {'pos': ',', 'word': ','},\n", 299 | " {'pos': 'NNS', 'word': 'settings'},\n", 300 | " {'pos': ',', 'word': ','},\n", 301 | " {'pos': 'NN', 'word': 'action'},\n", 302 | " {'pos': ':', 'word': '--'},\n", 303 | " {'pos': 'DT', 'word': 'all'},\n", 304 | " {'pos': 'IN', 'word': 'of'},\n", 305 | " {'pos': 'PRP', 'word': 'it'},\n", 306 | " {'pos': '.', 'word': '.'},\n", 307 | " {'pos': 'PRP', 'word': 'It'},\n", 308 | " {'pos': 'VBZ', 'word': \"'s\"},\n", 309 | " {'pos': 'JJ', 'word': 'great'},\n", 310 | " {'pos': '.', 'word': '.'},\n", 311 | " {'pos': 'PRP', 'word': 'I'},\n", 312 | " {'pos': 'VBP', 'word': 'read'},\n", 313 | " {'pos': 'DT', 'word': 'this'},\n", 314 | " {'pos': 'NN', 'word': 'book'},\n", 315 | " {'pos': 'IN', 'word': 'in'},\n", 316 | " {'pos': 'DT', 'word': 'a'},\n", 317 | " {'pos': 'NN', 'word': 'literature'},\n", 318 | " {'pos': 'NN', 'word': 'class'},\n", 319 | " {'pos': 'CC', 'word': 'and'},\n", 320 | " {'pos': 'RB', 'word': 'really'},\n", 321 | " {'pos': 'VBN', 'word': 'enjoyed'},\n", 322 | " {'pos': 'PRP', 'word': 'it'},\n", 323 | " {'pos': '.', 'word': '.'},\n", 324 | " {'pos': 'PRP', 'word': 'It'},\n", 325 | " {'pos': 'VBZ', 'word': 'is'},\n", 326 | " {'pos': 'RB', 'word': 'perhaps'},\n", 327 | " {'pos': 'DT', 'word': 'the'},\n", 328 | " {'pos': 'JJS', 'word': 'best'},\n", 329 | " {'pos': 'JJ', 'word': 'classic'},\n", 330 | " {'pos': 'PRP', 'word': 'I'},\n", 331 | " {'pos': 'VBP', 'word': 'have'},\n", 332 | " {'pos': 'VBN', 'word': 'read'},\n", 333 | " {'pos': 'NNS', 'word': 'besides'},\n", 334 | " {'pos': 'NNP', 'word': 'Tolkien'},\n", 335 | " {'pos': 'POS', 'word': \"'s\"},\n", 336 | " {'pos': 'NNP', 'word': 'The'},\n", 337 | " {'pos': 'NNP', 'word': 'Hobbit'},\n", 338 | " {'pos': '.', 'word': '.'},\n", 339 | " {'pos': 'NN', 'word': 'Treasure'},\n", 340 | " {'pos': 'NNP', 'word': 'Island'},\n", 341 | " {'pos': 'NNS', 'word': 'manages'},\n", 342 | " {'pos': 'TO', 'word': 'to'},\n", 343 | " {'pos': 'VB', 'word': 'stand'},\n", 344 | " {'pos': 'RB', 'word': 'alone'},\n", 345 | " {'pos': 'IN', 'word': 'as'},\n", 346 | " {'pos': 'DT', 'word': 'the'},\n", 347 | " {'pos': 'NNP', 'word': 'go-to'},\n", 348 | " {'pos': 'NN', 'word': 'book'},\n", 349 | " {'pos': 'IN', 'word': 'on'},\n", 350 | " {'pos': 'NNS', 'word': 'pirates'},\n", 351 | " {'pos': ',', 'word': ','},\n", 352 | " {'pos': 'NN', 'word': 'sea'},\n", 353 | " {'pos': 'NNS', 'word': 'voyages'},\n", 354 | " {'pos': ',', 'word': ','},\n", 355 | " {'pos': 'CC', 'word': 'and'},\n", 356 | " {'pos': 'JJ', 'word': 'young'},\n", 357 | " {'pos': 'NNS', 'word': 'men'},\n", 358 | " {'pos': 'VBG', 'word': 'seeking'},\n", 359 | " {'pos': 'NN', 'word': 'adventure'},\n", 360 | " {'pos': '.', 'word': '.'},\n", 361 | " {'pos': 'IN', 'word': 'Before'},\n", 362 | " {'pos': 'NN', 'word': 'anyone'},\n", 363 | " {'pos': 'NNS', 'word': 'decides'},\n", 364 | " {'pos': 'TO', 'word': 'to'},\n", 365 | " {'pos': 'VB', 'word': 'read'},\n", 366 | " {'pos': 'NNS', 'word': 'children'},\n", 367 | " {'pos': 'POS', 'word': \"'s\"},\n", 368 | " {'pos': 'CC', 'word': 'or'},\n", 369 | " {'pos': 'NNP', 'word': 'YA'},\n", 370 | " {'pos': 'NN', 'word': 'literature'},\n", 371 | " {'pos': ',', 'word': ','},\n", 372 | " {'pos': 'PRP', 'word': 'they'},\n", 373 | " {'pos': 'MD', 'word': 'must'},\n", 374 | " {'pos': 'RB', 'word': 'first'},\n", 375 | " {'pos': 'VB', 'word': 'read'},\n", 376 | " {'pos': 'NNP', 'word': 'Treasure'},\n", 377 | " {'pos': 'NNP', 'word': 'Island'},\n", 378 | " {'pos': 'TO', 'word': 'to'},\n", 379 | " {'pos': 'VB', 'word': 'get'},\n", 380 | " {'pos': 'DT', 'word': 'an'},\n", 381 | " {'pos': 'NN', 'word': 'idea'},\n", 382 | " {'pos': 'IN', 'word': 'of'},\n", 383 | " {'pos': 'WP', 'word': 'what'},\n", 384 | " {'pos': 'DT', 'word': 'a'},\n", 385 | " {'pos': 'JJ', 'word': 'good'},\n", 386 | " {'pos': 'NN', 'word': 'book'},\n", 387 | " {'pos': 'MD', 'word': 'should'},\n", 388 | " {'pos': 'VB', 'word': 'contain'},\n", 389 | " {'pos': '.', 'word': '.'},\n", 390 | " {'pos': 'NN', 'word': 'Question'},\n", 391 | " {'pos': ':', 'word': ':'},\n", 392 | " {'pos': 'WRB', 'word': 'Why'},\n", 393 | " {'pos': 'VBP', 'word': 'have'},\n", 394 | " {'pos': 'RB', 'word': \"n't\"},\n", 395 | " {'pos': 'PRP', 'word': 'they'},\n", 396 | " {'pos': 'VBD', 'word': 'made'},\n", 397 | " {'pos': 'DT', 'word': 'a'},\n", 398 | " {'pos': 'JJ', 'word': 'modern'},\n", 399 | " {'pos': 'NN', 'word': 'movie'},\n", 400 | " {'pos': 'IN', 'word': 'of'},\n", 401 | " {'pos': 'DT', 'word': 'this'},\n", 402 | " {'pos': '.', 'word': '?'},\n", 403 | " {'pos': 'PRP', 'word': 'It'},\n", 404 | " {'pos': 'MD', 'word': 'would'},\n", 405 | " {'pos': 'VB', 'word': 'be'},\n", 406 | " {'pos': 'JJ', 'word': 'great'},\n", 407 | " {'pos': 'NN', 'word': 'cinema'},\n", 408 | " {'pos': '.', 'word': '.'}]" 409 | ] 410 | } 411 | ], 412 | "prompt_number": 5 413 | }, 414 | { 415 | "cell_type": "code", 416 | "collapsed": false, 417 | "input": [ 418 | "a=wordnet_definitions(a)\n", 419 | "a" 420 | ], 421 | "language": "python", 422 | "metadata": {}, 423 | "outputs": [ 424 | { 425 | "metadata": {}, 426 | "output_type": "pyout", 427 | "prompt_number": 6, 428 | "text": [ 429 | "[{'pos': 'DT', 'word': 'This'},\n", 430 | " {'pos': 'VBZ', 'word': 'is'},\n", 431 | " {'pos': 'DT', 'word': 'an'},\n", 432 | " {'pos': 'NN',\n", 433 | " 'wn_def': 'an item of information that is typical of a class or group; \\na representative form or pattern; \\nsomething to be imitated; \\npunishment intended as a warning to others; \\nan occurrence of something; \\na task performed or problem solved in order to develop skill or understanding',\n", 434 | " 'wn_lemma': 'example',\n", 435 | " 'wn_pos': 'Noun',\n", 436 | " 'word': 'example'},\n", 437 | " {'pos': 'IN', 'word': 'of'},\n", 438 | " {'pos': 'JJ',\n", 439 | " 'wn_def': 'relatively large in size or number or extent; larger than others of its kind; \\nof major significance or importance; \\nremarkable or out of the ordinary in degree or magnitude or effect; \\nvery good; \\nuppercase; \\nin an advanced stage of pregnancy',\n", 440 | " 'wn_lemma': 'great',\n", 441 | " 'wn_pos': 'Adjective',\n", 442 | " 'word': 'great'},\n", 443 | " {'pos': 'NN',\n", 444 | " 'wn_def': 'the act of creating written works; \\nthe work of a writer; anything expressed in letters of the alphabet (especially when considered from the point of view of style and effect); \\n(usually plural) the collected work of an author; \\nletters or symbols that are written or imprinted on a surface to represent the sounds or words of a language; \\nthe activity of putting something in written form',\n", 445 | " 'wn_lemma': 'writing',\n", 446 | " 'wn_pos': 'Noun',\n", 447 | " 'word': 'writing'},\n", 448 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 449 | " {'pos': 'DT', 'word': 'The'},\n", 450 | " {'pos': 'NN',\n", 451 | " 'wn_def': 'how something is done or how it happens; \\na way of expressing something (in language or art or music etc.) that is characteristic of a particular person or group of people or period; \\na particular kind (as to appearance); \\nthe popular taste at a given time; \\n(botany) the narrow elongated part of the pistil between the ovary and the stigma; \\neditorial directions to be followed in spelling and punctuation and capitalization and typographical display; \\ndistinctive and stylish elegance; \\na pointed tool for writing or drawing or engraving; \\na slender bristlelike or tubular process',\n", 452 | " 'wn_lemma': 'style',\n", 453 | " 'wn_pos': 'Noun',\n", 454 | " 'word': 'style'},\n", 455 | " {'pos': ',', 'punct': True, 'word': ','},\n", 456 | " {'pos': 'NN',\n", 457 | " 'wn_def': 'the rate of moving (especially walking or running); \\nthe distance covered by a step; \\nthe relative speed of progress or change; \\na step in walking or running; \\nthe rate of some repeating event; \\na unit of length equal to 3 feet; defined as 91.44 centimeters; originally taken to be the average length of a stride',\n", 458 | " 'wn_lemma': 'pace',\n", 459 | " 'wn_pos': 'Noun',\n", 460 | " 'word': 'pace'},\n", 461 | " {'pos': ',', 'punct': True, 'word': ','},\n", 462 | " {'pos': 'NNS',\n", 463 | " 'wn_def': \"an imaginary person represented in a work of fiction (play or film or story); \\na characteristic property that defines the apparent individual nature of something; \\nthe inherent complex of attributes that determines a persons moral and ethical actions and reactions; \\nan actor's portrayal of someone in a play; \\na person of a specified kind (usually with many eccentricities); \\ngood repute; \\na formal recommendation by a former employer to a potential future employer describing the person's qualifications and dependability; \\na written symbol that is used to represent speech; \\n(genetics) an attribute (structural or functional) that is determined by a gene or group of genes\",\n", 464 | " 'wn_lemma': 'character',\n", 465 | " 'wn_pos': 'Noun',\n", 466 | " 'word': 'characters'},\n", 467 | " {'pos': ',', 'punct': True, 'word': ','},\n", 468 | " {'pos': 'NNS',\n", 469 | " 'wn_def': 'a statement that represents something in words; \\nthe act of describing something; \\nsort or variety',\n", 470 | " 'wn_lemma': 'description',\n", 471 | " 'wn_pos': 'Noun',\n", 472 | " 'word': 'descriptions'},\n", 473 | " {'pos': ',', 'punct': True, 'word': ','},\n", 474 | " {'pos': 'NNS',\n", 475 | " 'wn_def': 'the context and environment in which something is set; \\nthe state of the environment in which a situation exists; \\narrangement of scenery and properties to represent the place where a play or movie is enacted; \\nthe set of facts or circumstances that surround a situation or event; \\nthe physical position of something; \\na table service for one person; \\na mounting consisting of a piece of metal (as in a ring or other jewelry) that holds a gem in place',\n", 476 | " 'wn_lemma': 'setting',\n", 477 | " 'wn_pos': 'Noun',\n", 478 | " 'word': 'settings'},\n", 479 | " {'pos': ',', 'punct': True, 'word': ','},\n", 480 | " {'pos': 'NN',\n", 481 | " 'wn_def': 'something done (usually as opposed to something said); \\nthe state of being active; \\na military engagement; \\na process existing in or produced by nature (rather than by the intent of human beings); \\nthe series of events that form a plot; \\nthe trait of being active and energetic and forceful; \\nthe operating part that transmits power to a mechanism; \\na judicial proceeding brought by one party against another; one party prosecutes another for a wrong done or for protection of a right or for prevention of a wrong; \\nan act by a government body or supranational organization; \\nthe most important or interesting work or activity in a specific area or field',\n", 482 | " 'wn_lemma': 'action',\n", 483 | " 'wn_pos': 'Noun',\n", 484 | " 'word': 'action'},\n", 485 | " {'pos': ':', 'punct': True, 'word': '--'},\n", 486 | " {'pos': 'DT', 'word': 'all'},\n", 487 | " {'pos': 'IN', 'word': 'of'},\n", 488 | " {'pos': 'PRP', 'word': 'it'},\n", 489 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 490 | " {'pos': 'PRP', 'word': 'It'},\n", 491 | " {'pos': 'VBZ', 'word': \"'s\"},\n", 492 | " {'pos': 'JJ',\n", 493 | " 'wn_def': 'relatively large in size or number or extent; larger than others of its kind; \\nof major significance or importance; \\nremarkable or out of the ordinary in degree or magnitude or effect; \\nvery good; \\nuppercase; \\nin an advanced stage of pregnancy',\n", 494 | " 'wn_lemma': 'great',\n", 495 | " 'wn_pos': 'Adjective',\n", 496 | " 'word': 'great'},\n", 497 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 498 | " {'pos': 'PRP', 'word': 'I'},\n", 499 | " {'pos': 'VBP',\n", 500 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 501 | " 'wn_lemma': 'read',\n", 502 | " 'wn_pos': 'Verb',\n", 503 | " 'word': 'read'},\n", 504 | " {'pos': 'DT', 'word': 'this'},\n", 505 | " {'pos': 'NN',\n", 506 | " 'wn_def': 'a written work or composition that has been published (printed on pages bound together); \\nphysical objects consisting of a number of pages bound together; \\na compilation of the known facts regarding something or someone; \\na written version of a play or other dramatic composition; used in preparing for a performance; \\na record in which commercial accounts are recorded; \\na collection of playing cards satisfying the rules of a card game; \\na collection of rules or prescribed standards on the basis of which decisions are made; \\nthe sacred writings of Islam revealed by God to the prophet Muhammad during his life at Mecca and Medina; \\nthe sacred writings of the Christian religions; \\na major division of a long written composition; \\na number of sheets (ticket or stamps etc.) bound together on one edge',\n", 507 | " 'wn_lemma': 'book',\n", 508 | " 'wn_pos': 'Noun',\n", 509 | " 'word': 'book'},\n", 510 | " {'pos': 'IN', 'word': 'in'},\n", 511 | " {'pos': 'DT', 'word': 'a'},\n", 512 | " {'pos': 'NN',\n", 513 | " 'wn_def': 'creative writing of recognized artistic value; \\nthe humanistic study of a body of literature; \\npublished writings in a particular style on a particular subject; \\nthe profession or art of a writer',\n", 514 | " 'wn_lemma': 'literature',\n", 515 | " 'wn_pos': 'Noun',\n", 516 | " 'word': 'literature'},\n", 517 | " {'pos': 'NN',\n", 518 | " 'wn_def': 'a collection of things sharing a common attribute; \\na body of students who are taught together; \\npeople having the same social, economic, or educational status; \\neducation imparted in a series of lessons or meetings; \\na league ranked by quality; \\na body of students who graduate together; \\n(biology) a taxonomic group containing one or more orders; \\nelegance in dress or behavior',\n", 519 | " 'wn_lemma': 'class',\n", 520 | " 'wn_pos': 'Noun',\n", 521 | " 'word': 'class'},\n", 522 | " {'pos': 'CC', 'word': 'and'},\n", 523 | " {'pos': 'RB',\n", 524 | " 'wn_def': \"in accordance with truth or fact or reality; \\nin actual fact; \\nin fact (used as intensifiers or sentence modifiers); \\nused as intensifiers; `real' is sometimes used informally for `really'; `rattling' is informal\",\n", 525 | " 'wn_lemma': 'really',\n", 526 | " 'wn_pos': 'Adverb',\n", 527 | " 'word': 'really'},\n", 528 | " {'pos': 'VBN',\n", 529 | " 'wn_def': \"derive or receive pleasure from; get enjoyment from; take pleasure in; \\nhave benefit from; \\nget pleasure from; \\nhave for one's benefit; \\ntake delight in\",\n", 530 | " 'wn_lemma': 'enjoyed',\n", 531 | " 'wn_pos': 'Verb',\n", 532 | " 'word': 'enjoyed'},\n", 533 | " {'pos': 'PRP', 'word': 'it'},\n", 534 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 535 | " {'pos': 'PRP', 'word': 'It'},\n", 536 | " {'pos': 'VBZ', 'word': 'is'},\n", 537 | " {'pos': 'RB',\n", 538 | " 'wn_def': 'by chance',\n", 539 | " 'wn_lemma': 'perhaps',\n", 540 | " 'wn_pos': 'Adverb',\n", 541 | " 'word': 'perhaps'},\n", 542 | " {'pos': 'DT', 'word': 'the'},\n", 543 | " {'pos': 'JJS',\n", 544 | " 'wn_def': \"(superlative of `good') having the most positive qualities; \\n(comparative and superlative of `well') wiser or more advantageous and hence advisable; \\nhaving desirable or positive qualities especially those suitable for a thing specified; \\nhaving the normally expected amount; \\nmorally admirable; \\ndeserving of esteem and respect; \\npromoting or enhancing well-being; \\nagreeable or pleasing; \\nof moral excellence; \\nhaving or showing knowledge and skill and aptitude; \\nthorough; \\nwith or in a close or intimate relationship; \\nfinancially sound; \\nmost suitable or right for a particular purpose; \\nresulting favorably; \\nexerting force or influence; \\ncapable of pleasing; \\nappealing to the mind; \\nin excellent physical condition; \\ntending to promote physical well-being; beneficial to health; \\nnot forged; \\nnot left to spoil; \\ngenerally admired\",\n", 545 | " 'wn_lemma': 'best',\n", 546 | " 'wn_pos': 'Adjective',\n", 547 | " 'word': 'best'},\n", 548 | " {'pos': 'JJ',\n", 549 | " 'wn_def': 'of recognized authority or excellence; \\nof or relating to the most highly developed stage of an earlier civilisation and its culture; \\nof or pertaining to or characteristic of the ancient Greek and Roman cultures',\n", 550 | " 'wn_lemma': 'classic',\n", 551 | " 'wn_pos': 'Adjective',\n", 552 | " 'word': 'classic'},\n", 553 | " {'pos': 'PRP', 'word': 'I'},\n", 554 | " {'pos': 'VBP', 'word': 'have'},\n", 555 | " {'pos': 'VBN',\n", 556 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 557 | " 'wn_lemma': 'read',\n", 558 | " 'wn_pos': 'Verb',\n", 559 | " 'word': 'read'},\n", 560 | " {'pos': 'NNS', 'word': 'besides'},\n", 561 | " {'pos': 'NNP',\n", 562 | " 'wn_def': 'British philologist and writer of fantasies (born in South Africa) (1892-1973)',\n", 563 | " 'wn_lemma': 'tolkien',\n", 564 | " 'wn_pos': 'Noun',\n", 565 | " 'word': 'Tolkien'},\n", 566 | " {'pos': 'POS', 'word': \"'s\"},\n", 567 | " {'pos': 'NNP', 'word': 'The'},\n", 568 | " {'pos': 'NNP',\n", 569 | " 'wn_def': 'an imaginary being similar to a person but smaller and with hairy feet; invented by J.R.R. Tolkien',\n", 570 | " 'wn_lemma': 'hobbit',\n", 571 | " 'wn_pos': 'Noun',\n", 572 | " 'word': 'Hobbit'},\n", 573 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 574 | " {'pos': 'NN',\n", 575 | " 'wn_def': 'accumulated wealth in the form of money or jewels etc.; \\nart highly prized for its beauty or perfection; \\nany possession that is highly valued by its owner; \\na collection of precious things',\n", 576 | " 'wn_lemma': 'treasure',\n", 577 | " 'wn_pos': 'Noun',\n", 578 | " 'word': 'Treasure'},\n", 579 | " {'pos': 'NNP',\n", 580 | " 'wn_def': 'a land mass (smaller than a continent) that is surrounded by water; \\na zone or area resembling an island',\n", 581 | " 'wn_lemma': 'island',\n", 582 | " 'wn_pos': 'Noun',\n", 583 | " 'word': 'Island'},\n", 584 | " {'pos': 'NNS', 'word': 'manages'},\n", 585 | " {'pos': 'TO', 'word': 'to'},\n", 586 | " {'pos': 'VB',\n", 587 | " 'wn_def': \"be standing; be upright; \\nbe in some specified state or condition; \\noccupy a place or location, also metaphorically; \\nhold one's ground; maintain a position; be steadfast or upright; \\nput up with something or somebody unpleasant; \\nhave or maintain a position or stand on an issue; \\nremain inactive or immobile; \\nbe in effect; be or remain in force; \\nbe tall; have a height of; copula; \\nput into an upright position; \\nwithstand the force of something; \\nbe available for stud services\",\n", 588 | " 'wn_lemma': 'stand',\n", 589 | " 'wn_pos': 'Verb',\n", 590 | " 'word': 'stand'},\n", 591 | " {'pos': 'RB',\n", 592 | " 'wn_def': 'without any others being included or involved; \\nwithout anybody else or anything else',\n", 593 | " 'wn_lemma': 'alone',\n", 594 | " 'wn_pos': 'Adverb',\n", 595 | " 'word': 'alone'},\n", 596 | " {'pos': 'IN', 'word': 'as'},\n", 597 | " {'pos': 'DT', 'word': 'the'},\n", 598 | " {'pos': 'NNP', 'word': 'go-to'},\n", 599 | " {'pos': 'NN',\n", 600 | " 'wn_def': 'a written work or composition that has been published (printed on pages bound together); \\nphysical objects consisting of a number of pages bound together; \\na compilation of the known facts regarding something or someone; \\na written version of a play or other dramatic composition; used in preparing for a performance; \\na record in which commercial accounts are recorded; \\na collection of playing cards satisfying the rules of a card game; \\na collection of rules or prescribed standards on the basis of which decisions are made; \\nthe sacred writings of Islam revealed by God to the prophet Muhammad during his life at Mecca and Medina; \\nthe sacred writings of the Christian religions; \\na major division of a long written composition; \\na number of sheets (ticket or stamps etc.) bound together on one edge',\n", 601 | " 'wn_lemma': 'book',\n", 602 | " 'wn_pos': 'Noun',\n", 603 | " 'word': 'book'},\n", 604 | " {'pos': 'IN', 'word': 'on'},\n", 605 | " {'pos': 'NNS',\n", 606 | " 'wn_def': \"someone who uses another person's words or ideas as if they were his own; \\nsomeone who robs at sea or plunders the land from the sea without having a commission from any sovereign nation; \\na ship that is manned by pirates\",\n", 607 | " 'wn_lemma': 'pirate',\n", 608 | " 'wn_pos': 'Noun',\n", 609 | " 'word': 'pirates'},\n", 610 | " {'pos': ',', 'punct': True, 'word': ','},\n", 611 | " {'pos': 'NN',\n", 612 | " 'wn_def': 'a division of an ocean or a large body of salt water partially enclosed by land; \\nanything apparently limitless in quantity or volume; \\nturbulent water with swells of considerable size',\n", 613 | " 'wn_lemma': 'sea',\n", 614 | " 'wn_pos': 'Noun',\n", 615 | " 'word': 'sea'},\n", 616 | " {'pos': 'NNS',\n", 617 | " 'wn_def': 'an act of traveling by water; \\na journey to some distant place',\n", 618 | " 'wn_lemma': 'voyage',\n", 619 | " 'wn_pos': 'Noun',\n", 620 | " 'word': 'voyages'},\n", 621 | " {'pos': ',', 'punct': True, 'word': ','},\n", 622 | " {'pos': 'CC', 'word': 'and'},\n", 623 | " {'pos': 'JJ',\n", 624 | " 'wn_def': '(used of living things especially persons) in an early period of life or development or growth; \\n(of crops) harvested at an early stage of development; before complete maturity; \\nsuggestive of youth; vigorous and fresh; \\nbeing in its early stage; \\nnot tried or tested by experience',\n", 625 | " 'wn_lemma': 'young',\n", 626 | " 'wn_pos': 'Adjective',\n", 627 | " 'word': 'young'},\n", 628 | " {'pos': 'NNS',\n", 629 | " 'wn_def': 'the force of workers available; \\nan adult person who is male (as opposed to a woman); \\nsomeone who serves in the armed forces; a member of a military force; \\nthe generic use of the word to refer to any human being; \\nany living or extinct member of the family Hominidae characterized by superior intelligence, articulate speech, and erect carriage; \\na male subordinate; \\nan adult male person who has a manly character (virile and courageous competent); \\na manservant who acts as a personal attendant to his employer; \\na male person who plays a significant role (husband or lover or boyfriend) in the life of a particular woman; \\none of the British Isles in the Irish Sea; \\ngame equipment consisting of an object used in playing certain board games; \\nall of the living human inhabitants of the earth',\n", 630 | " 'wn_lemma': 'men',\n", 631 | " 'wn_pos': 'Noun',\n", 632 | " 'word': 'men'},\n", 633 | " {'pos': 'VBG',\n", 634 | " 'wn_def': 'try to get or reach; \\ntry to locate or discover, or try to establish the existence of; \\nmake an effort or attempt; \\ngo to or towards; \\ninquire for',\n", 635 | " 'wn_lemma': 'seeking',\n", 636 | " 'wn_pos': 'Verb',\n", 637 | " 'word': 'seeking'},\n", 638 | " {'pos': 'NN',\n", 639 | " 'wn_def': 'a wild and exciting undertaking (not necessarily lawful)',\n", 640 | " 'wn_lemma': 'adventure',\n", 641 | " 'wn_pos': 'Noun',\n", 642 | " 'word': 'adventure'},\n", 643 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 644 | " {'pos': 'IN', 'word': 'Before'},\n", 645 | " {'pos': 'NN', 'word': 'anyone'},\n", 646 | " {'pos': 'NNS', 'word': 'decides'},\n", 647 | " {'pos': 'TO', 'word': 'to'},\n", 648 | " {'pos': 'VB',\n", 649 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 650 | " 'wn_lemma': 'read',\n", 651 | " 'wn_pos': 'Verb',\n", 652 | " 'word': 'read'},\n", 653 | " {'pos': 'NNS',\n", 654 | " 'wn_def': 'a young person of either sex; \\na human offspring (son or daughter) of any age; \\nan immature childish person; \\na member of a clan or tribe',\n", 655 | " 'wn_lemma': 'child',\n", 656 | " 'wn_pos': 'Noun',\n", 657 | " 'word': 'children'},\n", 658 | " {'pos': 'POS', 'word': \"'s\"},\n", 659 | " {'pos': 'CC', 'word': 'or'},\n", 660 | " {'pos': 'NNP', 'word': 'YA'},\n", 661 | " {'pos': 'NN',\n", 662 | " 'wn_def': 'creative writing of recognized artistic value; \\nthe humanistic study of a body of literature; \\npublished writings in a particular style on a particular subject; \\nthe profession or art of a writer',\n", 663 | " 'wn_lemma': 'literature',\n", 664 | " 'wn_pos': 'Noun',\n", 665 | " 'word': 'literature'},\n", 666 | " {'pos': ',', 'punct': True, 'word': ','},\n", 667 | " {'pos': 'PRP', 'word': 'they'},\n", 668 | " {'pos': 'MD', 'word': 'must'},\n", 669 | " {'pos': 'RB',\n", 670 | " 'wn_def': 'before anything else; \\nthe initial time; \\nbefore another in time, space, or importance; \\nprominently forward',\n", 671 | " 'wn_lemma': 'first',\n", 672 | " 'wn_pos': 'Adverb',\n", 673 | " 'word': 'first'},\n", 674 | " {'pos': 'VB',\n", 675 | " 'wn_def': 'interpret something that is written or printed; \\nhave or contain a certain wording or form; \\nlook at, interpret, and say out loud something that is written or printed; \\nobtain data from magnetic tapes; \\ninterpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; \\ninterpret something in a certain way; convey a particular meaning or impression; \\nbe a student of a certain subject; \\nindicate a certain reading; of gauges and instruments; \\naudition for a stage role by reading parts of a role; \\nto hear and understand; \\nmake sense of a language',\n", 676 | " 'wn_lemma': 'read',\n", 677 | " 'wn_pos': 'Verb',\n", 678 | " 'word': 'read'},\n", 679 | " {'pos': 'NNP',\n", 680 | " 'wn_def': 'accumulated wealth in the form of money or jewels etc.; \\nart highly prized for its beauty or perfection; \\nany possession that is highly valued by its owner; \\na collection of precious things',\n", 681 | " 'wn_lemma': 'treasure',\n", 682 | " 'wn_pos': 'Noun',\n", 683 | " 'word': 'Treasure'},\n", 684 | " {'pos': 'NNP',\n", 685 | " 'wn_def': 'a land mass (smaller than a continent) that is surrounded by water; \\na zone or area resembling an island',\n", 686 | " 'wn_lemma': 'island',\n", 687 | " 'wn_pos': 'Noun',\n", 688 | " 'word': 'Island'},\n", 689 | " {'pos': 'TO', 'word': 'to'},\n", 690 | " {'pos': 'VB',\n", 691 | " 'wn_def': 'come into the possession of something concrete or abstract; \\nenter or assume a certain state or condition; \\ncause to move; cause to be in a certain position or condition; \\nreceive a specified treatment (abstract); \\nreach a destination; arrive by movement or progress; \\ngo or come after and bring or take back; \\ngo through (mental or physical states or experiences); \\ntake vengeance on or get even; \\nachieve a point or goal; \\ncause to do; cause to act in a specified manner; \\nsucceed in catching or seizing, especially after a chase; \\ncome to have or undergo a change of (physical features and attributes); \\nbe stricken by an illness, fall victim to an illness; \\ncommunicate with a place or person; establish communication with, as if by telephone; \\ngive certain properties to something; \\nmove into a desired direction of discourse; \\ngrasp with the mind or develop an understanding of; \\nattract and fix; \\nreach with a blow or hit in a particular spot; \\nreach by calculation; \\nacquire as a result of some effort or action; \\npurchase; \\nperceive by hearing; \\nsuffer from the receipt of; \\nreceive as a retribution or punishment; \\nleave immediately; used usually in the imperative form; \\nreach and board; \\nirritate; \\nevoke an emotional response; \\napprehend and reproduce accurately; \\nearn or achieve a base by being walked by the pitcher; \\novercome or destroy; \\nbe a mystery or bewildering to; \\ntake the first step or steps in carrying out an action; \\nundergo (as of injuries and illnesses); \\nmake children',\n", 692 | " 'wn_lemma': 'get',\n", 693 | " 'wn_pos': 'Verb',\n", 694 | " 'word': 'get'},\n", 695 | " {'pos': 'DT', 'word': 'an'},\n", 696 | " {'pos': 'NN',\n", 697 | " 'wn_def': 'the content of cognition; the main thing you are thinking about; \\nyour intention; what you intend to do; \\na personal view; \\nan approximate calculation of quantity or degree or worth; \\n(music) melodic subject of a musical composition',\n", 698 | " 'wn_lemma': 'idea',\n", 699 | " 'wn_pos': 'Noun',\n", 700 | " 'word': 'idea'},\n", 701 | " {'pos': 'IN', 'word': 'of'},\n", 702 | " {'pos': 'WP', 'word': 'what'},\n", 703 | " {'pos': 'DT', 'word': 'a'},\n", 704 | " {'pos': 'JJ',\n", 705 | " 'wn_def': 'having desirable or positive qualities especially those suitable for a thing specified; \\nhaving the normally expected amount; \\nmorally admirable; \\ndeserving of esteem and respect; \\npromoting or enhancing well-being; \\nagreeable or pleasing; \\nof moral excellence; \\nhaving or showing knowledge and skill and aptitude; \\nthorough; \\nwith or in a close or intimate relationship; \\nfinancially sound; \\nmost suitable or right for a particular purpose; \\nresulting favorably; \\nexerting force or influence; \\ncapable of pleasing; \\nappealing to the mind; \\nin excellent physical condition; \\ntending to promote physical well-being; beneficial to health; \\nnot forged; \\nnot left to spoil; \\ngenerally admired',\n", 706 | " 'wn_lemma': 'good',\n", 707 | " 'wn_pos': 'Adjective',\n", 708 | " 'word': 'good'},\n", 709 | " {'pos': 'NN',\n", 710 | " 'wn_def': 'a written work or composition that has been published (printed on pages bound together); \\nphysical objects consisting of a number of pages bound together; \\na compilation of the known facts regarding something or someone; \\na written version of a play or other dramatic composition; used in preparing for a performance; \\na record in which commercial accounts are recorded; \\na collection of playing cards satisfying the rules of a card game; \\na collection of rules or prescribed standards on the basis of which decisions are made; \\nthe sacred writings of Islam revealed by God to the prophet Muhammad during his life at Mecca and Medina; \\nthe sacred writings of the Christian religions; \\na major division of a long written composition; \\na number of sheets (ticket or stamps etc.) bound together on one edge',\n", 711 | " 'wn_lemma': 'book',\n", 712 | " 'wn_pos': 'Noun',\n", 713 | " 'word': 'book'},\n", 714 | " {'pos': 'MD', 'word': 'should'},\n", 715 | " {'pos': 'VB',\n", 716 | " 'wn_def': 'include or contain; have as a component; \\ncontain or hold; have within; \\nlessen the intensity of; temper; hold in restraint; hold or keep within limits; \\nbe divisible by; \\nbe capable of holding or containing; \\nhold back, as of a danger or an enemy; check the expansion or influence of',\n", 717 | " 'wn_lemma': 'contain',\n", 718 | " 'wn_pos': 'Verb',\n", 719 | " 'word': 'contain'},\n", 720 | " {'pos': '.', 'punct': True, 'word': '.'},\n", 721 | " {'pos': 'NN',\n", 722 | " 'wn_def': 'an instance of questioning; \\nthe subject matter at issue; \\na sentence of inquiry that asks for a reply; \\nuncertainty about the truth or factuality or existence of something; \\na formal proposal for action made to a deliberative assembly for discussion and vote; \\nan informal reference to a marriage proposal',\n", 723 | " 'wn_lemma': 'question',\n", 724 | " 'wn_pos': 'Noun',\n", 725 | " 'word': 'Question'},\n", 726 | " {'pos': ':', 'punct': True, 'word': ':'},\n", 727 | " {'pos': 'WRB', 'word': 'Why'},\n", 728 | " {'pos': 'VBP', 'word': 'have'},\n", 729 | " {'pos': 'RB', 'word': \"n't\"},\n", 730 | " {'pos': 'PRP', 'word': 'they'},\n", 731 | " {'pos': 'VBD',\n", 732 | " 'wn_def': 'engage in; \\ngive certain properties to something; \\nmake or cause to be or to become; \\ncause to do; cause to act in a specified manner; \\ngive rise to; cause to happen or occur, not always intentionally; \\ncreate or manufacture a man-made product; \\nmake, formulate, or derive in the mind; \\ncompel or make somebody or something to act in a certain way; \\ncreate by artistic means; \\nearn on some commercial or business transaction; earn as salary or wages; \\ncreate or design, often in a certain way; \\nto compose or represent:\"This wall forms the background of the stage setting\"; \\nreach a goal, e.g., \"make the first team\"; \\nbe or be capable of being changed or made into; \\nmake by shaping or bringing together constituents; \\nperform or carry out; \\nmake by combining materials and parts; \\nchange from one form into another; \\nact in a certain way so as to acquire; \\ncharge with a function; charge to be; \\nachieve a point or goal; \\nreach a destination, either real or abstract; \\ninstitute, enact, or establish; \\ncarry out or commit; \\nform by assembling individuals or constituents; \\norganize or be responsible for; \\nput in order or neaten; \\nhead into a specified direction; \\nhave a bowel movement; \\nundergo fabrication or creation; \\nbe suitable for; \\nadd up to; \\namount to; \\nconstitute the essence of; \\nappear to begin an activity; \\nproceed along a path; \\nreach in time; \\ngather and light the materials for; \\nprepare for eating by applying heat; \\ninduce to have sex; \\nassure the success of; \\nrepresent fictitiously, as in a play, or pretend to be or act like; \\nconsider as being; \\ncalculate as being; \\ncause to be enjoyable or pleasurable; \\nfavor the development of; \\ndevelop into; \\nbehave in a certain way; \\neliminate urine',\n", 733 | " 'wn_lemma': 'made',\n", 734 | " 'wn_pos': 'Verb',\n", 735 | " 'word': 'made'},\n", 736 | " {'pos': 'DT', 'word': 'a'},\n", 737 | " {'pos': 'JJ',\n", 738 | " 'wn_def': 'belonging to the modern era; since the Middle Ages; \\nrelating to a recently developed fashion or style; ; \\ncharacteristic of present-day art and music and literature and architecture; \\nahead of the times; \\nused of a living language; being the current stage in its development',\n", 739 | " 'wn_lemma': 'modern',\n", 740 | " 'wn_pos': 'Adjective',\n", 741 | " 'word': 'modern'},\n", 742 | " {'pos': 'NN',\n", 743 | " 'wn_def': 'a form of entertainment that enacts a story by sound and a sequence of images giving the illusion of continuous movement',\n", 744 | " 'wn_lemma': 'movie',\n", 745 | " 'wn_pos': 'Noun',\n", 746 | " 'word': 'movie'},\n", 747 | " {'pos': 'IN', 'word': 'of'},\n", 748 | " {'pos': 'DT', 'word': 'this'},\n", 749 | " {'pos': '.', 'punct': True, 'word': '?'},\n", 750 | " {'pos': 'PRP', 'word': 'It'},\n", 751 | " {'pos': 'MD', 'word': 'would'},\n", 752 | " {'pos': 'VB', 'word': 'be'},\n", 753 | " {'pos': 'JJ',\n", 754 | " 'wn_def': 'relatively large in size or number or extent; larger than others of its kind; \\nof major significance or importance; \\nremarkable or out of the ordinary in degree or magnitude or effect; \\nvery good; \\nuppercase; \\nin an advanced stage of pregnancy',\n", 755 | " 'wn_lemma': 'great',\n", 756 | " 'wn_pos': 'Adjective',\n", 757 | " 'word': 'great'},\n", 758 | " {'pos': 'NN',\n", 759 | " 'wn_def': 'a medium that disseminates moving pictures; \\na theater where films are shown',\n", 760 | " 'wn_lemma': 'cinema',\n", 761 | " 'wn_pos': 'Noun',\n", 762 | " 'word': 'cinema'},\n", 763 | " {'pos': '.', 'punct': True, 'word': '.'}]" 764 | ] 765 | } 766 | ], 767 | "prompt_number": 6 768 | }, 769 | { 770 | "cell_type": "code", 771 | "collapsed": false, 772 | "input": [ 773 | "\n", 774 | "a[39]\n" 775 | ], 776 | "language": "python", 777 | "metadata": {}, 778 | "outputs": [ 779 | { 780 | "metadata": {}, 781 | "output_type": "pyout", 782 | "prompt_number": 7, 783 | "text": [ 784 | "{'pos': 'VBN',\n", 785 | " 'wn_def': \"derive or receive pleasure from; get enjoyment from; take pleasure in; \\nhave benefit from; \\nget pleasure from; \\nhave for one's benefit; \\ntake delight in\",\n", 786 | " 'wn_lemma': 'enjoyed',\n", 787 | " 'wn_pos': 'Verb',\n", 788 | " 'word': 'enjoyed'}" 789 | ] 790 | } 791 | ], 792 | "prompt_number": 7 793 | }, 794 | { 795 | "cell_type": "code", 796 | "collapsed": false, 797 | "input": [ 798 | "sense = word_sense_disambiguate(a[39]['word'], wordnet_pos_code(a[39]['pos']), review)\n", 799 | "sense" 800 | ], 801 | "language": "python", 802 | "metadata": {}, 803 | "outputs": [ 804 | { 805 | "metadata": {}, 806 | "output_type": "pyout", 807 | "prompt_number": 8, 808 | "text": [ 809 | "Synset('enjoy.v.01')" 810 | ] 811 | } 812 | ], 813 | "prompt_number": 8 814 | }, 815 | { 816 | "cell_type": "code", 817 | "collapsed": false, 818 | "input": [ 819 | "wordnet.synset('enjoy.v.01').definition" 820 | ], 821 | "language": "python", 822 | "metadata": {}, 823 | "outputs": [ 824 | { 825 | "metadata": {}, 826 | "output_type": "pyout", 827 | "prompt_number": 9, 828 | "text": [ 829 | "'derive or receive pleasure from; get enjoyment from; take pleasure in'" 830 | ] 831 | } 832 | ], 833 | "prompt_number": 9 834 | }, 835 | { 836 | "cell_type": "code", 837 | "collapsed": false, 838 | "input": [ 839 | "sent = sentiment.senti_synset(sense.name)\n", 840 | "sent.pos_score\n", 841 | "\n" 842 | ], 843 | "language": "python", 844 | "metadata": {}, 845 | "outputs": [ 846 | { 847 | "metadata": {}, 848 | "output_type": "pyout", 849 | "prompt_number": 10, 850 | "text": [ 851 | "0.375" 852 | ] 853 | } 854 | ], 855 | "prompt_number": 10 856 | }, 857 | { 858 | "cell_type": "code", 859 | "collapsed": false, 860 | "input": [ 861 | "sent.neg_score" 862 | ], 863 | "language": "python", 864 | "metadata": {}, 865 | "outputs": [ 866 | { 867 | "metadata": {}, 868 | "output_type": "pyout", 869 | "prompt_number": 14, 870 | "text": [ 871 | "0.0" 872 | ] 873 | } 874 | ], 875 | "prompt_number": 14 876 | }, 877 | { 878 | "cell_type": "code", 879 | "collapsed": false, 880 | "input": [ 881 | "sent.obj_score" 882 | ], 883 | "language": "python", 884 | "metadata": {}, 885 | "outputs": [ 886 | { 887 | "metadata": {}, 888 | "output_type": "pyout", 889 | "prompt_number": 15, 890 | "text": [ 891 | "0.625" 892 | ] 893 | } 894 | ], 895 | "prompt_number": 15 896 | }, 897 | { 898 | "cell_type": "code", 899 | "collapsed": false, 900 | "input": [ 901 | "API_KEY_BING=''\n", 902 | "API_KEY_GOOGLE=''\n", 903 | "\n", 904 | "class GoogleApi:\n", 905 | " def __init__(self):\n", 906 | " self.service = build(\"customsearch\", \"v1\", developerKey=API_KEY_GOOGLE)\n", 907 | "\n", 908 | "\n", 909 | " def count(self,query):\n", 910 | " res = self.service.cse().list(\n", 911 | " q=query,\n", 912 | " cx='017576662512468239146:omuauf_lfve',\n", 913 | " ).execute()\n", 914 | " if 'nextPage' in res['queries']:\n", 915 | " return float(res['queries']['nextPage'][0]['totalResults'])\n", 916 | " else:\n", 917 | " return float(res['queries']['request'][0]['totalResults'])\n", 918 | "\n", 919 | "def request_bing(query, **params):\n", 920 | " URL_BING = 'https://api.datamarket.azure.com/Bing/Search/v1/Composite?Sources=%(source)s&Query=%(query)s&$top=50&$format=json'\n", 921 | " url = URL_BING % {'source': urllib2.quote(\"'web'\"),\n", 922 | " 'query': urllib2.quote(\"'\"+query+\"'\")}\n", 923 | " r = requests.get(url, auth=('', API_KEY_BING))\n", 924 | " return float(r.json()['d']['results'][0]['WebTotal'])" 925 | ], 926 | "language": "python", 927 | "metadata": {}, 928 | "outputs": [], 929 | "prompt_number": 11 930 | }, 931 | { 932 | "cell_type": "code", 933 | "collapsed": false, 934 | "input": [ 935 | "g = GoogleApi()\n", 936 | "res = g.count(\"aaa\")\n", 937 | "res" 938 | ], 939 | "language": "python", 940 | "metadata": {}, 941 | "outputs": [ 942 | { 943 | "metadata": {}, 944 | "output_type": "pyout", 945 | "prompt_number": 12, 946 | "text": [ 947 | "62700000.0" 948 | ] 949 | } 950 | ], 951 | "prompt_number": 12 952 | }, 953 | { 954 | "cell_type": "code", 955 | "collapsed": false, 956 | "input": [ 957 | "res = request_bing(\"prova\")\n", 958 | "res" 959 | ], 960 | "language": "python", 961 | "metadata": {}, 962 | "outputs": [ 963 | { 964 | "metadata": {}, 965 | "output_type": "pyout", 966 | "prompt_number": 19, 967 | "text": [ 968 | "55600000.0" 969 | ] 970 | } 971 | ], 972 | "prompt_number": 19 973 | } 974 | ], 975 | "metadata": {} 976 | } 977 | ] 978 | } -------------------------------------------------------------------------------- /BDA_Senti_ipython/SentiWordNet_reviewClassification.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "" 4 | }, 5 | "nbformat": 3, 6 | "nbformat_minor": 0, 7 | "worksheets": [ 8 | { 9 | "cells": [ 10 | { 11 | "cell_type": "code", 12 | "collapsed": false, 13 | "input": [ 14 | "import sys\n", 15 | "import csv\n", 16 | "import nltk\n", 17 | "from nltk.corpus import wordnet\n", 18 | "import re\n", 19 | "import codecs\n", 20 | "import pprint\n", 21 | "\n", 22 | "class SentiWordNetCorpusReader:\n", 23 | " def __init__(self, filename):\n", 24 | " \"\"\"\n", 25 | " Argument:\n", 26 | " filename -- the name of the text file containing the\n", 27 | " SentiWordNet database\n", 28 | " \"\"\" \n", 29 | " self.filename = filename\n", 30 | " self.db = {}\n", 31 | " self.parse_src_file()\n", 32 | "\n", 33 | " def parse_src_file(self):\n", 34 | " lines = codecs.open(self.filename, \"r\", \"utf8\").read().splitlines()\n", 35 | " lines = filter((lambda x : not re.search(r\"^\\s*#\", x)), lines)\n", 36 | " for i, line in enumerate(lines):\n", 37 | " fields = re.split(r\"\\t+\", line)\n", 38 | " fields = map(unicode.strip, fields)\n", 39 | " try: \n", 40 | " pos, offset, pos_score, neg_score, synset_terms, gloss = fields\n", 41 | " except:\n", 42 | " sys.stderr.write(\"Line %s formatted incorrectly: %s\\n\" % (i, line))\n", 43 | " if pos and offset:\n", 44 | " offset = int(offset)\n", 45 | " self.db[(pos, offset)] = (float(pos_score), float(neg_score))\n", 46 | "\n", 47 | " def senti_synset(self, *vals): \n", 48 | " if tuple(vals) in self.db:\n", 49 | " pos_score, neg_score = self.db[tuple(vals)]\n", 50 | " pos, offset = vals\n", 51 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 52 | " return SentiSynset(pos_score, neg_score, synset)\n", 53 | " else:\n", 54 | " synset = wordnet.synset(vals[0])\n", 55 | " pos = synset.pos\n", 56 | " offset = synset.offset\n", 57 | " if (pos, offset) in self.db:\n", 58 | " pos_score, neg_score = self.db[(pos, offset)]\n", 59 | " return SentiSynset(pos_score, neg_score, synset)\n", 60 | " else:\n", 61 | " return None\n", 62 | "\n", 63 | " def senti_synsets(self, string, pos=None):\n", 64 | " sentis = []\n", 65 | " synset_list = wordnet.synsets(string, pos)\n", 66 | " for synset in synset_list:\n", 67 | " sentis.append(self.senti_synset(synset.name))\n", 68 | " sentis = filter(lambda x : x, sentis)\n", 69 | " return sentis\n", 70 | "\n", 71 | " def all_senti_synsets(self):\n", 72 | " for key, fields in self.db.iteritems():\n", 73 | " pos, offset = key\n", 74 | " pos_score, neg_score = fields\n", 75 | " synset = wordnet._synset_from_pos_and_offset(pos, offset)\n", 76 | " yield SentiSynset(pos_score, neg_score, synset)\n", 77 | "\n", 78 | "class SentiSynset:\n", 79 | " def __init__(self, pos_score, neg_score, synset):\n", 80 | " self.pos_score = pos_score\n", 81 | " self.neg_score = neg_score\n", 82 | " self.obj_score = 1.0 - (self.pos_score + self.neg_score)\n", 83 | " self.synset = synset\n", 84 | "\n", 85 | " def __str__(self):\n", 86 | " \"\"\"Prints just the Pos/Neg scores for now.\"\"\"\n", 87 | " s = \"\"\n", 88 | " s += self.synset.name + \"\\t\"\n", 89 | " s += \"PosScore: %s\\t\" % self.pos_score \n", 90 | " s += \"NegScore: %s\" % self.neg_score\n", 91 | " return s\n", 92 | "\n", 93 | " def __repr__(self):\n", 94 | " return \"Senti\" + repr(self.synset)\n", 95 | "\n", 96 | "def tweet_dict(twitterData): \n", 97 | " ''' (file) -> list of dictionaries\n", 98 | " This method should take your .csv\n", 99 | " file and create a list of dictionaries.\n", 100 | " '''\n", 101 | " twitter_list_dict = [] \n", 102 | " twitterfile = open(twitterData)\n", 103 | " twitterreader = csv.reader(twitterfile)\n", 104 | " for line in twitterreader:\n", 105 | " twitter_list_dict.append(line[0])\n", 106 | " return twitter_list_dict\n", 107 | "\n", 108 | "# return true if a string ia a stopword\n", 109 | "def is_stopword(string):\n", 110 | " if string.lower() in nltk.corpus.stopwords.words('english'):\n", 111 | " return True\n", 112 | " else:\n", 113 | " return False\n", 114 | "\n", 115 | " # return true if a string is punctation \n", 116 | "def is_punctuation(string):\n", 117 | " for char in string:\n", 118 | " if char.isalpha() or char.isdigit():\n", 119 | " return False\n", 120 | " return True\n", 121 | "\n", 122 | "# Translation from nltk to Wordnet (words tag) (code)\n", 123 | "def wordnet_pos_code(tag):\n", 124 | " if tag.startswith('NN'):\n", 125 | " return wordnet.NOUN\n", 126 | " elif tag.startswith('VB'):\n", 127 | " return wordnet.VERB\n", 128 | " elif tag.startswith('JJ'):\n", 129 | " return wordnet.ADJ\n", 130 | " elif tag.startswith('RB'):\n", 131 | " return wordnet.ADV\n", 132 | " else:\n", 133 | " return ''\n", 134 | "\n", 135 | " \n", 136 | "# Translation from nltk to Wordnet (words tag) (label)\n", 137 | "def wordnet_pos_label(tag):\n", 138 | " if tag.startswith('NN'):\n", 139 | " return \"Noun\"\n", 140 | " elif tag.startswith('VB'):\n", 141 | " return \"Verb\"\n", 142 | " elif tag.startswith('JJ'):\n", 143 | " return \"Adjective\"\n", 144 | " elif tag.startswith('RB'):\n", 145 | " return \"Adverb\"\n", 146 | " else:\n", 147 | " return tag\n", 148 | " \n", 149 | "\n", 150 | "\"\"\" input -> a sentence \n", 151 | " otput -> sentence in which each words is enriched of -> lemma, wordnet_pos, wordnet_definitions \n", 152 | "\n", 153 | "\"\"\"\n", 154 | "def wordnet_definitions(sentence):\n", 155 | " wnl = nltk.WordNetLemmatizer()\n", 156 | " for token in sentence:\n", 157 | " word = token['word']\n", 158 | " wn_pos = wordnet_pos_code(token['pos'])\n", 159 | " if is_punctuation(word):\n", 160 | " token['punct'] = True\n", 161 | " elif is_stopword(word):\n", 162 | " pass\n", 163 | " elif len(wordnet.synsets(word, wn_pos)) > 0:\n", 164 | " token['wn_lemma'] = wnl.lemmatize(word.lower())\n", 165 | " token['wn_pos'] = wordnet_pos_label(token['pos'])\n", 166 | " defs = [sense.definition for sense in wordnet.synsets(word, wn_pos)]\n", 167 | " token['wn_def'] = \"; \\n\".join(defs) \n", 168 | " else:\n", 169 | " pass\n", 170 | " return sentence\n", 171 | "\n", 172 | "\n", 173 | "#Tokenization\n", 174 | "\n", 175 | "def tag_tweet(tweet): \n", 176 | " sents = nltk.sent_tokenize(tweet)\n", 177 | " sentence = []\n", 178 | " for sent in sents:\n", 179 | " tokens = nltk.word_tokenize(sent)\n", 180 | " tag_tuples = nltk.pos_tag(tokens)\n", 181 | " for (string, tag) in tag_tuples:\n", 182 | " token = {'word':string, 'pos':tag} \n", 183 | " sentence.append(token) \n", 184 | " return sentence\n", 185 | "\n", 186 | "\n", 187 | "\n", 188 | "# WSD\n", 189 | "\n", 190 | "def word_sense_disambiguate(word, wn_pos, tweet):\n", 191 | " senses = wordnet.synsets(word, wn_pos)\n", 192 | " if len(senses) >0:\n", 193 | " cfd = nltk.ConditionalFreqDist(\n", 194 | " (sense, def_word)\n", 195 | " for sense in senses\n", 196 | " for def_word in sense.definition.split()\n", 197 | " if def_word in tweet)\n", 198 | " best_sense = senses[0] # start with first sense\n", 199 | " for sense in senses:\n", 200 | " try:\n", 201 | " if cfd[sense].max() > cfd[best_sense].max():\n", 202 | " best_sense = sense\n", 203 | " except: \n", 204 | " pass \n", 205 | " return best_sense\n", 206 | " else:\n", 207 | " return None" 208 | ], 209 | "language": "python", 210 | "metadata": {}, 211 | "outputs": [], 212 | "prompt_number": 1 213 | }, 214 | { 215 | "cell_type": "code", 216 | "collapsed": false, 217 | "input": [ 218 | "sentiment = SentiWordNetCorpusReader(\"SentiWordNet_3.0.0_20130122.txt\")" 219 | ], 220 | "language": "python", 221 | "metadata": {}, 222 | "outputs": [], 223 | "prompt_number": 2 224 | }, 225 | { 226 | "cell_type": "code", 227 | "collapsed": false, 228 | "input": [ 229 | "#rev_bad_1.txt\n", 230 | "#rev_bad_2.txt\n", 231 | "#rev_good_b.txt\n", 232 | "#rev_good_s.txt\n", 233 | "#rev_good_t1.txt\n", 234 | "#rev_nutralbad_2.txt\n", 235 | "\n", 236 | "revfile = open(\"rev_nutralbad_2.txt\")\n", 237 | "\n", 238 | "review = revfile.read()\n", 239 | "\n", 240 | "review" 241 | ], 242 | "language": "python", 243 | "metadata": {}, 244 | "outputs": [ 245 | { 246 | "metadata": {}, 247 | "output_type": "pyout", 248 | "prompt_number": 4, 249 | "text": [ 250 | "\"I liked the look of this case at Best Buy versus the other they had available at the moment. I purchased this because I needed a case immediately and didn't want to risk carrying my new phone around with one. It seems like a study case and somewhat protective however it does not fit perfectly. It slides a little up and down in the case. The fit is fine left to right. I would probably look elsewhere for a case if I were you.\"" 251 | ] 252 | } 253 | ], 254 | "prompt_number": 4 255 | }, 256 | { 257 | "cell_type": "code", 258 | "collapsed": false, 259 | "input": [ 260 | "review = \"I bought this tablecloth in the taupe color for Thanksgiving dinner entertaining and was a little hesitant of what I would get for such a reasonable price. It washed well and didn't even need pressing after coming out of the dryer. The color worked out great with my gold-trimmed Lenox placesettings and the tablecloth was of a nice weight - not too flimsy yet not too heavy either. I'm pleased with this purchase and may order another in a smaller size for use now that the leaf is out of the table!\"" 261 | ], 262 | "language": "python", 263 | "metadata": {}, 264 | "outputs": [], 265 | "prompt_number": 8 266 | }, 267 | { 268 | "cell_type": "code", 269 | "collapsed": false, 270 | "input": [ 271 | "a = wordnet_definitions(tag_tweet(review))\n", 272 | "obj_score = 0 # object score \n", 273 | "pos_score=0 # positive score\n", 274 | "neg_score=0 #negative score\n", 275 | "pos_score_tre=0\n", 276 | "neg_score_tre=0\n", 277 | "threshold = 0.75\n", 278 | "count = 0\n", 279 | "count_tre = 0\n", 280 | "\n", 281 | "\"\"\"\n", 282 | "Conversion from plain text to SentiWordnet scores\n", 283 | "\"\"\"\n", 284 | " \n", 285 | "for word in a:\n", 286 | " if 'punct' not in word :\n", 287 | " sense = word_sense_disambiguate(word['word'], wordnet_pos_code(word['pos']), review)\n", 288 | " \n", 289 | " if sense is not None:\n", 290 | " sent = sentiment.senti_synset(sense.name)\n", 291 | " # Extraction of the scores\n", 292 | " if sent is not None and sent.obj_score <> 1:\n", 293 | " obj_score = obj_score + float(sent.obj_score)\n", 294 | " pos_score = pos_score + float(sent.pos_score)\n", 295 | " neg_score = neg_score + float(sent.neg_score)\n", 296 | " count=count+1\n", 297 | " print str(sent.pos_score)+ \" - \"+str(sent.neg_score)+ \" - \"+ str(sent.obj_score)+\" - \"+sent.synset.name\n", 298 | " if sent.obj_score < threshold:\n", 299 | " pos_score_tre = pos_score_tre + float(sent.pos_score)\n", 300 | " neg_score_tre = neg_score_tre + float(sent.neg_score)\n", 301 | " count_tre=count_tre+1\n", 302 | "print review\n", 303 | "\n", 304 | "#Evaluation by different methods\n", 305 | "\n", 306 | "avg_pos_score=0\n", 307 | "avg_neg_score=0\n", 308 | "avg_neg_score_tre=0\n", 309 | "avg_neg_score_tre=0\n", 310 | "\n", 311 | "#2\n", 312 | "\n", 313 | "if count <> 0:\n", 314 | " \n", 315 | " avg_pos_score=pos_score/count\n", 316 | " avg_neg_score=neg_score/count\n", 317 | "\n", 318 | "#3\n", 319 | "\n", 320 | "if count_tre <> 0:\n", 321 | " avg_pos_score_tre=pos_score_tre/count_tre\n", 322 | " avg_neg_score_tre=neg_score_tre/count_tre\n", 323 | "\n", 324 | "#pint results\n", 325 | "#1\n", 326 | "print \"pos_total : \"+str(pos_score)+\" - neg_ total: \"+str(neg_score)+\" - count : \"+str(count)+\" -> \"+(\" positivo \" if pos_score > neg_score else (\"negativo\" if pos_score < neg_score else \"neutro\"))\n", 327 | "#2\n", 328 | "print \"(AVG) pos : \"+str(avg_pos_score)+\" - (AVG) neg : \"+str(avg_neg_score)+\" -> \"+(\" positivo \" if avg_pos_score > avg_neg_score else (\"negativo\" if avg_pos_score < avg_neg_score else \"neutro\"))\n", 329 | "#3\n", 330 | "if count_tre > 0:\n", 331 | " print \"(AVG_TRE) pos : \"+str(avg_pos_score_tre)+\" - (AVG_TRE) neg : \"+str(avg_neg_score_tre)+\" -> \"+(\" positivo \" if avg_pos_score_tre > avg_neg_score_tre else (\"negativo\" if avg_pos_score_tre < avg_neg_score_tre else \"neutro\"))\n", 332 | "print \"\"" 333 | ], 334 | "language": "python", 335 | "metadata": {}, 336 | "outputs": [ 337 | { 338 | "output_type": "stream", 339 | "stream": "stdout", 340 | "text": [ 341 | "0.125 - 0.0 - 0.875 - like.v.05\n", 342 | "0.25 - 0.0 - 0.75 - best.n.01\n", 343 | "0.5 - 0.0 - 0.5 - bargain.n.02\n", 344 | "0.25 - 0.25 - 0.5 - necessitate.v.01\n", 345 | "0.0 - 0.375 - 0.625 - immediately.r.01\n", 346 | "0.25 - 0.0 - 0.75 - desire.v.01\n", 347 | "0.0 - 0.25 - 0.75 - risk.v.01\n", 348 | "0.25 - 0.0 - 0.75 - new.a.06\n", 349 | "0.125 - 0.125 - 0.75 - appear.v.04\n", 350 | "0.125 - 0.0 - 0.875 - study.n.02\n", 351 | "0.125 - 0.0 - 0.875 - protective.a.01\n", 352 | "0.125 - 0.5 - 0.375 - however.r.01\n", 353 | "0.0 - 0.625 - 0.375 - not.r.01\n", 354 | "0.375 - 0.125 - 0.5 - perfectly.r.02\n", 355 | "0.0 - 0.375 - 0.625 - little.r.01\n", 356 | "0.5 - 0.0 - 0.5 - fit.n.03\n", 357 | "0.25 - 0.125 - 0.625 - be.v.01\n", 358 | "0.0 - 0.125 - 0.875 - fine.n.01\n", 359 | "0.25 - 0.0 - 0.75 - right.r.01\n", 360 | "0.25 - 0.125 - 0.625 - be.v.01\n", 361 | "I liked the look of this case at Best Buy versus the other they had available at the moment. I purchased this because I needed a case immediately and didn't want to risk carrying my new phone around with one. It seems like a study case and somewhat protective however it does not fit perfectly. It slides a little up and down in the case. The fit is fine left to right. I would probably look elsewhere for a case if I were you.\n", 362 | "pos_total : 3.75 - neg_ total: 3.0 - count : 20 -> positivo \n", 363 | "(AVG) pos : 0.1875 - (AVG) neg : 0.15 -> positivo \n", 364 | "(AVG_TRE) pos : 0.225 - (AVG_TRE) neg : 0.25 -> negativo\n", 365 | "\n" 366 | ] 367 | } 368 | ], 369 | "prompt_number": 5 370 | } 371 | ], 372 | "metadata": {} 373 | } 374 | ] 375 | } -------------------------------------------------------------------------------- /BDA_Senti_ipython/good1.txt: -------------------------------------------------------------------------------- 1 | Like others, I was part of the funding campaign. Great idea, but either the execution or the technology is flawed. The product does NOT operate as advertised. I was willing to forgive that the product does not even have the capability to work with my new model android phone, and instead paired the sticker to my wife's iphone. The app installed, I was able to pair the sticker etc. Sounds good right? At least I got further than some of the other reviewers. 2 | 3 | Unfortunately, even in completely clear, no obstacle, line of sight, the location function was very poor - 5-10 feet at best - any further and it couldn't locate the sticker at all. When it did connect, it gave inaccurate information as to the distance. 4 | 5 | This produce is NOT ready for release. If you buy now, you are funding the second stage R&D and purchasing a product that has promise, but currently simply doesn't function -------------------------------------------------------------------------------- /BDA_Senti_ipython/les.csv: -------------------------------------------------------------------------------- 1 | This is the third book of Bukowski's that i have read (the first two were "Post Office" and "Hollywood") and thus far it is my favorite. This book is composed of a series of short passages, 87 total. This book is mostly about Henry Chinaski (meaning, for the most part, Charles Bukowski) drinking, having sex with women who drink, and moving from job to job. I dont know how many jobs Chinaski has in this book, but he often holds them only long enough for a single one-page section. If there is any unity in terms of story and plot in this book, it is found in the women, such as Jan and Laura, who manage to stay in Chinaski's life for a few jobs; the women serve to string together the sections. More significant than any plot are the various interwoven themes that Bukowski deals with, such as futility, solitary existence, and death (all themes that might lead us to link Bukowski with existentialist philosophy). These ideas (among others) are all related, and also related to the ways in which they are expressed, namely, through alcohol, cheap sex, disgust towards humanity, and peacefulness in the strangest situations-- and of course, Henry Chinaski's inability to hold a job or even have any desire to do so. On one hand, this book is a quick and light read; on the other hand, a close read that keeps in mind the interplay between the different themes involved truly exposes the genius of Bukowski. 2 | -------------------------------------------------------------------------------- /BDA_Senti_ipython/les.txt: -------------------------------------------------------------------------------- 1 | This is an example of great writing. The style, pace, characters, descriptions, settings, action--all of it. It's great. I read this book in a literature class and really enjoyed it. It is perhaps the best classic I have read besides Tolkien's The Hobbit. Treasure Island manages to stand alone as the go-to book on pirates, sea voyages, and young men seeking adventure. Before anyone decides to read children's or YA literature, they must first read Treasure Island to get an idea of what a good book should contain. 2 | Question: Why haven't they made a modern movie of this? It would be great cinema. -------------------------------------------------------------------------------- /BDA_Senti_ipython/rev_bad_1.txt: -------------------------------------------------------------------------------- 1 | I also ordered this via the campaign, with the expectation that it would work on my Motorola HD RAZR MAXX phone, which has SW Version 4.1 loaded. However, it does not and only works for limited Android phones. There's a blog written by people who have had mostly bad experiences with the sticker. For some people even if it does work on their phone/iPad, it doesn't work accurately. Also, not all the stickers worked when received. 2 | I was told via the company's Facebook page that they are coming up with an application that would work on my phone. We'll see. 3 | Until then, I would put off ordering this until the bugs are worked out. -------------------------------------------------------------------------------- /BDA_Senti_ipython/rev_bad_2.txt: -------------------------------------------------------------------------------- 1 | The company sent me 2 of this product and they both were not able to charge or sync my idevices. This Apple product is very high tech it actually has a chip that verifies the phone and the cable. The customer service was outstanding when dealing with the replacement of the product. I would buy from this seller again just not an apple product that has such high tech specifications. You get what you pay for. -------------------------------------------------------------------------------- /BDA_Senti_ipython/rev_good_b.txt: -------------------------------------------------------------------------------- 1 | as the first book of charles bukowski's that i ever read, "Women" holds a special place in my heart. it is an insane story of henry chinaski and his misunderstandings and communications with women. autobiographical to an extent, this book, and all of bukowski's, are special because they are so graphically and emotionally honest. no one else paints such candid portraits of the human psyche in its most degenerate and politically incorrect situations. no other author can put so much vulgarity into a work and make it sound as natural as bukowski does. everything and every word in his novels have a place and a meaning, making his writing style so refreshingly satisfying, that you can't help but to live vicariously through his beautiful insanity. "women" introduced me to this great american poet/novelist, and it is my belief that this book definitely makes for a proper introduction to his works. -------------------------------------------------------------------------------- /BDA_Senti_ipython/rev_good_s.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linkTDP/BigDataAnalysis_TweetSentiment/c456746625ac80a615db16f9ffb1ad7a6efffdd2/BDA_Senti_ipython/rev_good_s.txt -------------------------------------------------------------------------------- /BDA_Senti_ipython/rev_good_t1.txt: -------------------------------------------------------------------------------- 1 | Treasure Island was written 130 years ago and it remains one of the great adventure tales of all time. I originally read it when I was about ten years old and, fifty years later, I recently re-read it in the Kindle edition. The fact that the book brings as much pleasure now as it did then is an indication of how good it really is. Stevenson truly hit the ball out of the park with this one. 2 | 3 | Much has been remarked in many of these critiques about the outdated language Stevenson used. In that regard, I have to say that the Kindle edition that I downloaded lacks one thing that was included in my old printed edition, which was published by MacMillan way back in 1924. The old edition has a set of notes following the text, explaining a lot of the nautical terms and old-fashioned jargon. It even includes the complete lyrics to "A Bottle of Rum". I never found those notes necessary but they might prove useful to some of the younger readers, to whom such language might be unfamiliar. Personally, I think the language is part of what has given this tale it's lasting appeal. In addition, I don't know whether 18th Century pirates really spoke the way Stevenson has them speak in Treasure Island, but there is no doubt that it is the way they will forever be remembered, "...and ye may lay to that, Matey"! -------------------------------------------------------------------------------- /BDA_Senti_ipython/rev_nutralbad_2.txt: -------------------------------------------------------------------------------- 1 | I liked the look of this case at Best Buy versus the other they had available at the moment. I purchased this because I needed a case immediately and didn't want to risk carrying my new phone around with one. It seems like a study case and somewhat protective however it does not fit perfectly. It slides a little up and down in the case. The fit is fine left to right. I would probably look elsewhere for a case if I were you. -------------------------------------------------------------------------------- /DeriveTweetSentimentEasy.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import csv 3 | 4 | twitterData = sys.argv[1] #csv file 5 | 6 | def tweet_dict(twitterData): 7 | ''' (file) -> list of dictionaries 8 | This method should take your .csv 9 | file and create a list of dictionaries. 10 | ''' 11 | twitter_list_dict = [] 12 | twitterfile = open(twitterData) 13 | twitterreader = csv.reader(twitterfile) 14 | for line in twitterreader: 15 | twitter_list_dict.append(line[0]) 16 | return twitter_list_dict 17 | 18 | 19 | def sentiment_dict(sentimentData): 20 | ''' (file) -> dictionary 21 | This method should take your sentiment file 22 | and create a dictionary in the form {word: value} 23 | ''' 24 | afinnfile = open(sentimentData) 25 | scores = {} # initialize an empty dictionary 26 | for line in afinnfile: 27 | term, score = line.split("\t") # The file is tab-delimited. "\t" means "tab character" 28 | scores[term] = float(score) # Convert the score to an integer. 29 | 30 | return scores # Print every (term, score) pair in the dictionary 31 | 32 | def main(): 33 | tweets = tweet_dict("char.csv") 34 | sentiment = sentiment_dict("AFINN-111.txt") 35 | 36 | """Calculate sentiment scores for the whole tweet with unknown terms set to score of zero 37 | then accumulates a dictionary of list of values: new term -> new entry that has the word as key. 38 | """ 39 | for index in range(len(tweets)): 40 | 41 | tweet_word = tweets[index].split() 42 | sent_score = 0 # sentiment score della frase 43 | 44 | 45 | for word in tweet_word: 46 | word = word.rstrip('?:!.,;"!@') 47 | word = word.replace("\n", "") 48 | 49 | if not (word.encode('utf-8', 'ignore') == ""): 50 | if word.encode('utf-8') in sentiment.keys(): 51 | sent_score = sent_score + float(sentiment[word]) 52 | 53 | 54 | 55 | print tweets[index] + " --- "+ str(sent_score) 56 | 57 | if __name__ == '__main__': 58 | main() -------------------------------------------------------------------------------- /DocumentSentimentClassification.py: -------------------------------------------------------------------------------- 1 | import nltk 2 | import requests 3 | import math 4 | import urllib2 5 | 6 | from apiclient.discovery import build 7 | 8 | 9 | 10 | FILE_NAME='review_bad.txt' 11 | API_KEY_BING='' 12 | API_KEY_GOOGLE='' 13 | USE_GOOGLE = False 14 | 15 | """ 16 | Class that implements the search through Google Custom Search (100 Query/day free) 17 | """ 18 | 19 | class GoogleApi: 20 | def __init__(self): 21 | self.service = build("customsearch", "v1", developerKey=API_KEY_GOOGLE) 22 | 23 | 24 | def count(self,query): 25 | res = self.service.cse().list( 26 | q=query, 27 | cx='017576662512468239146:omuauf_lfve', 28 | ).execute() 29 | if 'nextPage' in res['queries']: 30 | return float(res['queries']['nextPage'][0]['totalResults']) 31 | else: 32 | return float(res['queries']['request'][0]['totalResults']) 33 | 34 | """ 35 | Class that implements the matcher through token triples 36 | """ 37 | 38 | class TokenMatcher: 39 | def __init__(self): 40 | self.pattern_anything = [['JJ','NN'],['JJ','NNS'], 41 | ['RB','VD'],['RB','VBD'],['RB','VBN'],['RB','VBG'], 42 | ['RBR','VD'],['RBR','VBD'],['RBR','VBN'],['RBR','VBG'], 43 | ['RBS','VD'],['RBS','VBD'],['RBS','VBN'],['RBS','VBG']] 44 | self.pattern_no_NN_or_NNS= [['RB','JJ'],['RBR','JJ'],['RBS','JJ'],['JJ','JJ'], 45 | ['NN','JJ'],['NNS','JJ']] 46 | """ 47 | Methon that match triple pattern 48 | input = sentence 49 | output = couples of words that have a match 50 | """ 51 | 52 | def evaluate_phrase(self, sentences): 53 | for index in range(len(sentences)-2): 54 | if self.matcher(sentences[index:index+3]): 55 | yield (sentences[index]['word']+" "+sentences[index+1]['word']) 56 | 57 | 58 | 59 | def matcher(self, triple): 60 | match = False 61 | for test in self.pattern_anything: 62 | if triple[0]['pos'] == test[0] and triple[1]['pos'] == test[1]: 63 | match= True 64 | for test in self.pattern_no_NN_or_NNS: 65 | if not match and triple[0]['pos'] == test[0] and triple[1]['pos'] == test[1] and triple[2] <> 'NN' and triple[2] <> 'NNS': 66 | match= True 67 | return match 68 | 69 | """ 70 | Method that implements the search through Bing 71 | """ 72 | 73 | def request_bing(query, **params): 74 | URL_BING = 'https://api.datamarket.azure.com/Bing/Search/v1/Composite?Sources=%(source)s&Query=%(query)s&$top=50&$format=json' 75 | url = URL_BING % {'source': urllib2.quote("'web'"), 76 | 'query': urllib2.quote("'"+query+"'")} 77 | r = requests.get(url, auth=('', API_KEY_BING)) 78 | return float(r.json()['d']['results'][0]['WebTotal']) 79 | 80 | 81 | # return true if a word ia a stopword 82 | def is_stopword(string): 83 | if string.lower() in nltk.corpus.stopwords.words('english'): 84 | return True 85 | else: 86 | return False 87 | 88 | # return true if a string is punctation 89 | def is_punctuation(string): 90 | for char in string: 91 | if char.isalpha() or char.isdigit(): 92 | return False 93 | return True 94 | 95 | 96 | #Tokenization 97 | 98 | def tokenizer(tweet): 99 | sents = nltk.sent_tokenize(tweet) 100 | sentence = [] 101 | for sent in sents: 102 | 103 | tokens = nltk.word_tokenize(sent) 104 | tag_tuples = nltk.pos_tag(tokens) 105 | for (string, tag) in tag_tuples: 106 | if not is_punctuation(string): 107 | token = {'word':string, 'pos':tag} 108 | sentence.append(token) 109 | return sentence 110 | 111 | """ 112 | input -> plain text 113 | output -> list of phrases 114 | """ 115 | 116 | def list_phrases(textImput): 117 | sent_tokenizer=nltk.data.load('tokenizers/punkt/english.pickle') 118 | text = open(textImput).read() 119 | sents = sent_tokenizer.tokenize(text) 120 | return sents 121 | 122 | 123 | 124 | def main(): 125 | t = TokenMatcher() 126 | if USE_GOOGLE: 127 | g = GoogleApi() 128 | text = list_phrases(FILE_NAME) 129 | 130 | excellent_BING=request_bing("excellent") 131 | poor_BING=request_bing("poor") 132 | if 'g' in locals(): 133 | excellent_GOOGLE=g.count("excellent") 134 | poor_GOOGLE=g.count("poor") 135 | 136 | avg_pmi_BING=0 137 | avg_pmi_GOOGLE=0 138 | count=0 139 | 140 | for phrase in text: 141 | for a in t.evaluate_phrase(tokenizer(phrase.encode('ascii', 'ignore'))): 142 | print a 143 | term1_term2_e = request_bing(a+" excellent") 144 | term1_term2_p = request_bing(a+" poor") 145 | 146 | print "---BING" 147 | 148 | print 'hits excellent : '+str(excellent_BING) 149 | print 'hits poor : '+str(poor_BING) 150 | print 'hits + excellent :'+str(term1_term2_e) 151 | print 'hits + poor :'+str(term1_term2_p) 152 | 153 | if 'accum_ex_bing' not in locals(): 154 | accum_ex_bing=excellent_BING 155 | else: 156 | accum_ex_bing=accum_ex_bing*excellent_BING 157 | if 'accum_po_bing' not in locals(): 158 | accum_po_bing=poor_BING 159 | else: 160 | accum_po_bing=accum_po_bing*poor_BING 161 | if 'accum_tex_bing' not in locals(): 162 | accum_tex_bing=term1_term2_e 163 | else: 164 | accum_tex_bing=accum_tex_bing*term1_term2_e 165 | if 'accum_tpo_bing' not in locals(): 166 | accum_tpo_bing=term1_term2_p 167 | else: 168 | accum_tpo_bing=accum_tpo_bing*term1_term2_p 169 | 170 | 171 | count = count +1 172 | 173 | 174 | print '' 175 | 176 | if 'g' in locals(): 177 | term1_term2_e = g.count(a+" excellent") 178 | term1_term2_p = g.count(a+" poor") 179 | 180 | 181 | print "---GOOGLE" 182 | 183 | print 'hits excellent : '+str(excellent_GOOGLE) 184 | print 'hits poor : '+str(poor_GOOGLE) 185 | print 'hits + excellent :'+str(term1_term2_e) 186 | print 'hits + poor :'+str(term1_term2_p) 187 | 188 | if 'accum_ex_google' not in locals(): 189 | accum_ex_google=excellent_GOOGLE 190 | else: 191 | accum_ex_google=accum_ex_google*excellent_GOOGLE 192 | if 'accum_po_google' not in locals(): 193 | accum_po_google=poor_GOOGLE 194 | else: 195 | accum_po_google=accum_po_google*poor_GOOGLE 196 | if 'accum_tex_google' not in locals(): 197 | accum_tex_google=term1_term2_e 198 | else: 199 | accum_tex_google=accum_tex_google*term1_term2_e 200 | if 'accum_tpo_google' not in locals(): 201 | accum_tpo_google=term1_term2_p 202 | else: 203 | accum_tpo_google=accum_tpo_google*term1_term2_p 204 | 205 | print '' 206 | 207 | 208 | 209 | print 'BING sentence text : '+str(math.log((accum_tex_bing*accum_po_bing)/(accum_ex_bing*accum_tpo_bing),2)) 210 | if 'g' in locals(): 211 | print 'GOOGLE sentence text : '+str(math.log((accum_tex_google*accum_po_google)/(accum_ex_google*accum_tpo_google),2)) 212 | 213 | 214 | if __name__ == '__main__': 215 | main() -------------------------------------------------------------------------------- /ExtractTweet.py: -------------------------------------------------------------------------------- 1 | from tweepy.streaming import StreamListener 2 | from tweepy import OAuthHandler 3 | from tweepy import Stream 4 | import json 5 | import csv 6 | from collections import namedtuple 7 | 8 | 9 | TWITTER_CONFIGS = 'config.json' 10 | 11 | 12 | def get_twitter_configs(): 13 | #load configuration file 14 | config = json.load(open(TWITTER_CONFIGS, 'r')) 15 | 16 | 17 | twitter_configs = namedtuple( 18 | 'TwitterConfigs', 19 | 'consumer_key, consumer_secret, access_token, access_token_secret,file_name,count') 20 | 21 | # Go to http://dev.twitter.com and create an app. 22 | # The consumer key and secret will be generated for you. 23 | twitter_configs.consumer_key = config["consumer_key"] 24 | twitter_configs.consumer_secret = config["consumer_secret"] 25 | 26 | # After the step above, you will be redirected to your app's page. 27 | # Create an access token under the the "Your access token" section. 28 | twitter_configs.access_token = config["access_token"] 29 | twitter_configs.access_token_secret = config["access_token_secret"] 30 | 31 | twitter_configs.file_name = config["file_name"] 32 | 33 | twitter_configs.count = config["count"] 34 | 35 | twitter_configs.filter = config["filter"] 36 | 37 | return twitter_configs 38 | 39 | 40 | 41 | class StdOutListener(StreamListener): 42 | """ A listener handles tweets are the received from the stream. 43 | This is a basic listener that just prints received tweets to stdout. 44 | 45 | """ 46 | def __init__(self,count): 47 | self.count=count 48 | self.index = 1 49 | 50 | 51 | def on_data(self, data): 52 | 53 | a = json.loads(data,encoding='utf-8') 54 | if a['lang'] == 'en' and len(a['text']) > 100: 55 | special=[";",r"\r\n"] 56 | 57 | current=a['text'] 58 | for curSpec in special: 59 | current.replace(curSpec,"") 60 | current=unicode(current.encode('utf-8'), 'ascii', 'ignore') 61 | 62 | self.index=self.index+1 63 | print current 64 | writer.writerow([current]) 65 | 66 | if self.index >= self.count: 67 | return False 68 | else: 69 | return True 70 | 71 | def on_error(self, status): 72 | print status 73 | 74 | 75 | 76 | if __name__ == '__main__': 77 | twitter_configs = get_twitter_configs() 78 | count = twitter_configs.count 79 | l = StdOutListener(count) 80 | with open(twitter_configs.file_name, 'wb') as f: 81 | writer = csv.writer(f,delimiter='\t') 82 | 83 | auth = OAuthHandler(twitter_configs.consumer_key, 84 | twitter_configs.consumer_secret) 85 | auth.set_access_token(twitter_configs.access_token, 86 | twitter_configs.access_token_secret) 87 | 88 | stream = Stream(auth, l) 89 | 90 | if len(twitter_configs.filter) > 0: 91 | stream.filter(track=[twitter_configs.filter]) 92 | else: 93 | # filter required 94 | stream.filter(track=["a"]) 95 | -------------------------------------------------------------------------------- /NewTermSentimentInference.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import sys 3 | 4 | twitterData = sys.argv[1] # csv file 5 | 6 | def tweet_dict(twitterData): 7 | ''' (file) -> list of dictionaries 8 | This method should take your csv file 9 | file and create a list of dictionaries. 10 | ''' 11 | twitter_list_dict = [] 12 | twitterfile = open(twitterData) 13 | twitterreader = csv.reader(twitterfile) 14 | for line in twitterreader: 15 | twitter_list_dict.append(line[0]) 16 | return twitter_list_dict 17 | 18 | 19 | def sentiment_dict(sentimentData): 20 | ''' (file) -> dictionary 21 | This method should take your sentiment file 22 | and create a dictionary in the form {word: value} 23 | ''' 24 | afinnfile = open(sentimentData) 25 | scores = {} # initialize an empty dictionary 26 | for line in afinnfile: 27 | term, score = line.split("\t") # The file is tab-delimited. "\t" means "tab character" 28 | scores[term] = float(score) # Convert the score to an integer. 29 | return scores # Print every (term, score) pair in the dictionary 30 | 31 | 32 | def main(): 33 | 34 | tweets = tweet_dict(twitterData) 35 | sentiment = sentiment_dict("AFINN-111.txt") 36 | accum_term = dict() 37 | 38 | """Calculating sentiment scores for the whole tweet with unknown terms set to score of zero 39 | See -> DeriveTweetSentimentEasy 40 | """ 41 | 42 | for index in range(len(tweets)): 43 | 44 | tweet_word = tweets[index].split() 45 | sent_score = 0 # sentiment of the sentence 46 | term_count = {} 47 | term_list = [] 48 | 49 | for word in tweet_word: 50 | word = word.rstrip('?:!.,;"!@') 51 | word = word.replace("\n", "") 52 | if not (word.encode('utf-8', 'ignore') == ""): 53 | if word.encode('utf-8') in sentiment.keys(): 54 | sent_score = sent_score + float(sentiment[word]) 55 | else: 56 | sent_score = sent_score 57 | accum_term[word] = [] 58 | term_list.append(word) #inverted index 59 | if word.encode('utf-8') in term_count.keys(): 60 | term_count[word] = term_count[word] + 1 61 | else: 62 | term_count[word] = 1 63 | 64 | for word in term_list: 65 | accum_term[word].append(sent_score) # for each new word assign to this word the sentiment of the tweet 66 | 67 | """Derive the sentiment of new terms 68 | """ 69 | 70 | for key in accum_term.keys(): 71 | adjusted_score = 0 72 | term_value = 0 73 | total_sum = 0 74 | for score in accum_term[key]: 75 | total_sum = total_sum + score 76 | 77 | """if a word is present in more tweet -> to the word is assigned the average of the sentiment of the tweets that contain it 78 | """ 79 | 80 | term_value = (total_sum)/len(accum_term[key]) 81 | 82 | adjusted_score = "%.3f" %term_value 83 | print key.encode('utf-8') + " " + adjusted_score 84 | 85 | 86 | 87 | if __name__ == '__main__': 88 | main() 89 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #Project of Sentiment Analysis for the course of Big Data Analysis of the Department of Computer Engineering of Modena and Reggio Emilia.# 2 | 3 | 4 | ##Library used in the project## 5 | 6 | 7 | 8 | 9 | - **Tweepy** - [https://github.com/tweepy/tweepy ](https://github.com/tweepy/tweepy "Tweepy") 10 | 11 | 12 | 13 | - **Natural Language Toolkit** - [http://nltk.org/install.html](http://nltk.org/install.html "NLTK") 14 | 15 | To install additional modules and datasource (Wordnet and other modules are required) launch the following command on a python shell: 16 | 17 | import nltk 18 | nltk.download() 19 | 20 | Download the followings modules: 21 | 22 | - Stopword 23 | - Wordnet 24 | - Wordnet_ic 25 | - All the modules in models 26 | 27 | ##Project usage## 28 | 29 | ###Tweet### 30 | 31 | The script named `ExtractTweet.py` can be used to download tweets in a `csv` file. This script is configurable by this file: `config.json` 32 | 33 | The configurable fields are: 34 | 35 | - consumer_key 36 | - consumer_secret 37 | - access_token 38 | - access_token\_secret 39 | 40 | These fields can be retrieved from [https://dev.twitter.com](https://dev.twitter.com) after creating an account and an application 41 | 42 | - file_name (name of the `cvs` output file) 43 | - count (number of tweet to download) 44 | - filter (a word used to filter the tweet in output) 45 | 46 | The CSV file produced in output can be used as arg of the other three script: 47 | 48 | - `DeriveTweetSentimentEasy.py`: This script uses AFINN-111.txt as vocabulary to try to assign a sentiment score to a tweet. 49 | - `NewTermSentimentInference.py`: This script try to assign a sentiment score to the words that are not present in AFINN-111.txt based on the sentiment score of a group of tweets. 50 | - `SentiWordnet.py`: This script uses SentiWordNet as vocabulary to try to assign a sentiment score to a tweet. The metrics of the scoring and the annotation process are more complex in this script. 51 | 52 | ###Document Sentiment Classification### 53 | 54 | The script is called `DocumentSentimentClassification.py` and implements a simple method for document sentiment classification. 55 | it possible to set some configuration parameters in the top of Python script: 56 | 57 | - FILE_NAME -> name of the file .txt on which you want execute the classification 58 | - API_KEY_BING -> Api Key Bing [http://it.bing.com/dev/en-us/dev-center](Bing Dev) 59 | - API_KEY_GOOGLE -> Api Key for Custom Search Api [https://cloud.google.com/console](Google Api Console) 60 | - USE_GOOGLE -> Enable (True) or Disable (False) the use of the Google Api for Custom Search ( 100 query for day without have to pay ) 61 | 62 | ### ipython ### 63 | 64 | For an interactive example with ipython, go into the folder `BDA_Senti_ipython` and launch the command: 65 | 66 | $ ipython notebook --pylab inline 67 | 68 | ###[Slides](http://www.slideshare.net/faigg/tutotial-of-sentiment-analysis)### 69 | 70 | 71 | 72 | -------------------------------------------------------------------------------- /SentiWordnet.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import csv 3 | import nltk 4 | from nltk.corpus import wordnet 5 | import re 6 | import codecs 7 | 8 | twitterData = sys.argv[1] # tweet input file (.csv) 9 | 10 | 11 | 12 | class SentiWordNetCorpusReader: 13 | def __init__(self, filename): 14 | """ 15 | Argument: 16 | filename -- the name of the text file containing the 17 | SentiWordNet database 18 | """ 19 | self.filename = filename 20 | self.db = {} 21 | self.parse_src_file() 22 | 23 | def parse_src_file(self): 24 | lines = codecs.open(self.filename, "r", "utf8").read().splitlines() 25 | lines = filter((lambda x : not re.search(r"^\s*#", x)), lines) 26 | for i, line in enumerate(lines): 27 | fields = re.split(r"\t+", line) 28 | fields = map(unicode.strip, fields) 29 | try: 30 | pos, offset, pos_score, neg_score, synset_terms, gloss = fields 31 | except: 32 | sys.stderr.write("Line %s formatted incorrectly: %s\n" % (i, line)) 33 | if pos and offset: 34 | offset = int(offset) 35 | self.db[(pos, offset)] = (float(pos_score), float(neg_score)) 36 | 37 | def senti_synset(self, *vals): 38 | if tuple(vals) in self.db: 39 | pos_score, neg_score = self.db[tuple(vals)] 40 | pos, offset = vals 41 | synset = wordnet._synset_from_pos_and_offset(pos, offset) 42 | return SentiSynset(pos_score, neg_score, synset) 43 | else: 44 | synset = wordnet.synset(vals[0]) 45 | pos = synset.pos 46 | offset = synset.offset 47 | if (pos, offset) in self.db: 48 | pos_score, neg_score = self.db[(pos, offset)] 49 | return SentiSynset(pos_score, neg_score, synset) 50 | else: 51 | return None 52 | 53 | def senti_synsets(self, string, pos=None): 54 | sentis = [] 55 | synset_list = wordnet.synsets(string, pos) 56 | for synset in synset_list: 57 | sentis.append(self.senti_synset(synset.name)) 58 | sentis = filter(lambda x : x, sentis) 59 | return sentis 60 | 61 | def all_senti_synsets(self): 62 | for key, fields in self.db.iteritems(): 63 | pos, offset = key 64 | pos_score, neg_score = fields 65 | synset = wordnet._synset_from_pos_and_offset(pos, offset) 66 | yield SentiSynset(pos_score, neg_score, synset) 67 | 68 | ###################################################################### 69 | 70 | class SentiSynset: 71 | def __init__(self, pos_score, neg_score, synset): 72 | self.pos_score = pos_score 73 | self.neg_score = neg_score 74 | self.obj_score = 1.0 - (self.pos_score + self.neg_score) 75 | self.synset = synset 76 | 77 | def __str__(self): 78 | """Prints just the Pos/Neg scores for now.""" 79 | s = "" 80 | s += self.synset.name + "\t" 81 | s += "PosScore: %s\t" % self.pos_score 82 | s += "NegScore: %s" % self.neg_score 83 | return s 84 | 85 | def __repr__(self): 86 | return "Senti" + repr(self.synset) 87 | 88 | 89 | def tweet_dict(twitterData): 90 | ''' (file) -> list of dictionaries 91 | This method should take your .csv 92 | file and create a list of dictionaries. 93 | ''' 94 | twitter_list_dict = [] 95 | twitterfile = open(twitterData) 96 | twitterreader = csv.reader(twitterfile) 97 | for line in twitterreader: 98 | twitter_list_dict.append(line[0]) 99 | return twitter_list_dict 100 | 101 | 102 | 103 | 104 | # return true if a string ia a stopword 105 | def is_stopword(string): 106 | if string.lower() in nltk.corpus.stopwords.words('english'): 107 | return True 108 | else: 109 | return False 110 | 111 | # return true if a string is punctation 112 | def is_punctuation(string): 113 | for char in string: 114 | if char.isalpha() or char.isdigit(): 115 | return False 116 | return True 117 | 118 | # Translation from nltk to Wordnet (words tag) (code) 119 | def wordnet_pos_code(tag): 120 | if tag.startswith('NN'): 121 | return wordnet.NOUN 122 | elif tag.startswith('VB'): 123 | return wordnet.VERB 124 | elif tag.startswith('JJ'): 125 | return wordnet.ADJ 126 | elif tag.startswith('RB'): 127 | return wordnet.ADV 128 | else: 129 | return '' 130 | 131 | # Translation from nltk to Wordnet (words tag) (label) 132 | def wordnet_pos_label(tag): 133 | if tag.startswith('NN'): 134 | return "Noun" 135 | elif tag.startswith('VB'): 136 | return "Verb" 137 | elif tag.startswith('JJ'): 138 | return "Adjective" 139 | elif tag.startswith('RB'): 140 | return "Adverb" 141 | else: 142 | return tag 143 | 144 | """ input -> a sentence 145 | otput -> sentence in which each words is enriched of -> lemma, wordnet_pos, wordnet_definitions 146 | 147 | """ 148 | def wordnet_definitions(sentence): 149 | wnl = nltk.WordNetLemmatizer() 150 | for token in sentence: 151 | word = token['word'] 152 | wn_pos = wordnet_pos_code(token['pos']) 153 | if is_punctuation(word): 154 | token['punct'] = True 155 | elif is_stopword(word): 156 | pass 157 | elif len(wordnet.synsets(word, wn_pos)) > 0: 158 | token['wn_lemma'] = wnl.lemmatize(word.lower()) 159 | token['wn_pos'] = wordnet_pos_label(token['pos']) 160 | defs = [sense.definition for sense in wordnet.synsets(word, wn_pos)] 161 | token['wn_def'] = "; \n".join(defs) 162 | else: 163 | pass 164 | return sentence 165 | 166 | #Tokenization 167 | 168 | def tag_tweet(tweet): 169 | sents = nltk.sent_tokenize(tweet) 170 | sentence = [] 171 | for sent in sents: 172 | tokens = nltk.word_tokenize(sent) 173 | tag_tuples = nltk.pos_tag(tokens) 174 | for (string, tag) in tag_tuples: 175 | token = {'word':string, 'pos':tag} 176 | sentence.append(token) 177 | return sentence 178 | 179 | 180 | # WSD 181 | 182 | def word_sense_disambiguate(word, wn_pos, tweet): 183 | senses = wordnet.synsets(word, wn_pos) 184 | if len(senses) >0: 185 | cfd = nltk.ConditionalFreqDist( 186 | (sense, def_word) 187 | for sense in senses 188 | for def_word in sense.definition.split() 189 | if def_word in tweet) 190 | best_sense = senses[0] # start with first sense 191 | for sense in senses: 192 | try: 193 | if cfd[sense].max() > cfd[best_sense].max(): 194 | best_sense = sense 195 | except: 196 | pass 197 | return best_sense 198 | else: 199 | return None 200 | 201 | 202 | def main(): 203 | tweets = tweet_dict(twitterData) 204 | sentiment = SentiWordNetCorpusReader("SentiWordNet_3.0.0_20130122.txt") 205 | for index in range(len(tweets)): 206 | a = wordnet_definitions(tag_tweet(tweets[index])) 207 | obj_score = 0 # object score 208 | pos_score=0 # positive score 209 | neg_score=0 #negative score 210 | pos_score_tre=0 211 | neg_score_tre=0 212 | threshold = 0.75 213 | count = 0 214 | count_tre = 0 215 | 216 | """ 217 | Conversion from plain text to SentiWordnet scores 218 | """ 219 | 220 | for word in a: 221 | if 'punct' not in word : 222 | sense = word_sense_disambiguate(word['word'], wordnet_pos_code(word['pos']), tweets[index]) 223 | if sense is not None: 224 | sent = sentiment.senti_synset(sense.name) 225 | # Extraction of the scores 226 | if sent is not None and sent.obj_score <> 1: 227 | obj_score = obj_score + float(sent.obj_score) 228 | pos_score = pos_score + float(sent.pos_score) 229 | neg_score = neg_score + float(sent.neg_score) 230 | count=count+1 231 | print str(sent.pos_score)+ " - "+str(sent.neg_score)+ " - "+ str(sent.obj_score)+" - "+sent.synset.name 232 | if sent.obj_score < threshold: 233 | pos_score_tre = pos_score_tre + float(sent.pos_score) 234 | neg_score_tre = neg_score_tre + float(sent.neg_score) 235 | count_tre=count_tre+1 236 | print tweets[index] 237 | 238 | #Evaluation by different methods 239 | 240 | avg_pos_score=0 241 | avg_neg_score=0 242 | avg_neg_score_tre=0 243 | avg_neg_score_tre=0 244 | 245 | #2 246 | 247 | if count <> 0: 248 | 249 | avg_pos_score=pos_score/count 250 | avg_neg_score=neg_score/count 251 | 252 | #3 253 | 254 | if count_tre <> 0: 255 | avg_pos_score_tre=pos_score_tre/count_tre 256 | avg_neg_score_tre=neg_score_tre/count_tre 257 | 258 | #pint results 259 | #1 260 | print "pos_total : "+str(pos_score)+" - neg_ total: "+str(neg_score)+" - count : "+str(count)+" -> "+(" positivo " if pos_score > neg_score else ("negativo" if pos_score < neg_score else "neutro")) 261 | #2 262 | print "(AVG) pos : "+str(avg_pos_score)+" - (AVG) neg : "+str(avg_neg_score)+" -> "+(" positivo " if avg_pos_score > avg_neg_score else ("negativo" if avg_pos_score < avg_neg_score else "neutro")) 263 | #3 264 | if count_tre > 0: 265 | print "(AVG_TRE) pos : "+str(avg_pos_score_tre)+" - (AVG_TRE) neg : "+str(avg_neg_score_tre)+" -> "+(" positivo " if avg_pos_score_tre > avg_neg_score_tre else ("negativo" if avg_pos_score_tre < avg_neg_score_tre else "neutro")) 266 | print "" 267 | 268 | 269 | if __name__ == '__main__': 270 | main() -------------------------------------------------------------------------------- /config.json: -------------------------------------------------------------------------------- 1 | { 2 | "consumer_key": "", 3 | "consumer_secret": "", 4 | "access_token": "", 5 | "access_token_secret": "", 6 | "file_name": "tweet.csv", 7 | "count": 300, 8 | "filter": "obama" 9 | } -------------------------------------------------------------------------------- /review_bad.txt: -------------------------------------------------------------------------------- 1 | Please don't waste your money this shaver is not worth it. I have only had this razor for a little over a year and it has all kinds of problems. The motor acts up and sometimes it will work and other times it will quit in the middle of me shaving. The other thing is the foil replacements cannot be bought in stores you have to order them online which is a huge inconvenience. The one set of cutters they sell at the local stores are not the factory blades that they put on the shaver they are worse. They pull your hair and don't shave as close. Once I found this out I ordered the new foil and cutters online and it cost me almost $50 because the only place that had them in stock was Braun. 2 | 3 | To clean this shaver you have to take the foil and cutters of by pressing a button on the side of the shaver. Since I have used this shaver so much and had to clean it the button stopped releasing the foil and cutters, now the only way to get it off is if I pry it off with a screwdriver. This is ridiculous and I should not have to do that. 4 | 5 | As for shaving...Not really worth the money either I get a closer shave using my fusion razor and also with less irritation compared to this. 6 | 7 | My advice to you is don't waste your money on this shaver. I would rather spend the money on the Fusion refills then buy another one of these to break in a little over a year. 8 | -------------------------------------------------------------------------------- /review_good.txt: -------------------------------------------------------------------------------- 1 | What I love about this book is that it is not a formula for "getting rich quick," or another list of what to do to make people "visit" your site. Instead, it is an extensive description of what is really happening in the real world right now. 2 | 3 | Semantic search is intentionally difficult to game. Of all the ideas in the book, this is the most salient for me. 4 | 5 | Google appears to want to make certain that this is NOT a game, that search is connected to ideas that are honest, original and interconnected to those real people who have taken the time to think things through. The takeaway I get from the book is the best thing to do with your business is to make it really great! 6 | 7 | The intention of Google in all this seems promising and optimistic: they appear to want to ensure that, ongoing, we will have a way of connecting to the best ideas NOT because someone put a million hashtags in the HTML, or because someone PAID for the privilege (Google is very clear about who has paid to be in search results), but because what we find really is what we are looking for (or, actually, BETTER than what we were looking for because semantic search can fill in the blanks and know that we do not yet know that there is something better out there). --------------------------------------------------------------------------------