├── README.md ├── SemEval2019-Task5 ├── README.md ├── datasets │ ├── README.md │ └── trial │ │ ├── README.md │ │ ├── trial_en.tsv │ │ └── trial_es.tsv └── evaluation │ ├── README.md │ ├── evaluation.py │ └── metadata ├── annotation_guidelines.md ├── competition.yaml ├── data.html ├── evaluation.html ├── keyword_set.md ├── logohateval.png ├── overview.html ├── public_trial.zip ├── reference_test_en.zip ├── reference_test_es.zip ├── reference_trial.zip ├── scoring_program.zip └── terms.html /README.md: -------------------------------------------------------------------------------- 1 | # hateval 2 | 3 | Configuration files for Codalab project 4 | 5 | # Data 6 | 7 | In order to get the HatEval official dataset, please fill in the form on [this page](http://hatespeech.di.unito.it/hateval.html). 8 | -------------------------------------------------------------------------------- /SemEval2019-Task5/README.md: -------------------------------------------------------------------------------- 1 | # SemEval2019-Task5: Multilingual Detection of Hate (hatEval) # 2 | 3 | This is the Github repository for SemEval-2019 Task 5. 4 | 5 | #### Disclaimer #### 6 | Please note that the data are exclusively reserved for the participants of SemEval-2019 Task 5 and are not to be used freely. 7 | They may be distributed upon request (contact the task organizers) and for academic purposes only. 8 | 9 | The following **datasets and scripts** will be distributed throughout the competition: 10 | * Trial data & evaluation script (20/08/2018) 11 | * Data for training and development & benchmark system (01/10/2018) 12 | * Test data (10/01/2019) 13 | 14 | Should you have any questions, contact our [Google group](https://groups.google.com/forum/#!forum/semeval2019-task5-hateval). 15 | -------------------------------------------------------------------------------- /SemEval2019-Task5/datasets/README.md: -------------------------------------------------------------------------------- 1 | # SemEval-2019 Task 5 - datasets # 2 | 3 | In this repository you will find the datasets for SemEval-2019 Task 5: 4 | 5 | * Trial data for English and Spanish (August 20, 2018 ) 6 | * Training data for taskA-English, taskA-Spanish, taskB-English, taskB-Spanish (October 1, 2018) 7 | * Test data for taskA-English, taskA-Spanish, taskB-English, taskB-Spanish (January 10, 2019) 8 | 9 | ## Useful links: ## 10 | * CodaLab website of the task 11 | * SemEval-2019 website 12 | * SemEval-2019 Task 5 Google group 13 | -------------------------------------------------------------------------------- /SemEval2019-Task5/datasets/trial/README.md: -------------------------------------------------------------------------------- 1 | # Trial data # 2 | 3 | In this repository you will find example files for SemEval 2019 - Task 5, released during the **Practice** phase. The files are: 4 | 5 | * *trial_en.tsv*: it includes English tweets about hate speech against women and immigrants 6 | * *trial_es.tsv*: it includes Spanish tweets about hate speech against women and immigrants 7 | 8 | Trial data serve the purpose of showing how the official training data will look like, and allowing participants to try out uploading a submission to CodaLab. 9 | 10 | ### File format ### 11 | The data have been encoded with the same format, independently of the task and language. Such format consists of a tab-separated file with one line per tweet containing the following fields: 12 | * a numeric ID that uniquely identifies the tweet within the dataset (*id* column) 13 | * the text of the tweet (*text* column) 14 | * a binary value {1|0} indicating if hate speech is occurring against one of the given targets, women or immigrants (*HS* column) 15 | * if HS occurs (i.e. the value for the feature at point 2 is 1), a binary value indicating if the target is a generic group of people (0) or a specific individual (1) (*TR* column) 16 | * if HS occurs (i.e. the value for the feature at point 2 is 1), a binary value indicating if the tweeter is aggressive (1) or not (0) (*AG* column) 17 | 18 | 19 | 20 | 21 | 22 | 23 | -------------------------------------------------------------------------------- /SemEval2019-Task5/datasets/trial/trial_en.tsv: -------------------------------------------------------------------------------- 1 | id text HS TR AG 2 | 1 RT @vaintshit: shut the fuck up and come suck my dick " 1 1 1 3 | 2 @ArianasBotch Ok if you fucking said leave block me. But dm me first I'm gonna kick your ass. Shut the fuck up you https://t.co/6BXkfxMEf9 1 1 1 4 | 3 @CyV_SW Wow mo cock got hard. Want to pull your panties down, push you against these door and let you feel my cock deep in your wet puss 😈 1 1 1 5 | 4 Ill kill the bitch (chloe) when your not home — Idgaf if you’re kidding bitch I’ll report your ass ... https://t.co/GlBx3SroY0 1 1 1 6 | 5 ...............................'I get to rape beautiful women and that's why I like working on this project.' 1 0 1 7 | 6 Ovide Mercredi to Paul Schabas, 'Char's a squaw. I attack Indigenous women all the time. No one will notice if you rape her! Rape her!' 1 0 1 8 | 7 lol, chop her head off and rape the bitch https://t.co/ZB8CosmSD8 1 1 1 9 | 8 @HausOfRo I love your ass bitch https://t.co/zvehim1Ts7 1 1 0 10 | 9 @husska89 go rape your whore of a mother 1 1 1 11 | 10 If she lets you put your penis in her vagina with no strings attached, have the common decency to treat her like a queen on her bday 1 0 0 12 | 11 WOW can't believe all these women riding the subway today? Shouldn't these bitches be making sandwiches LOL #ihatefemales.. 1 0 0 13 | 12 @DaniellaMicaela YOU FVCKING CUNT BITCH!!! APOLOGIZE! @CNN @cnnbrk @CNNPolitics 1 1 0 14 | 13 @MSaveryWhiteway @MoiraSSnews @HPEschools Fucking cunt where's my belt bitch suck my dick and make me cum on your mouth 1 0 1 15 | 14 #YesAllWomen should stay in the kitchen 1 0 0 16 | 15 @Corter_back no I said hope. I hope you women learn your place! #SitDownInTheKitchen 1 0 0 17 | 16 @femfreq Y'know, you'd probably get a job at Subway because all a cunt like you is good for is making sandwiches 1 0 0 18 | 17 @melstreett don't worry about it and go make me a sandwich woman. #WomenSuck 1 0 0 19 | 18 @PennyRed @TIME This is not rape culture for fucks sake. It’s literally just a scumbag taking advantage of their position. Nothing new under the sun. 1 0 0 20 | 19 I FUCKING TOLD YOU SKANK WHORE SLUTS LISTEN TO MS SERIOUSLY https://t.co/HB12As7Ttr 1 1 0 21 | 20 @JonRiley7 Some women are aggressive, promiscuous whores, but all women must try harder not to make men feel uncomf… https://t.co/DTioxr1ZmO 1 0 0 22 | 21 @lexxynichelle So this bitch blocked me but check her pinned tweet out, she is accusing men of rape on tumblr with… https://t.co/bX0V5V7kQU 1 1 0 23 | 22 This isn't rape, or anything like it. This a crude pass. This is not 'rape culture.' IT'S WHORE CULTURE. https://t.co/lRIK86iLUx 1 0 0 24 | 23 Stop tweeting about football. You're a girl and you opinion doesn't count. #WomenSuck 1 0 0 25 | 24 rape is bad but alota hoes be lying bout that shit women are crazy dawg & will fuck ya life up be careful.. 1 0 0 26 | 25 @EmmaSlutWatson oh you dirty bitch me and my friends are going to fuck all your holes till you can't take no more… https://t.co/zx8MJn5oJb 1 1 1 27 | 26 My sister it's #NoBraDay to raise cancer awareness not for you to hoe 😅 0 0 0 28 | 27 I love my religious brothers and sisters, but @realDonaldTrump, FUCK YOU, YOU'RE NOT EVEN A REAL THEOCRAT YOU FAT USLESS BITCH. 0 0 0 29 | 28 @HanSimpkin Wait till you've got gin tomorrow you won't have it then cock 0 0 0 30 | 29 There's interesting insight to uncover and understand if you want to go down that rabbit hole. 0 0 0 31 | 30 @chiefrepublic1 @TandtheB LOL... they didn't applaud a rape threat.. .how long are you going to ride that shit? 0 0 0 32 | 31 Hey ladies and men, rape, and sexual assault, etc of any sort is not acceptable in society. For those of you who commit these acts, pls die. 0 0 0 33 | 32 We don't know about you , k.inne bare fan ho ap bh cricket k. @fahadmustafa26 @rummanraees15 @realshoaibmalik @babarazam258 https://t.co/2S7e4deOcn 0 0 0 34 | 33 @scottygb PS: so happy we kicked your ass off the continent in 1776. Can't imagine living in a socialist rat hole emulating 1984. 0 0 0 35 | 34 @realDonaldTrump Dont you mean killing Americans. You lowlife piece of garbage i hope you rot in hell you son of a bitch 0 0 0 36 | 35 @realDonaldTrump Trump, the only thing you worship is Putin’s dick and money. 0 0 0 37 | 36 how small a cock do you have to have to park sideways in the Tesco car park?? Flexing your Ford Focus lol 0 0 0 38 | 37 @TMZ I don't want to blame the victim but this makes it difficult. Are you a ho just lookin for money or are you a victim looking for justice? 0 0 0 39 | 38 @MrStephenHowson you had a great chance to show everyone that you’re not a cunt on tv tonight and you fucked it up. 0 0 0 40 | 39 @sugayiffer Buy more RAM, girl :P 0 0 0 41 | 40 @glaiveXD WHERE THE FUCK AM I U STUPID CUNT 0 0 0 42 | 41 PLAY MUSICAL CARS ALL YOU WANT...IM LIKE YOU...GOTTA KILL ME TO MAKE ME STOP...YOU HAVE NO DICK https://t.co/Qb7nO1HjuI 0 0 0 43 | 42 @LiamSeaward8 @_KramerPeter Bitch you don't talk about my fucking kids. My fucking kids eat better meals than you'll ever eat in your life, you see that? 0 0 0 44 | 43 @PrideofGypsies a bunch of people who can’t separate a character you play, from the real world, are moaning that you “glamourise rape”, ugh! 0 0 0 45 | 44 @realDonaldTrump You are a dumb cunt twit! When people start dying you should be tried for murder you asshole 0 0 0 46 | 45 If you're fearful to go down rabbit hole, then those who spread oppression, have you right where they want you- 100% controlled. - #Cagsil 0 0 0 47 | 46 Silent about harassment? NOPE. Silent about sexual assault? NOPE. Silent about rape? FUCK NO. You can't shut me up on a GOOD day. 0 0 0 48 | 47 @TheMelissaMilf U should be sucking and gagging on my cock Melissa and me Fucking that hot Booty and stunning sweet Pussy tonight Love 😚😚😚 0 0 0 49 | 48 @meanwhile_0 eats entire bags of dicks. Like, seriously. I've seen him eat an entire bag of disembodied cocks. He's a ravenous cunt m8 0 0 0 50 | 49 @_phriss Dick????😭😭.. Are you seriois now 😒 0 0 0 51 | 50 Someone asks you to catch a rare fish with like two days left in the season and it's like dude can you not be such a dick 0 0 0 52 | 51 Now the USA, the EU and Russia can close their borders for illegal migration of fake refugees. The migrants have to assimilate and to work, or have to go home. I'm also expecting the deislamisation of Europe. It doesn't work that way! 1 0 0 53 | 52 LMAOOOOOOOO!!! Stuff I Say... Some stuff be so damn Funny...Especially when it comes to Illegal Immigrants who are full fledged CRIMINALS Coming to America & I said "Fall off the Boat & Fall all the way, as in a GO BACK HOME In even that". Funny & yet Truthful lolll lol dude smh https://t.co/hAwd3GNHDc 1 0 1 54 | 53 The German Government Pays for 3 Week Vacation for Refugees to Go Home - https://t.co/zVw9e2CU4Q https://t.co/VlYsYYOy41 1 0 0 55 | 54 #sendthemhome why does Twitter keep deleting this? 1 0 1 56 | 55 Mexico’s not sending their best. A lot of them are drunks and thieves engaged in constant criminality. #StopTheInvasion #DeportThemAll #NoAmnesty #BuildTheWall https://t.co/hsWyOIdZ3B 1 0 1 57 | 56 Go back and take a vacation to that land you fled in terror for your life from. https://t.co/1jvXiJAVms 1 0 0 58 | 57 Danish #Muslims Insist On Wearing A #Niqab (or FaceVeil) & Still Desire To Live As A Non-Person Ghost. Muslims Are Hopelessly Unable To Assimilate. #SendThemBack. @victororban @realDonaldTrump @GiuseppeConteIT @Peggynoonannyc https://t.co/nd4XAzn5ns 1 0 0 59 | 58 The German Government Pays for 3 Week Vacation for Refugees to Go Home | Armstrong Economics https://t.co/bEo7b2ENPV 1 0 0 60 | 59 last night outside friend's flat in Saxony-Anhalt Germany' she watched a fight between some German nationalists and Africans immigrants. she heard "heil hitler" "go home" "i"'ll kill you" from the Germans. From the Africans? "I don't want trouble but I will fight if you touch me 1 0 1 61 | 60 American taxpayers are billed $8 BILLION EVERY YEAR for REFUGEES ???????? 😲🤬That’s ABSURD & should be illegal! When did we vote for that?????????!!! #NoMoreRefugees https://t.co/mR8PVNUBqT 1 0 0 62 | 61 Burundian refugees should go home! https://t.co/it0c7LmfXg https://t.co/LB5NzdQFmw 1 0 0 63 | 62 Illegal Alien Awaiting Deportation For Child Rape, Now Caught Producing Child Porn https://t.co/t7FLU2CjEy 1 1 1 64 | 63 GERAMNY: African rapefugee climbs into house, steel expensive goods, rook a knife and abuse a girl of the family. https://t.co/fVXhWMYcE8 1 0 0 65 | 64 No benefits to any immigrants !! Come here & make it on your own, or go back home !! @POTUS https://t.co/1gRTNP5H3L 1 0 1 66 | 65 No, I am not paying for his wall. It is a waste of money cause it will not stop illegal immigrants. Most come on a visa and just never go home. https://t.co/HkiMyqHrEa 1 0 0 67 | 66 DC stay tuned, antifascists, queer people, black and brown people are mobilizing against hate in Washington, DC Help spread the word and tell fash not to come https://t.co/UrtZBYD1RH https://t.co/lNMJPE6kZD 1 0 0 68 | 67 @marnard808 @a_dub_alva @emrazz @Prof_Tweeper @TomiLahren So you cant stand your own kind either you gotta breed with something else. 1 0 1 69 | 68 The German Government Pays for 3 Week Vacation for Refugees to Go Home | Armstrong Economics https://t.co/z9x41OoWeq 1 0 0 70 | 69 * Sweden. The Alternativ for Sverige: “Syria is now ready to receive close to 900.000 Syrian refugees in the coming months. It is now time for all 150,000 Syrians who came to Sweden since the outbreak of war in 2011 to go home!” https://t.co/hccQmsQ9Vl #v4 #visegrad https://t.co/v70bVzAjIW 1 0 0 71 | 70 anyone who thinks we should not welcome refugees should read this. @ilctr @tylercowen @IIRGMU https://t.co/iLGt7NwdM8 1 0 0 72 | 71 How the hell is this not illegal? #StopTheInvasion #MuslimIlegals https://t.co/oHf855ZC2Q 1 0 0 73 | 72 Illegal aliens waiting for first class #MSC container-shippping in @PortofAntwerp #Antwerpen #StopImmigration https://t.co/t32BDR9RlO 1 0 1 74 | 73 Got an IDEAL. If we stopped giving illegal immigrants Food Stamp, Welfare would they go home. And if they are illegally here, why should they get any of this? Also, has anyone has the amount of our tax dollars go to these folks? Just asking! 😎🎩 1 0 1 75 | 74 @1bZhEFkl9MhwosZ @KAFTA78 @CanEmbSA Of course not!This man 🇨🇦 wanted to free wrote something Saudi Arabia did not like!But your country is full of slaves! Because you do not know the truth of the Bible you will never be free! We let people be free & choose for themselves, but apostasy in Islam is a death sentence 1 1 0 76 | 75 Sick barstewards! This is what happens when we put up the refugees welcome signs! They not only rape our wives or girlfriends, our daughters but our ruddy mothers too!! https://t.co/XAYLr6FjNk 1 0 1 77 | 76 Hundreds of migrants armed with sticks and homemade flamethrowers broke through the border fence in Ceuta, according to the Spanish Civil Guard. Over 100 migrants and 15 border agents were injured in the fight. Well the Spanish are saying "refugees welcome" tourists are not... 0 0 0 78 | 77 Boy, 16, fights for his life after being stabbed on London street https://t.co/PYRrmaN3tz 0 0 0 79 | 78 And why should he now buckle to their demands that he swear undying trust for these A******* ????? THEY are the enemy of this nation! #Treason https://t.co/YBuCPtPp7e 0 0 0 80 | 79 The German Government Pays for 3 Week Vacation for Refugees to Go Home -INCLUDING airfare BACK to the very countries that claim they are fleeing because it is unsafe. | Armstrong Economics https://t.co/6vNlFcwQa9 0 0 0 81 | 80 Wrong. Doug Ford simply wants the federal government to foot the bill because the Liberal government was not prepared and budgeted to welcome the surge in refugees. Nobody is strictly against immigration but it must be orderly otherwise taxpayers pay through their nose. https://t.co/qscloK5CnZ 0 0 0 82 | 81 American Chiropractic Association Members Advocate for Veterans Health, Student Debt Relief https://t.co/F17fk9hxuY 0 0 0 83 | 82 MSM agreement includes covering for HRC https://t.co/kM5MOoHNt4 0 0 0 84 | 83 @11head_shot I am not forcing anyone. I am just putting out my view. People believe it or not is their prerogative.Even if a single person agree with me, it is ok with me. Thanks, 0 0 0 85 | 84 Huge show of solidarity & powerful speeches at #refugeeswelcome rally - we will not allow @SercoGroup to throw vulnerable families on to streets - say it loud & say it clear, refugees are welcome here! #peoplemakeglasgow @AbdulBostani2 @ChrisStephens @alison_phipps https://t.co/H6sVocxG3L 0 0 0 86 | 85 Tajikistani Charged With Beating New York University Professor To Death https://t.co/U2zEwQ19Av Another animal.../ 0 0 0 87 | 86 @djf510 You mean it's not true? 0 0 0 88 | 87 Child Prostitution: Feds Break Up Mexico-to-Queens Sex Trafficking Ring https://t.co/4Jd6lIN3jy via @WNYC 0 0 0 89 | 88 Home Office guilty of "a serious breach of the duty of candour and cooperation" regarding children entitled to enter UK. Where did these children go? Many disappeared, which means they are lost or trafficked. How can we have come to this? @AmberRuddHR https://t.co/JjCttbKZKl 0 0 0 90 | 89 [NEW POST] In Switzerland, people have managed to get a referendum changing the financial system with the same promises of Karl Marx that will end all financial crisis to come forever. It is scheduled for June 10th, 2018 after they got 100,000 signatures https://t.co/Gt4QtwEp1o 0 0 0 91 | 90 #DACA program should be fully restarted, federal judge rules. https://t.co/Ca1IRV6LgO #EndDACA #SendThemBack 0 0 0 92 | 91 The Plague of One-Dimensional Analysis https://t.co/SkolQoFlph https://t.co/WViBMj8Kao 0 0 0 93 | 92 @IDS_MP P.S ..Why do you not pay unemployed people who do endless hours of voluntary work they do that to give something to the community 0 0 0 94 | 93 @Redbaiternz Seriously, Amy and Cindy are BFFs, I know that for sure. Hmm, mmm. 0 0 0 95 | 94 This Open Letter From Detained Immigrant Parents To The American People Is Soul-Crushing https://t.co/LcKPxH8y76 0 0 0 96 | 95 The German Government Pays for 3 Week Vacation for Refugees to Go Home | Armstrong Economics https://t.co/h6QeFqP6Vs 0 0 0 97 | 96 Russia has said that within the coming months, nearly 900,000 Syrian refugees will be able to go home. https://t.co/lHdkiiDfJK https://t.co/jIf5mqQ0N7 0 0 0 98 | 97 Orthodox Church attacked in hate crime against Christians. The Church of Michael the Archangel in Grozny was attacked by 4 armed jihadists this evening service. Worshippers and law enforcement officers killed.... https://t.co/rezUd1MOoL 0 0 0 99 | 98 UK MPs probe unpublished UN ‘sex-for-food’ findings: report https://t.co/k6Z2fHVFXU 0 0 0 100 | 99 \"Large areas have been evacuated and people were moved to shelters to wait for the waters to subside.\" https://t.co/bk0yibqpPs 0 0 0 101 | 100 what a f**king political muppet. https://t.co/dYUVH5B4Oo 0 0 0 102 | -------------------------------------------------------------------------------- /SemEval2019-Task5/datasets/trial/trial_es.tsv: -------------------------------------------------------------------------------- 1 | id text HS TR AG 2 | 101 @ian_delaCalva @IrantzuVarela @pikaramagazine Oye por qué no molestas a tu puta madre? 0 0 0 3 | 102 NINGUNA MUJER ES 'PUTA' ❗❗❗ https://t.co/cV0CQM5RIg 0 0 0 4 | 103 Editar, además de complicado, es lo que hace de algo que merezca la pena o no, porque puedes haber grabado la puta hostia de planos, que si no los montas bien, con un sentido y los maquillas atractivamente, se va a la mierda. 0 0 0 5 | 104 Bien joder una puta alegría!! Te lo mereces pequeña #OTGala9 0 0 0 6 | 105 Todo política... no se levanta la sesión y estos hijos de puta mandan a los peones que están en la plaza y provocar violencia... 15 policías heridos pero no pasa nada... si esos que están ahí son el pueblo... DIOS MIO el pueblo es mi vieja que se rompe el lomo laburando no ellos. 0 0 0 7 | 106 Casuso callate de una puta vez https://t.co/tWmjW1UFMM 0 0 0 8 | 107 @BedMM @NayBetania Tú eres la culpable, perra >:v 1 1 1 9 | 108 Yo no puedo creer que las propias mujeres naturalicen una violación o acoso y lo justifiquen con que la mujer es la que “provoca”. Que mente de mierda la puta madre 0 0 0 10 | 109 @DalasReview @jk_rowling Hijo de puta muerete asqueroso no te mereces esa mierda 0 0 0 11 | 110 @Wachit0 @100Akshat @_Midoriyaa_ @NeohSJ La idea está la polla con cebolla pero obviamente no todos van a entrar por desconfianza, pero entrarán los que quieran ir y tengan confianza 0 0 0 12 | 111 Dos amigos hablando: -Raúl, q me das por mi mujer? -Nada -Trato hecho. 1 0 0 13 | 112 — Yo soy respetuoso con los demás, sólamente les recuerdo lo que son. Y si son escoria no es culpa mía. ... — ¡Claro que lo sé! ¿¡Por quién me tomas!? 0 0 0 14 | 113 +¿Es que solo sabes subir historias de tu perra? -¿Es que hay algo más que merezca la pena? 0 0 0 15 | 114 ¿Estás en contra de Erdogan? Pues te ponemos una foto con una polla en la cara y manipulamos el titular. https://t.co/qbG04sMOsv 0 0 0 16 | 115 En que se parece una mujer a un cientifico? En que al cientifico le pasan cosas por la cabeza, y a la mujer le pasan cabezas por la cosa 1 0 0 17 | 116 @vanessagirona @KRLS @mossos Vienes a las mías..., 'políticos' no España, con Franco eso no pasaba teníais lo que queríais, todos por igual, el fanatismo por la independencia llegó después con atontaos a los que votamos, yo soy de c's por innovar no por o 0 0 0 18 | 117 @Reforma @aurelionuno No seas hijo de puta Nuño según tu eres representante de la educación puto viejo chismoso chupate mi verga..... 0 0 0 19 | 118 @Barbijaputa Por culpa de los petardos tengo unos amigos locos por encontrar a su perrita. Se les escapó y salió huyendo por el ruido. https://t.co/pw4b7kWwoc 0 0 0 20 | 119 DIPUTADOS.HIJOS DE PUTA Q. VAN A PROVOCAR A LOS GENDARMES . QUIENES SON ?? KMAXIMO DEL CAÑO MORO BREGMAN CUANDO LUCHARON ×LOS JUBILADOS?? 0 0 0 21 | 120 @Guru_socket @joaquinbarucco La mejor forma de dormirte es pegarte un cachetazo, en la cara dormido... (Pensala) 😠 🙆... 0 0 0 22 | 121 @Relatofeminista Hemos empezado de puta madre, si seguimos este paso llegaremos a las cuevas antes de que rajoy diga la próxima gilipollez 0 0 0 23 | 122 Ains... ya no queda buen machismo como el de antes... 1 0 0 24 | 123 @OneforAllDeku_ @xkingxexplosion ¡Tú no me das ordenes imbécil voy a destruirle la polla a ese maldito subnormal y voy hacer las cosas a mi maldita manera! ¡Apártate parásito! 0 0 0 25 | 124 Jajajajaja le tengo que enseñar eso a mi perra😂 https://t.co/rlTRFU3Dx0 0 0 0 26 | 125 @OrgMujeresCI Estoy de acuerdo, pero no sé puede minimizar o relativizar la violación o el acoso sexual. Sea de rocas y famosas o no 0 0 0 27 | 126 @anaisbernal Pásalo muy bien,te lo mereces 😘😘 0 0 0 28 | 127 @erreleny @Fel_blan Buena descripción, aunque se te ha olvidado decir la puta moralina. Todos tienen que ENSEÑARTE y EDUCARTE con su puta moralina de los cojones. Es una serie muy milenial. 0 0 0 29 | 128 Esa gente insultando y condenando al asesino de Diana Quer pero que luego le quitan importancia al acoso, violación y maltrato a la mujer... 0 0 0 30 | 129 @Maridolvers Perra te lo merecías! 😎 1 1 1 31 | 130 Qué es un partido doble mixto de tenis? Un single de hombres con obstáculos. 1 0 0 32 | 131 @Milenio Cállate rata asquerosa priísta de toda tu perra vida. No te cansas de robar. Pudrete hijo de tu puta madre. 0 0 0 33 | 132 Que es una Mujer embarazada de dos nenas y un varon? Un kit de limpieza con instrucciones. 1 0 0 34 | 133 ¿Cuál es la peor desgracia para una mujer? Parir un varón, porque después de tener un cerebro dentro durante 9 meses, van y se lo sacan. 1 0 0 35 | 134 ¿Porque se dice que la cerveza tiene hormonas femeninas? Porque cuando la bebes conduces mal, dices tonterias y nadie te hace caso. 1 0 0 36 | 135 @BeeluArzola callate puta mal parida perra re culia inconciente callada deberias estar maldita 1 1 1 37 | 136 @erices16_ Cállate vos wacha trola deja de confundir las hormonas puta 1 1 1 38 | 137 YA CALLATE PUTA PERRA ME TIENES CANSADO CON TUS PENDEJADAS 1 1 1 39 | 138 Te odio Lidia, no te mereces a Francisco, #perra #LasChicasDelCable2 1 1 1 40 | 139 @GustavoOrozcoH No le enseñaron modales a esa antifujimorista perra mermelera del foro de sao paulodebretch?esta ofendiendo a millones de votantes 1 0 1 41 | 140 @gonzozaqw @_piturros tu callate un mes perra 1 1 1 42 | 141 Te voy a enseñar lo que es una perra, PERRA. 1 1 1 43 | 142 Gracias Twitter por enseñarme que el truco para ser famosa es estar bien buena y ser bien puta 1 0 0 44 | 143 @abc_es @HiginiaRoig Ya está la niña del exorcista diciendo tonterías. Has visto lo que a hecho la guarra de tú hija. 1 1 1 45 | 144 @Laura99CM Ahora que me has faltado el respeto te lo puedo decir por fin: ¡A fregar puta! 1 1 1 46 | 145 @xVolkovaVera Que guarra. Búscate ya un mozo. 1 0 1 47 | 146 Lisa agradece que tenes un dueño con corazón porque te mereces que te ahorque perra hdp😡 1 1 1 48 | 147 @SoyDas117 @LuckasDroiidYT1 @BrandonGaming6 @LlinxaYT Y tu eres mi esposa ve a la cocina zorra >:v 1 1 1 49 | 148 '¿La mitad de sueldo por ser mujer?' http://t.co/HXOVOaiOuj Así es, porque aguantarlas cuesta el doble. #TomaDosTazas 1 0 0 50 | 149 ¿Por qué una mujer no puede ser guapa e inteligente? Porque sería un hombre. 1 0 0 51 | 150 -Veis chicas? Nunca hay que tener piedad con estas putas. Hay que educarlas en todo momento y enseñarles para lo que sirven: para nuestra diversión. Le voy a reventar su culo de puta mientras te lo come, y por su bien, espero que se esfuerce o no podrá se 1 1 0 52 | 151 El ínclito @sanchezcastejon cobra la sanidad a los inmigrantes legales mientras la regala a los sin papeles | Economía Digital https://t.co/BCxnADAVH6 1 0 1 53 | 152 Que tierra dejamos a nuestras generaciones. El planeta es de todos; no solo de unos pocos. Ahora ya me estan quitando el aire para respirar. No. Aquí o nos salvamos o morimos todos¡ incluídos refugiados pateras-etc Manda Huevos. El calor me afecta al cerebro pk es inteligente 1 0 0 54 | 153 Llevan razón, no hay ningún motivo para estar en alerta por terrorismo. Los migrantes,inmigrantes o como quieran llamarlos q están estando en España es la religión del amor, son educados y cívicos. Yo ya puestos os quitaría las armas y os daría flores o piruletas para defenderos https://t.co/fVEb0Kiyul 1 0 0 55 | 154 ULTIMA HORA ., el ministro del interior Grande Marlaska ha dicho a los guardias civiles de Ceuta que cuando salte la valla el negro de wuassap, que lo detengan y lo trasladen inmediatamente a su domicilio en Madrid 1 1 1 56 | 155 (@RoseDMatagalpa) A esas Bestias irán a poner con Gabacha Blanca,? la Sangre de Cristo tiene poder y nos libre de esas hordas criminales #SOSNicaragua #AYUDA !!!!!Estamos rodeados de asesinos por doquier..#SALVENANICARAGUA!! #SOSNicaragua #fueraO… https://t.co/GXhaNGmT6Y https://t.co/1GuWYRk4jb 1 0 1 57 | 156 Hay que pasarle al canciller Moncada una bolsa de gabacha, para cuando salga de la sesión, se tape la cara 😂😂😂 1 1 0 58 | 157 El problema de algunos no es “la xenofobia" sino que ”la aporofobia” (rechazo al pobre). A nadie le molesta el inmigrante millonario como Paulman sino el haitiano pobre que necesita ayuda. 1 0 0 59 | 158 #politivida Si no te gusta, puedes cambiarlo. DESTINO SUDACA https://t.co/2tpPakccSk 1 0 0 60 | 159 Este negro políticamente incorrecto es un facha porque defiende a España mucho más que los politicuchos que nos gobiernan, si fuera presidente controlaría mucho mejor la inmigración indecente e invasora, que el okupa bocazas que nos gobierna. https://t.co/WXyLR7WtWb 1 0 1 61 | 160 Ideas: para intentar recuperar la popularidad perdida, el fujimorismo empezará con el discurso xenófobo anti inmigración venezolana. Y voilá, he ahí nuestro partido fascista. 1 0 1 62 | 161 En Barcelona un grupo de inmigrantes indocumentados agrede gravemente a un turista norteamericano qué defendió a una mujer acosada por estos bandidos,tan poca hombría tienen los españoles qué no defienden a sus mujeres o son como la manada.? 1 0 1 63 | 162 1.-Los va a mantener Ud., porque desde su Palacio de Doñana,?se vé todo muy bonito. Es que no hay manera de que comprendan, PSOE y PODEMOS que no hay presupuesto para esta avalancha? Que somos la mofa Europea? Que todos los camellos de Lavapiés, MADRID, son inmigrantes ilegales? https://t.co/t1yeNXJnGH 1 0 1 64 | 163 - Los catalanes somos españoles. Y si no te gusta te jodes, porque como catalán serás español toda tu vida. + Això és com a mínim discutible. Del que no hi ha dubte és que, per molta bandera espanyola que portis, per a gran part d'espanya tu sempre seràs "un puto catalufo" 🤷 1 0 0 65 | 164 Otro artículo más de la canalla inmigracionista justificando la necesidad de destruir nuestra cultura en aras de una economía que nada entiende de integración, costumbres y convivencia. Si el problema son las pensiones, que se reformen, no nuestras vidas. https://t.co/fmPHT5OjkD 1 0 1 66 | 165 Muchísimas gracias!!! Además, con la inmigración de los africanos, nos vais a llenar las calles de heroína y comercio ilegal... Gracias de todo corazón! Además de acortar nuestras listas de espera en la sanidad, gracias a la sanidad pública universal. 👍 https://t.co/7rm0mAai4C https://t.co/gY6bTH0MG3 1 0 0 67 | 166 Los árabes son los hombres más descarados y ligadores del mundo 1 0 1 68 | 167 tu vete a la verga, puto sudaca 1 1 1 69 | 168 A buenas horas, mangas verdes. Tenemos al líder del partido con más fuerza en el Congreso deshumanizando a los inmigrantes con cifras inventadas y luego haciéndose fotos con ellos. Pero no, aquí no habrá populismo:) https://t.co/57cuoVsTMx 1 0 0 70 | 169 Amaneció de GOLPE se acerca de nuevo? Palabras de INDOCUMENTADO. Análisi... https://t.co/OC4wmrbLgA vía @YouTube 1 0 1 71 | 170 https://t.co/l7lBo0jBCh... Y el estado se hace pendejo con pagarles sus perdidas, recordemos que el mayate Rafael Moreno Valle se robo la lana de varios seguros 1 1 1 72 | 171 Sánchez da Sanidad gratis a los sin papeles... pero no a los inmigrantes legales https://t.co/SNpZKhBIsh Enviado desde @updayESP 1 0 1 73 | 172 Decenas de víctimas en ataques de la coalición árabe contra Al Hudaida https://t.co/iiacL0qfmw 1 0 0 74 | 173 Los manteros subsaharianos corren a sillazos a la Policía https://t.co/OXY9ZhnW5y vía @Periodistadigit 1 0 1 75 | 174 Hay racismo cuando no creen que soy su hermana porque soy güera @unpentagono @medicenlore https://t.co/9iHeSpHOIi 1 0 0 76 | 175 Ada Colau no aprende. La alcaldesa de la Ciudad Condal quiere más inmigrantes subsaharianos en Barcelona. La podemita, que no ha condenado el ataque de los manteros al turista, solamente habla para exigir que lleguen más inmigrantes ilegales. https://t.co/kkNRjXWIZK 1 0 0 77 | 176 Una historia más de los "arabes" en nuestro país. Sumado a los hondureños colocados en "puestos clave". Ha sido histórico. #LADRONES https://t.co/UrDQbL4Ux8 0 0 0 78 | 177 Los árabes y los venezolanos jajaja https://t.co/4UPxNGicsX 0 0 0 79 | 178 Pues en los primeros quince días de vida, el blog ya ha recibido visitas de Austria, Emiratos Árabes Unidos, Costa Rica, Eslovenia, Armenia, Panamá, Malta, Japón, Perú o República Dominicana. Sois grandes. https://t.co/zvRie0eKsF 0 0 0 80 | 179 @TinedoGuia escuchándote en vivo en Abudhabi, emiratos árabes, un abrazo 0 0 0 81 | 180 FIFA eWorld Cup: un árabe se consagró campeón del Mundial de ... - https://t.co/CpW3k82CMV https://t.co/cWKCmchjUL https://t.co/7T8ibcr0Vh 0 0 0 82 | 181 Migrantes que saltaron la valla de Ceuta responden a las acusaciones de violencia: "La única arma que tenemos es el coraje" https://t.co/s3legM6gRv https://t.co/YjZwoNBCw5 0 0 0 83 | 182 Una valla de taxis y taxistas en la frontera de Ceuta y Melilla sería infranqueable. 0 0 0 84 | 183 Migrantes que saltaron la valla de Ceuta responden a las acusaciones de violencia: "La única arma que tenemos es el coraje" https://t.co/5lNI8NjukF vía @desalambre 0 0 0 85 | 184 Unos manteros agreden a un turista en Barcelona, Sanchez se carga el jefe de la UCO, salto a la valla en Melilla por el efecto llamada, subida de impuestos, sanidad universal para los ilegales, concesiones a independentistas catalanes y vascos... Pero vamos a hablar de Franco... 0 0 0 86 | 185 ▶ En Foggia, Italia, 12 jornaleros inmigrantes han fallecido en un accidente de tráfico. Y en Bolonia, el choque entre dos camiones ha provocado una explosión que habría dejado dos muertos y 67 heridos. https://t.co/yxDAHs5ftl https://t.co/OseSRlYDFg 0 0 0 87 | 186 Yo COCINO..Y QUE? Prefiero comprar en DISCO, o JUMBO, y espero q Larreta deje entrar a la cadena Walmart y los franchutes de Carrefour dejen de robar 0 0 0 88 | 187 Me ha gustado un vídeo de @YouTube (https://t.co/WwDPX6yAGs - ASALTO A LA VALLA DE CEUTA: 22 Guardias Civiles heridos. Y el gobierno 0 0 0 89 | 188 "La Güera" le ofreció un trabajo de mesera, donde le presentaron al sujeto: "era un tipo guapo, alto, de ojos bonitos, como blanco, pelo rubio, cuerpazo", detalla la víctima. https://t.co/6icbcgOOza 0 0 0 90 | 189 Yo solo estoy esperando el momento en el que dj Alex suba el remix Arabe de Ecko y Papichamp 0 0 0 91 | 190 La izquierda española pide que se derribe la valla de Ceuta y Melilla para que entren los inmigrantes | El Municipio https://t.co/eg2e4aGhIi 0 0 0 92 | 191 La trampa de la ONU: "convertir 710 mil árabes en 5 millones de refugiados palestinos" https://t.co/dG85x0jgrq 0 0 0 93 | 192 @davidzepeda1 Hoy ha tocado un paseo por recintos árabes.Tú, siendo morenito y tal,te iría muy bien el papel de califa de aquella época ( siglo X) ¿no crees? De momento contemplaremos su legado!! 1beso💕 https://t.co/aYjl6vvZaM 0 0 0 94 | 193 Agentes de la Patrulla Fronteriza de USA detuvo a un grupo de 95 inmigrantes #indocumentados en el desierto del suroeste de #Arizona. #BuildTheWall https://t.co/8Z2f4PFWPM 0 0 0 95 | 194 Pero mire que llegó, la güera Salomé 0 0 0 96 | 195 Qué ganas de comer unas árabes 0 0 0 97 | 196 si alguien tiene la imagen del pokemon ese bailando con el machupichu hoy no cobras tirititirititiritiirii que la ponga 0 0 0 98 | 197 El ciudadano ruso Denis Yakovlev facilitó en Florida al menos 50 matrimonios falsos entre estadounidenses e inmigrantes indocumentados —en su mayoría asiáticos— por los que cobraba hasta $20,000. https://t.co/u9g3SagXGi 0 0 0 99 | 198 Toy feliz porque me voy a comer 12 empanadas árabes viendo naruto 0 0 0 100 | 199 Los Barrios, un pueblo solidario ante el "caos" con la acogida de inmigrantes en pabellones https://t.co/QgZNL4ikm0 vía @elmundoes 0 0 0 101 | 200 los viernes por la noche los moritos y su reggaeton arabe no dejan descansar, https://t.co/flg5tQnp0S Forniteo 0 0 0 102 | -------------------------------------------------------------------------------- /SemEval2019-Task5/evaluation/README.md: -------------------------------------------------------------------------------- 1 | # SemEval-2019 Task 5 - evaluation script # 2 | 3 | 4 | This is the official evaluation script for SemEval-2019 Task 5: Multilingual detection of hate. The script is language-independent and has been conceived in order to evaluate submissions for both task A and task B. 5 | 6 | **NOTE** 7 | During the **Practice** phase, the prediction files submitted by participants to the task page will be evaluated for the task A, and for demonstration purposes only; if participants wish to test the script on prediction files for task B as well, they could use the version available here (see the instructions at the bottom of this page). 8 | 9 | For the **Development** and **Evaluation** phases, the script will provide a complete evaluation for each language and task for any submitted file, provided that the latter meet the submission requirements described below. 10 | 11 | ## Submission instructions ## 12 | 13 | The script takes one single prediction file as input, that MUST be a TSV file structured as follows: 14 | 15 | ### Task A ### 16 | id[tab]{0|1} 17 | 18 | e.g. 19 | 20 | 101[tab]1 21 | 102[tab]0 22 | 103[tab]1 23 | 24 | ### Task B ### 25 | id[tab]{0|1}[tab]{0|1}[tab]{0|1} 26 | 27 | e.g. 28 | 29 | 101[tab]1[tab]1[tab]1 30 | 102[tab]0[tab]0[tab]0 31 | 103[tab]1[tab]1[tab]0 32 | 104[tab]1[tab]0[tab]0 33 | 105[tab]1[tab]0[tab]1 34 | 35 | ### File names ### 36 | 37 | When submitting predictions to the task page in Codalab, one single file should be uploaded for each task and language, as a zip-compressed file, and it should be named according to the language and task predictions are submitted for, therefore: 38 | 39 | * *en_a.tsv* for predictions for taskA-English 40 | * *es_a.tsv* for predictions for taskA-Spanish 41 | * *en_b.tsv* for predictions for taskB-English 42 | * *es_b.tsv* for predictions for taskB-Spanish 43 | 44 | 45 | ## Submission results ## 46 | 47 | The script outputs a file **scorer.txt** containing different scores, depending on the task. 48 | For task A it returns accuracy, precision, recall and F1-score just for the HS category. 49 | For task B it returns accuracy, precision, recall and F1-score for each category (HS, Target Type, Aggressiveness), along with the macro-averaged F1-score and the Exact Match Ratio. 50 | 51 | ## Testing the script offline ## 52 | 53 | In order to run the script locally, the input and output directories must match the Codalab format. The *input* directory must contain two subdirectories, namely **res** (containing the result file in TSV format with the naming convention described above) and **ref** containing the reference dataset (called **en.tsv** for English and **es.tsv** for Spanish). The output will be written in the file **scorer.txt** in the *output* directory. 54 | Example of file structure: 55 | 56 | input/ 57 | |- ref/ 58 | |- en.tsv 59 | |- res/ 60 | |- en_a.tsv 61 | output/ 62 | |- scorer.txt 63 | 64 | -------------------------------------------------------------------------------- /SemEval2019-Task5/evaluation/evaluation.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import pandas as pd 3 | import sys 4 | import os 5 | import numpy as np 6 | from sklearn.metrics import precision_recall_fscore_support, accuracy_score 7 | import io 8 | 9 | def isfloat(value): 10 | try: 11 | float(value) 12 | return True 13 | except ValueError: 14 | return False 15 | 16 | 17 | def check_file(path, correct_number_of_columns): 18 | f = open(path, 'r') 19 | first_line = f.readlines()[0].split("\t") 20 | f.close() 21 | if (len(first_line) != correct_number_of_columns): 22 | sys.exit('Column format problem.') 23 | if (isfloat(first_line[0])): 24 | has_header = 0 25 | else: 26 | has_header = 1 27 | return has_header 28 | 29 | 30 | def evaluate_a(pred,gold): 31 | levels = ["HS"] 32 | 33 | ground_truth = pd.read_csv(gold, sep="\t", names=["ID", "Tweet-text", "HS", "TargetRange", "Aggressiveness"], 34 | skiprows=check_file(gold, 5), 35 | converters={0: str, 1: str, 2: int, 3: int, 4: int}, header=None) 36 | 37 | predicted = pd.read_csv(pred, sep="\t", names=["ID"] + levels , skiprows=check_file(pred, 2), 38 | converters={0: str, 1: int}, header=None) 39 | 40 | # Check length files 41 | if (len(ground_truth) != len(predicted)): 42 | sys.exit('Prediction and gold data have different number of lines.') 43 | 44 | # Check predicted classes 45 | for c in levels: 46 | gt_class = list(ground_truth[c].value_counts().keys()) 47 | if not (predicted[c].isin(gt_class).all()): 48 | sys.exit("Wrong value in " + c + " prediction column.") 49 | 50 | data = pd.merge(ground_truth, predicted, on="ID") 51 | 52 | if (len(ground_truth) != len(data)): 53 | sys.exit('Invalid tweet IDs in prediction.') 54 | 55 | # Compute Performance Measures HS 56 | acc_hs = accuracy_score(data["HS_x"], data["HS_y"]) 57 | p_hs, r_hs, f1_hs, support = precision_recall_fscore_support(data["HS_x"], data["HS_y"], average = "macro") 58 | 59 | return acc_hs, p_hs, r_hs, f1_hs 60 | 61 | def evaluate_b(pred,gold): 62 | levels = ["HS", "TargetRange", "Aggressiveness"] 63 | 64 | ground_truth = pd.read_csv(gold, sep="\t", names=["ID", "Tweet-text", "HS", "TargetRange", "Aggressiveness"], 65 | skiprows=check_file(gold, 5), 66 | converters={0: str, 1: str, 2: int, 3: int, 4: int}, header=None) 67 | 68 | predicted = pd.read_csv(pred, sep="\t", names=["ID"] + levels , skiprows=check_file(pred, 4), 69 | converters={0: str, 1: int, 2: int, 3: int}, header=None) 70 | 71 | # Check length files 72 | if (len(ground_truth) != len(predicted)): 73 | sys.exit('Prediction and gold data have different number of lines.') 74 | 75 | # Check predicted classes 76 | for c in levels: 77 | gt_class = list(ground_truth[c].value_counts().keys()) 78 | if not (predicted[c].isin(gt_class).all()): 79 | sys.exit("Wrong value in " + c + " prediction column.") 80 | 81 | data = pd.merge(ground_truth, predicted, on="ID") 82 | 83 | if (len(ground_truth) != len(data)): 84 | sys.exit('Invalid tweet IDs in prediction.') 85 | 86 | # Compute Performance Measures 87 | acc_levels = dict.fromkeys(levels) 88 | p_levels = dict.fromkeys(levels) 89 | r_levels = dict.fromkeys(levels) 90 | f1_levels = dict.fromkeys(levels) 91 | for l in levels: 92 | acc_levels[l] = accuracy_score(data[l + "_x"], data[l + "_y"]) 93 | p_levels[l], r_levels[l], f1_levels[l], _ = precision_recall_fscore_support(data[l + "_x"], data[l + "_y"], average="macro") 94 | macro_f1 = np.mean(list(f1_levels.values())) 95 | 96 | # Compute Exact Match Ratio 97 | check_emr = np.ones(len(data), dtype=bool) 98 | for l in levels: 99 | check_label = data[l + "_x"] == data[l + "_y"] 100 | check_emr = check_emr & check_label 101 | emr = sum(check_emr) / len(data) 102 | 103 | return macro_f1, emr, acc_levels, p_levels, r_levels, f1_levels 104 | 105 | def main(argv): 106 | # https://github.com/Tivix/competition-examples/blob/master/compute_pi/program/evaluate.py 107 | # as per the metadata file, input and output directories are the arguments 108 | 109 | [input_dir, output_dir] = argv 110 | 111 | # unzipped submission data is always in the 'res' subdirectory 112 | # https://github.com/codalab/codalab-competitions/wiki/User_Building-a-Scoring-Program-for-a-Competition#directory-structure-for-submissions 113 | 114 | 115 | ref_dir = os.path.join(input_dir, 'ref') 116 | gold_standard = os.path.join(ref_dir, os.listdir(ref_dir)[0]) 117 | lang = gold_standard.split('/')[-1].replace('.tsv', '') 118 | res_dir = os.path.join(input_dir, 'res') 119 | submission_path = os.path.join(res_dir, os.listdir(res_dir)[0]) 120 | task = submission_path.split('/')[-1].replace('.tsv', '').split('_')[1] 121 | 122 | output_file = open(os.path.join(output_dir, 'scores.txt'), "wb") 123 | if task == 'a': 124 | acc_hs, p_hs, r_hs, f1_hs = evaluate_a(submission_path, gold_standard) 125 | 126 | # the scores for the leaderboard must be in a file named "scores.txt" 127 | # https://github.com/codalab/codalab-competitions/wiki/User_Building-a-Scoring-Program-for-a-Competition#directory-structure-for-submissions 128 | 129 | output_file.write("taskA_fscore: {0}\n".format(f1_hs)) 130 | output_file.write("taskA_precision: {0}\n".format(p_hs)) 131 | output_file.write("taskA_recall: {0}\n".format(r_hs)) 132 | output_file.write("taskA_accuracy: {0}\n".format(acc_hs)) 133 | print("taskA_fscore: {0}".format(f1_hs)) 134 | print("taskA_precision: {0}".format(p_hs)) 135 | print("taskA_recall: {0}".format(r_hs)) 136 | print("taskA_accuracy: {0}".format(acc_hs)) 137 | elif task == 'b': 138 | macro_f1, emr, acc_levels, p_levels, r_levels, f1_levels = evaluate_b(submission_path, gold_standard) 139 | 140 | # the scores for the leaderboard must be in a file named "scores.txt" 141 | # https://github.com/codalab/codalab-competitions/wiki/User_Building-a-Scoring-Program-for-a-Competition#directory-structure-for-submissions 142 | 143 | output_file.write("taskB_fscore_macro: {0}\n".format(macro_f1)) 144 | output_file.write("taskB_emr: {0}\n".format(emr)) 145 | output_file.write("taskB_fscore_HS: {0}\n".format(f1_levels["HS"])) 146 | output_file.write("taskB_precision_HS: {0}\n".format(p_levels["HS"])) 147 | output_file.write("taskB_recall_HS: {0}\n".format(r_levels["HS"])) 148 | output_file.write("taskB_accuracy_HS: {0}\n".format(acc_levels["HS"])) 149 | output_file.write("taskB_fscore_TR: {0}\n".format(f1_levels["TargetRange"])) 150 | output_file.write("taskB_precision_TR: {0}\n".format(p_levels["TargetRange"])) 151 | output_file.write("taskB_recall_TR: {0}\n".format(r_levels["TargetRange"])) 152 | output_file.write("taskB_accuracy_TR: {0}\n".format(acc_levels["TargetRange"])) 153 | output_file.write("taskB_fscore_AG: {0}\n".format(f1_levels["Aggressiveness"])) 154 | output_file.write("taskB_precision_AG: {0}\n".format(p_levels["Aggressiveness"])) 155 | output_file.write("taskB_recall_AG: {0}\n".format(r_levels["Aggressiveness"])) 156 | output_file.write("taskB_accuracy_AG: {0}\n".format(acc_levels["Aggressiveness"])) 157 | 158 | print("taskB_fscore_macro: {0}".format(macro_f1)) 159 | print("taskB_emr: {0}n".format(emr)) 160 | print("taskB_fscore_HS: {0}".format(f1_levels["HS"])) 161 | print("taskB_precision_HS: {0}".format(p_levels["HS"])) 162 | print("taskB_recall_HS: {0}".format(r_levels["HS"])) 163 | print("taskB_accuracy_HS: {0}".format(acc_levels["HS"])) 164 | print("taskB_fscore_TR: {0}".format(f1_levels["TargetRange"])) 165 | print("taskB_precision_TR: {0}".format(p_levels["TargetRange"])) 166 | print("taskB_recall_TR: {0}".format(r_levels["TargetRange"])) 167 | print("taskB_accuracy_TR: {0}".format(acc_levels["TargetRange"])) 168 | print("taskB_fscore_AG: {0}".format(f1_levels["Aggressiveness"])) 169 | print("taskB_precision_AG: {0}".format(p_levels["Aggressiveness"])) 170 | print("taskB_recall_AG: {0}".format(r_levels["Aggressiveness"])) 171 | print("taskB_accuracy_AG: {0}".format(acc_levels["Aggressiveness"])) 172 | 173 | 174 | output_file.close() 175 | if __name__ == "__main__": 176 | main(sys.argv[1:]) 177 | -------------------------------------------------------------------------------- /SemEval2019-Task5/evaluation/metadata: -------------------------------------------------------------------------------- 1 | command: python $program/evaluation.py $input $output 2 | description: Scoring program for HatEval 2018. 3 | -------------------------------------------------------------------------------- /annotation_guidelines.md: -------------------------------------------------------------------------------- 1 | # 1. English language against immigrants 2 | 3 | ## Introduction 4 | 5 | 6 | Welcome, and thank you for choosing to participate in this task. 7 | 8 | You're asked to read a given set of tweets in English having immigrants and migration issues as the main topic, and, for each tweet, answer some questions regarding the presence or not of hate speech (HS) and other HS-related aspects.\ 9 | Hate Speech is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics. 10 | 11 | More specifically, HS against immigrants may include: 12 | 13 | - insults, threats, denigrating or hateful expressions 14 | 15 | - incitement to hatred, violence or violation of rights to individuals or groups perceived as different for somatic traits (e.g. skin color), origin, cultural traits, language, etc. 16 | 17 | - presumed association of origin/ethnicity with cognitive abilities, propensity to crime, laziness or other vices 18 | 19 | - references to the alleged inferiority (or superiority) of some ethnic groups with respect to others 20 | 21 | - delegitimation of social position or credibility based on origin/ethnicity 22 | 23 | - references to certain backgrounds/ethnicities as a threat to the national security or welfare or as competitors in the distribution of government resources 24 | 25 | - dehumanization or association with animals or entities considered inferior 26 | 27 | HS identification is a challenging task that can be subject to individual biases, especially considering the fact that there is no single distinctive factor in drawing the line between HS and not-HS, but a set of variables that the should be considered case by case. Therefore, while answering our questions, we advise you to CAREFULLY READ THE GUIDELINES provided in the next section. 28 | 29 | Guidelines 30 | = 31 | 32 | Below are some instructions on the questions you have to answer for this job. 33 | 34 | Hate Speech\ 35 | While answering the question "Is this tweet hateful?", you must take into account the following aspects: 36 | 37 | 1. the tweet content MUST have IMMIGRANTS/REFUGEES as main TARGET, or even a single individual, but considered for his/her membership in that category (and NOT for the individual characteristics) 38 | 39 | 2. we must deal with a message that spreads, incites, promotes or justifies HATRED OR VIOLENCE TOWARDS THE TARGET, or a message that aims at dehumanizing, hurting or intimidating the target 40 | 41 | The joint presence of both elements in a tweet is considered essential to determine whether the tweet has hateful contents, therefore if both of them occur, your answer will be 'Yes'. 42 | 43 | In case even just one of these conditions is not detected, HS (at least against immigrants) is assumed not to occur, then your answer will be 'No'. 44 | 45 | Here a list of other aspects that are NOT considered hate speech in our study: 46 | 47 | - HATE SPEECH AGAINST OTHER TARGETS 48 | 49 | - offensive language 50 | 51 | - blasphemy 52 | 53 | - historical denial 54 | 55 | - overt incitement to terrorism 56 | 57 | - offense towards public servants and police officers 58 | 59 | - defamation 60 | 61 | ##Target Type and Aggressive Language 62 | 63 | If the answer to the first question is 'Yes', then TWO further distinctions (expressed with two different questions) must be drawn, namely on the following aspects: 64 | 65 | 1. the target type: the first question is on whether the text includes hateful messages purposely sent to a specific target (then you will answer 'Single person'), or it refers to hateful messages posted to many potential receivers (then the answer will be 'Whole group'), bearing in mind that the target at issue must always be immigrants. 66 | 67 | 2. the presence of aggressive language: the second one is on whether the tweet is aggressive or not. A message is considered aggressive (then you are supposed to answer 'Yes'), if: 68 | 69 | - it implies or legitimates discriminating attitudes or policies against the given target (immigrants/migrants/refugees) 70 | 71 | - there is an allusion to a potential threat posed by the presence of the target, or its alleged outnumbering with respect to the native population 72 | 73 | - there is a sense of dissatisfaction and frustration, which may also result in overt hostility, due to the (perceived) privileged treatment granted to the target group by the government 74 | 75 | - there is the reference (whether explicit or just implied) to violent actions of any kind perpetrated against the given target of the message 76 | 77 | If none of these conditions hold, then your answer will be 'No'. 78 | 79 | ##Examples 80 | 81 | 82 | - HATEFUL but NOT AGGRESSIVE tweet,the target is the WHOLE GROUP:\ 83 | *@united**how about donating flights to deport the Invaders back to their homeland. **#DeportThemAll* 84 | 85 | - HATEFUL and AGGRESSIVE tweet, the target is the WHOLE GROUP:\ 86 | *WTH??? Sent them back home and DON'T let them back into Europe! The German Government Pays for 3 Week Vacation for Refugees to Go Home (?)* 87 | 88 | - HATEFUL, but NOT AGGRESSIVE tweet, the target is a SINGLE PERSON:\ 89 | *You are a scrounger/soldier coming to create a caliphate or create chaos. Go home, look after your girlfriends/wives/kids. The last thing you are, are a refugee! Bugger off! Bye, bey, cheers....* 90 | 91 | - NOT HATEFUL tweet: 92 | 93 | 1. It does not incite hatred*: Hundreds of Syrian refugees started crossing the border from Lebanon on Saturday. * 94 | 95 | 2. It does not have immigrants/refugees as main target*: * *@campaignforleo Ireland is a neutral and independent country which is NOT part of any military alliance. Please do not forget that and act like Ireland is a vassal state of the UK or the EU. PLEASE DO NOT ENGAGE IN STUPID and DAMAGING ACTIONS AGAINST RUSSIA. Keep good relations with Russia!* 96 | 97 | # 2.English language against women 98 | 99 | 100 | ##Overview 101 | 102 | 103 | This job aims at labelling English misogynous tweets shared by users in online social media. 104 | 105 | ##Steps 106 | 107 | 108 | The first step is about 109 | 110 | - **Misogyny Labelling:** you have to decide if a tweet is misogynous or not. 111 | 112 | If the tweet has been labelled as *misogynous*, then other two questions will be asked: 113 | 114 | - **Aggressive Language Labelling:** you should indicate if the misogynous contents is aggressive or not. 115 | - **Target Labelling: **you should indicate if misogynous contents is towards an individual target or a social group. 116 | 117 | ##Rules & Tips 118 | 119 | 120 | **Misogyny Labelling** 121 | 122 | - ***Misogynous***: a text that expresses hating towards women in particular (in the form of insulting, sexual harassment, threats of violence, stereotype, objectification and negation of male responsibility). 123 | - ***Not Misogynous***: a text that does notexpress hating towards women in particular. 124 | 125 | **IMPORTANT(!)**: a tweet is MISOGYNOUS only if it is related to woman/women. 126 | 127 | **Aggressive Language Labelling** 128 | 129 | - **Aggressive**: a message is considered aggressive if it (implicitly or explicitly) presents, incites, threatens, suggests or alludes to: 130 | 131 | - attitudes, violent actions, hostility or commission of offenses against women; 132 | - justify or legitimize an aggressive action against women. 133 | 134 | - **Not Aggressive**: If none of the previous conditions hold. 135 | 136 | **IMPORTANT(!)**: In order to recognize "aggressivity" please take into account that it can be **moderate** or **severe**. For example: "social isolation" is **moderate**, while "killing" is considered **severe**. 137 | 138 | **Target Labelling** 139 | 140 | - ***Individual***: unwelcome and offensive messages purposely sent to a specific individual person. 141 | - ***Generic***: messages intentionally posted to many potential victims belonging to a social group (e.g. feminists, women in politics, women in online games). 142 | 143 | ##Examples 144 | 145 | 146 | - Misogynous and Aggressive tweet, the target is individual: 147 | 148 | - *@BarbeeDani You ain't shit bitch shut your whore mouth* 149 | - *Punch that girl right in the skank* 150 | 151 | - Misogynous and Aggressive tweet, the target is generic: 152 | 153 | - *Fucking clueless women should stay in the kitchen and not ruin a good man's name. - A United fan after reading this, probably* 154 | - *Women are all harassed? And it's also all men unless we prove other wise? **You're a clueless cunt**👍* 155 | 156 | - Misogynous and Not Aggressive tweet, the target is individual: 157 | 158 | - *The only hoe I trust is that bitch Nina.* 159 | - *@xXKristyP Ok hoe or whore you choose sweetie?* 160 | 161 | - Misogynous and Not Aggressive tweet, the target is generic: 162 | 163 | - *sucking my own dick cause women are stupid* 164 | - *not all men are trash. but if you always pick the ones who are trash, or treat you like trash, then ya you'll learn to hate them for sure!* 165 | 166 | - Not Misogynous: 167 | 168 | - *Ex-Teacher Pleads Not guilty To Rape Charges https://t.co/D2mGu3VT5G* 169 | - *You couldn't possibly value a girl if you still refer to women as "bitch" **😒* 170 | 171 | * * * * * 172 | 173 | # 3. Spanish language against immigrants 174 | 175 | 176 | ##Introducción 177 | 178 | 179 | Bienvenido y gracias por participar en esta tarea. 180 | 181 | Se le pide que lea un conjunto de tweets en español que tienen como tema principal los inmigrantes y las migraciones. Para cada tweet, debe responder algunas preguntas sobre la presencia o no de discurso de odio ("hate speech", HS) y otros aspectos relacionados con el HS.\ 182 | El discurso de odio se define comúnmente como cualquier comunicación que menosprecia a una persona o un grupo en función de la raza, el color, la etnia, el género, la orientación sexual, la nacionalidad, la religión u otras características.\ 183 | No se confundan tweets *clasistas y racistas* con tweets de odio contra inmigrantes. 184 | 185 | Más específicamente, HS contra los inmigrantes puede incluir: 186 | 187 | - insultos, amenazas, denigrantes o expresiones de odio; 188 | 189 | - incitación al odio, la violencia o la violación de los derechos de individuos o grupos percibidos como diferentes por los rasgos somáticos (por ejemplo, el color de la piel), el origen, los rasgos culturales, el idioma, etc. 190 | 191 | - asociación de la origen/etnia con deficiencias cognitivas, propensión al delito, pereza u otros vicios; 192 | 193 | - referencias a la supuesta inferioridad (o superioridad) de algunos grupos étnicos con respecto a otros; 194 | 195 | - deslegitimación de la posición social o credibilidad basada en el origen/etnia; 196 | 197 | - referencias a ciertos background/etnicidades como una amenaza para la seguridad o el bienestar nacional o como competidores en la distribución de recursos publicos; 198 | 199 | - deshumanización o asociación con animales o entidades consideradas inferiores. 200 | 201 | La identificación del HS es un desafío, que puede estar sujeto a sesgos individuales, considerando que no existe una uníca manerade trazar la línea entre HS y no-HS, sino un conjunto de variables que deben considerarse caso por caso. Por lo tanto, al responder nuestras preguntas, le aconsejamos que lea detenidamente las pautas proporcionadas en la siguiente sección. 202 | 203 | ##Líneas guía 204 | 205 | 206 | A continuación hay algunas instrucciones sobre las preguntas que debe responder para este trabajo. 207 | 208 | Discurso del odio\ 209 | Al responder a la pregunta "*¿Este tweet expresa odio contra los inmigrantes/refugiados?*", debe tener en cuenta los siguientes aspectos: 210 | 211 | - el contenido del tweet debe tener como objetivo principal a los inmigrantes, o incluso a un solo individuo, si considerado como membro de eso grupo (y NO por sus características individuales); 212 | 213 | - tratamos con un mensaje que propaga, incita, promueve o justifica el odio o la violencia hacia el objetivo, o un mensaje que apunta a deshumanizar, herir o intimidar al objetivo. 214 | 215 | La presencia conjunta de ambos elementos en un tweet se considera esencial para determinar si el tweet tiene contenido de odio. Por lo tanto, si ambos ocurren, tu respuesta será 'Sí'. 216 | 217 | En caso de que no se detecte siquiera una de estas condiciones, se asume que HS (al menos contra inmigrantes) no ocurre, entonces su respuesta será 'No'. 218 | 219 | Elencamos aquí otros aspectos que NO se consideran discurso de odio en nuestro estudio: 220 | 221 | - solo ofensividad; 222 | 223 | - blasfemia; 224 | 225 | - negación histórica; 226 | 227 | - abierta incitación al terrorismo; 228 | 229 | - ofensa a servidores públicos y policías; 230 | 231 | - difamación. 232 | 233 | No se confundan tweets *clasistas y racistas* con tweets de odio contra inmigrantes. 234 | 235 | Tipo de objetivo y lenguaje agresivo 236 | 237 | Si la respuesta a la primera pregunta es 'Sí', se debe responder DOS preguntas adicionales, en los siguientes aspectos: 238 | 239 | 1. tipo de objetivo: se le pregunta si el texto incluye mensajes de odio deliberadamente enviados a un objetivo específico (entonces la respuesta será 'Persona'), o se refiere a mensajes de odio enviados a muchos receptores potenciales (entonces la respuesta será ' Grupo '). Tenga en cuenta que el objetivo en cuestión siempre debe ser los inmigrantes. 240 | 241 | 2. presencia de lenguaje agresivo: se le pregunta si el tweet es agresivo o no. Un mensaje se considera agresivo (entonces la respuesta será 'Sí'), si: 242 | 243 | - implica o legitima actitudes o políticas discriminatorias; 244 | 245 | - hay una alusión a una amenaza potencial planteada por la presencia del objetivo, o su supuesta superioridad numerica con respecto a la población nativa; 246 | 247 | - hay una sensación de insatisfacción o frustración, que también puede dar lugar a hostilidad manifiesta, debido al tratamiento privilegiado (percibido) que el gobierno le otorga al grupo objetivo; 248 | 249 | - hay referencia (explícita o implícita) a acciones violentas de cualquier tipo. 250 | 251 | Si no se cumple ninguna de estas condiciones, su respuesta será 'No'. 252 | 253 | ##Ejemplos 254 | 255 | 256 | - Tweet con odio peró no agresivo, el objectivo es un grupo:\ 257 | *Los extranjeros indeseables en España no son los que saltan la valla de Melilla. https://t.co/pDNl4MUKtZ vía @elmundoes* 258 | 259 | - Tweet con odio y agresivo, el objectivo es un grupo:\ 260 | *No soy racista pero QUE ASCO DAN LOS PUTOS MOROS. COMO DEPENDIESE DE MI QUE ENTRASEN EN ESPAÑA LO LLEVABAN CLARO ESTOS IMPRESENTABLES GUARROS Y MAL OLIENTES* 261 | 262 | - Tweet con odio peró no agresivo, el objectivo es una persona:\ 263 | *Pablito indocumentado de mierda* 264 | 265 | - Tweets sin odio:\ 266 | *Cientos de niños inmigrantes se le escapan a la Junta de Andalucía Al menos 400 menores sin papeles se han fugado de un albergue de Granada, como está ocurriendo en otras provincias #EleccionesYa https://t.co/CELzxrtXCk* 267 | 268 | *Ahora resulta que un sudaca va a enseñarme a mi de tés** 269 | 270 | *Es una peli de un negrata macarra matando blancos idiotas* 271 | 272 | *El juez rechaza una denuncia contra 602 inmigrantes que saltaron la valla de Ceuta https://...*\ 273 | * 274 | 275 | # 4.Spanish language against women 276 | 277 | 278 | ##Tarea 279 | 280 | 281 | * * * * * 282 | 283 | La tarea consiste en identificar si el texto mostrado puede ser considerado o no misógino. 284 | 285 | Para comprender qué es la misoginia, se sugiere la lectura de la página correspondiente de la [Wikipedia](https://es.wikipedia.org/wiki/Misoginia), de la que se extraen los siguientes fragmentos: 286 | 287 | "La misoginia se define como la aversión y también el odio hacia las mujeres o las niñas." 288 | 289 | "De acuerdo con el sociólogo Allan G. Johnson, "la misoginia es la actitud cultural de odio hacia las mujeres, por el hecho de que son mujeres". 290 | 291 | Los diccionarios definen la misoginia como el "odio a la mujer", "odio, aversión o desconfianza a la mujer", o "prejuicios arraigados contra la mujer". 292 | 293 | Si un texto se considera misógino, se debe identificar (de manera independiente): 294 | 295 | - Si el texto es agresivo, si incita al odio. Se puede considerar que un texto incita al odio si fomenta o promueve, directa o indirectamente, el odio, la hostilidad, la discriminación o la violencia contra una mujer o un grupo de mujeres, sólo por hecho de ser mujer/mujeres. 296 | - Si el texto misógino se refiere a un individuo concreto (una mujer en particular), o contra un colectivo (todas las mujeres, mujeres de un determinado grupo, etc.). 297 | 298 | ##Ejemplos 299 | 300 | * * * * * 301 | 302 | - MISÓGINO: Cállate puta zorra y chúpame la polla. 303 | 304 | - AGRESIVO: SI, porque está insultando y denigrando gravemente a una mujer. 305 | 306 | - INDIVIDUAL, porque se lo dice a una mujer en concreto. 307 | 308 | -- 309 | 310 | 311 | 312 | - MISÓGINO: ¿Cómo le das más libertad a una mujer? Ampliándole la cocina. 313 | 314 | - AGRESIVO: NO, está haciendo un chiste. 315 | 316 | - COLECTIVO, porque se burla de todas las mujeres. 317 | 318 | -- 319 | 320 | - NO MISÓGINO: Mi novia es la mujer más hermosa que he visto en mi perra vida. 321 | 322 | -- 323 | 324 | - NO MISÓGINO: No se puede minimizar o relativizar la violación o el acoso sexual. 325 | 326 | -- 327 | 328 | - NO MISÓGINO: No entiendo a las mujeres que defienden el machismo. 329 | 330 | -- 331 | 332 | - NO MISÓGINO: Me cago en mi puta vida. -------------------------------------------------------------------------------- /competition.yaml: -------------------------------------------------------------------------------- 1 | admin_names: boscoc,dnozza,msang,valeriob,vpatti 2 | allow_public_submissions: true 3 | allow_teams: true 4 | anonymous_leaderboard: false 5 | description: Given a tweet and a target (Woman or Immigrant), determine the presence 6 | of hate speech and whether it is against an individual or a group. 7 | disallow_leaderboard_modifying: true 8 | enable_detailed_results: true 9 | enable_forum: true 10 | enable_per_submission_metadata: false 11 | force_submission_to_leaderboard: true 12 | has_registration: true 13 | html: 14 | data: data.html 15 | evaluation: evaluation.html 16 | overview: overview.html 17 | terms: terms.html 18 | image: logohateval.png 19 | 20 | leaderboard: 21 | leaderboards: 22 | RESULTS: &RESULTS 23 | label: Results 24 | rank: 1 25 | columns: 26 | taskA_fscore: 27 | label: F1 28 | leaderboard: *RESULTS 29 | rank: 4 30 | taskA_precision: 31 | label: P 32 | leaderboard: *RESULTS 33 | rank: 2 34 | taskA_recall: 35 | label: R 36 | leaderboard: *RESULTS 37 | rank: 3 38 | taskA_accuracy: 39 | label: Acc 40 | leaderboard: *RESULTS 41 | rank: 1 42 | taskB_emr: 43 | label: EMR 44 | leaderboard: *RESULTS 45 | rank: 5 46 | taskB_fscore_macro: 47 | label: F 48 | leaderboard: *RESULTS 49 | rank: 6 50 | 51 | phases: 52 | 1: 53 | color: grey 54 | description: 'Trial data available.' 55 | is_scoring_only: false 56 | label: Trial 57 | max_submissions: 999 58 | max_submissions_per_day: 999 59 | phasenumber: 1 60 | public_data: misogyny-trial.zip 61 | reference_data: misogyny-trial.zip 62 | scoring_program: scoring_program_en_a.zip 63 | start_date: 2018-08-20 00:00:00+00:00 64 | starting_kit: starting_kit.zip 65 | 2: 66 | color: green 67 | description: 'English dataset for task A available for training' 68 | is_scoring_only: false 69 | label: Practice-English-A 70 | max_submissions: 999 71 | max_submissions_per_day: 999 72 | phasenumber: 2 73 | public_data: misogyny-trial.zip 74 | reference_data: prova_EN_A.zip 75 | scoring_program: scoring_program_en_a.zip 76 | start_date: 2018-09-17 00:00:00+00:00 77 | 3: 78 | color: red 79 | description: 'Spanish dataset for task A available for training' 80 | is_scoring_only: false 81 | label: Practice-Spanish-A 82 | max_submissions: 999 83 | max_submissions_per_day: 999 84 | phasenumber: 3 85 | public_data: misogyny-trial.zip 86 | reference_data: prova_ES_A.zip 87 | scoring_program: scoring_program_es_a.zip 88 | start_date: 2018-09-17 00:00:00+00:00 89 | 4: 90 | color: purple 91 | description: 'English dataset for task B available for training' 92 | is_scoring_only: false 93 | label: Practice-English-B 94 | max_submissions: 999 95 | max_submissions_per_day: 999 96 | phasenumber: 4 97 | public_data: misogyny-trial.zip 98 | reference_data: prova_EN_B.zip 99 | scoring_program: scoring_program_en_b.zip 100 | start_date: 2018-09-17 00:00:00+00:00 101 | 5: 102 | color: blue 103 | description: 'Spanish dataset for task B available for training' 104 | is_scoring_only: false 105 | label: Practice-Spanish-B 106 | max_submissions: 999 107 | max_submissions_per_day: 999 108 | phasenumber: 5 109 | public_data: misogyny-trial.zip 110 | reference_data: prova_ES_B.zip 111 | scoring_program: scoring_program_es_b.zip 112 | start_date: 2018-09-17 00:00:00+00:00 113 | 6: 114 | auto_migration: false 115 | color: green 116 | description: 'Runs submitted to the English task A by participants are evaluated. A single submission is allowed for each participant and for each task.' 117 | execution_time_limit: 500 118 | is_scoring_only: false 119 | label: Evaluation-English-A 120 | max_submissions: 2 121 | max_submissions_per_day: 2 122 | phasenumber: 6 123 | reference_data: prova_EN_A.zip 124 | scoring_program: scoring_program_en_a.zip 125 | start_date: 2019-01-10 00:00:00+00:00 126 | leaderboard_management_mode: hide_results 127 | 7: 128 | auto_migration: false 129 | color: red 130 | description: 'Runs submitted to the Spanish task A by participants are evaluated. A single submission is allowed for each participant and for each task.' 131 | execution_time_limit: 500 132 | is_scoring_only: false 133 | label: Evaluation-Spanish-A 134 | max_submissions: 2 135 | max_submissions_per_day: 2 136 | phasenumber: 7 137 | reference_data: prova_ES_A.zip 138 | scoring_program: scoring_program_es_a.zip 139 | start_date: 2019-01-10 00:00:00+00:00 140 | leaderboard_management_mode: hide_results 141 | 8: 142 | auto_migration: false 143 | color: purple 144 | description: 'Runs submitted to the English task B by participants are evaluated. A single submission is allowed for each participant and for each task.' 145 | execution_time_limit: 500 146 | is_scoring_only: false 147 | label: Evaluation-English-B 148 | max_submissions: 2 149 | max_submissions_per_day: 2 150 | phasenumber: 8 151 | reference_data: prova_EN_B.zip 152 | scoring_program: scoring_program_en_b.zip 153 | start_date: 2019-01-10 00:00:00+00:00 154 | leaderboard_management_mode: hide_results 155 | 9: 156 | auto_migration: false 157 | color: blue 158 | description: 'Runs submitted to the Spanish task B by participants are evaluated. A single submission is allowed for each participant and for each task.' 159 | execution_time_limit: 500 160 | is_scoring_only: false 161 | label: Evaluation-Spanish-B 162 | max_submissions: 2 163 | max_submissions_per_day: 2 164 | phasenumber: 9 165 | reference_data: prova_ES_B.zip 166 | scoring_program: scoring_program_es_b.zip 167 | start_date: 2019-01-10 00:00:00+00:00 168 | leaderboard_management_mode: hide_results 169 | show_datasets_from_yaml: true 170 | title: SemEval 2019 Task 5 - Shared Task on Multilingual Detection of Hate 171 | -------------------------------------------------------------------------------- /data.html: -------------------------------------------------------------------------------- 1 |
Data
2 |All data for the competition are collected from Twitter and manually annotated mainly via the Figur8 crowdsourcing platform. They are organized in two datasets especially released for the competition and based on the languages and targets involved. More specifically, they will include TWO datasets, contaning tweets about hate against women and immigrants, in English and Spanish, respectively.
4 | 5 |A sample of each dataset is made available to participants from 08-20-2018, during the 'Practice' phase.
6 |According to the need of the task and related subtasks, for each tweet each dataset will include:
8 |An annotated tweet is a tab-separated line with the following pattern:
16 |17 |19 |id[tab]text[tab]HS[tab]TR[tab]AG
18 |
where 'id' is a progressive number denoting the tweet, 'text' is the given text of the tweet
while the other parts of the pattern (given in trial and training data and to be predicted in testing data) are: Hate Speech (HS) is hateful (1) or not (0), Target Range (TR) is the whole group (0) or a single individual (1), and Aggressiveness (AG) is absent (0) or present (1). An example of annotation is reported in the following:
42648663[tab]USER_NAME Stupid ugly cunt who needs to die[tab]1[tab]1[tab]121 |
Notice that aggressiveness is not a mandatory characteristic of all hateful texts and some text can express hate against a target in terms of disrespect but without using an aggressive language.
22 |The script takes one single prediction file as input, that MUST be a .tsv file structured as follows:
24 | 25 |id[tab]{0|1}
26 |e.g.
27 |101[tab]1
28 |102[tab]0
29 |103[tab]1
30 | 31 |id[tab]{0|1}[tab]{0|1}[tab]{0|1}
32 |e.g.
33 |101[tab]1[tab]1[tab]1
34 |102[tab]0[tab]0[tab]0
35 |103[tab]1[tab]1[tab]0
36 |104[tab]1[tab]0[tab]0
37 |105[tab]1[tab]0[tab]1
38 |When submitting predictions to the task page in Codalab, one single file should be uploaded for each task, as a zip-compressed file, and it should be named according to the language and task predictions are submitted for, hence:
41 |48 | -------------------------------------------------------------------------------- /evaluation.html: -------------------------------------------------------------------------------- 1 |
Evaluation
2 |For the evaluation of the results of task A and B different strategies and metrics are applied in order to allow for more fine-grained scores.
3 |TASK A.
Systems will be evaluated using standard evaluation metrics, including accuracy, precision, recall and F1-score. The submissions will be ranked by F1-score.
The metrics will be computed as follows:
Systems will be evaluated on the basis of two criteria: partial match and exact match.
13 |20 |
The evaluation script is available in this GitHub repository: https://github.com/msang/hateval/tree/master/SemEval2019-Task5/evaluation
During the 'Practice' phase, the prediction files submitted by participants to the task page will be evaluated for the task A, and for demonstration purposes only; if participants wish to test the script on prediction files for task B as well, they could use the version available in the GitHub repository.
24 |For the 'Development' and 'Evaluation' phases, the script will provide a complete evaluation for each language and task for any submitted file, provided that the latter meet the submission requirements (see Data).
-------------------------------------------------------------------------------- /keyword_set.md: -------------------------------------------------------------------------------- 1 | # Keyword set 2 | 3 | Following, the set of keywords used to filter Twitter streams for each target and language are reported. 4 | Note that when an item is composed by more than one keyword, it means that all the reported keywords must appear in the tweet in order to be selected. 5 | 6 | ## 1. English language against immigrants 7 | --- 8 | 9 | - migrant 10 | - refugee 11 | - rapefugee 12 | - golliwog 13 | - sendthemback 14 | - whitepride 15 | - whitegenocide 16 | - whitepower 17 | - whiteresistance 18 | - whiterevolution 19 | - sendthemhome 20 | - stoptheinvasion 21 | - stopimmigration 22 | - rang head 23 | - nomorerefugees 24 | - nomoremigrant 25 | - refugeesnotwelcome 26 | - migrants go home 27 | - refugees go home 28 | - buildthatwall 29 | - endchainmigration 30 | - stopdaca 31 | - nodaca 32 | - enddaca 33 | - deportthemall 34 | - illegalaliens 35 | - illegalimmigr 36 | 37 | ## 2. English language against women 38 | --- 39 | - women are inferior to men 40 | - womenareinferiortomen 41 | - stay in the kitchen 42 | - sandwich maker 43 | - sandwichmaker 44 | - getbackinthekitchen 45 | - makemeasandwich 46 | - feminismiscancer bitch 47 | - feminismiscancer women 48 | - feminismiscancer you 49 | - notallmen women 50 | - notallmen woman 51 | - notallmen you 52 | - bitch cunt 53 | - bitch kunt 54 | - bitch whore 55 | - bitch women 56 | - whore cunt 57 | - whore women 58 | - kunt you 59 | - cunt kunt 60 | - cunt women 61 | - cunt woman 62 | - slut cunt 63 | - slut kunt 64 | - slut bitch 65 | - slut whore 66 | - not all men cunt 67 | - not all men bitch 68 | - not all men women 69 | - not all men woman 70 | - not all men girl 71 | - not all men you 72 | - not all men stupid 73 | - not all men pussy 74 | - skank cunt 75 | - skank bitch 76 | - skank whore 77 | - skank women 78 | - skank woman 79 | - skank girl 80 | - skank girlfriend 81 | - skank you 82 | - skank stupid 83 | - skank pussy 84 | - skank your ass 85 | - hysterical cunt 86 | - hysterical bitch 87 | - hysterical women 88 | - hysterical woman 89 | - hysterical girl 90 | - hysterical girlfriend 91 | - hysterical you 92 | - hysterical stupid 93 | - hysterical pussy 94 | - hysterical your ass 95 | - rape cunt 96 | - rape bitch 97 | - rape whore 98 | - rape women 99 | - rape woman 100 | - hole cunt 101 | - hole bitch 102 | - hole whore 103 | - hole women 104 | - hoe cunt 105 | - hoe bitch 106 | - hoe whore 107 | - hoe women 108 | - you pussy 109 | 110 | ## 3. Spanish language against immigrants 111 | --- 112 | 113 | - valla ceuta 114 | - concertinas 115 | - valla melilla 116 | - moro 117 | - sudaca 118 | - negrata 119 | - islam escuelas 120 | - inmigra 121 | - refugiad 122 | - sudaka 123 | - sudamericano 124 | - arabe 125 | - subsahariano 126 | - clandestino 127 | - sin papel 128 | - indocumentado 129 | - espalda mojada 130 | - catalufo 131 | - franchute 132 | - gabach 133 | - gachup 134 | - guer 135 | - machupichu 136 | - mayate 137 | - moromierda 138 | - panchito 139 | - putos moro 140 | - frijoler 141 | 142 | ## 4. Spanish language against women 143 | --- 144 | 145 | - perra 146 | - puta 147 | - zorra 148 | - guarra 149 | - coño 150 | - escoria 151 | - polla 152 | - eres una imbécil 153 | - sirvienta 154 | - violación 155 | - acoso 156 | - te pego 157 | - pegarte 158 | - te follo 159 | - follarte 160 | - comemela 161 | - con franco eso no pasaba 162 | - las mujeres son inferiores que los hombres 163 | - a fregar puta 164 | - callate 165 | - que te calles 166 | - arandina 167 | - lamanada 168 | - foreversirvienta 169 | - huelesaindigena 170 | - eresputa 171 | - ereszorra 172 | - huelesasirvienta 173 | - frescaderaja 174 | - callatesirvienta 175 | - noesviolacion 176 | - vuelvealacocina 177 | - mujeresinferiores 178 | - todasputas 179 | - notodosloshombres 180 | - machismo100x100 181 | - feminazis 182 | 183 | -------------------------------------------------------------------------------- /logohateval.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msang/hateval/edeecf5efae3884cf1889b4298101d1e52c99efb/logohateval.png -------------------------------------------------------------------------------- /overview.html: -------------------------------------------------------------------------------- 1 |Multilingual detection of hate speech against immigrants and women in Twitter (hatEval)
2 |Hate Speech is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics. Given the huge amount of user-generated contents on the Web, and in particular on social media, the problem of detecting, and therefore possibly limit the Hate Speech diffusion, is becoming fundamental, for instance for fighting against misogyny and xenophobia.
3 |The proposed task consists in Hate Speech detection in Twitter but featured by two specific different targets, immigrants and women, in a multilingual perspective, for Spanish and English.
The task will be articulated around two related subtasks for each of the involved languages: a basic task about Hate Speech, and another one where fine-grained features of hateful contents will be investigated in order to understand how existing approaches may deal with the identification of especially dangerous forms of hate, i.e. those where the incitement is against an individual rather than against a group of people, and where an aggressive behavior of the author can be identified as a prominent feature of the expression of hate. Participants will be asked to identify, on the one hand, if the target of hate is a single human or a group of persons, on the other hand, if the message author intends to be aggressive, harmful, or even to incite, in various forms, to violent acts against the target.
Important dates
9 |Join the hatEval mailing group: semeval2019-task5-hateval[at]googlegroups.com
20 |Organizers:
21 |32 |
-------------------------------------------------------------------------------- /public_trial.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msang/hateval/edeecf5efae3884cf1889b4298101d1e52c99efb/public_trial.zip -------------------------------------------------------------------------------- /reference_test_en.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msang/hateval/edeecf5efae3884cf1889b4298101d1e52c99efb/reference_test_en.zip -------------------------------------------------------------------------------- /reference_test_es.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msang/hateval/edeecf5efae3884cf1889b4298101d1e52c99efb/reference_test_es.zip -------------------------------------------------------------------------------- /reference_trial.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msang/hateval/edeecf5efae3884cf1889b4298101d1e52c99efb/reference_trial.zip -------------------------------------------------------------------------------- /scoring_program.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msang/hateval/edeecf5efae3884cf1889b4298101d1e52c99efb/scoring_program.zip -------------------------------------------------------------------------------- /terms.html: -------------------------------------------------------------------------------- 1 |
Terms and conditions
2 | 3 | 4 | 5 |
6 | By submitting results to this competition, you consent to the public release of your scores at the SemEval-2019 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.
7 |
8 | You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.
9 |
10 | You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.
11 |
12 | You agree not to redistribute the test data except in the manner prescribed by its licence.