├── frontpage.png ├── LICENSE └── README.md /frontpage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SALT-NLP/implicit-hate/HEAD/frontpage.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 SALT 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Implicit Hate Speech 2 | 3 | _Latent Hatred: A Benchmark for Understanding Implicit Hate Speech_ 4 | 5 | [[Read the Paper]](https://aclanthology.org/2021.emnlp-main.29/) | [[Take a Survey to Access the Data]](https://forms.gle/QxCpEbVp91Z35hWFA) | [[Download the Data]](https://www.dropbox.com/s/24meryhqi1oo0xk/implicit-hate-corpus.zip?dl=0) 6 | 7 | frontpage 8 | 9 | ## *Why Implicit Hate?* 10 | 11 | It is important to consider the subtle tricks that many extremists use to mask their threats and abuse. These more implicit forms of hate speech may easily go undetected by keyword detection systems, and even the most advanced architectures can fail if they have not been trained on implicit hate speech ([Caselli et al. 2020](https://aclanthology.org/2020.lrec-1.760/)). 12 | 13 | ## *Where can I download the data?* 14 | 15 | If you have not already, please first complete a short [survey](https://forms.gle/QxCpEbVp91Z35hWFA). Then follow [this link to download](https://www.dropbox.com/s/p1ctnsg3xlnupwr/implicit-hate-corpus.zip?dl=0) (2 MB, expands to 6 MB). 16 | 17 | ## *What's 'in the box?'* 18 | 19 | This dataset contains **22,056** tweets from the most prominent extremist groups in the United States; **6,346** of these tweets contain *implicit hate speech.* We decompose the implicit hate class using the following taxonomy (distribution shown on the left). 20 | 21 | * (24.2%) **Grievance:** frustration over a minority group's perceived privilege. 22 | * (20.0%) **Incitement:** implicitly promoting known hate groups and ideologies (e.g. by flaunting in-group power). 23 | * (13.6%) **Inferiority:** implying some group or person is of lesser value than another. 24 | * (12.6%) **Irony:** using sarcasm, humor, and satire to demean someone. 25 | * (17.9%) **Stereotypes:** associating a group with negative attribute using euphemisms, circumlocution, or metaphorical language. 26 | * (10.5%) **Threats:** making an indirect commitment to attack someone's body, well-being, reputation, liberty, etc. 27 | * (1.2%) **Other** 28 | 29 | Each of the 6,346 implicit hate tweets also has free-text annotations for *target demographic group* and an *implied statement* to describe the underlying message (see banner image above). 30 | 31 | ## *What can I do with this data?* 32 | 33 | State-of-the-art neural models may be able to learn from our data how to (1) classify this more difficult class of hate speech and (3) explain implicit hate by generating descriptions of both the *target* and the *implied message.* As our [paper baselines](#) show, neural models still have a ways to go, especially with classifying *implicit hate categories*, but overall, the results are promising, especially with *implied statement generation,* an admittedly challenging task. 34 | 35 | We hope you can extend our baselines and further our efforts to understand and address some of these most pernicious forms of language that plague the web, especially among extremist groups. 36 | 37 | ## *How do I cite this work?* 38 | 39 | **Citation:** 40 | 41 | > ElSherief, M., Ziems, C., Muchlinski, D., Anupindi, V., Seybolt, J., De Choudhury, M., & Yang, D. (2021). Latent Hatred: A Benchmark for Understanding Implicit Hate Speech. In _Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)_. 42 | 43 | **BibTeX:** 44 | 45 | ```tex 46 | @inproceedings{elsherief-etal-2021-latent, 47 | title = "Latent Hatred: A Benchmark for Understanding Implicit Hate Speech", 48 | author = "ElSherief, Mai and 49 | Ziems, Caleb and 50 | Muchlinski, David and 51 | Anupindi, Vaishnavi and 52 | Seybolt, Jordyn and 53 | De Choudhury, Munmun and 54 | Yang, Diyi", 55 | booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", 56 | month = nov, 57 | year = "2021", 58 | address = "Online and Punta Cana, Dominican Republic", 59 | publisher = "Association for Computational Linguistics", 60 | url = "https://aclanthology.org/2021.emnlp-main.29", 61 | pages = "345--363" 62 | } 63 | ``` 64 | --------------------------------------------------------------------------------