├── Guidelines.pdf ├── corpus ├── news.zip └── travel.zip ├── corpus-BIO_format └── .DS_Store └── README.md /Guidelines.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dhfbk/Histo/HEAD/Guidelines.pdf -------------------------------------------------------------------------------- /corpus/news.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dhfbk/Histo/HEAD/corpus/news.zip -------------------------------------------------------------------------------- /corpus/travel.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dhfbk/Histo/HEAD/corpus/travel.zip -------------------------------------------------------------------------------- /corpus-BIO_format/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dhfbk/Histo/HEAD/corpus-BIO_format/.DS_Store -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # READ ME 2 | 3 | This repository contains: 4 | - annotation guidelines designed to detect and classify event mentions in texts 5 | - a corpus of historical texts annotated with events (span + class) following the previously mentioned guidelines 6 | - the corpus in BIO/IOB2 format to be used with the BiLSTM implementation by Nils Reimers and Iryna Gurevych 7 | 8 | Due to space limitations, the following resources are in an external Google Drive folder (https://drive.google.com/open?id=1HVIZpCmei90tE2hMWIyH-b7_hhHUKnmb): 9 | - a set of word embeddings pre-trained on a part of the COHA corpus (https://corpus.byu.edu/coha/) made of texts published between 1860 and 1939 for a total of more than 198 million tokens; 10 | - best models for the automatic detection of events and the joint classification of event extent and type developed with the BiLSTM implementation by Nils Reimers and Iryna Gurevych (https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf) 11 | --------------------------------------------------------------------------------