├── .gitignore
├── LICENSE
├── README.md
├── aggregate_tags.py
├── annotations
    ├── D18-1062.txt
    ├── D18-1220.txt
    ├── D18-1276.txt
    ├── D18-1332.txt
    ├── D18-1494.txt
    ├── D19-1350.txt
    ├── D19-1555.txt
    ├── D19-1597.txt
    ├── N18-1009.txt
    ├── N18-1045.txt
    ├── N18-1158.txt
    ├── N18-1176.txt
    ├── N18-2031.txt
    ├── N18-2075.txt
    ├── N18-2097.txt
    ├── N19-1015.txt
    ├── N19-1071.txt
    ├── N19-1154.txt
    ├── N19-1157.txt
    ├── N19-1185.txt
    ├── N19-1329.txt
    ├── N19-2009.txt
    ├── P18-1089.txt
    ├── P18-1145.txt
    ├── P18-1192.txt
    ├── P18-2040.txt
    ├── P19-1009.txt
    ├── P19-1085.txt
    ├── P19-1113.txt
    ├── P19-1178.txt
    ├── P19-1201.txt
    ├── P19-1263.txt
    ├── P19-1286.txt
    ├── P19-1326.txt
    ├── P19-1336.txt
    ├── P19-1447.txt
    ├── P19-1511.txt
    ├── P19-1540.txt
    ├── P19-2032.txt
    └── P19-2038.txt
├── concepts.md
├── draw_bar.py
├── fig
    ├── annotations.png
    └── auto.png
├── get_paper.py
├── requirements.txt
├── rule_classifier.py
└── template.cpt


/.gitignore:
--------------------------------------------------------------------------------
1 | acl-anthology
2 | __pycache__
3 | .idea
4 | papers/
5 | auto/
6 | auto.tsv
7 | annotations.tsv
8 | *.swp
9 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright 2020 Graham Neubig
 2 | 
 3 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
 4 | 
 5 | 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
 6 | 
 7 | 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
 8 | 
 9 | 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
10 | 
11 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Concepts in Neural Networks for NLP
 2 | by [Graham Neubig](http://phontron.com), [Pengfei Liu](http://pfliu.com), and other contributors
 3 | 
 4 | This is a repository that makes an attempt to empirically take stock of the **most important concepts necessary to
 5 | understand cutting-edge research in neural network models for NLP**. You can look at two figures below, generated
 6 | automatically and through manual annotation, to see which of these topics are most common in current NLP research in
 7 | NLP (in recent papers in the [ACL anthology](http://aclanthology.info)). See the [list of concepts](concepts.md)
 8 | to see a more complete description of what each of the tags below means, and also see the directions below the figures if you want to contribute
 9 | to the project, or want to figure out how to generate the graphs yourself.
10 | 
11 | ![Manually Annotated Concepts in Neural Nets for NLP](fig/annotations.png)
12 | 
13 | ![Automatically Annotated Concepts in Neural Nets for NLP](fig/auto.png)
14 | 
15 | ## Contributing
16 | 
17 | There are several ways to contribute:
18 | * **Perform annotation:** We could use more manual annotations of the concepts covered by papers, so please take a look
19 |   below if you're interested in helping out in this regard.
20 | * **Improve the code:** Check the "issues" section of the github repo, there are still several things that could be done
21 |   to improve the functionality of the code.
22 | 
23 | ## Setup
24 | 
25 | If you want to run the code here, first get the requirements:
26 | 
27 |     pip install -r requirements.txt
28 | 
29 | Also, install `poppler` through your favorite version control software.
30 |     
31 | Then download the ACL Anthology github repository:
32 | 
33 |     git clone https://github.com/acl-org/acl-anthology.git
34 |     
35 | ## How to Perform Annotation
36 | 
37 | 1. Read `concepts.md` to learn more about the concepts that are annotated here.
38 | 2. Run `get_paper.py` (directions below) to get a paper to annotate with the concepts contained therein.
39 | 3. When the paper is downloaded, a text file corresponding to the paper ID will be written out to `auto/ID.txt`. This
40 |    will include some comments with the paper name, title, PDF location, etc. In addition, it will have some
41 |    automatically provided concept tags that were estimated based on the article text.
42 | 4. Manually confirm that the automatically annotated concepts are correct. If so, then delete the comment saying
43 |    "# CHECK: ". If the concepts are not included in the paper, then delete them.
44 | 5. Add any concepts that were not caught by the automatic process.
45 | 6. Once you are done annotating concepts, move `auto/ID.txt` to `annotations/ID.txt`. You can then send a pull request
46 |    to the repository to contribute back.
47 | 
48 | **Examples Running `get_paper.py`**
49 | 
50 | Get random papers to annotate:
51 | * 1: `python get_paper.py --years 18-19 --confs "P,N,D"  --n_sample 2 --template template.cpt --feature fulltext`
52 |  
53 | Get a specific paper to annotate:
54 | * 2: `python get_paper.py --paper_id P19-1032 --template template.cpt --feature fulltext`
55 | 
56 | where:
57 | * `paper_id`: it usually takes the form: P|N|D-1234. Moreover, once the `paper_id` has been specified, `years`,`confs`, and `n_sample` are not required.
58 | * `confs`: a comma-separted list of conference abbreviations from which papers can be selected (P,N,D)
59 | * `n_sample`: the number of sampled papers if paper_id is not specified
60 | * `template`: the file of concept template (e.g. template.cpt)
61 | * `feature`: which part of paper is used to classify (e.g. fulltext or title)
62 | 
63 | ## Generating Aggregate Statistics/Charts
64 | 
65 | To generate bar charts like the ones above, you use `aggregate_tags.py` to calculate aggregate statistics, then
66 | `draw_bar.py` to generate the bar chart.
67 | 
68 | If you want to do this for the hand-annotated concepts included in this library, you can run the following commands.
69 | 
70 |     python aggregate_tags.py annotations --concepts concepts.md > annotations.tsv
71 |     python draw_bar.py --tsv annotations.tsv --fig annotations.png
72 | 
73 | If you want to do this for automatically-generated tags from some portion of the ACL anthology, you can run the
74 | following command, which will download the papers from all the specified conferences (e.g. ACL, NAACL, and EMNLP from
75 | 2018 and 2019).
76 | 
77 |     python get_paper.py --years 18-19 --confs "P,N,D"  --n_sample all --template template.cpt --feature fulltext
78 |     python aggregate_tags.py auto --concepts concepts.md > auto.tsv
79 |     python draw_bar.py --tsv auto.tsv --fig auto.png
80 | 
81 | ## Acknowledgements
82 | 
83 | Thanks to Yoav Goldberg and the [TAS for CS11-747](http://phontron.com/class/nn4nlp2020/description.html) who gave feedback on the categorization of concepts. Also, thanks, of course, to everyone who has contributed or will contribute.
84 | 


--------------------------------------------------------------------------------
/aggregate_tags.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import os
 3 | import re
 4 | from collections import defaultdict
 5 | 
 6 | def get_all_files(f):
 7 |   if os.path.isfile(f):
 8 |     return [f]
 9 |   elif os.path.isdir(f):
10 |     ret = []
11 |     for x in os.listdir(f):
12 |       ret += get_all_files(os.path.join(f,x))
13 |     return ret
14 |   else:
15 |     raise ValueError(f'Could not find file {f}')
16 | 
17 | def get_tags(f, legal_tags = None):
18 |   with open(f, 'r') as fin:
19 |     for line in fin:
20 |       line = line.strip()
21 |       if not line.startswith("#"):
22 |         if legal_tags and line not in legal_tags:
23 |           raise ValueError(f'Illegal tag found in file {f}, {line}')
24 |         yield line
25 | 
26 | def parse_concepts(f):
27 |   concepts = {'not-neural': 1}
28 |   concept_re = re.compile(r': \[?`([a-z0-9-]+)`')
29 |   with open(f, 'r') as fin:
30 |     for line in fin:
31 |       m = re.search(concept_re, line)
32 |       if m:
33 |         val = m.group(1)
34 |         concepts[val] = 1
35 |   return concepts
36 | 
37 | if __name__ == "__main__":
38 | 
39 |   parser = argparse.ArgumentParser(description="Aggregate tags from several files or a directories of files into a"
40 |                                                " tag/count file for use in draw_bar.py.")
41 | 
42 |   parser.add_argument("files", type=str, nargs='+',
43 |                       help="The files or directories over which you'd like to aggregate")
44 |   parser.add_argument("--concepts", type=str, default=None,
45 |                       help="The concepts.md file, which is parsed to find a list of legal tags")
46 | 
47 |   args = parser.parse_args()
48 | 
49 |   fs = []
50 |   for f in args.files:
51 |     fs += get_all_files(f)
52 | 
53 |   legal_concepts = parse_concepts(args.concepts) if args.concepts else None
54 | 
55 |   all_tags = defaultdict(lambda: 0)
56 |   for f in fs:
57 |     for tag in get_tags(f, legal_tags=legal_concepts):
58 |       all_tags[tag] += 1
59 | 
60 |   for k, v in sorted(list(all_tags.items()), key=lambda x: -x[1]):
61 |     print(f'{k}\t{v}')
62 | 
63 | 


--------------------------------------------------------------------------------
/annotations/D18-1062.txt:
--------------------------------------------------------------------------------
1 | # Title: Unsupervised Bilingual Lexicon Induction via Latent Variable Models
2 | # Online location: https://www.aclweb.org/anthology/D18-1062.pdf
3 | train-mll
4 | adv-train
5 | latent-vae
6 | task-lexicon
7 | 


--------------------------------------------------------------------------------
/annotations/D18-1220.txt:
--------------------------------------------------------------------------------
1 | # Title: A Knowledge Hunting Framework for Common Sense Reasoning
2 | # Online location: https://www.aclweb.org/anthology/D18-1220.pdf
3 | not-neural
4 | 


--------------------------------------------------------------------------------
/annotations/D18-1276.txt:
--------------------------------------------------------------------------------
 1 | # Title: Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit
 2 | # Online location: https://www.aclweb.org/anthology/D18-1276.pdf
 3 | init-glorot
 4 | arch-lstm
 5 | arch-att
 6 | arch-subword
 7 | arch-gnn
 8 | arch-energy
 9 | search-beam
10 | struct-crf
11 | task-seqlab
12 | task-seq2seq
13 | loss-margin
14 | 


--------------------------------------------------------------------------------
/annotations/D18-1332.txt:
--------------------------------------------------------------------------------
1 | # Title: Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation
2 | # Online location: https://www.aclweb.org/anthology/D18-1332.pdf
3 | train-parallel
4 | optim-adam
5 | norm-gradient
6 | arch-gru
7 | task-seq2seq
8 | 


--------------------------------------------------------------------------------
/annotations/D18-1494.txt:
--------------------------------------------------------------------------------
1 | # Title: Siamese Network-Based Supervised Topic Modeling
2 | # Online location: https://www.aclweb.org/anthology/D18-1494.pdf
3 | reg-stopping
4 | arch-subword
5 | pre-word2vec
6 | latent-topic
7 | task-textclass
8 | 


--------------------------------------------------------------------------------
/annotations/D19-1350.txt:
--------------------------------------------------------------------------------
1 | # Title: Neural Topic Model with Reinforcement Learning
2 | # Online location: https://www.aclweb.org/anthology/D19-1350.pdf
3 | latent-vae
4 | nondif-reinforce
5 | optim-adam
6 | 


--------------------------------------------------------------------------------
/annotations/D19-1555.txt:
--------------------------------------------------------------------------------
1 | # Title: Leveraging Structural and Semantic Correspondence for Attribute-Oriented Aspect Sentiment Discovery
2 | # Online location: https://www.aclweb.org/anthology/D19-1555.pdf
3 | # CHECK: confidence=0.9, justification=Matched regex attention
4 | arch-att
5 | pre-word2vec
6 | pre-use
7 | 


--------------------------------------------------------------------------------
/annotations/D19-1597.txt:
--------------------------------------------------------------------------------
1 | # Title: GeoSQA: A Benchmark for Scenario-based Question Answering in the Geography Domain at High School Level
2 | # Online location: https://www.aclweb.org/anthology/D19-1597.pdf
3 | not-neural
4 | 


--------------------------------------------------------------------------------
/annotations/N18-1009.txt:
--------------------------------------------------------------------------------
1 | # Title: Please Clap: Modeling Applause in Campaign Speeches
2 | # Online location: https://www.aclweb.org/anthology/N18-1009.pdf
3 | optim-adam
4 | reg-dropout
5 | arch-lstm
6 | arch-cnn
7 | task-textclass
8 | pre-skipthought
9 | 


--------------------------------------------------------------------------------
/annotations/N18-1045.txt:
--------------------------------------------------------------------------------
1 | # Title: Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
2 | # Online location: https://www.aclweb.org/anthology/N18-1045.pdf
3 | pre-word2vec
4 | optim-adam
5 | optim-projection
6 | 


--------------------------------------------------------------------------------
/annotations/N18-1158.txt:
--------------------------------------------------------------------------------
 1 | # Title: Ranking Sentences for Extractive Summarization with Reinforcement Learning
 2 | # Online location: https://www.aclweb.org/anthology/N18-1158.pdf
 3 | optim-adam
 4 | pool-max
 5 | arch-lstm
 6 | arch-cnn
 7 | nondif-reinforce
 8 | task-extractive
 9 | task-seq2seq
10 | pre-word2vec
11 | 


--------------------------------------------------------------------------------
/annotations/N18-1176.txt:
--------------------------------------------------------------------------------
1 | # Title: Linguistic Cues to Deception and Perceived Deception in Interview Dialogues
2 | # Online location: https://www.aclweb.org/anthology/N18-1176.pdf
3 | not-neural
4 | 


--------------------------------------------------------------------------------
/annotations/N18-2031.txt:
--------------------------------------------------------------------------------
1 | # Title: Frustratingly Easy Meta-Embedding – Computing Meta-Embeddings by Averaging Source Word Embeddings
2 | # Online location: https://www.aclweb.org/anthology/N18-2031.pdf
3 | pool-mean
4 | pre-glove
5 | pre-word2vec
6 | 


--------------------------------------------------------------------------------
/annotations/N18-2075.txt:
--------------------------------------------------------------------------------
1 | # Title: Text Segmentation as a Supervised Learning Task
2 | # Online location: https://www.aclweb.org/anthology/N18-2075.pdf
3 | pool-max
4 | arch-bilstm
5 | pre-word2vec
6 | 


--------------------------------------------------------------------------------
/annotations/N18-2097.txt:
--------------------------------------------------------------------------------
 1 | # Title: A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
 2 | # Online location: https://www.aclweb.org/anthology/N18-2097.pdf
 3 | arch-bilstm
 4 | arch-att
 5 | arch-copy
 6 | arch-coverage
 7 | search-beam
 8 | optim-adagrad
 9 | task-seq2seq
10 | 


--------------------------------------------------------------------------------
/annotations/N19-1015.txt:
--------------------------------------------------------------------------------
 1 | # Title: Topic-Guided Variational Auto-Encoder for Text Generation
 2 | # Online location: https://www.aclweb.org/anthology/N19-1015.pdf
 3 | reg-dropout
 4 | arch-gru
 5 | arch-att
 6 | latent-vae
 7 | latent-topic
 8 | task-lm
 9 | task-seq2seq
10 | 


--------------------------------------------------------------------------------
/annotations/N19-1071.txt:
--------------------------------------------------------------------------------
 1 | # Title: SEQˆ3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression
 2 | # Online location: https://www.aclweb.org/anthology/N19-1071.pdf
 3 | optim-adam
 4 | task-lm
 5 | reg-dropout
 6 | reg-worddropout
 7 | norm-layer
 8 | arch-birnn
 9 | arch-lstm
10 | arch-att
11 | pre-glove
12 | nondif-reinforce
13 | adv-gan
14 | latent-vae
15 | task-seq2seq
16 | 


--------------------------------------------------------------------------------
/annotations/N19-1154.txt:
--------------------------------------------------------------------------------
1 | # Title: One Size Does Not Fit All: Comparing NMT Representations of Different Granularities
2 | # Online location: https://www.aclweb.org/anthology/N19-1154.pdf
3 | optim-sgd
4 | arch-lstm
5 | arch-att
6 | arch-subword
7 | task-seqlab
8 | task-seq2seq
9 | 


--------------------------------------------------------------------------------
/annotations/N19-1157.txt:
--------------------------------------------------------------------------------
1 | # Title: Quantifying the morphosyntactic content of Brown Clusters
2 | # Online location: https://www.aclweb.org/anthology/N19-1157.pdf
3 | not-neural
4 | 


--------------------------------------------------------------------------------
/annotations/N19-1185.txt:
--------------------------------------------------------------------------------
 1 | # Title: Tweet Stance Detection Using an Attention based Neural Ensemble Model
 2 | # Online location: https://www.aclweb.org/anthology/N19-1185.pdf
 3 | optim-adam
 4 | arch-bilstm
 5 | reg-norm
 6 | pool-max
 7 | arch-cnn
 8 | arch-att
 9 | task-textclass
10 | 


--------------------------------------------------------------------------------
/annotations/N19-1329.txt:
--------------------------------------------------------------------------------
1 | # Title: Understanding Learning Dynamics Of Language Models with SVCCA
2 | # Online location: https://www.aclweb.org/anthology/N19-1329.pdf
3 | pre-elmo
4 | arch-lstm
5 | loss-cca
6 | loss-svd
7 | task-seqlab
8 | task-lm
9 | 


--------------------------------------------------------------------------------
/annotations/N19-2009.txt:
--------------------------------------------------------------------------------
 1 | # Title: Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce
 2 | # Online location: https://www.aclweb.org/anthology/N19-2009.pdf
 3 | arch-cnn
 4 | arch-lstm
 5 | arch-att
 6 | activ-tanh
 7 | adv-gan
 8 | task-condlm
 9 | nondif-reinforce
10 | 


--------------------------------------------------------------------------------
/annotations/P18-1089.txt:
--------------------------------------------------------------------------------
1 | # Title: Identifying Transferable Information Across Domains for Cross-domain Sentiment Classification
2 | # Online location: https://www.aclweb.org/anthology/P18-1089.pdf
3 | train-transfer
4 | comb-ensemble
5 | pre-word2vec
6 | task-textclass
7 | 


--------------------------------------------------------------------------------
/annotations/P18-1145.txt:
--------------------------------------------------------------------------------
1 | # Title: Nugget Proposal Networks for Chinese Event Detection
2 | # Online location: https://www.aclweb.org/anthology/P18-1145.pdf
3 | arch-bilstm
4 | arch-cnn
5 | arch-gating
6 | task-seqlab
7 | activ-tanh
8 | 


--------------------------------------------------------------------------------
/annotations/P18-1192.txt:
--------------------------------------------------------------------------------
 1 | # Title: Syntax for Semantic Role Labeling, To Be, Or Not To Be
 2 | # Online location: https://www.aclweb.org/anthology/P18-1192.pdf
 3 | optim-adam
 4 | reg-worddropout
 5 | arch-bilstm
 6 | arch-cnn
 7 | arch-gating
 8 | pre-glove
 9 | pre-word2vec
10 | task-seqlab
11 | task-relation
12 | 


--------------------------------------------------------------------------------
/annotations/P18-2040.txt:
--------------------------------------------------------------------------------
1 | # Title: Improving Topic Quality by Promoting Named Entities in Topic Modeling
2 | # Online location: https://www.aclweb.org/anthology/P18-2040.pdf
3 | not-neural
4 | 


--------------------------------------------------------------------------------
/annotations/P19-1009.txt:
--------------------------------------------------------------------------------
 1 | # Title: AMR Parsing as Sequence-to-Graph Transduction
 2 | # Online location: https://www.aclweb.org/anthology/P19-1009.pdf
 3 | optim-adam
 4 | reg-dropout
 5 | reg-stopping
 6 | arch-bilinear
 7 | arch-cnn
 8 | task-graph
 9 | norm-gradient
10 | arch-lstm
11 | pool-mean
12 | pool-max
13 | arch-att
14 | arch-copy
15 | search-greedy
16 | search-beam
17 | pre-glove
18 | pre-bert
19 | 


--------------------------------------------------------------------------------
/annotations/P19-1085.txt:
--------------------------------------------------------------------------------
 1 | # Title: GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification
 2 | # Online location: https://www.aclweb.org/anthology/P19-1085.pdf
 3 | optim-adam
 4 | reg-patience
 5 | arch-att
 6 | arch-transformer
 7 | arch-gnn
 8 | activ-relu
 9 | pool-max
10 | pool-mean
11 | pre-bert
12 | task-textpair
13 | # CHECK: confidence=0.9, justification=Matched regex language modeling|language model
14 | task-lm
15 | 


--------------------------------------------------------------------------------
/annotations/P19-1113.txt:
--------------------------------------------------------------------------------
 1 | # Title: Rumor Detection by Exploiting User Credibility Information, Attention and Multi-task Learning
 2 | # Online location: https://www.aclweb.org/anthology/P19-1113.pdf
 3 | reg-dropout
 4 | optim-adadelta
 5 | arch-lstm
 6 | arch-gru
 7 | train-mtl
 8 | arch-att
 9 | task-textclass
10 | pre-word2vec
11 | 


--------------------------------------------------------------------------------
/annotations/P19-1178.txt:
--------------------------------------------------------------------------------
 1 | # Title: Self-Supervised Neural Machine Translation
 2 | # Online location: https://www.aclweb.org/anthology/P19-1178.pdf
 3 | optim-noam
 4 | reg-dropout
 5 | reg-labelsmooth
 6 | arch-lstm
 7 | arch-transformer
 8 | loss-margin
 9 | task-lm
10 | task-seq2seq
11 | train-transfer
12 | 


--------------------------------------------------------------------------------
/annotations/P19-1201.txt:
--------------------------------------------------------------------------------
 1 | # Title: Jointly Learning Semantic Parser and Natural Language Generator via Dual Information Maximization
 2 | # Online location: https://www.aclweb.org/anthology/P19-1201.pdf
 3 | optim-adam
 4 | reg-patience
 5 | train-mtl
 6 | arch-bilstm
 7 | arch-att
 8 | arch-copy
 9 | search-beam
10 | nondif-reinforce
11 | task-seq2seq
12 | task-tree
13 | 


--------------------------------------------------------------------------------
/annotations/P19-1263.txt:
--------------------------------------------------------------------------------
 1 | # Title: Exploiting Explicit Paths for Multi-hop Reading Comprehension
 2 | # Online location: https://www.aclweb.org/anthology/P19-1263.pdf
 3 | optim-adam
 4 | norm-gradient
 5 | reg-dropout
 6 | arch-bilstm
 7 | arch-gru
 8 | arch-att
 9 | search-beam
10 | pre-elmo
11 | pre-glove
12 | 


--------------------------------------------------------------------------------
/annotations/P19-1286.txt:
--------------------------------------------------------------------------------
 1 | # Title: Domain Adaptation of Neural Machine Translation by Lexicon Induction
 2 | # Online location: https://www.aclweb.org/anthology/P19-1286.pdf
 3 | loss-svd
 4 | arch-lstm
 5 | arch-transformer
 6 | adv-feat
 7 | train-augment
 8 | train-transfer
 9 | optim-adam
10 | search-beam
11 | task-seq2seq
12 | task-lexicon
13 | 


--------------------------------------------------------------------------------
/annotations/P19-1326.txt:
--------------------------------------------------------------------------------
1 | # Title: Embedding Imputation with Grounded Language Information
2 | # Online location: https://www.aclweb.org/anthology/P19-1326.pdf
3 | arch-gcnn
4 | activ-relu
5 | pre-glove
6 | 


--------------------------------------------------------------------------------
/annotations/P19-1336.txt:
--------------------------------------------------------------------------------
1 | # Title: Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition
2 | # Online location: https://www.aclweb.org/anthology/P19-1336.pdf
3 | arch-transformer
4 | arch-att
5 | arch-bilstm
6 | adv-feat
7 | adv-train
8 | struct-crf
9 | 


--------------------------------------------------------------------------------
/annotations/P19-1447.txt:
--------------------------------------------------------------------------------
 1 | # Title: Reranking for Neural Semantic Parsing
 2 | # Online location: https://www.aclweb.org/anthology/P19-1447.pdf
 3 | arch-att
 4 | arch-copy
 5 | arch-bilstm
 6 | task-tree
 7 | task-textpair
 8 | search-beam
 9 | nondif-minrisk
10 | 


--------------------------------------------------------------------------------
/annotations/P19-1511.txt:
--------------------------------------------------------------------------------
1 | # Title: Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks
2 | # Online location: https://www.aclweb.org/anthology/P19-1511.pdf
3 | arch-bilstm
4 | arch-cnn
5 | pre-glove
6 | optim-adadelta
7 | struct-crf
8 | task-seqlab
9 | 


--------------------------------------------------------------------------------
/annotations/P19-1540.txt:
--------------------------------------------------------------------------------
 1 | # Title: Ordinal and Attribute Aware Response Generation in a Multimodal Dialogue System
 2 | # Online location: https://www.aclweb.org/anthology/P19-1540.pdf
 3 | arch-bilinear
 4 | reg-dropout
 5 | init-glorot
 6 | reg-dropout
 7 | reg-labelsmooth
 8 | norm-gradient
 9 | arch-gru
10 | arch-selfatt
11 | search-beam
12 | task-condlm
13 | 


--------------------------------------------------------------------------------
/annotations/P19-2032.txt:
--------------------------------------------------------------------------------
 1 | # Title: Automatic Generation of Personalized Comment Based on User Profile
 2 | # Online location: https://www.aclweb.org/anthology/P19-2032.pdf
 3 | arch-bilstm
 4 | optim-sgd
 5 | arch-att
 6 | arch-memo
 7 | arch-gating
 8 | search-beam
 9 | task-seq2seq
10 | 


--------------------------------------------------------------------------------
/annotations/P19-2038.txt:
--------------------------------------------------------------------------------
 1 | # Title: ARHNet - Leveraging Community Interaction for Detection of Religious Hate Speech in Arabic
 2 | # Online location: https://www.aclweb.org/anthology/P19-2038.pdf
 3 | optim-adam
 4 | reg-dropout
 5 | reg-norm
 6 | arch-lstm
 7 | arch-bilstm
 8 | arch-gru
 9 | arch-bigru
10 | arch-selfatt
11 | arch-gnn
12 | arch-cnn
13 | pre-word2vec
14 | task-textclass
15 | 


--------------------------------------------------------------------------------
/concepts.md:
--------------------------------------------------------------------------------
  1 | # Concept Hierarchy in Neural Networks for NLP
  2 | 
  3 | Below is a list of important concepts in neural networks for NLP. In the `annotations/` directory in this repository,
  4 | we have examples of papers annotated with these concepts that you can peruse.
  5 | 
  6 | **Annotation Critera**: For a particular paper, the concept should be annotated if it is important to understand the
  7 | proposed method. It should also be annotated if it's important to understand the evaluation. For example, if a
  8 | proposed self-attention model is compared to a baseline that uses an LSTM, and the difference between these two
  9 | methods is important to understanding the experimental results, then the LSTM concept should also be annotated. Concepts
 10 | do not need to be annotated if they are simply mentioned in passing, or in the related work section.
 11 | 
 12 | **Implication**: Some tags are listed with "`XXX` (implies `YYY`)" which means you need to understand a particular
 13 | concept `XXX` in order to understand concept `YYY`. If `YYY` exists in a paper, you do not need to annotate `XXX`.
 14 | 
 15 | **Non-neural Papers**: This conceptual hierarchy is for tagging papers that are about neural network models for NLP.
 16 | If a paper is not fundamentally about some application of neural networks to NLP, it should be tagged with `not-neural`,
 17 | and no other tags need to be applied.
 18 | 
 19 | ## Optimization/Learning
 20 | 
 21 | ### Optimizers and Optimization Techniques
 22 | 
 23 | * Mini-batch SGD: [`optim-sgd`](http://pfliu.com/pl-nlp2019/bs/optim-sgd.html) 
 24 | * Adam: [`optim-adam`](http://pfliu.com/pl-nlp2019/bs/optim-adam.html) (implies `optim-sgd`)
 25 | * Adagrad: [`optim-adagrad`](http://pfliu.com/pl-nlp2019/bs/optim-adagrad.html) (implies `optim-sgd`)
 26 | * Adadelta: [`optim-adadelta`](http://pfliu.com/pl-nlp2019/bs/optim-adadelta.html) (implies `optim-sgd`)
 27 | * Adam with Specialized Transformer Learning Rate ("Noam" Schedule): [`optim-noam`](http://pfliu.com/pl-nlp2019/bs/optim-noam.html) (implies `optim-adam`)
 28 | * SGD with Momentum: [`optim-momentum`](http://pfliu.com/pl-nlp2019/bs/optim-momentum.html) (implies `optim-sgd`)
 29 | * AMS: [`optim-amsgrad`](http://pfliu.com/pl-nlp2019/bs/optim-amsgrad.html) (implies `optim-sgd`)
 30 | * Projection / Projected Gradient Descent: [`optim-projection`](http://pfliu.com/pl-nlp2019/bs/optim-projection.html) (implies `optim-sgd`)
 31 | 
 32 | ### Initialization
 33 | 
 34 | * Glorot/Xavier Initialization: [`init-glorot`](http://pfliu.com/pl-nlp2019/bs/init-glorot.html)
 35 | * He Initialization: [`init-he`](http://pfliu.com/pl-nlp2019/bs/init-he.html)
 36 | 
 37 | ### Regularization
 38 | 
 39 | * Dropout: [`reg-dropout`](http://pfliu.com/pl-nlp2019/bs/reg-dropout.html)
 40 | * Word Dropout: [`reg-worddropout`](http://pfliu.com/pl-nlp2019/bs/reg-worddropout.html) (implies `reg-dropout`)
 41 | * Norm (L1/L2) Regularization: [`reg-norm`](http://pfliu.com/pl-nlp2019/bs/reg-norm.html)
 42 | * Early Stopping: [`reg-stopping`](http://pfliu.com/pl-nlp2019/bs/reg-stopping.html)
 43 | * Patience: [`reg-patience`](http://pfliu.com/pl-nlp2019/bs/reg-patience.html) (implies `reg-stopping`)
 44 | * Weight Decay: [`reg-decay`](http://pfliu.com/pl-nlp2019/bs/reg-decay.html)
 45 | * Label Smoothing: [`reg-labelsmooth`](http://pfliu.com/pl-nlp2019/bs/reg-labelsmooth.html)
 46 | 
 47 | ### Normalization
 48 | 
 49 | * Layer Normalization: [`norm-layer`](http://pfliu.com/pl-nlp2019/bs/norm-layer.html)
 50 | * Batch Normalization: [`norm-batch`](http://pfliu.com/pl-nlp2019/bs/norm-batch.html)
 51 | * Gradient Clipping: [`norm-gradient`](http://pfliu.com/pl-nlp2019/bs/norm-gradient.html)
 52 | 
 53 | ### Loss Functions (other than cross-entropy)
 54 | 
 55 | * Canonical Correlation Analysis (CCA): [`loss-cca`](http://pfliu.com/pl-nlp2019/bs/loss-cca.html)
 56 | * Singular Value Decomposition (SVD): [`loss-svd`](http://pfliu.com/pl-nlp2019/bs/loss-svd.html)
 57 | * Margin-based Loss Functions: [`loss-margin`](http://pfliu.com/pl-nlp2019/bs/loss-margin.html)
 58 | * Contrastive Loss: [`loss-cons`](http://pfliu.com/pl-nlp2019/bs/loss-cons.html)
 59 | * Noise Contrastive Estimation (NCE): [`loss-nce`](http://pfliu.com/pl-nlp2019/bs/loss-nce.html) (implies `loss-cons`)
 60 | * Triplet Loss: [`loss-triplet`](http://pfliu.com/pl-nlp2019/bs/loss-triplet.html) (implies `loss-cons`)
 61 | 
 62 | ### Training Paradigms
 63 | 
 64 | * Multi-task Learning (MTL): [`train-mtl`](http://pfliu.com/pl-nlp2019/bs/train-mtl.html)
 65 | * Multi-lingual Learning (MLL): [`train-mll`](http://pfliu.com/pl-nlp2019/bs/train-mll.html) (implies `train-mtl`)
 66 | * Transfer Learning: [`train-transfer`](http://pfliu.com/pl-nlp2019/bs/train-transfer.html)
 67 | * Active Learning: [`train-active`](http://pfliu.com/pl-nlp2019/bs/train-active.html)
 68 | * Data Augmentation: [`train-augment`](http://pfliu.com/pl-nlp2019/bs/train-augment.html)
 69 | * Curriculum Learning: [`train-curriculum`](http://pfliu.com/pl-nlp2019/bs/train-curriculum.html)
 70 | * Parallel Training: [`train-parallel`](http://pfliu.com/pl-nlp2019/bs/train-parallel.html)
 71 | 
 72 | ## Sequence Modeling Architectures
 73 | 
 74 | ### Activation Functions
 75 | 
 76 | * Hyperbolic Tangent (tanh): [`activ-tanh`](http://pfliu.com/pl-nlp2019/bs/activ-tanh.html)
 77 | * Rectified Linear Units (RelU): [`activ-relu`](http://pfliu.com/pl-nlp2019/bs/activ-relu.html)
 78 | 
 79 | ### Pooling Operations
 80 | 
 81 | * Max Pooling: [`pool-max`](http://pfliu.com/pl-nlp2019/bs/pool-max.html)
 82 | * Mean Pooling: [`pool-mean`](http://pfliu.com/pl-nlp2019/bs/pool-mean.html)
 83 | * k-Max Pooling: [`pool-kmax`](http://pfliu.com/pl-nlp2019/bs/pool-kmax.html)
 84 | 
 85 | ### Recurrent Architectures
 86 | 
 87 | * Recurrent Neural Network (RNN): [`arch-rnn`](http://pfliu.com/pl-nlp2019/bs/arch-rnn.html)
 88 | * Bi-directional Recurrent Neural Network (Bi-RNN): [`arch-birnn`](http://pfliu.com/pl-nlp2019/bs/arch-birnn.html) (implies `arch-rnn`)
 89 | * Long Short-term Memory (LSTM): [`arch-lstm`](http://pfliu.com/pl-nlp2019/bs/arch-lstm.html) (implies `arch-rnn`)
 90 | * Bi-directional Long Short-term Memory (LSTM): [`arch-bilstm`](http://pfliu.com/pl-nlp2019/bs/arch-bilstm.html) (implies `arch-birnn`, `arch-lstm`)
 91 | * Gated Recurrent Units (GRU): [`arch-gru`](http://pfliu.com/pl-nlp2019/bs/arch-gru.html) (implies `arch-rnn`)
 92 | * Bi-directional Gated Recurrent Units (GRU): [`arch-bigru`](http://pfliu.com/pl-nlp2019/bs/arch-bigru.html) (implies `arch-birnn`, `arch-gru`)
 93 | 
 94 | ### Other Sequential/Structured Architectures
 95 | 
 96 | * Bag-of-words, Bag-of-embeddings, Continuous Bag-of-words (BOW): `arch-bow`
 97 | * Convolutional Neural Networks (CNN): [`arch-cnn`](http://pfliu.com/pl-nlp2019/bs/arch-cnn.html)
 98 | * Attention: [`arch-att`](http://pfliu.com/pl-nlp2019/bs/arch-att.html)
 99 | * Self Attention: [`arch-selfatt`](http://pfliu.com/pl-nlp2019/bs/arch-selfatt.html) (implies `arch-att`)
100 | * Recursive Neural Network (RecNN): [`arch-recnn`](http://pfliu.com/pl-nlp2019/bs/arch-recnn.html)
101 | * Tree-structured Long Short-term Memory (TreeLSTM): [`arch-treelstm`](http://pfliu.com/pl-nlp2019/bs/arch-treelstm.html) (implies `arch-recnn`)
102 | * Graph Neural Network (GNN): [`arch-gnn`](http://pfliu.com/pl-nlp2019/bs/arch-gnn.html)
103 | * Graph Convolutional Neural Network (GCNN): [`arch-gcnn`](http://pfliu.com/pl-nlp2019/bs/arch-gcnn.html) (implies `arch-gnn`)
104 | 
105 | ### Architectural Techniques
106 | 
107 | * Residual Connections (ResNet): [`arch-residual`](http://pfliu.com/pl-nlp2019/bs/arch-residual.html)
108 | * Gating Connections, Highway Connections: [`arch-gating`](http://pfliu.com/pl-nlp2019/bs/arch-gating.html)
109 | * Memory: [`arch-memo`](http://pfliu.com/pl-nlp2019/bs/arch-memo.html)
110 | * Copy Mechanism: [`arch-copy`](http://pfliu.com/pl-nlp2019/bs/arch-copy.html)
111 | * Bilinear, Biaffine Models: [`arch-bilinear`](http://pfliu.com/pl-nlp2019/bs/arch-bilinear.html)
112 | * Coverage Vectors/Penalties: [`arch-coverage`](http://pfliu.com/pl-nlp2019/bs/arch-coverage.html)
113 | * Subword Units: [`arch-subword`](http://pfliu.com/pl-nlp2019/bs/arch-subword.html)
114 | * Energy-based, Globally-normalized Mdels: [`arch-energy`](http://pfliu.com/pl-nlp2019/bs/arch-energy.html)
115 | 
116 | ### Standard Composite Architectures
117 | 
118 | * Transformer: [`arch-transformer`](http://pfliu.com/pl-nlp2019/bs/arch-transformer.html) (implies `arch-selfatt`, `arch-residual`, `arch-layernorm`, `optim-noam`)
119 | 
120 | 
121 | ## Model Combination
122 | 
123 | * Ensembling: [`comb-ensemble`](http://pfliu.com/pl-nlp2019/bs/comb-ensemble.html)
124 | 
125 | ## Search Algorithms
126 | 
127 | * Greedy Search: [`search-greedy`](http://pfliu.com/pl-nlp2019/bs/search-greedy.html)
128 | * Beam Search: [`search-beam`](http://pfliu.com/pl-nlp2019/bs/search-beam.html)
129 | * A* Search: [`search-astar`](http://pfliu.com/pl-nlp2019/bs/search-astar.html)
130 | * Viterbi Algorithm: [`search-viterbi`](http://pfliu.com/pl-nlp2019/bs/search-viterbi.html)
131 | * Ancestral Sampling: [`search-sampling`](http://pfliu.com/pl-nlp2019/bs/search-sampling.html)
132 | * Gumbel Max: [`search-gumbel`](http://pfliu.com/pl-nlp2019/bs/search-gumbel.html) (implies `search-sampling`)
133 | 
134 | ## Prediction Tasks
135 | 
136 | * Text Classification (text -> label): [`task-textclass`](http://pfliu.com/pl-nlp2019/bs/task-textclass.html)
137 | * Text Pair Classification (two texts -> label: [`task-textpair`](http://pfliu.com/pl-nlp2019/bs/task-textpair.html)
138 | * Sequence Labeling (text -> one label per token): [`task-seqlab`](http://pfliu.com/pl-nlp2019/bs/task-seqlab.html)
139 | * Extractive Summarization (text -> subset of text): [`task-extractive`](http://pfliu.com/pl-nlp2019/bs/task-extractive.html) (implies `text-seqlab`)
140 | * Span Labeling (text -> labels on spans): [`task-spanlab`](http://pfliu.com/pl-nlp2019/bs/task-spanlab.html)
141 | * Language Modeling (predict probability of text): [`task-lm`](http://pfliu.com/pl-nlp2019/bs/task-lm.html)
142 | * Conditioned Language Modeling (some input -> text): [`task-condlm`](http://pfliu.com/pl-nlp2019/bs/task-condlm.html) (implies `task-lm`)
143 | * Sequence-to-sequence Tasks (text -> text, including MT): [`task-seq2seq`](http://pfliu.com/pl-nlp2019/bs/task-seq2seq.html) (implies `task-condlm`)
144 | * Cloze-style Prediction, Masked Language Modeling (right and left context -> word): [`task-cloze`](http://pfliu.com/pl-nlp2019/bs/task-cloze.html)
145 | * Context Prediction (as in word2vec) (word -> right and left context): [`task-context`](http://pfliu.com/pl-nlp2019/bs/task-context.html)
146 | * Relation Prediction (text -> graph of relations between words, including dependency parsing): [`task-relation`](http://pfliu.com/pl-nlp2019/bs/task-relation.html)
147 | * Tree Prediction (text -> tree, including syntactic and some semantic semantic parsing): [`task-tree`](http://pfliu.com/pl-nlp2019/bs/task-tree.html)
148 | * Graph Prediction (text -> graph not necessarily between nodes): [`task-graph`](http://pfliu.com/pl-nlp2019/bs/task-graph.html)
149 | * Lexicon Induction/Embedding Alignment (text/embeddings -> bi- or multi-lingual lexicon): [`task-lexicon`](http://pfliu.com/pl-nlp2019/bs/task-lexicon.html)
150 | * Word Alignment (parallel text -> alignment between words): [`task-alignment`](http://pfliu.com/pl-nlp2019/bs/task-alignment.html)
151 | 
152 | ## Composite Pre-trained Embedding Techniques
153 | 
154 | * word2vec: [`pre-word2vec`](http://pfliu.com/pl-nlp2019/bs/pre-word2vec.html) (implies `arch-cbow`, `task-cloze`, `task-context`)
155 | * fasttext: [`pre-fasttext`](http://pfliu.com/pl-nlp2019/bs/pre-fasttext.html) (implies `arch-cbow`, `arch-subword`, `task-cloze`, `task-context`)
156 | * GloVe: [`pre-glove`](http://pfliu.com/pl-nlp2019/bs/pre-glove.html)
157 | * Paragraph Vector (ParaVec): [`pre-paravec`](http://pfliu.com/pl-nlp2019/bs/pre-paravec.html)
158 | * Skip-thought: [`pre-skipthought`](http://pfliu.com/pl-nlp2019/bs/pre-skipthought.html) (implies `arch-lstm`, `task-seq2seq`)
159 | * ELMo: [`pre-elmo`](http://pfliu.com/pl-nlp2019/bs/pre-elmo.html) (implies `arch-bilstm`, `task-lm`)
160 | * BERT: [`pre-bert`](http://pfliu.com/pl-nlp2019/bs/pre-bert.html) (implies `arch-transformer`, `task-cloze`, `task-textpair`)
161 | * Universal Sentence Encoder (USE): [`pre-use`](http://pfliu.com/pl-nlp2019/bs/pre-use.html) (implies `arch-transformer`, `task-seq2seq`)
162 | 
163 | ## Structured Models/Algorithms
164 | 
165 | * Hidden Markov Models (HMM): [`struct-hmm`](http://pfliu.com/pl-nlp2019/bs/struct-hmm.html)
166 | * Conditional Random Fields (CRF): [`struct-crf`](http://pfliu.com/pl-nlp2019/bs/struct-crf.html)
167 | * Context-free Grammar (CFG): [`struct-cfg`](http://pfliu.com/pl-nlp2019/bs/struct-cfg.html)
168 | * Combinatorial Categorical Grammar (CCG): [`struct-ccg`](http://pfliu.com/pl-nlp2019/bs/struct-ccg.html)
169 | 
170 | ## Relaxation/Training Methods for Non-differentiable Functions
171 | 
172 | * Complete Enumeration: [`nondif-enum`](http://pfliu.com/pl-nlp2019/bs/nondif-enum.html)
173 | * Straight-through Estimator: [`nondif-straightthrough`](http://pfliu.com/pl-nlp2019/bs/nondif-straightthrough.html)
174 | * Gumbel Softmax: [`nondif-gumbelsoftmax`](http://pfliu.com/pl-nlp2019/bs/nondif-gumbelsoftmax.html)
175 | * Minimum Risk Training: [`nondif-minrisk` ](http://pfliu.com/pl-nlp2019/bs/nondif-minrisk.html)
176 | * REINFORCE: [`nondif-reinforce` ](http://pfliu.com/pl-nlp2019/bs/nondif-reinforce.html)
177 | 
178 | ## Adversarial Methods
179 | 
180 | * Generative Adversarial Networks (GAN): [`adv-gan`](http://pfliu.com/pl-nlp2019/bs/adv-gan.html)
181 | * Adversarial Feature Learning: [`adv-feat`](http://pfliu.com/pl-nlp2019/bs/adv-feat.html)
182 | * Adversarial Examples: [`adv-examp`](http://pfliu.com/pl-nlp2019/bs/adv-examp.html)
183 | * Adversarial Training: [`adv-train`](http://pfliu.com/pl-nlp2019/bs/adv-train.html) (implies `adv-examp`)
184 | 
185 | ## Latent Variable Models
186 | 
187 | * Variational Auto-encoder (VAE): [`latent-vae`](http://pfliu.com/pl-nlp2019/bs/latent-vae.html)
188 | * Topic Model: [`latent-topic`](http://pfliu.com/pl-nlp2019/bs/latent-topic.html)
189 | 
190 | ## Meta Learning
191 | 
192 | * Meta-learning Initialization: [`meta-init`](http://pfliu.com/pl-nlp2019/bs/meta-init.html)
193 | * Meta-learning Optimizers: [`meta-optim`](http://pfliu.com/pl-nlp2019/bs/meta-optim.html)
194 | * Meta-learning Loss functions: [`meta-loss`](http://pfliu.com/pl-nlp2019/bs/meta-loss.html)
195 | * Neural Architecture Search: [`meta-arch`](http://pfliu.com/pl-nlp2019/bs/meta-arch.html)
196 | 


--------------------------------------------------------------------------------
/draw_bar.py:
--------------------------------------------------------------------------------
 1 | # import libraries
 2 | import matplotlib
 3 | matplotlib.use('Agg')
 4 | import pandas as pd
 5 | import matplotlib.pyplot as plt
 6 | import argparse
 7 | from collections import defaultdict
 8 | #%matplotlib inline
 9 | 
10 | # set font
11 | plt.rcParams['font.family'] = 'sans-serif'
12 | plt.rcParams['font.sans-serif'] = 'Helvetica'
13 | 
14 | # set the style of the axes and the text color
15 | plt.rcParams['axes.edgecolor']='#333F4B'
16 | plt.rcParams['axes.linewidth']=0.8
17 | plt.rcParams['xtick.color']='#333F4B'
18 | plt.rcParams['ytick.color']='#333F4B'
19 | plt.rcParams['text.color']='#333F4B'
20 | 
21 | 
22 | 
23 | 
24 | parser = argparse.ArgumentParser(description='Draw Bar')
25 | parser.add_argument('--tsv', default='input.tsv', help='input file separted by \'\\t\' ')
26 | parser.add_argument('--fig', default='out.png', help='the output figure')
27 | parser.add_argument('--title', default='Concept Count in All Papers', help='the title of the graph')
28 | parser.add_argument('--colored_concepts', default=None, nargs='+',
29 | 										help='An interleaved list of filenames containing concept tags (e.g. first.txt red second.txt purple)')
30 | 
31 | args = parser.parse_args()
32 | 
33 | concept_colors = defaultdict(lambda: '#007ACC')
34 | if args.colored_concepts:
35 | 	for i in range(0, len(args.colored_concepts), 2):
36 | 		print(f"opening {args.colored_concepts[i]} as {args.colored_concepts[i+1]}")
37 | 		with open(args.colored_concepts[i], 'r') as f:
38 | 			for line in f:
39 | 				line = line.strip()
40 | 				concept_colors[line] = args.colored_concepts[i+1]
41 | 				print(f'concept_colors[{line}] = {args.colored_concepts[i+1]}')
42 | 
43 | 
44 | tsv_file = args.tsv
45 | fig_file = args.fig
46 | 
47 | fin = open(tsv_file,"r")
48 | cpt_list = []
49 | val_list = []
50 | for line in fin:
51 | 	line = line.strip()
52 | 	cpt, val = line.split("\t")
53 | 	val_list.append(int(val))
54 | 	cpt_list.append(cpt)  
55 | fin.close()
56 | 
57 | percentages = pd.Series(val_list, 
58 |                         index=cpt_list)
59 | 
60 | df = pd.DataFrame({'percentage' : percentages})
61 | df = df.sort_values(by='percentage')
62 | 
63 | color_list = [concept_colors[x] for x in df.index]
64 | 
65 | # we first need a numeric placeholder for the y axis
66 | my_range=list(range(1,len(df.index)+1))
67 | 
68 | fig, ax = plt.subplots(figsize=(10,25))
69 | 
70 | # create lines and dots for each bar
71 | plt.hlines(y=my_range, xmin=0, xmax=df['percentage'], colors=color_list, alpha=0.5, linewidth=5)
72 | # plt.plot(df['percentage'], my_range, "o", markersize=5, colors=color_list, alpha=0.6)
73 | 
74 | # set labels
75 | ax.set_xlabel(args.title, fontsize=15, fontweight='black', color = '#333F4B')
76 | ax.xaxis.set_label_position('top')
77 | ax.xaxis.tick_top() 
78 | #ax.set_ylabel('')
79 | 
80 | # set axis
81 | ax.tick_params(axis='both', which='major', labelsize=12)
82 | plt.yticks(my_range, df.index)
83 | 
84 | # add an horizonal label for the y axis 
85 | #fig.text(-0.23, 0.86, 'Concept Coverage (Fulltext)', fontsize=15, fontweight='black', color = '#333F4B')
86 | 
87 | # change the style of the axis spines
88 | ax.spines['bottom'].set_color('none')
89 | ax.spines['right'].set_color('none')
90 | ax.spines['left'].set_smart_bounds(True)
91 | ax.spines['top'].set_smart_bounds(True)
92 | 
93 | '''
94 | # set the spines position
95 | ax.spines['bottom'].set_position(('axes', -0.04))
96 | ax.spines['left'].set_position(('axes', 0.015))
97 | '''
98 | plt.savefig(fig_file, dpi=300, bbox_inches='tight')
99 | 


--------------------------------------------------------------------------------
/fig/annotations.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/neulab/nn4nlp-concepts/064d45d94f15c499165e7fed086eb2c89ac12ef8/fig/annotations.png


--------------------------------------------------------------------------------
/fig/auto.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/neulab/nn4nlp-concepts/064d45d94f15c499165e7fed086eb2c89ac12ef8/fig/auto.png


--------------------------------------------------------------------------------
/get_paper.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import random
  3 | import itertools
  4 | import os
  5 | import sys
  6 | import rule_classifier as paper_classifier
  7 | import urllib.request
  8 | import bs4 as bs
  9 | import time
 10 | 
 11 | 
 12 | 
 13 | 
 14 | def label_paper(paper_id = None, paper_meta = None, cased_regexes = None, feature = None):
 15 |   """Label one paper
 16 | 
 17 |   :param paper_id: The paper ID
 18 |   :param paper_meta: Store meta information of a paper
 19 |   :param cased_regexes: store meta information of a paper
 20 |   :param feature: which part of content will we used to label papers. i.e. "title" or "fulltext"
 21 |   :return: Nothing.
 22 |   """
 23 |   if not os.path.isfile(f'papers/{paper_id}.pdf'):
 24 |     os.makedirs(f'papers/', exist_ok=True)
 25 |     try:
 26 |       urllib.request.urlretrieve(f'https://www.aclweb.org/anthology/{paper_id}.pdf', f'papers/{paper_id}.pdf')
 27 |       # time.sleep(2) # maybe we would wait some time until downloading processing finishes.
 28 |       os.system(f'pdftotext papers/{paper_id}.pdf papers/{paper_id}.txt')
 29 |     except:
 30 |       print(f'WARNING: Error while downloading/processing https://www.aclweb.org/anthology/{paper_id}.pdf')
 31 |       return
 32 | 
 33 |   with open(f'papers/{paper_id}.txt', 'r') as f:
 34 |     paper_text = '\n'.join(f.readlines())
 35 |   paper_title = ''.join(paper_meta.title.findAll(text=True))
 36 | 
 37 |   is_cased = 1 # if case-sensitive
 38 |   if feature == "title":
 39 |     feature = paper_title
 40 |     is_cased = 0
 41 |   elif feature == "fulltext":
 42 |     feature = paper_text
 43 |     is_cased = 1
 44 | 
 45 |   predicted_tags = paper_classifier.classify(feature, cased_regexes, is_cased)
 46 |   print(f'Title: {paper_title}\n'
 47 |         f'Local location: papers/{paper_id}.pdf\n'
 48 |         f'Online location: https://www.aclweb.org/anthology/{paper_id}.pdf\n'
 49 |         f'Text file location: auto/{paper_id}.txt')
 50 |   for i, tag in enumerate(predicted_tags):
 51 |     print(f'Tag {i}: {tag}')
 52 |   print("------------------------------------------------\n")
 53 | 
 54 |   os.makedirs(f'auto/', exist_ok=True)
 55 |   fin = open(f'auto/{paper_id}.txt', 'w')
 56 |   print(f'# Title: {paper_title}\n# Online location: https://www.aclweb.org/anthology/{paper_id}.pdf', file=fin)
 57 |   for tag, conf, just in predicted_tags:
 58 |     print(f'# CHECK: confidence={conf}, justification={just}\n{tag}',file=fin)
 59 | 
 60 | 
 61 | 
 62 | 
 63 | if __name__ == "__main__":
 64 | 
 65 |   parser = argparse.ArgumentParser(description="Get a paper to try to read and annotate")
 66 | 
 67 |   parser.add_argument("--paper_id", type=str, default=None,
 68 |                       help="The paper ID to get, if you want to specify a single one (e.g. P84-1031)")
 69 |   parser.add_argument("--years", type=str, default="19",
 70 |                       help="If a paper ID is not specified, a year (e.g. 19) or range of years (e.g. 99-02) from which"+
 71 |                            " to select a random paper.")
 72 |   parser.add_argument("--confs", type=str, default="P,N,D",
 73 |                       help="A comma-separted list of conference abbreviations from which papers can be selected")
 74 |   parser.add_argument("--volumes", type=str, default="1,2",
 75 |                       help="A comma-separated list of volumes to include (default is long and short research papers)."+
 76 |                            " 'all' for no filtering.")
 77 |   parser.add_argument("--n_sample", type=str, default="1",
 78 |                       help="the number of sampled papers if paper_id is not specified (e.g. 1)."
 79 |                            " Write 'all' to select all papers from those years/conferences/volumes.")
 80 | 
 81 |   parser.add_argument("--template", type=str, default="template.cpt",
 82 |                       help="The file of concept template (e.g. template.cpt)")
 83 | 
 84 |   parser.add_argument("--feature", type=str, default="fulltext",
 85 |                       help="Which parts of paper is used to classify (e.g. fulltext|title)")
 86 | 
 87 |   args = parser.parse_args()
 88 | 
 89 |   # init variables
 90 |   feature = args.feature
 91 |   paper_id = args.paper_id
 92 |   template = args.template
 93 |   n_sample = args.n_sample
 94 |   volumes = args.volumes.split(',')
 95 |   paper_map = {}
 96 | 
 97 |   # lead the concept template
 98 |   cased_regexes = paper_classifier.genConceptReg(file_concept=template, formate_col = 3)
 99 | 
100 |   # if paper_id has not been specified
101 |   if paper_id == None:
102 |     years = args.years.split('-')
103 |     confs = args.confs.split(',')
104 |     if len(years) == 2:
105 |       years = list(range(int(years[0]), int(years[1])+1))
106 |     else:
107 |       assert len(years) == 1, "invalid format of years, {args.years}"
108 |     for pref, year in itertools.product(confs, years):
109 |       year = int(year)
110 |       pref= pref.upper()
111 |       with open(f'acl-anthology/data/xml/{pref}{year:02d}.xml', 'r') as f:
112 |         soup = bs.BeautifulSoup(f, 'xml')
113 |       for vol in soup.collection.find_all('volume'):
114 |         if vol.attrs['id'] in volumes:
115 |           for pap in vol.find_all('paper'):
116 |             if pap.url:
117 |               paper_map[pap.url.contents[0]] = pap
118 | 
119 |     paper_keys = list(paper_map.keys())
120 |     if n_sample == 'all':
121 |       for paper_id in paper_keys:
122 |         paper_meta = paper_map[paper_id]
123 |         label_paper(paper_id, paper_meta, cased_regexes, feature)
124 |     else:
125 |       for _ in range(int(n_sample)):
126 |         randid = random.choice(paper_keys)
127 |         if not os.path.isfile(f'annotations/{randid}.txt') and not os.path.isfile(f'auto/{randid}.txt'):
128 |           paper_id = randid
129 |           paper_meta = paper_map[paper_id]
130 |           #print(paper_meta)
131 |           label_paper(paper_id, paper_meta, cased_regexes, feature)
132 |         else:
133 |           print(f'Warning: {paper_id} has been labeled!')
134 | 
135 |   # if paper_id is specified
136 |   else:
137 |     prefix = paper_id.split("-")[0]
138 |     with open(f'acl-anthology/data/xml/{prefix}.xml', 'r') as f:
139 |         soup = bs.BeautifulSoup(f, 'xml')
140 |         for vol in soup.collection.find_all('volume'):
141 |             if vol.attrs['id'] in volumes:
142 |               for pap in vol.find_all('paper'):
143 |                 if pap.url and pap.url.contents[0] == paper_id:
144 |                   paper_map[pap.url.contents[0]] = pap
145 |                   #print(paper_map[pap.url.contents[0]])
146 |                   if not os.path.isfile(f'annotations/{paper_id}.txt') and not os.path.isfile(f'auto/{paper_id}.txt'):
147 |                       label_paper(paper_id, paper_map[paper_id], cased_regexes, feature)
148 |                       sys.exit(1)
149 |                   else:
150 |                     print(f'Warning: {paper_id} has been labeled!')
151 | 
152 |     if len(paper_map) == 0:
153 |       print(f'Warning: {paper_id} can not been found!')
154 |       sys.exit(1)
155 | 
156 | 
157 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | beautifulsoup4
2 | matplotlib
3 | pandas


--------------------------------------------------------------------------------
/rule_classifier.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | import re
 4 | 
 5 | # generate regular expression from a concept template
 6 | # the formate of template:
 7 | # concept \t father_concept \t keywords
 8 | def genConceptReg(file_concept="test.cpt", formate_col = 3):
 9 | 	if not os.path.exists(file_concept):
10 | 		print("can not find concept template")
11 | 		os._exit(0)
12 | 
13 | 	cased_regexes = []
14 | 	fin = open(file_concept,"r")
15 | 	for line in fin:
16 | 		line = line.rstrip("\n")
17 | 		if len(line.split("\t"))!= formate_col or line[0] == "#":
18 | 			continue
19 | 		info_list = line.split("\t")
20 | 		cased_regexes.append((info_list[2].rstrip("\r"), info_list[0], 0.9))
21 | 	fin.close()
22 | 	return cased_regexes
23 | 
24 | 
25 | def classify(paper_text=None, cased_regexes = None, flag_cased = 1, threshold=0.5):
26 | 	ret = []
27 | 	if paper_text != None:
28 | 		for reg, tag, certainty in cased_regexes:
29 | 			if flag_cased == 1:  
30 | 				m = re.search(reg, paper_text)
31 | 			else:
32 | 				m = re.search(reg, paper_text,re.IGNORECASE)
33 | 
34 | 			if m:
35 | 				ret.append((tag, certainty, 'Matched regex {}'.format(str(reg))))
36 | 	return ret
37 | 


--------------------------------------------------------------------------------
/template.cpt:
--------------------------------------------------------------------------------
  1 | #Concept	Concept-fa Keyword
  2 | # ---------- Optimizers ----------
  3 | optim-sgd	null	SGD|gradient decent
  4 | optim-adam	optim-sgd	Adam
  5 | optim-adagrad	optim-sgd Adagrad
  6 | optim-adadelta	optim-sgd Adadelta
  7 | optim-noam	optim-adam	specialized Transformer learning rate
  8 | optim-momentum	optim-sgd	SGD with Momentum
  9 | optim-amsgrad	optim-sgd	AMSGrad
 10 | optim-projection	optim-sgd	projection|projected gradient descent
 11 | # ---------- Initialization ----------
 12 | init-glorot	null	Xavier|Glorot
 13 | init-he	null	He initialization
 14 | # ---------- Regularization ----------
 15 | reg-dropout	null	Dropout|dropout
 16 | reg-worddropout	null	word dropout
 17 | reg-stopping	null	early stopping
 18 | reg-patience	null	patience
 19 | reg-norm	null	Norm (L1/L2) Regularization|L2 regularization
 20 | reg-decay	null	weight decay
 21 | reg-labelsmooth	null	label smooth
 22 | # ---------- Normalization ----------
 23 | norm-layer	null	Layer Normalization|layer normalization
 24 | norm-batch	null	Batch Normalization|batch normalization
 25 | norm-gradient	null	gradient clipping|gradient normalization|clipnorm
 26 | # ---------- Training Paradigms: Multi-task/Multi-lingual/Transfer ----------
 27 | train-mtl	null	multi-task learning
 28 | train-mll	train-multitask	cross-lingual|multi-lingual|cross language
 29 | train-transfer	null	transfer learning|domain adaptation
 30 | train-active	null	active learning
 31 | train-augment	null	data augmentation|Data Augmentation
 32 | train-curriculum	null	data curriculum
 33 | train-parallel	null	parallelism
 34 | # ---------- Activation Functions ----------
 35 | activ-tanh	null	Hyperbolic Tangent|hyperbolic tangent
 36 | activ-relu	null	Rectified Linear Units|rectified linear
 37 | # ---------- Pooling Operations ----------
 38 | pool-max	null	Max Pooling|max-pooling|max pooling
 39 | pool-mean	null	Mean Pooling|mean pooling|Average Pooling|average pooling
 40 | pool-kmax	null	k-Max Pooling|k-max pooling
 41 | # ---------- Recurrent Architectures ----------
 42 | arch-rnn	null	Recurrent Neural Network|RNN|recurrent neural networks
 43 | arch-birnn	arch-rnn	Bi-directional Recurrent Neural Network|Bi-RNN|BiRNN
 44 | arch-lstm	arch-rnn	Long Short-term Memory|LSTMs|LSTM
 45 | arch-bilstm	arch-rnn	Bi-directional Long Short-term Memory|BiLSTM|BiLSTMs|Bi-LSTM|BLSTM
 46 | arch-gru	arch-rnn	Gated Recurrent Units|GRU|GRUs
 47 | arch-bigru	arch-rnn	Bi-directional GRU|Bi-GRU|BiGRU
 48 | # ---------- Other Sequential Architectures ----------
 49 | arch-bow	null	bag-of-words|bag-of-embeddings|deep averaging network
 50 | arch-cnn	null	Convolutional Neural Networks|CNNs|convolutional neural network
 51 | arch-att	null	attention
 52 | arch-selfatt	arch-att	Self Attention|self attention|self-attention
 53 | arch-recnn	null	Recursive Neural Network
 54 | arch-treelstm	null	Tree-LSTM|TreeLSTM|Tree-structured Long Short-term Memory
 55 | arch-gnn	null	Graph Neural Network|GNN
 56 | arch-gcnn	null	Graph Convolutional Neural Network|GCNN
 57 | # ---------- Architectural Techniques ----------
 58 | arch-residual	null	residual connections
 59 | arch-gating	null	gating connections|Highway
 60 | arch-memo	null	memory network|external memory
 61 | arch-copy	null	copy mechanism|copying mechanism
 62 | arch-bilinear	null	bilinear|bi-linear|biaffine|bi-affine
 63 | arch-coverage	null	coverage
 64 | arch-subword	null	subword|BPE|sentencepiece
 65 | arch-energy	null	energy-based|globally normalized|global normalization
 66 | # ---------- Standard Composite Architectures ----------
 67 | arch-transformer	arch-selfatt	Transformer
 68 | # ---------- Model Combination ----------
 69 | comb-ensemble	null	ensemble|ensembling
 70 | # ---------- Search Algorithms ----------
 71 | search-greedy	null	Greedy Search|greedy search
 72 | search-beam	null	Beam Search|beam search
 73 | search-astar	null	A\* Search
 74 | search-viterbi	null	Viterbi Algorithm|Viterbi|viterbi
 75 | search-sampling	null	Ancestral Sampling|ancestral sampling
 76 | search-gumbel	search-sampling	Gumbel Max|gumbel max
 77 | # ---------- Pre-trained Embedding Techniques ----------
 78 | pre-word2vec	null	word2vec|Word2vec
 79 | pre-fasttext	null	fasttext|FastText|fastText
 80 | pre-glove	null	glove|GloVe
 81 | pre-paravec	null	paragraph vector|ParaVector
 82 | pre-skipthought	task-seq2seq	Skip-thought|skip-thought|skipthought
 83 | pre-elmo	task-lm	ELMo
 84 | pre-bert	arch-transformer	BERT
 85 | pre-use	null	Universal Sentence Encoder|universal sentence encoder
 86 | # ---------- Pre-trained Embedding Techniques ----------
 87 | struct-hmm	null	Hidden Markov Models|hidden markov
 88 | struct-crf	null	Conditional Random Fields|conditional random fields|CRF
 89 | struct-cfg	null	Context-free Grammar|context-free grammar
 90 | struct-ccg	null	Combinatorial Categorical Grammar|combinatorial categorical grammar
 91 | # ---------- Relaxation/Training Methods for Non-differentiable Functions ----------
 92 | nondif-enum	null	Complete Enumeration|complete enumeration
 93 | nondif-straightthrough	null	Straight-through Estimator|straight-through estimator
 94 | nondif-gumbelsoftmax	null	Gumbel Softmax|gumbel softmax
 95 | nondif-minrisk	null	Minimum Risk Training|minimum risk
 96 | nondif-reinforce	null	REINFORCE
 97 | # ---------- Adversarial Methods  ----------
 98 | adv-gan	null	Generative Adversarial Networks|GAN|generative adversarial
 99 | adv-feat	null	Adversarial Feature Learning|adversarial feature
100 | adv-examp	null	Adversarial Examples|adversarial examples
101 | adv-train	adv-examp	Adversarial Training|adversarial trainin
102 | # ---------- Latent Variable Models ----------
103 | latent-vae	null	Variational Auto-encoder|variational auto-encoder|latent variable
104 | latent-topic	null	topic model
105 | # ---------- Loss Functions ----------
106 | loss-cca	null	Canonical Correlation Analysis|canonical correlation analysis
107 | loss-svd	null	Singular Value Decomposition|SVD|singular value decomposition
108 | loss-margin	null	Margin-based Loss Functions|margin-based|ranking-based loss
109 | loss-cons	null	Contrastive Loss
110 | loss-nce	loss-cons	Noise Contrastive Estimation|NCE
111 | loss-triplet	loss-cons	Triplet loss|triplet loss
112 | # ---------- Prediction Tasks ----------
113 | task-textclass	null	Text Classification|text classification
114 | task-textpair	null	natural language inference|semantic matching|question answering matching
115 | task-seqlab	null	named entity recognition|Part-of-Speech|word segmentation|text chunking
116 | task-extractive	task-seqlab	extractive summarization
117 | task-spanlab	null	span labeling|machine reading comprehension|SQuAD
118 | task-lm	null	language model
119 | task-condlm	null	image caption
120 | task-seq2seq	null	machine translat|abstractive summarization
121 | task-cloze	null	cloze-style prediction|masked language model|text cloze
122 | task-context	null	context prediction
123 | task-relation	null	dependency pars
124 | task-tree	null	syntactic pars|semantic pars
125 | task-graph	null	AMR|UDD
126 | task-lexicon	null	lexicon induction|bi-lingual embedding|embedding alignment|MUSE
127 | task-alignment	null	word alignment|GIZA
128 | # ---------- Meta Learning ----------
129 | meta-init	null	MAML
130 | meta-optim	null	meta optimizer|meta learner
131 | meta-loss	null	meta learning loss
132 | meta-arch	null	architecture search
133 | 


--------------------------------------------------------------------------------