├── .gitignore
├── README.md
├── benchmark.py
├── das.py
├── data.py
├── data
├── templates
│ ├── data.json
│ ├── data_extra.json
│ ├── preposing_in_pp.json
│ ├── syntaxgym.json
│ └── syntaxgym_failed.json
└── test_suites
│ ├── center_embedding.json
│ ├── filler_gap_subject.json
│ ├── gss_subord_pp.json
│ ├── mvrr.json
│ ├── mvrr_mod.json
│ ├── npi.json
│ ├── npi2.json
│ ├── npi_ever.json
│ ├── npz_obj.json
│ ├── npz_obj_mod.json
│ ├── npz_v-trans.json
│ ├── reflexive_number_agreement_feminine_object_relative.json
│ ├── subject_verb_number_agreement_with_prepositional_phrase.json
│ ├── subject_verb_number_agreement_with_subject_relative_clause.json
│ └── subordination.json
├── diff_methods.py
├── eval.py
├── interventions.py
├── plot.py
├── prompt.py
├── requirements.txt
├── test_all.py
├── train.py
└── utils.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *.out
2 | .DS_Store
3 | __pycache__/
4 | *.ipynb
5 | logs/
6 | .ipynb_checkpoints/
7 | .vscode/
8 | figs/
9 | deprecated/figs/
10 | *.profile
11 | data/huggingface/
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | # CausalGym
4 |
5 |
6 |
7 | Aryaman Arora, Dan Jurafsky, and Christopher Potts. 2024. [CausalGym: Benchmarking causal interpretability methods on linguistic tasks](https://aclanthology.org/2024.acl-long.785/). In _Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)_, pages 14638–14663, Bangkok, Thailand. Association for Computational Linguistics.
8 |
9 | *HuggingFace dataset*: [aryaman/causalgym](https://huggingface.co/datasets/aryaman/causalgym)
10 |
11 |
12 |
13 | **CausalGym** is a benchmark for comparing the performance of causal interpretability methods on a variety of simple linguistic tasks taken from the SyntaxGym evaluation set ([Gauthier et al., 2020](https://aclanthology.org/2020.acl-demos.10/), [Hu et al., 2020](https://aclanthology.org/2020.acl-main.158/)) and converted into a format suitable for interventional interpretability.
14 |
15 | This repository includes code for:
16 | - Training DAS and all the other methods benchmarked in the paper, on every region, layer, and task for some model. This is sufficient for replicating all experiments in the paper (including hyperparameter sweeps and interpretability during training).
17 | - Reproducing every plot in the paper.
18 | - Template specifications for every task in the benchmark and utils for generating examples, tokenizing, generating non-overlapping train/test sets, and so on.
19 | - Testing model outputs on the task templates; this was used to design the benchmark tasks.
20 |
21 | You can also download the train/dev/test splits for each task as used in the paper via [HuggingFace](https://huggingface.co/datasets/aryaman/causalgym).
22 |
23 | If you are having trouble getting anything running, do not hesitate to file an issue! We would love to help you benchmark your new method or help you replicate the results from our paper.
24 |
25 | ## Instructions
26 |
27 | > [!IMPORTANT]
28 | > The implementations in this repo are only for `GPTNeoX`-type language models (e.g. the `pythia` series) and will probably not work for other architectures without some modifications.
29 |
30 | First install the requirements (a fresh environment is probably best):
31 |
32 | ```bash
33 | pip install -r requirements.txt
34 | ```
35 |
36 | ### Training
37 |
38 | To train every method, layer, region, and task for `pythia-70m` (results are logged to the directory `logs/das/`):
39 |
40 | ```bash
41 | python test_all.py --model EleutherAI/pythia-70m
42 | ```
43 |
44 | To do the same but with the dog-give control task used to compute selectivity:
45 |
46 | ```bash
47 | python test_all.py --model EleutherAI/pythia-70m --manipulate dog-give
48 | ```
49 |
50 | To run just the Preposing in PP extension:
51 |
52 | ```bash
53 | python test_all.py --model EleutherAI/pythia-70m --datasets preposing_in_pp/preposing_in_pp preposing_in_pp/preposing_in_pp_embed_1
54 | ```
55 |
56 |
57 | ### Analysis + plots
58 |
59 | Once you have run this for several models, you can create results tables (like those found in the appendix) with:
60 |
61 | ```bash
62 | python plot.py --file logs/das/ --plot summary --metric odds --reload
63 | ```
64 |
65 | This also caches intermediate results in csv file in the directory, so you don't need to use the `--reload` option again unless you need to recompute statistics.
66 |
67 | To produce the causal tracing-style plots for all methods:
68 |
69 | ```bash
70 | python plot.py --file logs/das/ --plot pos_all --metric odds
71 | ```
72 |
73 | To visualize just runs from the Preposing in PP extension:
74 |
75 | ```bash
76 | python plot.py --file logs/das/ --plot pos_all --metric odds --template_filename preposing_in_pp
77 | ```
78 |
79 | You can also specify a subset of methods:
80 |
81 | ```bash
82 | python plot.py --file logs/das/ --plot pos_t --metric odds --methods das vanilla probe
83 | ```
84 |
85 |
86 | ## Citation
87 |
88 | Please cite the CausalGym publication:
89 |
90 | ```bibtex
91 | @inproceedings{arora-etal-2024-causalgym,
92 | title = "{C}ausal{G}ym: Benchmarking causal interpretability methods on linguistic tasks",
93 | author = "Arora, Aryaman and Jurafsky, Dan and Potts, Christopher",
94 | editor = "Ku, Lun-Wei and Martins, Andre and Srikumar, Vivek",
95 | booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
96 | month = aug,
97 | year = "2024",
98 | address = "Bangkok, Thailand",
99 | publisher = "Association for Computational Linguistics",
100 | url = "https://aclanthology.org/2024.acl-long.785",
101 | doi = "10.18653/v1/2024.acl-long.785",
102 | pages = "14638--14663"
103 | }
104 |
105 | ```
106 |
107 | Also cite the earlier SyntaxGym papers:
108 |
109 | ```bibtex
110 | @inproceedings{gauthier-etal-2020-syntaxgym,
111 | title = "{S}yntax{G}ym: An Online Platform for Targeted Evaluation of Language Models",
112 | author = "Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger",
113 | editor = "Celikyilmaz, Asli and Wen, Tsung-Hsien",
114 | booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
115 | month = jul,
116 | year = "2020",
117 | address = "Online",
118 | publisher = "Association for Computational Linguistics",
119 | url = "https://aclanthology.org/2020.acl-demos.10",
120 | doi = "10.18653/v1/2020.acl-demos.10",
121 | pages = "70--76",
122 | }
123 |
124 | @inproceedings{hu-etal-2020-systematic,
125 | title = "A Systematic Assessment of Syntactic Generalization in Neural Language Models",
126 | author = "Hu, Jennifer and Gauthier, Jon and Qian, Peng and Wilcox, Ethan and Levy, Roger",
127 | editor = "Jurafsky, Dan and Chai, Joyce and Schluter, Natalie and Tetreault, Joel",
128 | booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
129 | month = jul,
130 | year = "2020",
131 | address = "Online",
132 | publisher = "Association for Computational Linguistics",
133 | url = "https://aclanthology.org/2020.acl-main.158",
134 | doi = "10.18653/v1/2020.acl-main.158",
135 | pages = "1725--1744",
136 | }
137 | ```
138 |
139 | ## Task examples
140 |
141 | | **Task** | **Example** |
142 | |:-------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------|
143 | | ***Agreement*** (4) | |
144 | | `agr_gender` | \[**John**\]\[**Jane**\] walked because \[**he**\]\[**she**\] |
145 | | `agr_sv_num_subj-relc` | The \[**guard**\]\[**guards**\] that hated the manager \[**is**\]\[**are**\] |
146 | | `agr_sv_num_obj-relc` | The \[**guard**\]\[**guards**\] that the customers hated \[**is**\]\[**are**\] |
147 | | `agr_sv_num_pp` | The \[**guard**\]\[**guards**\] behind the managers \[**is**\]\[**are**\] |
148 | | ***Licensing*** (7) | |
149 | | `agr_refl_num_subj-relc` | The \[**farmer**\]\[**farmers**\] that loved the actors embarrassed \[**himself**\]\[**themselves**\] |
150 | | `agr_refl_num_obj-relc` | The \[**farmer**\]\[**farmers**\] that the actors loved embarrassed \[**himself**\]\[**themselves**\] |
151 | | `agr_refl_num_pp` | The \[**farmer**\]\[**farmers**\] behind the actors embarrassed \[**himself**\]\[**themselves**\] |
152 | | `npi_any_subj-relc` | \[**No**\]\[**The**\] consultant that has helped the taxi driver has shown \[**any**\]\[**some**\] |
153 | | `npi_any_obj-relc` | \[**No**\]\[**The**\] consultant that the taxi driver has helped has shown \[**any**\]\[**some**\] |
154 | | `npi_ever_subj-relc` | \[**No**\]\[**The**\] consultant that has helped the taxi driver has \[**ever**\]\[**never**\] |
155 | | `npi_ever_obj-relc` | \[**No**\]\[**The**\] consultant that the taxi driver has helped has \[**ever**\]\[**never**\] |
156 | | ***Garden path effects*** (6) | |
157 | | `garden_mvrr` | The infant \[**who was**\]\[**⌀**\] brought the sandwich from the kitchen \[**by**\]\[**.**\] |
158 | | `garden_mvrr_mod` | The infant \[**who was**\]\[**⌀**\] brought the sandwich from the kitchen with a new microwave \[**by**\]\[**.**\] |
159 | | `garden_npz_obj` | While the students dressed \[**,**\]\[**⌀**\] the comedian \[**was**\]\[**for**\] |
160 | | `garden_npz_obj_mod` | While the students dressed \[**,**\]\[**⌀**\] the comedian who told bad jokes \[**was**\]\[**for**\] |
161 | | `garden_npz_v-trans` | As the criminal \[**slept**\]\[**shot**\] the woman \[**was**\]\[**for**\] |
162 | | `garden_npz_v-trans_mod` | As the criminal \[**slept**\]\[**shot**\] the woman who told bad jokes \[**was**\]\[**for**\] |
163 | | ***Gross syntactic state*** (4) | |
164 | | `gss_subord` | \[**While the**\]\[**The**\] lawyers lost the plans \[**they**\]\[**.**\] |
165 | | `gss_subord_subj-relc` | \[**While the**\]\[**The**\] lawyers who wore white lab jackets studied the book that described several advances in cancer therapy \[**,**\]\[**.**\] |
166 | | `gss_subord_obj-relc` | \[**While the**\]\[**The**\] lawyers who the spy had contacted repeatedly studied the book that colleagues had written on cancer therapy \[**,**\]\[**.**\] |
167 | | `gss_subord_pp` | \[**While the**\]\[**The**\] lawyers in a long white lab jacket studied the book about several recent advances in cancer therapy \[**,**\]\[**.**\] |
168 | | ***Long-distance dependencies*** (8) | |
169 | | `cleft` | What the young man \[**did**\]\[**ate**\] was \[**make**\]\[**for**\] |
170 | | `cleft_mod` | What the young man \[**did**\]\[**ate**\] after the ingredients had been bought from the store was \[**make**\]\[**for**\] |
171 | | `filler_gap_embed_3` | I know \[**that**\]\[**what**\] the mother said the friend remarked the park attendant reported your friend sent \[**him**\]\[**.**\] |
172 | | `filler_gap_embed_4` | I know \[**that**\]\[**what**\] the mother said the friend remarked the park attendant reported the cop thinks your friend sent \[**him**\]\[**.**\] |
173 | | `filler_gap_hierarchy` | The fact that the brother said \[**that**\]\[**who**\] the friend trusted \[**the**\]\[**was**\] |
174 | | `filler_gap_obj` | I know \[**that**\]\[**what**\] the uncle grabbed \[**him**\]\[**.**\] |
175 | | `filler_gap_pp` | I know \[**that**\]\[**what**\] the uncle grabbed food in front of \[**him**\]\[**.**\] |
176 | | `filler_gap_subj` | I know \[**that**\]\[**who**\] the uncle grabbed food in front of \[**him**\]\[**.**\] |
177 |
--------------------------------------------------------------------------------
/benchmark.py:
--------------------------------------------------------------------------------
1 | """
2 | Check if a model produces the expected output for a task.
3 | """
4 |
5 | import torch
6 | from transformers import AutoTokenizer, AutoModelForCausalLM
7 | from data import Dataset, list_datasets
8 | from utils import WEIGHTS, MODELS, top_vals, format_token, get_last_token
9 | import argparse
10 | from tqdm import tqdm
11 | import json
12 |
13 | @torch.no_grad()
14 | def benchmark(model=None, task=None, debug=False, rank=False):
15 |
16 | # get models, data
17 | if model is None:
18 | models = reversed(MODELS)
19 | else:
20 | models = [model]
21 | datasets = [dataset for dataset in list_datasets() if dataset.startswith(f"syntaxgym/{task if task is not None else ''}")]
22 | data = []
23 |
24 | # benchmark
25 | for model in models:
26 |
27 | # load model
28 | device = "cuda:0" if torch.cuda.is_available() else "cpu"
29 | tokenizer = AutoTokenizer.from_pretrained(model)
30 | tokenizer.pad_token = tokenizer.eos_token
31 | gpt = AutoModelForCausalLM.from_pretrained(
32 | model,
33 | revision="main",
34 | torch_dtype=WEIGHTS.get(model, torch.bfloat16) if device == "cuda:0" else torch.float32,
35 | ).to(device)
36 | gpt.eval()
37 | # print model dtype
38 | print(f"{model:<30} {gpt.dtype}")
39 |
40 | # make data
41 | for dataset in datasets:
42 | data_source = Dataset.load_from(dataset)
43 | trainset = data_source.sample_batches(tokenizer, 8, 50, device, seed=42)
44 | count, correct = 0, 0
45 | probs = {}
46 |
47 | for batch in tqdm(trainset):
48 | # vars
49 | base_label = batch.base_labels
50 | src_label = batch.src_labels
51 | base_type = batch.base_types
52 | src_type = batch.src_types
53 |
54 | # inference
55 | base_output = gpt(**batch.base)
56 | src_output = gpt(**batch.src)
57 | base_logits = get_last_token(base_output.logits, batch.base['attention_mask'])
58 | src_logits = get_last_token(src_output.logits, batch.src['attention_mask'])
59 |
60 | # check for batch accuracy
61 | for i in range(4):
62 | base_probs = torch.softmax(base_logits[i], dim=-1)
63 | src_probs = torch.softmax(src_logits[i], dim=-1)
64 | if base_probs[base_label[i]] > base_probs[src_label[i]] and src_probs[src_label[i]] > src_probs[base_label[i]]:
65 | correct += 1
66 | if debug:
67 | print(base_probs[base_label[i]] > base_probs[src_label[i]] and src_probs[src_label[i]] > src_probs[base_label[i]])
68 | print(tokenizer.decode(batch.base['input_ids'][i]))
69 | top_vals(tokenizer, base_probs, n=5, highlight=[base_label[i], src_label[i]])
70 | print(tokenizer.decode(batch.src['input_ids'][i]))
71 | top_vals(tokenizer, src_probs, n=5, highlight=[base_label[i], src_label[i]])
72 | input()
73 | if count == 0:
74 | probs[base_type[i]] = base_probs
75 | probs[src_type[i]] = src_probs
76 | else:
77 | probs[base_type[i]] += base_probs
78 | probs[src_type[i]] += src_probs
79 | count += 1
80 |
81 | # store stats
82 | data.append({
83 | "model": model,
84 | "dataset": dataset,
85 | "count": count,
86 | "correct": correct,
87 | "iia": correct / count,
88 | "parameters": gpt.num_parameters(),
89 | })
90 | print(f"{dataset:<30} {correct / count:>10.2%} ({correct} / {count})")
91 | if rank:
92 | for k, v in probs.items():
93 | probs[k] = (v / count)
94 | print(k.upper())
95 | top_vals(tokenizer, probs[k], n=50)
96 | print('---')
97 | print("DIFF")
98 | top_vals(tokenizer, list(probs.values())[1] - list(probs.values())[0], n=50)
99 | print('---')
100 | top_vals(tokenizer, list(probs.values())[0] - list(probs.values())[1], n=50)
101 | print('---')
102 |
103 | # save data
104 | with open("logs/benchmark.json", "w") as f:
105 | json.dump(data, f)
106 |
107 |
108 | def main():
109 | parser = argparse.ArgumentParser()
110 | parser.add_argument("--model", type=str, default=None)
111 | parser.add_argument("--task", type=str, default=None)
112 | parser.add_argument("--debug", action="store_true")
113 | parser.add_argument("--rank", action="store_true")
114 | args = parser.parse_args()
115 | benchmark(**vars(args))
116 |
117 | if __name__ == "__main__":
118 | main()
--------------------------------------------------------------------------------
/das.py:
--------------------------------------------------------------------------------
1 | from numpy import add
2 | import torch
3 | import os
4 | import argparse
5 | from transformers import AutoTokenizer, AutoModelForCausalLM, GPTNeoXForCausalLM
6 | from utils import WEIGHTS
7 | from data import Dataset
8 | from eval import eval, augment_data
9 | from train import train_das, train_feature_direction
10 | from diff_methods import method_mapping, additional_method_mapping, probe_mapping
11 | import datetime
12 | import json
13 | from typing import Union
14 |
15 | from pyvene.models.intervenable_base import IntervenableModel
16 | from interventions import *
17 |
18 |
19 | # make das subdir
20 | if not os.path.exists("figs/das"):
21 | os.makedirs("figs/das")
22 | if not os.path.exists("figs/das/steps"):
23 | os.makedirs("figs/das/steps")
24 | if not os.path.exists("logs/das"):
25 | os.makedirs("logs/das")
26 |
27 | # clear files from figs/das/steps
28 | for file in os.listdir("figs/das/steps"):
29 | os.remove(os.path.join("figs/das/steps", file))
30 |
31 |
32 | def experiment(
33 | model: str,
34 | dataset: str,
35 | steps: int,
36 | eval_steps: int,
37 | grad_steps: int,
38 | batch_size: int,
39 | intervention_site: str,
40 | strategy: str,
41 | lr: float,
42 | only_das: bool=False,
43 | hparam_non_das: bool=False,
44 | das_label: str=None,
45 | revision: str="main",
46 | log_folder: str="das",
47 | manipulate: Union[str, None]=None,
48 | tokenizer: Union[AutoTokenizer, None]=None,
49 | gpt: Union[AutoModelForCausalLM, None]=None,
50 | ):
51 | """Run a feature-finding experiment."""
52 |
53 | # load model
54 | total_data = []
55 | diff_vectors = []
56 | NOW = datetime.datetime.now().strftime("%Y%m%d%H%M%S%f")
57 | device = "cuda:0" if torch.cuda.is_available() else "cpu"
58 | if tokenizer is None:
59 | tokenizer = AutoTokenizer.from_pretrained(model)
60 | tokenizer.pad_token = tokenizer.eos_token
61 | if gpt is None:
62 | weight_type = WEIGHTS.get(model, torch.float16) if device == "cuda:0" else torch.float32
63 | gpt = GPTNeoXForCausalLM.from_pretrained(
64 | model,
65 | revision=revision,
66 | torch_dtype=weight_type,
67 | use_flash_attention_2=(weight_type in [torch.bfloat16, torch.float16] and device == "cuda:0"),
68 | ).to(device)
69 | print(model, gpt.config.num_hidden_layers)
70 | gpt.eval()
71 |
72 | # make dataset, ensuring examples in trainset are not in evalset
73 | data_source = Dataset.load_from(dataset)
74 | trainset = data_source.sample_batches(tokenizer, batch_size, steps, device, seed=42, manipulate=manipulate)
75 | print(trainset[0])
76 | discard = set()
77 | for batch in trainset:
78 | for pair in batch.pairs:
79 | discard.add(''.join(pair.base))
80 |
81 | # evalset
82 | eval_seed = 420 if hparam_non_das else 1
83 | evalset = data_source.sample_batches(tokenizer, batch_size, 25, device, seed=eval_seed, discard=discard, manipulate=manipulate)
84 |
85 | # methods
86 | if hparam_non_das:
87 | method_mapping.update(additional_method_mapping)
88 | if model in probe_mapping:
89 | for i, probe_func in enumerate(probe_mapping[model]):
90 | method_mapping[f"probe_{i}"] = probe_func
91 | print(list(method_mapping.keys()))
92 |
93 | # entering train loops
94 | for pos_i in range(data_source.first_var_pos, data_source.length):
95 | if trainset[0].compute_pos(strategy)[0][0][pos_i][0] == -1:
96 | continue
97 |
98 | # per-layer training loop
99 | iterator = range(gpt.config.num_hidden_layers)
100 | for layer_i in iterator:
101 | print(f"position {pos_i} ({data_source.span_names[pos_i]}), layer {layer_i}")
102 | data = []
103 |
104 | # vanilla intervention
105 | if strategy != "all" and not only_das:
106 | intervenable_config = intervention_config(
107 | intervention_site, pv.VanillaIntervention, layer_i, 0
108 | )
109 | intervenable = IntervenableModel(intervenable_config, gpt)
110 | intervenable.set_device(device)
111 | intervenable.disable_model_gradients()
112 |
113 | more_data, summary, _ = eval(intervenable, evalset, layer_i, pos_i, strategy)
114 | intervenable._cleanup_states()
115 | data.extend(augment_data(more_data, {"method": "vanilla", "step": -1}))
116 | print(f"vanilla: {summary}")
117 |
118 | # DAS intervention
119 | intervenable_config = intervention_config(
120 | intervention_site,
121 | pv.LowRankRotatedSpaceIntervention if strategy != "all" else PooledLowRankRotatedSpaceIntervention,
122 | layer_i, 1
123 | )
124 | intervenable = IntervenableModel(intervenable_config, gpt)
125 | intervenable.set_device(device)
126 | intervenable.disable_model_gradients()
127 |
128 | _, more_data, activations, eval_activations, diff_vector = train_das(
129 | intervenable, trainset, evalset, layer_i, pos_i, strategy,
130 | eval_steps, grad_steps, lr=lr, das_label="das" if das_label is None else das_label)
131 | diff_vectors.append({"method": "das" if das_label is None else das_label,
132 | "layer": layer_i, "pos": pos_i, "vec": diff_vector})
133 | data.extend(more_data)
134 |
135 | # test other methods
136 | if not only_das:
137 | for method in list(method_mapping.keys()):
138 | try:
139 | more_data, summary, diff_vector = train_feature_direction(
140 | method, intervenable, activations, eval_activations,
141 | evalset, layer_i, pos_i, strategy, intervention_site,
142 | method_mapping
143 | )
144 | print(f"{method}: {summary}")
145 | diff_vectors.append({"method": method, "layer": layer_i, "pos": pos_i, "vec": diff_vector})
146 | data.extend(more_data)
147 | except:
148 | continue
149 |
150 | # store all data
151 | total_data.extend(augment_data(data, {"layer": layer_i, "pos": pos_i}))
152 |
153 | # make data dump
154 | short_dataset_name = dataset.split('/')[-1]
155 | short_model_name = model.split('/')[-1] + (f"_{revision}" if revision != "main" else "")
156 | filedump = {
157 | "metadata": {
158 | "model": model + (f"_{revision}" if revision != "main" else ""),
159 | "dataset": dataset,
160 | "steps": steps,
161 | "eval_steps": eval_steps,
162 | "grad_steps": grad_steps,
163 | "batch_size": batch_size,
164 | "intervention_site": intervention_site,
165 | "strategy": strategy,
166 | "lr": lr,
167 | "span_names": data_source.span_names,
168 | "manipulate": manipulate,
169 | },
170 | "data": total_data,
171 | "vec": diff_vectors,
172 | }
173 |
174 | # log
175 | if manipulate is None:
176 | manipulate = "orig"
177 | log_file = f"logs/{log_folder}/{NOW}__{short_model_name}__{short_dataset_name}__{manipulate}.json"
178 | print(f"logging to {log_file}")
179 | with open(log_file, "w") as f:
180 | json.dump(filedump, f)
181 |
182 |
183 | def main():
184 | parser = argparse.ArgumentParser()
185 | parser.add_argument("--model", type=str, default="EleutherAI/pythia-70m")
186 | parser.add_argument("--dataset", type=str, default="syntaxgym/agr_gender")
187 | parser.add_argument("--steps", type=int, default=100)
188 | parser.add_argument("--eval-steps", type=int, default=25)
189 | parser.add_argument("--grad-steps", type=int, default=1)
190 | parser.add_argument("--batch-size", type=int, default=4)
191 | parser.add_argument("--intervention-site", type=str, default="block_output")
192 | parser.add_argument("--strategy", type=str, default="last")
193 | parser.add_argument("--lr", type=float, default=5e-3)
194 | parser.add_argument("--only-das", action="store_true")
195 | parser.add_argument("--hparam-non-das", action="store_true")
196 | parser.add_argument("--das-label", type=str, default=None)
197 | parser.add_argument("--revision", type=str, default="main")
198 | parser.add_argument("--log-folder", type=str, default="das")
199 | parser.add_argument("--manipulate", type=str, default=None)
200 | args = parser.parse_args()
201 | print(vars(args))
202 | experiment(**vars(args))
203 |
204 |
205 | if __name__ == "__main__":
206 | main()
207 |
--------------------------------------------------------------------------------
/data.py:
--------------------------------------------------------------------------------
1 | import json
2 | from transformers import AutoTokenizer, AutoModel
3 | import random
4 | import torch
5 | from collections import defaultdict, namedtuple
6 | import json
7 | import glob
8 | from typing import Union
9 | import re
10 | from tqdm import tqdm
11 |
12 | random.seed(42)
13 | Tokenized = namedtuple("Tokenized", ["base", "src", "alignment_base", "alignment_src"])
14 |
15 |
16 | class Pair:
17 | """
18 | A pair of sentences where all features except one are held constant.
19 |
20 | Each pair has a base sentence and a source sentence. These two sentences
21 | have different "types" (the value of the differing feature) and different
22 | "labels" (expected continuation of the sentence).
23 | """
24 |
25 | base: list[str]
26 | src: list[str]
27 | base_type: str
28 | src_type: str
29 | base_label: str
30 | src_label: str
31 |
32 |
33 | def __init__(self, base: list[str], src: list[str], base_type: str, src_type: str, base_label: str, src_label: str):
34 | self.base = base
35 | self.src = src
36 | self.base_type = base_type
37 | self.src_type = src_type
38 | self.base_label = base_label
39 | self.src_label = src_label
40 |
41 |
42 | def tokenize(self, tokenizer: AutoTokenizer, device: str="cpu") -> Tokenized:
43 | """Tokenize the pair and produce alignments."""
44 | alignment_base, alignment_src = [], []
45 | pos_base, pos_src = 0, 0
46 | for span_i in range(len(self.base)):
47 |
48 | # get span lengths in tokens
49 | tok_base = tokenizer.tokenize(self.base[span_i])
50 | tok_src = tokenizer.tokenize(self.src[span_i])
51 | alignment = [
52 | list(range(pos_base, pos_base + len(tok_base))),
53 | list(range(pos_src, pos_src + len(tok_src)))
54 | ]
55 |
56 | # update positions
57 | alignment_base.append(alignment[0])
58 | alignment_src.append(alignment[1])
59 | pos_base += len(tok_base)
60 | pos_src += len(tok_src)
61 |
62 | # tokenize full pair and return
63 | base_tok = tokenizer(''.join(self.base), return_tensors="pt", padding=True).to(device)
64 | src_tok = tokenizer(''.join(self.src), return_tensors="pt", padding=True).to(device)
65 | return Tokenized(base=base_tok, src=src_tok, alignment_base=alignment_base, alignment_src=alignment_src)
66 |
67 |
68 | def swap(self) -> "Pair":
69 | """Swap the base and src sentences."""
70 | return Pair(self.src, self.base, self.src_type, self.base_type, self.src_label, self.base_label)
71 |
72 |
73 | def __repr__(self):
74 | return f"Pair('{self.base}' > '{self.base_label}', '{self.src}' > '{self.src_label}', {self.base_type}, {self.src_type})"
75 |
76 |
77 | class Batch:
78 | """
79 | A Batch is a collection of Pairs that have been tokenized and padded.
80 | The messy part is figuring out where to do interventions, so a Batch
81 | encapsulates the functions for computing pos_i for the intervention
82 | at inference time, using the tokenized pair and alignments.
83 | """
84 |
85 | def __init__(self, pairs: list[Pair], tokenizer: AutoTokenizer, device: str="cpu"):
86 | self.pairs = pairs
87 |
88 | # tokenize base and src
89 | tokenized = [pair.tokenize(tokenizer, device) for pair in pairs]
90 | max_len = max([max(x.base.input_ids.shape[-1], x.src.input_ids.shape[-1]) for x in tokenized])
91 | self.base = self._stack_and_pad([x.base for x in tokenized], max_len=max_len)
92 | self.src = self._stack_and_pad([x.src for x in tokenized], max_len=max_len)
93 | self.alignment_base = [x.alignment_base for x in tokenized]
94 | self.alignment_src = [x.alignment_src for x in tokenized]
95 |
96 | # labels
97 | self.base_labels = torch.LongTensor([tokenizer.encode(pair.base_label)[0] for pair in pairs]).to(device)
98 | self.src_labels = torch.LongTensor([tokenizer.encode(pair.src_label)[0] for pair in pairs]).to(device)
99 | self.base_types = [pair.base_type for pair in pairs]
100 | self.src_types = [pair.src_type for pair in pairs]
101 | self.cached_pos = {}
102 |
103 |
104 | def _pos_bounds(self, span1: list[int], span2: list[int]) -> list[int]:
105 | """Compute the bounds of a span."""
106 | if self.pos_strategy == "first":
107 | return span1[:1], span2[:1]
108 | elif self.pos_strategy == "last":
109 | return span1[-1:], span2[-1:]
110 | elif self.pos_strategy == "all":
111 | max_len = max(len(span1), len(span2))
112 | return span1 + [span1[-1]] * (max_len - len(span1)), span2 + [span2[-1]] * (max_len - len(span2))
113 |
114 |
115 | def compute_pos(self, strategy: str) -> torch.LongTensor:
116 | """Compute pos alignments as tensors."""
117 | # shape of alignment: [batch_size, 2, num_spans, tokens_in_span]
118 | # not a proper tensor though! tokens_in_span is variable, rest is constant
119 | if strategy in self.cached_pos:
120 | return self.cached_pos[strategy]
121 | self.pos_strategy = strategy
122 | assert self.pos_strategy in ["first", "last", "all"]
123 | rets_base, rets_src = [], []
124 | for batch_i in range(len(self.pairs)):
125 | ret_base, ret_src = [], []
126 | for span_i in range(len(self.alignment_src[batch_i])):
127 | # skip null alignments
128 | if len(self.alignment_base[batch_i][span_i]) == 0 or len(self.alignment_src[batch_i][span_i]) == 0:
129 | ret_base.append([-1])
130 | ret_src.append([-1])
131 | else:
132 | bounds = self._pos_bounds(self.alignment_base[batch_i][span_i], self.alignment_src[batch_i][span_i])
133 | ret_base.append(bounds[0])
134 | ret_src.append(bounds[1])
135 | rets_base.append(ret_base)
136 | rets_src.append(ret_src)
137 |
138 | # shape: [2, batch_size, length, 1]
139 | # dim 0 -> src, base (the intervention code wants src first)
140 | ret = [rets_src, rets_base]
141 | self.cached_pos[strategy] = ret
142 | return ret
143 |
144 |
145 | def _stack_and_pad(self, input_list: dict, pad_token: int=0, max_len: int=100) -> dict:
146 | """Stack and pad a list of tensors outputs from a tokenizer."""
147 | input_ids = torch.stack([torch.nn.functional.pad(x.input_ids[0], (0, max_len - x.input_ids.shape[-1]), mode='constant', value=pad_token) for x in input_list])
148 | attention_mask = torch.stack([torch.nn.functional.pad(x.attention_mask[0], (0, max_len - x.attention_mask.shape[-1]), mode='constant', value=0) for x in input_list])
149 | return {"input_ids": input_ids, "attention_mask": attention_mask}
150 |
151 |
152 | def __repr__(self):
153 | return f"Batch({len(self.pairs)} pairs)\n" + "\n".join([f" {pair}" for pair in self.pairs])
154 |
155 |
156 | class Dataset:
157 | """
158 | A Dataset is a template for generating minimal pairs that is loaded
159 | from a JSON specification.
160 |
161 | We importantly want examples generated from a dataset to include token-
162 | level alignments.
163 | """
164 |
165 | templates: list[str]
166 | label_vars: list[str]
167 | labels: dict[str, list[str]]
168 | variables: dict[str, Union[list[str], dict[str, list[str]]]]
169 | result_prepend_space: bool
170 |
171 |
172 | def __init__(self, data: dict):
173 | # load basics
174 | self.templates = data["templates"]
175 | self.template = [x for x in re.split(r"(?<=\})|(?= \{)|(?' + random.choice(self.templates)) if x != '']
176 | self.label_vars = data["label"] if isinstance(data["label"], list) else [data["label"]]
177 | self.labels = data["labels"]
178 | self.types = list(self.labels.keys())
179 | self.variables = data["variables"]
180 |
181 | # split template into spans and find variables
182 | self.vars_per_span, self.span_names = [], []
183 | self.first_var_pos = None
184 | for token_i in range(len(self.template)):
185 | var = re.findall(r"\{(.+?)\}", self.template[token_i])
186 | if len(var) > 0 and self.first_var_pos is None:
187 | if var[0] in self.label_vars:
188 | self.first_var_pos = token_i
189 | self.vars_per_span.append(var)
190 | self.span_names.append("{" + var[0] + "}" if len(var) == 1 else self.template[token_i].replace(' ', '_'))
191 |
192 | # other stuff
193 | length = {}
194 | for var in self.variables:
195 | if '.' in var:
196 | head_var = var.split('.')[0]
197 | if head_var not in length:
198 | length[head_var] = len(self.variables[var])
199 | else:
200 | assert length[head_var] == len(self.variables[var]), f"Variable {var} has length {len(self.variables[var])} but {head_var} has length {length[head_var]}"
201 | self.result_prepend_space = data["result_prepend_space"]
202 |
203 |
204 | @classmethod
205 | def load_from(self, template: str) -> "Dataset":
206 | """Load a Dataset from a json template."""
207 | template_file, template_name = template.split('/')
208 | if template_name.endswith("_inverted"):
209 | template_name = template_name[:-len("_inverted")]
210 | data = json.load(open(f"data/templates/{template_file}.json", "r"))
211 | return Dataset(data[template_name])
212 |
213 |
214 | @property
215 | def length(self) -> int:
216 | return len(self.template)
217 |
218 |
219 | def sample_pair(self) -> Pair:
220 | """Sample a minimal pair from the dataset."""
221 | # pick types (should differ)
222 | base_type = random.choice(self.types)
223 | src_type = base_type
224 | while src_type == base_type:
225 | src_type = random.choice(self.types)
226 |
227 | # make template
228 | base, src = self.template[:], self.template[:]
229 |
230 | # go token by token
231 | stored_choices = {}
232 | for token_i in range(len(self.template)):
233 | var = self.vars_per_span[token_i]
234 | if len(var) == 0: continue
235 | var = var[0]
236 | var_temp = '{' + var + '}'
237 |
238 | # set label vars (different)
239 | if var in self.label_vars:
240 | base_choice = random.choice(self.variables[var][base_type])
241 | src_choice = random.choice(self.variables[var][src_type])
242 | base[token_i] = base[token_i].replace(var_temp, base_choice)
243 | src[token_i] = src[token_i].replace(var_temp, src_choice)
244 | # set other vars (same for both)
245 | elif '.' in var:
246 | head_var = var.split('.')[0]
247 | if head_var not in stored_choices:
248 | stored_choices[head_var] = random.randint(0, len(self.variables[var]) - 1)
249 | base[token_i] = base[token_i].replace(var_temp, self.variables[var][stored_choices[head_var]])
250 | src[token_i] = src[token_i].replace(var_temp, self.variables[var][stored_choices[head_var]])
251 | else:
252 | choice = random.choice(self.variables[var])
253 | base[token_i] = base[token_i].replace(var_temp, choice)
254 | src[token_i] = src[token_i].replace(var_temp, choice)
255 |
256 | # get continuations
257 | base_label = random.choice(self.labels[base_type])
258 | src_label = random.choice(self.labels[src_type])
259 | if self.result_prepend_space:
260 | base_label = " " + base_label
261 | src_label = " " + src_label
262 |
263 | return Pair(base, src, base_type, src_type, base_label, src_label)
264 |
265 |
266 | @torch.no_grad()
267 | def _sample_doable_pair(self, model: AutoModel, tokenizer: AutoTokenizer, device: str="cpu", discard: set[str]=set()) -> Pair:
268 | """Sample a minimal pair from the dataset that is correctly labelled by a model."""
269 |
270 | # keep resampling until we get a pair that is correctly labelled
271 | correct, ct = False, 0
272 | while not correct:
273 | pair = self.sample_pair(discard)
274 | if ''.join(pair.base) in discard:
275 | continue
276 | base = tokenizer(''.join(pair.base), return_tensors="pt").to(device)
277 | src = tokenizer(''.join(pair.src), return_tensors="pt").to(device)
278 | base_logits = model(**base).logits[0, -1]
279 | src_logits = model(**src).logits[0, -1]
280 | base_label = tokenizer.encode(pair.base_label)[0]
281 | src_label = tokenizer.encode(pair.src_label)[0]
282 | if base_logits[base_label] > base_logits[src_label] and src_logits[src_label] > src_logits[base_label]:
283 | correct = True
284 | ct += 1
285 | if ct == 20 and not correct:
286 | print("WARNING: could not find a doable pair after 20 iterations")
287 | print("Using a random pair instead")
288 | break
289 |
290 | return pair
291 |
292 |
293 | def sample_batch(
294 | self, tokenizer: AutoTokenizer, batch_size: int, device: str="cpu",
295 | model: Union[AutoModel, None]=None, discard: set[str]=set(),
296 | manipulate: Union[str, None]=None) -> Batch:
297 | """Sample a batch of minimal pairs from the dataset."""
298 | pairs = []
299 |
300 | # get the pairs
301 | if model is None:
302 | while len(pairs) < batch_size // 2:
303 | pair = self.sample_pair()
304 | ct = 0
305 | while ''.join(pair.base) in discard:
306 | pair = self.sample_pair()
307 | ct += 1
308 | if ct == 20:
309 | print("WARNING: could not find a pair not in the discard set after 20 iterations")
310 | print("Using a random pair instead")
311 | break
312 | pairs.append(pair)
313 | else:
314 | pairs = [
315 | self._sample_doable_pair(model, tokenizer, device, discard)
316 | for _ in range(batch_size // 2)
317 | ]
318 |
319 | # control tasks
320 | if manipulate == "invert":
321 | for i in range(len(pairs)):
322 | pairs[i].base_label, pairs[i].src_label = pairs[i].src_label, pairs[i].base_label
323 | elif manipulate == "dog-give":
324 | for i in range(len(pairs)):
325 | pairs[i].base_label = " dog" if pairs[i].base_type == self.types[0] else " give"
326 | pairs[i].src_label = " dog" if pairs[i].src_type == self.types[0] else " give"
327 | elif manipulate == "random":
328 | for i in range(len(pairs)):
329 | if random.random() < 0.5:
330 | pairs[i].base_label, pairs[i].src_label = pairs[i].src_label, pairs[i].base_label
331 | pairs[i].base_type, pairs[i].src_type = pairs[i].src_type, pairs[i].base_type
332 |
333 | # add flipped pairs
334 | for i in range(batch_size // 2):
335 | pairs.append(pairs[i].swap())
336 |
337 | # make batch
338 | return Batch(pairs, tokenizer, device)
339 |
340 |
341 | def sample_batches(
342 | self, tokenizer: AutoTokenizer, batch_size: int, num_batches: int,
343 | device: str="cpu", seed: int=42, model: Union[AutoModel, None]=None,
344 | discard: set[str]=set(), manipulate: Union[str, None]=None) -> list[Batch]:
345 | """Sample a list of batches of minimal pairs from the dataset."""
346 | random.seed(seed)
347 | return [self.sample_batch(tokenizer, batch_size, device, model, discard, manipulate) for _ in range(num_batches)]
348 |
349 |
350 | def load_from_syntaxgym():
351 | for suite_file in glob.glob("data/test_suites/gss_subord_pp.json"):
352 | print(suite_file.split('/')[-1])
353 | with open(suite_file, "r") as f:
354 | suite = json.load(f)
355 | if "items" not in suite:
356 | continue
357 | print(len(suite["items"]))
358 |
359 | region_numbers = defaultdict(list)
360 | for i, item in enumerate(suite["items"]):
361 | for condition in item["conditions"]:
362 | for region in condition["regions"]:
363 | region_numbers[f"{condition['condition_name']}_{region['region_number']}"].append(region["content"])
364 |
365 | for key in region_numbers:
366 | print(key, json.dumps(region_numbers[key]))
367 |
368 |
369 | def list_datasets() -> list[str]:
370 | """List all available datasets."""
371 | datasets = []
372 | for template_file in glob.glob("data/templates/*.json"):
373 | name = template_file.split("/")[-1].split(".json")[0]
374 | with open(template_file, "r") as f:
375 | data = json.load(f)
376 | datasets.extend([name + "/" + x for x in data.keys()])
377 | return datasets
378 |
379 |
380 | def convert_to_huggingface_format():
381 | """Generate dataset files for HuggingFace upload."""
382 | tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-70m")
383 | tokenizer.pad_token = tokenizer.eos_token
384 | datasets = [x for x in list_datasets() if x.startswith("syntaxgym/")]
385 | all_data = defaultdict(list)
386 | for dataset in tqdm(datasets):
387 | data = Dataset.load_from(dataset)
388 |
389 | # sample trainset
390 | trainset = data.sample_batches(tokenizer, 4, 100, "cpu", seed=42, manipulate=None)
391 | discard = set()
392 | for batch in trainset:
393 | for pair in batch.pairs:
394 | discard.add(''.join(pair.base))
395 |
396 | # dev + test exclude trainset
397 | devset = data.sample_batches(tokenizer, 4, 25, "cpu", seed=420, manipulate=None, discard=discard)
398 | testset = data.sample_batches(tokenizer, 4, 25, "cpu", seed=1, manipulate=None, discard=discard)
399 |
400 | # get pairs from each batch
401 | groups = {"train": trainset, "dev": devset, "test": testset}
402 | for split, batches in groups.items():
403 | pairs = [pair for batch in batches for pair in batch.pairs]
404 | all_data[split].extend([{
405 | "base": pair.base,
406 | "src": pair.src,
407 | "base_type": pair.base_type,
408 | "src_type": pair.src_type,
409 | "base_label": pair.base_label,
410 | "src_label": pair.src_label,
411 | "task": dataset.split('/')[1]
412 | } for pair in pairs])
413 |
414 | # dump
415 | for split in all_data:
416 | with open(f"data/huggingface/{split}.json", "w") as f:
417 | json.dump(all_data[split], f, indent=2)
418 |
419 |
420 | if __name__ == "__main__":
421 | convert_to_huggingface_format()
--------------------------------------------------------------------------------
/data/templates/data.json:
--------------------------------------------------------------------------------
1 | {
2 | "gender_basic": {
3 | "templates": [
4 | "{name} {completion} because"
5 | ],
6 | "label": "name",
7 | "label_prepend_space": false,
8 | "variables": {
9 | "name": {
10 | "he": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
11 | "she": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
12 | },
13 | "completion": [
14 | "walked"
15 | ]
16 | }
17 | },
18 | "gender": {
19 | "templates": [
20 | "{name} {completion} because"
21 | ],
22 | "label": "name",
23 | "label_prepend_space": false,
24 | "variables": {
25 | "name": {
26 | "he": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
27 | "she": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
28 | },
29 | "completion": [
30 | "walked", "ran", "agreed", "blinked", "bounced", "called", "disappeared", "lied", "laughed", "paid"
31 | ]
32 | }
33 | },
34 | "gender_is_a": {
35 | "templates": [
36 | "{name} is a"
37 | ],
38 | "label": "name",
39 | "label_prepend_space": false,
40 | "variables": {
41 | "name": {
42 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
43 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
44 | }
45 | }
46 | },
47 | "number": {
48 | "templates": [
49 | "The {noun}"
50 | ],
51 | "label": "noun",
52 | "label_prepend_space": true,
53 | "variables": {
54 | "noun": {
55 | "is": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"],
56 | "are": ["managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"]
57 | }
58 | }
59 | },
60 | "animacy": {
61 | "templates": [
62 | "The {noun} fell because"
63 | ],
64 | "label": "noun",
65 | "label_prepend_space": true,
66 | "variables": {
67 | "noun": {
68 | "he": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"],
69 | "it": ["box", "stone", "table", "chair", "book", "car", "house", "tree", "rock", "ball", "computer", "phone", "desk", "bed"]
70 | }
71 | }
72 | }
73 | }
74 |
--------------------------------------------------------------------------------
/data/templates/data_extra.json:
--------------------------------------------------------------------------------
1 | {
2 | "gender": {
3 | "templates": [
4 | "{name} {completion} because"
5 | ],
6 | "label": "name",
7 | "label_prepend_space": false,
8 | "variables": {
9 | "name": {
10 | "he": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
11 | "she": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
12 | },
13 | "completion": [
14 | "walked",
15 | "is tired", "is excited", "is ready", "went home",
16 | "is walking", "ran", "is running", "works there",
17 | "joined the army", "plays soccer", "likes playing games",
18 | "said no to me"
19 | ]
20 | }
21 | },
22 | "location": {
23 | "templates": [
24 | "{object} is a famous"
25 | ],
26 | "label": "object",
27 | "label_prepend_space": false,
28 | "variables": {
29 | "object": {
30 | "country": [
31 | "Canada", "America", "Mexico", "Brazil", "Argentina",
32 | "Chile", "Peru", "Colombia", "Venezuela", "Ecuador",
33 | "Spain", "Portugal", "France", "Germany", "Italy",
34 | "England", "Ireland", "Scotland", "Wales", "Sweden",
35 | "Norway", "Finland", "Denmark", "Russia", "China",
36 | "Japan", "Korea", "India", "Pakistan", "Iran",
37 | "Iraq", "Egypt", "Nigeria", "South Africa", "Kenya",
38 | "Australia", "New Zealand"
39 | ],
40 | "city": [
41 | "Madrid", "Rome", "London", "Paris", "Berlin", "Moscow",
42 | "Beijing", "Tokyo", "Seoul", "Delhi", "Mumbai",
43 | "Bangalore", "Lagos", "Cairo", "Johannesburg",
44 | "Sydney", "Melbourne", "Auckland", "Wellington",
45 | "Toronto", "Montreal", "Vancouver", "New York",
46 | "Los Angeles", "Chicago", "Houston", "Philadelphia",
47 | "Phoenix", "San Antonio", "San Diego", "Dallas",
48 | "San Jose", "Austin", "Jacksonville", "San Francisco",
49 | "Amman", "Baghdad", "Damascus", "Jerusalem", "Kabul",
50 | "Tehran", "Ankara", "Athens", "Budapest", "Dublin",
51 | "Riyadh", "Kuwait City", "Nairobi", "Lima", "Bogota",
52 | "Caracas", "Santiago", "Buenos Aires", "Mexico City",
53 | "Brasilia", "Lisbon", "Barcelona", "Vienna", "Prague",
54 | "Warsaw", "Stockholm", "Copenhagen", "Oslo", "Helsinki",
55 | "Reykjavik", "Bucharest", "Sofia", "Belgrade", "Kiev",
56 | "Minsk", "Tbilisi", "Yerevan", "Baku", "Ashgabat",
57 | "Tashkent", "Dushanbe", "Kathmandu", "Islamabad",
58 | "Kabul", "Kathmandu", "Dhaka", "Colombo", "Yangon",
59 | "Bangkok", "Hanoi", "Manila", "Jakarta", "Kuala Lumpur"
60 | ]
61 | }
62 | }
63 | },
64 | "polarity__9_shot": {
65 | "templates": [
66 | "- advantage: good\n- robbery: bad\n- destruction: bad\n- health: good\n- positivity: good\n- war: bad\n- peace: good\n- beautiful: good\n- family: good\n- {word}:"
67 | ],
68 | "label": "word",
69 | "label_prepend_space": true,
70 | "variables": {
71 | "word": {
72 | "good": ["toy", "happy", "friend", "child", "help", "nice", "kind", "clean"],
73 | "bad": ["kill", "hurt", "sad", "angry", "mean", "enemy", "rude", "prevent", "dirty"]
74 | }
75 | }
76 | },
77 | "syntaxgen_number_prep": {
78 | "templates": [
79 | "The leader was smart, but the {subj} {prep} the {prepobj}"
80 | ],
81 | "label": "subj",
82 | "label_prepend_space": true,
83 | "variables": {
84 | "subj": {
85 | "was": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"],
86 | "were": ["managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"]
87 | },
88 | "prep": [
89 | "in front of", "behind", "next to", "near", "across from", "to the side of"
90 | ],
91 | "prepobj": [
92 | "manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager",
93 | "managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"
94 | ]
95 | }
96 | },
97 | "syntaxgen_reflexive_prep": {
98 | "templates": [
99 | "After falling, the {subj} {prep} the {prepobj} {verb}"
100 | ],
101 | "label": "subj",
102 | "label_prepend_space": true,
103 | "variables": {
104 | "subj": {
105 | "himself": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"],
106 | "themselves": ["managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"]
107 | },
108 | "prep": [
109 | "in front of", "behind", "next to", "near", "across from", "to the side of"
110 | ],
111 | "prepobj": [
112 | "manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager",
113 | "managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"
114 | ],
115 | "verb": [
116 | "hurt", "injured", "trusted", "embarrassed", "disguised", "hated", "doubted"
117 | ]
118 | }
119 | },
120 | "gender_is_a__2_shot": {
121 | "templates": [
122 | "John is a man. Jane is a woman. {name} is a"
123 | ],
124 | "label": "name",
125 | "label_prepend_space": true,
126 | "variables": {
127 | "name": {
128 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
129 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
130 | }
131 | }
132 | },
133 | "gender_colon__2_shot": {
134 | "templates": [
135 | "- John: man\n- Jane: woman\n- {name}:"
136 | ],
137 | "label": "name",
138 | "label_prepend_space": true,
139 | "variables": {
140 | "name": {
141 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
142 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
143 | }
144 | }
145 | },
146 | "gender_is_a__5_shot": {
147 | "templates": [
148 | "Tom is a man. Janet is a woman. David is a man. Jessica is a woman. Richard is a man. {name} is a"
149 | ],
150 | "label": "name",
151 | "label_prepend_space": true,
152 | "variables": {
153 | "name": {
154 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
155 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
156 | }
157 | }
158 | },
159 | "gender_colon__5_shot": {
160 | "templates": [
161 | "- Tom: man\n- Janet: woman\n- David: man\n- Richard: man\n- Jessica: woman\n- {name}:"
162 | ],
163 | "label": "name",
164 | "label_prepend_space": true,
165 | "variables": {
166 | "name": {
167 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"],
168 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"]
169 | }
170 | }
171 | }
172 | }
--------------------------------------------------------------------------------
/data/templates/preposing_in_pp.json:
--------------------------------------------------------------------------------
1 | {
2 | "preposing_in_pp": {
3 | "templates": [
4 | "{prefix}{filler} though {subj} {verb}"
5 | ],
6 | "label": "filler",
7 | "result_prepend_space": false,
8 | "labels": {
9 | "pp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary",
10 | " suspicious", " doubtful", " clever", " witty", " young", " old",
11 | " sharp", " bright", " intelligent"
12 | ],
13 | "pipp": ["."]
14 | },
15 | "variables": {
16 | "prefix": ["The work continued,", "The plan was in motion,"],
17 | "subj": ["the mother", "the security guard", "the man", "the delivery boy",
18 | "the judge", "the reporter", "the accountant", "the secretary",
19 | "the investigator", "the businessman", "the friend", "the painter",
20 | "the neighbor", "the woman", "the politician", "the old man"
21 | ],
22 |
23 | "filler": {
24 | "pp": [""],
25 | "pipp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary",
26 | " suspicious", " doubtful", " clever", " witty", " young", " old",
27 | " sharp", " bright", " intelligent"
28 | ]
29 | },
30 | "verb": ["seemed", "seems"]
31 | }
32 | },
33 | "preposing_in_pp_embed_1": {
34 | "templates": [
35 | "{prefix}{filler} though {subj1} {verb1} that {subj2} {verb2}"
36 | ],
37 | "label": "filler",
38 | "result_prepend_space": false,
39 | "labels": {
40 | "pp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary",
41 | " suspicious", " doubtful", " clever", " witty", " young", " old",
42 | " sharp", " bright", " intelligent"
43 | ],
44 | "pipp": ["."]
45 | },
46 | "variables": {
47 | "prefix": ["The work continued,", "The plan was in motion,"],
48 | "subj1": ["the mother", "the security guard", "the man", "the delivery boy",
49 | "the judge", "the reporter", "the accountant", "the secretary",
50 | "the investigator", "the businessman", "the friend", "the painter",
51 | "the neighbor", "the woman", "the politician", "the old man"
52 | ],
53 | "verb1": ["said", "believed", "knew", "remarked", "heard",
54 | "thought", "stated", "said", "said", "thought",
55 | "believes", "believed", "said", "said", "said",
56 | "thinks", "said", "stated", "believed", "stated"
57 | ],
58 | "subj2": ["the friend", "the assistant", "the woman", "the worker", "the newspaper",
59 | "the cop", "the television host", "the colleague", "the journalist",
60 | "the banker", "the colleague", "the rival", "the reporter", "the associate",
61 | "the worker", "the cop", "the mother", "the secretary", "the friend",
62 | "the press secretary", "the woman"
63 | ],
64 | "filler": {
65 | "pp": [""],
66 | "pipp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary",
67 | " suspicious", " doubtful", " clever", " witty", " young", " old",
68 | " sharp", " bright", " intelligent"
69 | ]
70 | },
71 | "verb2": ["seemed", "seems"]
72 | }
73 | }
74 | }
75 |
--------------------------------------------------------------------------------
/data/templates/syntaxgym_failed.json:
--------------------------------------------------------------------------------
1 | {
2 | "passive_wh_extraction": {
3 | "templates": [
4 | "{prefix} who the {subject} {hadwas} {verb}"
5 | ],
6 | "label": "hadwas",
7 | "label_prepend_space": true,
8 | "labels": {
9 | "had": ["."],
10 | "was": ["by"]
11 | },
12 | "variables": {
13 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
14 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
15 | "hadwas": {
16 | "had": ["had"],
17 | "was": ["was"]
18 | },
19 | "verb": ["killed", "attacked", "admired", "amazed", "impressed", "disturbed", "injured", "hurt", "shot"]
20 | }
21 | },
22 | "from_to": {
23 | "templates": [
24 | "The {subject} {verb} {fromto} the {object}"
25 | ],
26 | "label": "fromto",
27 | "label_prepend_space": true,
28 | "labels": {
29 | "from": ["to"],
30 | "to": ["from"]
31 | },
32 | "variables": {
33 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
34 | "verb": ["drove", "walked", "flew", "ran", "swam", "crawled", "sailed", "skated", "biked", "hiked", "skied", "climbed", "traveled", "jogged", "sprinted", "raced", "strolled", "marched", "ambled", "wandered"],
35 | "fromto": {
36 | "from": ["from"],
37 | "to": ["to"]
38 | },
39 | "object": ["city", "town", "village", "beach", "north", "south"]
40 | }
41 | },
42 | "filler_gap_time_extraction": {
43 | "templates": [
44 | "{prefix} {comp} {np1} will {verb}"
45 | ],
46 | "label": "comp",
47 | "label_prepend_space": true,
48 | "labels": {
49 | "wh": ["."],
50 | "th": ["soon"]
51 | },
52 | "variables": {
53 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
54 | "comp": {
55 | "wh": ["when"],
56 | "th": ["that"]
57 | },
58 | "np1": ["our new friend", "the star student", "the nurse", "the student", "the movie star", "the man", "the convict", "the collector", "our uncle", "her rival", "my good friend", "the suspect", "the businessman"],
59 | "verb": ["arrive", "depart", "leave", "come back", "return"]
60 | }
61 | },
62 | "agreement_number_reflexive_obj-relc2": {
63 | "templates": [
64 | "The {subject} that the {embed_np} {embed_vp} {matrix_v}"
65 | ],
66 | "label": ["subject", "embed_np"],
67 | "label_prepend_space": true,
68 | "labels": {
69 | "plural": ["themselves", "themselves"],
70 | "singular": ["herself", "himself"]
71 | },
72 | "variables": {
73 | "subject": {
74 | "singular": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
75 | "plural": ["managers", "farmers", "architects", "pilots", "doctors", "authors", "consultants", "taxi drivers", "customers", "secretaries", "officers", "actors", "guards", "teachers", "executives", "senators", "ministers", "clerks", "athletes"]
76 | },
77 | "embed_np": {
78 | "singular": ["actors", "managers", "authors", "guards", "customers", "farmers", "architects", "pilots", "ministers", "clerks", "secretaries", "teachers", "doctors", "executives", "officers", "athletes", "senators"],
79 | "plural": ["doctor", "farmer", "minister", "officer", "guard", "author", "customer", "pilot", "architect", "senator", "athlete", "executive", "actor", "secretary", "clerk", "teacher", "manager"]
80 | },
81 | "embed_vp": ["loved", "liked", "discussed", "met", "hated"],
82 | "matrix_v": ["embarrassed", "disguised", "injured", "suspected", "doubted", "hated", "hurt", "trusted"]
83 | }
84 | },
85 | "fillergap_subj": {
86 | "templates": [
87 | "{prefix} {comp}"
88 | ],
89 | "label": "comp",
90 | "label_prepend_space": true,
91 | "labels": {
92 | "wh": ["did"],
93 | "th": ["there"]
94 | },
95 | "variables": {
96 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
97 | "comp": {
98 | "wh": ["who"],
99 | "th": ["that"]
100 | }
101 | }
102 | },
103 | "fillergap_obj-him": {
104 | "templates": [
105 | "{prefix} {comp} {np1} {verb}"
106 | ],
107 | "label": "comp",
108 | "label_prepend_space": true,
109 | "labels": {
110 | "wh": ["when"],
111 | "th": ["him"]
112 | },
113 | "variables": {
114 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
115 | "comp": {
116 | "wh": ["who"],
117 | "th": ["that"]
118 | },
119 | "np1": ["our new friend", "the star student", "the nurse", "the student", "the movie star", "the man", "the convict", "the collector", "our uncle", "her rival", "my good friend", "the suspect", "the businessman"],
120 | "verb": ["killed", "met", "attacked", "saw"]
121 | }
122 | },
123 | "fillergap_obj-it": {
124 | "templates": [
125 | "{prefix} {comp} {np1} {verb}"
126 | ],
127 | "label": "comp",
128 | "label_prepend_space": true,
129 | "labels": {
130 | "wh": ["when"],
131 | "th": ["it"]
132 | },
133 | "variables": {
134 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
135 | "comp": {
136 | "wh": ["what"],
137 | "th": ["that"]
138 | },
139 | "np1": ["our new friend", "the star student", "the nurse", "the student", "the movie star", "the man", "the convict", "the collector", "our uncle", "her rival", "my good friend", "the suspect", "the businessman"],
140 | "verb": ["grabbed", "caught", "stole", "forged", "derailed", "will get", "will be awarded", "placed", "repaired", "dragged"]
141 | }
142 | },
143 | "fillergap_passive_subj-pp": {
144 | "templates": [
145 | "{prefix} {wh} the {subject} was {verb} by"
146 | ],
147 | "label": "wh",
148 | "label_prepend_space": true,
149 | "labels": {
150 | "why": ["them"],
151 | "who": ["."]
152 | },
153 | "variables": {
154 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
155 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
156 | "wh": {
157 | "why": ["why"],
158 | "who": ["who"]
159 | },
160 | "verb": ["killed", "attacked", "admired", "amazed", "impressed", "disturbed", "injured", "hurt", "shot"]
161 | }
162 | },
163 | "fillergap_ditransitive_recipient": {
164 | "templates": [
165 | "{prefix} {wh} the {subject} {verb} the {object} to"
166 | ],
167 | "label": "wh",
168 | "label_prepend_space": true,
169 | "labels": {
170 | "that": ["them"],
171 | "who": ["."]
172 | },
173 | "variables": {
174 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
175 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
176 | "wh": {
177 | "that": ["that"],
178 | "who": ["who"]
179 | },
180 | "object": ["box", "toy", "present", "gift", "package", "letter", "card", "book", "note", "envelope", "package", "ball", "flower", "message", "email", "bill", "check", "money", "package", "parcel"],
181 | "verb": ["showed", "gave", "presented", "offered", "sent", "handed", "delivered", "sold"]
182 | }
183 | },
184 | "fillergap_ditransitive_time": {
185 | "templates": [
186 | "{prefix} {wh} the {subject} {verb} the {object} to them"
187 | ],
188 | "label": "wh",
189 | "label_prepend_space": true,
190 | "labels": {
191 | "that": ["today"],
192 | "when": ["."]
193 | },
194 | "variables": {
195 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"],
196 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
197 | "wh": {
198 | "that": ["that"],
199 | "when": ["when"]
200 | },
201 | "object": ["box", "toy", "present", "gift", "package", "letter", "card", "book", "note", "envelope", "package", "ball", "flower", "message", "email", "bill", "check", "money", "package", "parcel"],
202 | "verb": ["showed", "gave", "presented", "offered", "sent", "handed", "delivered", "sold"]
203 | }
204 | },
205 | "npi_obj-relc2": {
206 | "templates": [
207 | "{det1} {np} that {det2} {rc_subj} {rc_verb} {matrix_v}"
208 | ],
209 | "label": ["det1", "det2"],
210 | "label_prepend_space": true,
211 | "labels": {
212 | "any": ["any"],
213 | "some": ["some"]
214 | },
215 | "variables": {
216 | "det1": {
217 | "any": ["No"],
218 | "some": ["The"]
219 | },
220 | "np": ["consultant", "taxi driver", "farmer", "architects", "pilots", "architect", "athlete", "authors", "journalists", "secretaries", "pilot", "minister", "clerks", "senators", "clerk", "ministers", "dancer", "officers", "athletes", "secretary", "senator", "farmers", "guard", "author", "manager", "doctors", "taxi drivers", "teacher", "customers", "executives", "managers", "teachers", "officer", "surgeon", "consultants", "executive", "guards", "customer"],
221 | "det2": {
222 | "any": ["the"],
223 | "some": ["no"]
224 | },
225 | "rc_subj": ["consultant", "taxi driver", "architects", "farmer", "pilots", "architect", "journalists", "authors", "athlete", "secretaries", "clerks", "minister", "pilot", "senators", "clerk", "ministers", "dancer", "officers", "secretary", "senator", "farmers", "guard", "doctors", "manager", "taxi drivers", "teacher", "customers", "executives", "managers", "teachers", "officer", "surgeon", "consultants", "executive", "guards", "customer"],
226 | "rc_verb": ["helped", "liked", "admired", "contacted", "respected", "discussed", "impressed", "praised", "knew", "hated", "loved"],
227 | "matrix_v": ["has shown", "has fired", "has broken", "have planted", "has passed up", "have had", "has spent", "have missed", "has seen", "has had", "have landed", "has crashed", "have refused", "have passed", "has missed", "has known", "have failed", "has burned", "have advocated", "has completed", "have caught", "has failed", "has passed", "have purchased", "have broken", "have read", "have crashed", "have burned", "have arrested"]
228 | }
229 | },
230 | "passive": {
231 | "templates": [
232 | "The {subject} {hadwas} {verb}"
233 | ],
234 | "label": "hadwas",
235 | "label_prepend_space": true,
236 | "labels": {
237 | "had": ["him"],
238 | "was": ["by"]
239 | },
240 | "variables": {
241 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"],
242 | "hadwas": {
243 | "had": ["had"],
244 | "was": ["was"]
245 | },
246 | "verb": ["killed", "attacked", "admired", "amazed", "impressed", "disturbed", "injured", "hurt", "shot"]
247 | }
248 | }
249 | }
--------------------------------------------------------------------------------
/data/test_suites/center_embedding.json:
--------------------------------------------------------------------------------
1 | {"items":[{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"painting","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"artist","region_number":5},{"content":"deteriorated","region_number":6},{"content":"painted","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"painting","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"artist","region_number":5},{"content":"painted","region_number":6},{"content":"deteriorated","region_number":7}]}],"item_number":1},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"captain","region_number":5},{"content":"subsided","region_number":6},{"content":"feared","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"captain","region_number":5},{"content":"feared","region_number":6},{"content":"subsided","region_number":7}]}],"item_number":2},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"girl","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bug","region_number":5},{"content":"shouted","region_number":6},{"content":"frightened","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"girl","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bug","region_number":5},{"content":"frightened","region_number":6},{"content":"shouted","region_number":7}]}],"item_number":3},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"baby","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"yelled","region_number":6},{"content":"held","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"baby","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"held","region_number":6},{"content":"yelled","region_number":7}]}],"item_number":4},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"soldier","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bullet","region_number":5},{"content":"died","region_number":6},{"content":"wounded","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"soldier","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bullet","region_number":5},{"content":"wounded","region_number":6},{"content":"died","region_number":7}]}],"item_number":5},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"scientist","region_number":5},{"content":"intensified","region_number":6},{"content":"predicted","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"scientist","region_number":5},{"content":"predicted","region_number":6},{"content":"intensified","region_number":7}]}],"item_number":6},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"convict","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"cop","region_number":5},{"content":"escaped","region_number":6},{"content":"arrested","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"convict","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"cop","region_number":5},{"content":"arrested","region_number":6},{"content":"escaped","region_number":7}]}],"item_number":7},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"computer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"crashed","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"computer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"bought","region_number":6},{"content":"crashed","region_number":7}]}],"item_number":8},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"floor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"cracked","region_number":6},{"content":"swept","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"floor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"swept","region_number":6},{"content":"cracked","region_number":7}]}],"item_number":9},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"yacht","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"millionaires","region_number":5},{"content":"sank","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"yacht","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"millionaires","region_number":5},{"content":"bought","region_number":6},{"content":"sank","region_number":7}]}],"item_number":10},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"shirt","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"ripped","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"shirt","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"bought","region_number":6},{"content":"ripped","region_number":7}]}],"item_number":11},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"evaporated","region_number":6},{"content":"poured","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"poured","region_number":6},{"content":"evaporated","region_number":7}]}],"item_number":12},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"collapsed","region_number":6},{"content":"built","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"built","region_number":6},{"content":"collapsed","region_number":7}]}],"item_number":13},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"bones","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"broke","region_number":6},{"content":"examined","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"bones","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"examined","region_number":6},{"content":"broke","region_number":7}]}],"item_number":14},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"deteriorated","region_number":6},{"content":"repaired","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"repaired","region_number":6},{"content":"deteriorated","region_number":7}]}],"item_number":15},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"ship","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"sank","region_number":6},{"content":"built","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"ship","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"built","region_number":6},{"content":"sank","region_number":7}]}],"item_number":16},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"horse","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"bucked","region_number":6},{"content":"rode","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"horse","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"rode","region_number":6},{"content":"bucked","region_number":7}]}],"item_number":17},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"chef","region_number":5},{"content":"evaporated","region_number":6},{"content":"needed","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"chef","region_number":5},{"content":"needed","region_number":6},{"content":"evaporated","region_number":7}]}],"item_number":18},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"tree","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"old man","region_number":5},{"content":"fell","region_number":6},{"content":"cut","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"tree","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"old man","region_number":5},{"content":"cut","region_number":6},{"content":"fell","region_number":7}]}],"item_number":19},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"letter","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"arrived","region_number":6},{"content":"wrote","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"letter","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"wrote","region_number":6},{"content":"arrived","region_number":7}]}],"item_number":20},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"glass","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"cracked","region_number":6},{"content":"hit","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"glass","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"hit","region_number":6},{"content":"cracked","region_number":7}]}],"item_number":21},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"bomb","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"terrorist","region_number":5},{"content":"exploded","region_number":6},{"content":"built","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"bomb","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"terrorist","region_number":5},{"content":"built","region_number":6},{"content":"exploded","region_number":7}]}],"item_number":22},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"meat","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"burned","region_number":6},{"content":"cooked","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"meat","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"cooked","region_number":6},{"content":"burned","region_number":7}]}],"item_number":23},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"sugar","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"visitor","region_number":5},{"content":"dissolved","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"sugar","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"visitor","region_number":5},{"content":"bought","region_number":6},{"content":"dissolved","region_number":7}]}],"item_number":24},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"pants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"ripped","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"pants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"bought","region_number":6},{"content":"ripped","region_number":7}]}],"item_number":25},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"toilet","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"worker","region_number":5},{"content":"clogged","region_number":6},{"content":"fixed","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"toilet","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"worker","region_number":5},{"content":"fixed","region_number":6},{"content":"clogged","region_number":7}]}],"item_number":26},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"window","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"shattered","region_number":6},{"content":"wiped","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"window","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"wiped","region_number":6},{"content":"shattered","region_number":7}]}],"item_number":27},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"child","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"shadow","region_number":5},{"content":"yelled","region_number":6},{"content":"frightened","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"child","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"shadow","region_number":5},{"content":"frightened","region_number":6},{"content":"yelled","region_number":7}]}],"item_number":28}],"meta":{"author":"","comment":null,"metric":"sum","name":"center_embed","reference":"\"Wilcox E. Levy R. & Futrell R. (2019). Hierarchical representation in neural language models: Suppression and recovery of expectations.\""},"predictions":[{"formula":"( (6;%plaus%) + (7;%plaus%) ) < ( (6;%implaus%) + (7;%implaus%) )","type":"formula"}],"region_meta":{"1":"intro","2":"np_1","3":"that","4":"det_2","5":"np_2","6":"verb1","7":"verb2"}}
--------------------------------------------------------------------------------
/data/test_suites/reflexive_number_agreement_feminine_object_relative.json:
--------------------------------------------------------------------------------
1 | {"items":[{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":1},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":2},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":3},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"discussed","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"discussed","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":4},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"loved","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"loved","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":5},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"liked","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"liked","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":6},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"met","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"met","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":7},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"hated","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"hated","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":8},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":9},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":10},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":11},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":12},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":13},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":14},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"loved","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"loved","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":15},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"liked","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"liked","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":16},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"met","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"met","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":17},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"hated","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"hated","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":18},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"discussed","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"discussed","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":19}],"meta":{"author":"","comment":null,"metric":"sum","name":"reflexive_orc_fem","reference":"\"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. \""},"predictions":[{"formula":"(8;%match_sing%) < (8;%mismatch_sing%)","type":"formula"},{"formula":"(8;%match_plural%) < (8;%mismatch_plural%)","type":"formula"}],"region_meta":{"1":"intro","2":"np_subject","3":"that","4":"the","5":"embed_np","6":"embed_vp","7":"matrix_v","8":"reflexive"}}
--------------------------------------------------------------------------------
/data/test_suites/subject_verb_number_agreement_with_prepositional_phrase.json:
--------------------------------------------------------------------------------
1 | {"items":[{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":1},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"bring","region_number":6},{"content":"love to people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"brings","region_number":6},{"content":"love to people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"brings","region_number":6},{"content":"love to people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"bring","region_number":6},{"content":"love to people","region_number":7}]}],"item_number":2},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]}],"item_number":3},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"know","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"knows","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"knows","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"know","region_number":6},{"content":"many people","region_number":7}]}],"item_number":4},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"like","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"likes","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"likes","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"like","region_number":6},{"content":"to gamble","region_number":7}]}],"item_number":5},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"enjoy","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"enjoys","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"enjoys","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"enjoy","region_number":6},{"content":"playing tennis","region_number":7}]}],"item_number":6},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":7},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":8},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":9},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":10},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"are","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"is","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":11},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":12},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":13},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"bring","region_number":6},{"content":"good feelings","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"brings","region_number":6},{"content":"good feelings","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"brings","region_number":6},{"content":"good feelings","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"bring","region_number":6},{"content":"good feelings","region_number":7}]}],"item_number":14},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]}],"item_number":15},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"know","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"knows","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"knows","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"know","region_number":6},{"content":"many people","region_number":7}]}],"item_number":16},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"like","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"likes","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"likes","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"like","region_number":6},{"content":"to gamble","region_number":7}]}],"item_number":17},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"enjoy","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"enjoys","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"enjoys","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"enjoy","region_number":6},{"content":"playing tennis","region_number":7}]}],"item_number":18},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":19}],"meta":{"author":"","comment":null,"metric":"sum","name":"number_prep","reference":"\"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. \""},"predictions":[{"formula":"(6;%match_sing%) < (6;%mismatch_sing%)","type":"formula"},{"formula":"(6;%match_plural%) < (6;%mismatch_plural%)","type":"formula"}],"region_meta":{"1":"intro","2":"np_subject","3":"prep","4":"the","5":"prep_np","6":"matrix_v","7":"continuation"}}
2 |
--------------------------------------------------------------------------------
/data/test_suites/subject_verb_number_agreement_with_subject_relative_clause.json:
--------------------------------------------------------------------------------
1 | {"items":[{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senator","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senators","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senator","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senators","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":1},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"bring","region_number":7},{"content":"love to people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"brings","region_number":7},{"content":"love to people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"brings","region_number":7},{"content":"love to people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"bring","region_number":7},{"content":"love to people","region_number":8}]}],"item_number":2},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guard","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guards","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guard","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guards","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]}],"item_number":3},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerk","region_number":6},{"content":"know","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerks","region_number":6},{"content":"knows","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerk","region_number":6},{"content":"knows","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerks","region_number":6},{"content":"know","region_number":7},{"content":"many people","region_number":8}]}],"item_number":4},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architect","region_number":6},{"content":"like","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architects","region_number":6},{"content":"likes","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architect","region_number":6},{"content":"likes","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architects","region_number":6},{"content":"like","region_number":7},{"content":"to gamble","region_number":8}]}],"item_number":5},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athlete","region_number":6},{"content":"enjoy","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athletes","region_number":6},{"content":"enjoys","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athlete","region_number":6},{"content":"enjoys","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athletes","region_number":6},{"content":"enjoy","region_number":7},{"content":"playing tennis","region_number":8}]}],"item_number":6},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":7},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"minister","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"ministers","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"minister","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"ministers","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":8},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":9},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretary","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretaries","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretary","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretaries","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":10},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executive","region_number":6},{"content":"are","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executives","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executive","region_number":6},{"content":"is","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executives","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":11},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"author","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"authors","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"author","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"authors","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":12},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilot","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilots","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilot","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilots","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":13},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctor","region_number":6},{"content":"bring","region_number":7},{"content":"good feelings","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctors","region_number":6},{"content":"brings","region_number":7},{"content":"good feelings","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctor","region_number":6},{"content":"brings","region_number":7},{"content":"good feelings","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctors","region_number":6},{"content":"bring","region_number":7},{"content":"good feelings","region_number":8}]}],"item_number":14},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmer","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmers","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmer","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmers","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]}],"item_number":15},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"manager","region_number":6},{"content":"know","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"managers","region_number":6},{"content":"knows","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"manager","region_number":6},{"content":"knows","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"managers","region_number":6},{"content":"know","region_number":7},{"content":"many people","region_number":8}]}],"item_number":16},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customer","region_number":6},{"content":"like","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customers","region_number":6},{"content":"likes","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customer","region_number":6},{"content":"likes","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customers","region_number":6},{"content":"like","region_number":7},{"content":"to gamble","region_number":8}]}],"item_number":17},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officer","region_number":6},{"content":"enjoy","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officers","region_number":6},{"content":"enjoys","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officer","region_number":6},{"content":"enjoys","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officers","region_number":6},{"content":"enjoy","region_number":7},{"content":"playing tennis","region_number":8}]}],"item_number":18},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":19}],"meta":{"author":"","comment":null,"metric":"sum","name":"number_src","reference":"\"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. \""},"predictions":[{"formula":"(7;%match_sing%) < (7;%mismatch_sing%)","type":"formula"},{"formula":"(7;%match_plural%) < (7;%mismatch_plural%)","type":"formula"}],"region_meta":{"1":"intro","2":"np_subject","3":"that","4":"embed_vp","5":"the","6":"embed_np","7":"matrix_v","8":"continuation"}}
2 |
--------------------------------------------------------------------------------
/diff_methods.py:
--------------------------------------------------------------------------------
1 | from numpy import add
2 | import torch
3 | from collections import defaultdict
4 |
5 | from sklearn.cluster import KMeans
6 | from sklearn.decomposition import PCA
7 | from sklearn.linear_model import LogisticRegression
8 | from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
9 | from sklearn.utils._testing import ignore_warnings
10 |
11 |
12 | # mean diff
13 | def mean_diff(activations, labels, eval_activations, eval_labels):
14 | means, counts = {}, defaultdict(int)
15 |
16 | # accumulate
17 | for activation, label in zip(activations, labels):
18 | if label not in means:
19 | means[label] = torch.zeros_like(activation)
20 | means[label] += activation
21 | counts[label] += 1
22 |
23 | # calc means
24 | for k in means:
25 | means[k] /= counts[k]
26 |
27 | # make vector
28 | vecs = list(means.values())
29 | vec = vecs[1] - vecs[0]
30 | return vec / torch.norm(vec), None
31 |
32 |
33 | @ignore_warnings(category=Warning)
34 | def kmeans_diff(activations, labels, eval_activations, eval_labels):
35 | # fit kmeans
36 | kmeans = KMeans(n_clusters=2, random_state=0, n_init=10).fit(activations)
37 |
38 | # make vector
39 | vecs = kmeans.cluster_centers_
40 | vec = torch.tensor(vecs[0] - vecs[1], dtype=torch.float32)
41 | return vec / torch.norm(vec), None
42 |
43 |
44 | def pca_diff(n_components=1):
45 | def diff_func(activations, labels, eval_activations, eval_labels):
46 | # fit pca
47 | pca = PCA(n_components=n_components).fit(activations)
48 | explained_variance = sum(pca.explained_variance_ratio_)
49 |
50 | # average all components
51 | vec = torch.tensor(pca.components_.mean(axis=0), dtype=torch.float32)
52 | return vec / torch.norm(vec), explained_variance
53 | return diff_func
54 |
55 |
56 | def probe_diff(fit_intercept=False, penalty='l2', solver="lbfgs", C=1.0) -> callable:
57 | @ignore_warnings(category=Warning)
58 | def diff_func(activations, labels, eval_activations, eval_labels):
59 | # fit lr
60 | lr = LogisticRegression(random_state=0, max_iter=1000, l1_ratio=0.5,
61 | fit_intercept=fit_intercept, C=C,
62 | penalty=penalty, solver=solver).fit(activations, labels)
63 | accuracy = lr.score(eval_activations, eval_labels)
64 |
65 | # extract weight
66 | vec = torch.tensor(lr.coef_[0], dtype=torch.float32)
67 | return vec / torch.norm(vec), accuracy
68 | return diff_func
69 |
70 |
71 | def lda_diff(activations, labels, eval_activations, eval_labels):
72 | # fit lda
73 | lda = LinearDiscriminantAnalysis(n_components=1).fit(activations, labels)
74 | accuracy = lda.score(eval_activations, eval_labels)
75 |
76 | # extract weight
77 | vec = torch.tensor(lda.coef_[0], dtype=torch.float32)
78 | return vec / torch.norm(vec), accuracy
79 |
80 |
81 | def random_diff(activations, labels, eval_activations, eval_labels):
82 | vec = torch.randn_like(activations[0])
83 | return vec / torch.norm(vec), None
84 |
85 |
86 | method_mapping = {
87 | "mean": mean_diff,
88 | "kmeans": kmeans_diff,
89 | "pca": pca_diff(n_components=1),
90 | "lda": lda_diff,
91 | "random": random_diff,
92 | }
93 |
94 | probe_mapping = {
95 | "EleutherAI/pythia-14m": [probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-1)],
96 | "EleutherAI/pythia-31m": [probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-2)],
97 | "EleutherAI/pythia-70m": [probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-3)],
98 | "EleutherAI/pythia-160m": [
99 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-4),
100 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5)
101 | ],
102 | "EleutherAI/pythia-410m": [
103 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-4),
104 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5)
105 | ],
106 | "EleutherAI/pythia-1b": [
107 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5),
108 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6)
109 | ],
110 | "EleutherAI/pythia-1.4b": [
111 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5),
112 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6)
113 | ],
114 | "EleutherAI/pythia-2.8b": [
115 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5),
116 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6)
117 | ],
118 | "EleutherAI/pythia-6.9b": [
119 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6),
120 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-7)
121 | ],
122 | "EleutherAI/pythia-12b": [
123 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6),
124 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-7)
125 | ],
126 | }
127 |
128 | additional_method_mapping = {
129 | # various pca components (up to 5)
130 | "pca_2": pca_diff(n_components=2),
131 | "pca_3": pca_diff(n_components=3),
132 | "pca_4": pca_diff(n_components=4),
133 | "pca_5": pca_diff(n_components=5),
134 |
135 | # various linear probe types
136 | "probe_noreg_noint": probe_diff(fit_intercept=False, penalty=None, solver="saga", C=1.0),
137 | "probe_noreg_int": probe_diff(fit_intercept=True, penalty=None, solver="saga", C=1.0),
138 |
139 | "probe_l1_noint_1": probe_diff(fit_intercept=False, penalty='l1', solver="saga", C=1.0),
140 | "probe_l2_noint_1": probe_diff(fit_intercept=False, penalty='l2', solver="saga", C=1.0),
141 | "probe_elastic_noint_1": probe_diff(fit_intercept=False, penalty="elasticnet", solver="saga", C=1.0),
142 | "probe_l1_int_1": probe_diff(fit_intercept=True, penalty='l1', solver="saga", C=1.0),
143 | "probe_l2_int_1": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1.0),
144 | "probe_elastic_int_1": probe_diff(fit_intercept=True, penalty="elasticnet", solver="saga", C=1.0),
145 |
146 | "probe_l1_noint_0.1": probe_diff(fit_intercept=False, penalty='l1', solver="saga", C=0.1),
147 | "probe_l2_noint_0.1": probe_diff(fit_intercept=False, penalty='l2', solver="saga", C=0.1),
148 | "probe_elastic_noint_0.1": probe_diff(fit_intercept=False, penalty="elasticnet", solver="saga", C=0.1),
149 | "probe_l1_int_0.1": probe_diff(fit_intercept=True, penalty='l1', solver="saga", C=0.1),
150 | "probe_l2_int_0.1": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.1),
151 | "probe_elastic_int_0.1": probe_diff(fit_intercept=True, penalty="elasticnet", solver="saga", C=0.1),
152 |
153 | "probe_l1_noint_0.001": probe_diff(fit_intercept=False, penalty='l1', solver="saga", C=0.001),
154 | "probe_l2_noint_0.001": probe_diff(fit_intercept=False, penalty='l2', solver="saga", C=0.001),
155 | "probe_elastic_noint_0.001": probe_diff(fit_intercept=False, penalty="elasticnet", solver="saga", C=0.001),
156 | "probe_l1_int_0.001": probe_diff(fit_intercept=True, penalty='l1', solver="saga", C=0.001),
157 | "probe_l2_int_0.001": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.001),
158 | "probe_elastic_int_0.001": probe_diff(fit_intercept=True, penalty="elasticnet", solver="saga", C=0.001),
159 |
160 | "probe_l2_int_0.01": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.01),
161 | "probe_l2_int_0.0001": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.0001),
162 | }
--------------------------------------------------------------------------------
/eval.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch.nn import CrossEntropyLoss
3 | from utils import get_last_token
4 | import pyvene as pv
5 | from data import Batch
6 |
7 | loss_fct = CrossEntropyLoss()
8 |
9 |
10 | def calculate_loss(logits: torch.tensor, label: torch.tensor) -> torch.tensor:
11 | """Calculate cross entropy between logits and a single target label (can be batched)"""
12 | shift_logits = logits.contiguous()
13 | shift_labels = label.to(shift_logits.device)
14 | loss = loss_fct(shift_logits, shift_labels)
15 | return loss
16 |
17 |
18 | @torch.no_grad()
19 | def eval(intervenable: pv.IntervenableModel, evalset: list[Batch],
20 | layer_i: int, pos_i: int, strategy: str) -> tuple[list[dict], dict, list[tuple]]:
21 | """Evaluate an intervention on an evalset."""
22 |
23 | data, activations = [], []
24 | for batch in evalset:
25 |
26 | # inference
27 | pos_interv = [[x[pos_i] for x in y] for y in batch.compute_pos(strategy)]
28 | base_outputs, counterfactual_outputs = intervenable(
29 | batch.base,
30 | [None, batch.src],
31 | {"sources->base": ([None, pos_interv[1]], pos_interv)},
32 | output_original_output=True
33 | )
34 |
35 | # store activations/labels for training non-causal methods
36 | for batch_i in range(len(batch.pairs)):
37 | for unit_i in range(base_outputs[-1][batch_i].shape[0]):
38 | activation = base_outputs[-1][batch_i][unit_i].detach().cpu()
39 | activations.append((activation, batch.base_types[batch_i]))
40 |
41 | # get last token probs
42 | logits = get_last_token(counterfactual_outputs.logits, batch.base['attention_mask'])
43 | probs = logits.log_softmax(dim=-1)
44 | base_logits = get_last_token(base_outputs[0].logits, batch.base['attention_mask'])
45 | base_probs = base_logits.log_softmax(dim=-1)
46 | loss = calculate_loss(logits, batch.src_labels)
47 |
48 | # get probs
49 | for batch_i in range(len(batch.pairs)):
50 | src_label = batch.src_labels[batch_i]
51 | base_label = batch.base_labels[batch_i]
52 | # riia = 1 if logits[batch_i][src_label].item() > logits[batch_i][base_label].item() else 0
53 | # odds_ratio = (base_probs[batch_i][base_label] - base_probs[batch_i][src_label]) + (probs[batch_i][src_label] - probs[batch_i][base_label])
54 |
55 | # store stats
56 | data.append({
57 | "src_label": src_label.item(),
58 | "base_label": base_label.item(),
59 | "loss": loss.item(),
60 | "p_base": probs[batch_i][base_label].item(),
61 | "p_src": probs[batch_i][src_label].item(),
62 | "base_p_base": base_probs[batch_i][base_label].item(),
63 | "base_p_src": base_probs[batch_i][src_label].item(),
64 | "layer": layer_i,
65 | "pos": pos_i
66 | })
67 |
68 | # summary metrics
69 | summary = {
70 | "iia": sum([d['p_src'] > d['p_base'] for d in data]) / len(data),
71 | "iia-flip": sum([d['p_src'] > d['p_base'] for d in data if d['base_p_base'] > d['base_p_src']]) / len(data),
72 | "odds_ratio": sum([d['base_p_base'] - d['base_p_src'] + d['p_src'] - d['p_base'] for d in data]) / len(data),
73 | "eval_loss": sum([d['loss'] for d in data]) / len(data),
74 | }
75 |
76 | # update iterator
77 | return data, summary, activations
78 |
79 |
80 | def augment_data(data: list[dict], information: dict) -> list[dict]:
81 | """Add information to a list of dicts."""
82 | for d in data:
83 | d.update(information)
84 | return data
85 |
--------------------------------------------------------------------------------
/interventions.py:
--------------------------------------------------------------------------------
1 | import pyvene as pv
2 | from pyvene.models.layers import LowRankRotateLayer
3 | from pyvene.models.modeling_utils import b_sd_to_bsd, bsd_to_b_sd
4 | import torch
5 |
6 | class PooledLowRankRotatedSpaceIntervention(pv.TrainableIntervention, pv.DistributedRepresentationIntervention):
7 |
8 | """Intervention in the rotated space."""
9 |
10 | def __init__(self, **kwargs):
11 | super().__init__(**kwargs)
12 | rotate_layer = LowRankRotateLayer(self.embed_dim, kwargs["low_rank_dimension"])
13 | self.rotate_layer = torch.nn.utils.parametrizations.orthogonal(rotate_layer)
14 | # TODO: put them into a parent class
15 | self.register_buffer('embed_dim', torch.tensor(self.embed_dim))
16 | self.register_buffer('interchange_dim', torch.tensor(self.embed_dim))
17 |
18 | def forward(self, base, source, subspaces=None):
19 | num_unit = (base.shape[1] // int(self.embed_dim))
20 | base = b_sd_to_bsd(base, num_unit)
21 | source = b_sd_to_bsd(source, num_unit)
22 | rotated_base = self.rotate_layer(base)
23 | rotated_source = self.rotate_layer(source).mean(dim=1).unsqueeze(1).repeat(1, base.shape[1], 1)
24 | output = base + torch.matmul(
25 | (rotated_source - rotated_base), self.rotate_layer.weight.T
26 | )
27 | output = bsd_to_b_sd(output)
28 | return output.to(base.dtype)
29 |
30 | def __str__(self):
31 | return f"LowRankRotatedSpaceIntervention(embed_dim={self.embed_dim})"
32 |
33 |
34 | class CollectIntervention(pv.CollectIntervention):
35 |
36 | """Collect activations."""
37 |
38 | def __init__(self, **kwargs):
39 | super().__init__(**kwargs)
40 |
41 | def forward(self, base, source=None, subspaces=None):
42 | return base
43 |
44 | def __str__(self):
45 | return f"CollectIntervention(embed_dim={self.embed_dim})"
46 |
47 |
48 | def intervention_config(intervention_site, intervention_type, layer, num_dims):
49 | """Generate intervention config."""
50 | intervenable_config = pv.IntervenableConfig([
51 | {
52 | "layer": layer,
53 | "component": intervention_site,
54 | "intervention_type": CollectIntervention
55 | },
56 | {
57 | "layer": layer,
58 | "component": intervention_site,
59 | "intervention_type": intervention_type,
60 | "low_rank_dimension": num_dims
61 | }
62 | ])
63 | return intervenable_config
--------------------------------------------------------------------------------
/prompt.py:
--------------------------------------------------------------------------------
1 | from transformers import AutoTokenizer, AutoModelForCausalLM
2 | from utils import WEIGHTS, top_vals, format_token
3 | import torch
4 |
5 | with torch.no_grad():
6 | # load model
7 | model = input("Model: ")
8 | device = "cuda:0" if torch.cuda.is_available() else "cpu"
9 | tokenizer = AutoTokenizer.from_pretrained(model)
10 | tokenizer.pad_token = tokenizer.eos_token
11 | gpt = AutoModelForCausalLM.from_pretrained(
12 | model,
13 | revision="main",
14 | torch_dtype=WEIGHTS.get(model, torch.bfloat16) if device == "cuda:0" else torch.float32,
15 | ).to(device)
16 |
17 | # make data
18 | while True:
19 | text = input("Text: ")
20 | text = tokenizer(text, return_tensors="pt").to(device)
21 | print([format_token(tokenizer, i) for i in text.input_ids[0]])
22 | logits = gpt(**text).logits[0, -1]
23 | probs = logits.softmax(-1)
24 | top_vals(tokenizer, probs)
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy==1.26.4
2 | pandas==2.2.2
3 | plotnine==0.14.1
4 | pyvene==0.1.6
5 | scikit_learn==1.5.2
6 | scipy==1.13.1
7 | torch==2.5.1
8 | tqdm==4.66.6
9 | transformers==4.46.2
10 |
--------------------------------------------------------------------------------
/test_all.py:
--------------------------------------------------------------------------------
1 | from data import list_datasets
2 | from das import experiment
3 | import argparse
4 | from transformers import AutoTokenizer, AutoModelForCausalLM
5 | from utils import WEIGHTS
6 | import torch
7 | import os
8 |
9 | def run_command(
10 | tokenizer: AutoTokenizer,
11 | gpt: AutoModelForCausalLM,
12 | model_name: str,
13 | dataset: str,
14 | lr: float,
15 | only_das: bool,
16 | hparam_non_das: bool,
17 | das_label: str,
18 | revision: str,
19 | folder: str,
20 | manipulate: str,
21 | ):
22 | # command = f"python das.py --model EleutherAI/pythia-70m --intervention {method} --dataset {dataset} --position each --num-tokens 1 --num-dims 1 --steps {steps}"
23 | print(dataset)
24 | experiment(
25 | model=model_name,
26 | dataset=dataset,
27 | steps=100,
28 | eval_steps=100,
29 | grad_steps=1,
30 | batch_size=4,
31 | intervention_site="block_output",
32 | strategy="last",
33 | lr=lr,
34 | only_das=only_das,
35 | hparam_non_das=hparam_non_das,
36 | das_label=das_label,
37 | revision=revision,
38 | log_folder=folder,
39 | manipulate=manipulate,
40 | tokenizer=tokenizer,
41 | gpt=gpt,
42 | )
43 |
44 | def main(
45 | model: str, lr: float=5e-3, hparam_non_das: bool=False, only_das: bool=False,
46 | das_label: str=None, start: int=None, end: int=None, folder: str="das", revision: str="main",
47 | manipulate: str=False, datasets: str=None):
48 |
49 | # load model + tokenizer
50 | device = "cuda:0" if torch.cuda.is_available() else "cpu"
51 | tokenizer = AutoTokenizer.from_pretrained(model)
52 | tokenizer.pad_token = tokenizer.eos_token
53 | gpt = AutoModelForCausalLM.from_pretrained(
54 | model,
55 | revision=revision,
56 | torch_dtype=WEIGHTS.get(model, torch.bfloat16) if device == "cuda:0" else torch.float32,
57 | ).to(device)
58 |
59 | # run commands
60 | if datasets is None:
61 | datasets = [d for d in list_datasets() if d.startswith("syntaxgym/")]
62 | print(len(datasets))
63 |
64 | # start/end
65 | if start is None:
66 | start = 0
67 | if end is None:
68 | end = len(datasets)
69 |
70 | # make folder
71 | if not os.path.exists(f"logs/{folder}"):
72 | os.makedirs(f"logs/{folder}")
73 |
74 | for dataset in datasets[start:end]:
75 | run_command(tokenizer, gpt, model, dataset, lr, only_das, hparam_non_das, das_label, revision, folder, manipulate)
76 |
77 | if __name__ == "__main__":
78 | parser = argparse.ArgumentParser()
79 | parser.add_argument("--model", type=str, default="EleutherAI/pythia-70m")
80 | parser.add_argument("--lr", type=float, default=5e-3)
81 | parser.add_argument("--only-das", action="store_true")
82 | parser.add_argument("--hparam_non_das", action="store_true")
83 | parser.add_argument("--das-label", type=str, default=None)
84 | parser.add_argument("--start", type=int, default=None)
85 | parser.add_argument("--end", type=int, default=None)
86 | parser.add_argument("--folder", type=str, default="das")
87 | parser.add_argument("--revision", type=str, default="main")
88 | parser.add_argument("--manipulate", type=str, default=None)
89 | parser.add_argument("--datasets", nargs='+', default=None)
90 | args = parser.parse_args()
91 | main(**vars(args))
92 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from transformers import get_linear_schedule_with_warmup
3 |
4 | from eval import augment_data, calculate_loss, eval
5 | from utils import get_last_token
6 | from interventions import intervention_config, PooledLowRankRotatedSpaceIntervention
7 | import pyvene as pv
8 | from data import Batch
9 |
10 | def train_das(
11 | intervenable: pv.IntervenableModel, trainset: list[Batch], evalset: list[Batch],
12 | layer_i: int, pos_i: int, strategy: str, eval_steps: int, grad_steps: int, lr: float,
13 | epochs: int=1, das_label: str="das"):
14 | """Train DAS or Boundless DAS on a model."""
15 |
16 | # setup
17 | data, activations, eval_activations, stats = [], [], [], {}
18 | total_steps = len(trainset) * epochs
19 | warm_up_steps = 0.1 * total_steps
20 |
21 | # optimizer
22 | optimizer_params = []
23 | for k, v in intervenable.interventions.items():
24 | if isinstance(v[0], pv.LowRankRotatedSpaceIntervention) or isinstance(v[0], PooledLowRankRotatedSpaceIntervention):
25 | optimizer_params.append({"params": v[0].rotate_layer.parameters()})
26 | elif isinstance(v[0], pv.BoundlessRotatedSpaceIntervention):
27 | optimizer_params.append({"params": v[0].rotate_layer.parameters()})
28 | optimizer_params.append({"params": v[0].intervention_boundaries, "lr": 1e-2})
29 | optimizer = torch.optim.Adam(optimizer_params, lr=lr)
30 | # print("model trainable parameters: ", count_parameters(intervenable.model))
31 | # print("intervention trainable parameters: ", intervenable.count_parameters())
32 |
33 | # scheduler
34 | scheduler = get_linear_schedule_with_warmup(
35 | optimizer,
36 | num_warmup_steps=warm_up_steps,
37 | num_training_steps=total_steps,
38 | )
39 |
40 | # temperature for boundless
41 | total_step = 0
42 | temperature_start = 50.0
43 | temperature_end = 0.1
44 | temperature_schedule = (
45 | torch.linspace(temperature_start, temperature_end, total_steps)
46 | .to(torch.bfloat16)
47 | .to(intervenable.get_device())
48 | )
49 | intervenable.set_temperature(temperature_schedule[total_step])
50 |
51 | # train
52 | iterator = trainset * epochs
53 | total_loss = torch.tensor(0.0).to(intervenable.get_device())
54 |
55 | for step, batch in enumerate(iterator):
56 |
57 | # inference
58 | pos_interv = [[x[pos_i] for x in y] for y in batch.compute_pos(strategy)]
59 | base_outputs, counterfactual_outputs = intervenable(
60 | base=batch.base,
61 | sources=[None, batch.src],
62 | unit_locations={"sources->base": ([None, pos_interv[1]], pos_interv)},
63 | )
64 |
65 | # store activations/labels for training non-causal methods
66 | for batch_i in range(len(batch.pairs)):
67 | for unit_i in range(base_outputs[-1][batch_i].shape[0]):
68 | activation = base_outputs[-1][batch_i][unit_i].detach().cpu()
69 | activations.append((activation, batch.base_types[batch_i]))
70 |
71 | # get last token logits
72 | logits = get_last_token(counterfactual_outputs.logits, batch.base['attention_mask'])
73 |
74 | # loss and backprop
75 | loss = calculate_loss(logits, batch.src_labels)
76 | total_loss += loss
77 |
78 | # gradient accumulation
79 | if total_step % grad_steps == 0:
80 |
81 | # print stats
82 | stats["lr"] = scheduler.optimizer.param_groups[0]['lr']
83 | stats["loss"] = total_loss.item()
84 | for k, v in intervenable.interventions.items():
85 | if isinstance(v[0], pv.BoundlessRotatedSpaceIntervention):
86 | stats["bound"] = v[0].intervention_boundaries.sum() * v[0].embed_dim
87 |
88 | # backward
89 | if not (grad_steps > 1 and total_step == 0):
90 | total_loss.backward()
91 | total_loss = torch.tensor(0.0).to(intervenable.get_device())
92 | optimizer.step()
93 | scheduler.step()
94 | intervenable.set_zero_grad()
95 | intervenable.set_temperature(temperature_schedule[total_step])
96 |
97 | # eval
98 | if (step % eval_steps == 0 or step == total_steps - 1) and step != 0:
99 | more_data, summary, eval_activation = eval(intervenable, evalset, layer_i, pos_i, strategy)
100 | if eval_activations == []:
101 | eval_activations = eval_activation
102 | stats.update(summary)
103 | print(step, stats)
104 | data.extend(augment_data(more_data, {"method": das_label, "step": step}))
105 |
106 | total_step += 1
107 |
108 | # return data
109 | diff_vector = None
110 | for k, v in intervenable.interventions.items():
111 | if isinstance(v[0], pv.LowRankRotatedSpaceIntervention) or isinstance(v[0], PooledLowRankRotatedSpaceIntervention):
112 | diff_vector = v[0].rotate_layer.weight.detach().detach().cpu().tolist()
113 | break
114 | intervenable._cleanup_states()
115 | return intervenable, data, activations, eval_activations, diff_vector
116 |
117 |
118 | def train_feature_direction(
119 | method: str, intervenable: pv.IntervenableModel, activations: list[tuple[torch.tensor, str]],
120 | eval_activations: list[tuple[torch.tensor, str]], evalset: list[Batch], layer_i: int,
121 | pos_i: int, strategy: str, intervention_site: str, method_mapping: dict[str, callable]) -> tuple[list[dict], dict]:
122 | """Train/compute and evaluate an intervention direction on some activations."""
123 |
124 | # get diff vector based on method
125 | labels = [label for _, label in activations]
126 | activations = [activation.type(torch.float32) for activation, _ in activations]
127 | eval_labels = [label for _, label in eval_activations]
128 | eval_activations = [activation.type(torch.float32) for activation, _ in eval_activations]
129 |
130 | diff_vector, accuracy = method_mapping[method](activations, labels, eval_activations, eval_labels)
131 | diff_vector = diff_vector.to(intervenable.get_device()).unsqueeze(1)
132 |
133 | # new config
134 | eval_config = intervention_config(
135 | intervention_site,
136 | pv.LowRankRotatedSpaceIntervention if strategy != "all" else PooledLowRankRotatedSpaceIntervention,
137 | layer_i, 1
138 | )
139 | intervenable2 = pv.IntervenableModel(eval_config, intervenable.model)
140 | intervenable2.set_device(intervenable.get_device())
141 | intervenable2.disable_model_gradients()
142 | for k, v in intervenable2.interventions.items():
143 | if isinstance(v[0], pv.LowRankRotatedSpaceIntervention) or isinstance(v[0], PooledLowRankRotatedSpaceIntervention):
144 | v[0].rotate_layer.weight = diff_vector
145 |
146 | # eval
147 | data, summary, _ = eval(intervenable2, evalset, layer_i, pos_i, strategy)
148 | if accuracy is not None:
149 | summary["accuracy"] = accuracy
150 |
151 | # done
152 | intervenable2._cleanup_states()
153 | return augment_data(data, {"method": method, "step": -1, "accuracy": accuracy}), summary, diff_vector.detach().cpu().tolist()
--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
1 | from torch import float32, bfloat16, float16, topk, arange
2 | from collections import namedtuple
3 | import random
4 | from transformers import AutoTokenizer
5 | import csv
6 |
7 |
8 | # models and weight format
9 | MODELS = [
10 | # "gpt2",
11 | # "gpt2-medium",
12 | # "gpt2-large",
13 | # "gpt2-xl",
14 | "EleutherAI/pythia-14m",
15 | "EleutherAI/pythia-31m",
16 | "EleutherAI/pythia-70m",
17 | "EleutherAI/pythia-160m",
18 | "EleutherAI/pythia-410m",
19 | "EleutherAI/pythia-1b",
20 | "EleutherAI/pythia-1.4b",
21 | "EleutherAI/pythia-2.8b",
22 | "EleutherAI/pythia-6.9b",
23 | "EleutherAI/pythia-12b",
24 | ]
25 |
26 |
27 | WEIGHTS = {
28 | # "gpt2": float32,
29 | # "gpt2-medium": float32,
30 | # "gpt2-large": float32,
31 | # "gpt2-xl": float32,
32 | "EleutherAI/pythia-14m": float32,
33 | "EleutherAI/pythia-31m": float32,
34 | "EleutherAI/pythia-70m": float32,
35 | "EleutherAI/pythia-160m": float32,
36 | "EleutherAI/pythia-410m": float32,
37 | "EleutherAI/pythia-1b": bfloat16,
38 | "EleutherAI/pythia-1.4b": float16,
39 | "EleutherAI/pythia-2.8b": float16,
40 | "EleutherAI/pythia-6.9b": float16,
41 | "EleutherAI/pythia-12b": float16,
42 | }
43 |
44 |
45 | parameters = {
46 | "pythia-12b": 11846072320,
47 | "pythia-6.9b": 6857302016,
48 | "pythia-2.8b": 2775208960,
49 | "pythia-1.4b": 1414647808,
50 | "pythia-1b": 1011781632,
51 | "pythia-410m": 405334016,
52 | "pythia-160m": 162322944,
53 | "pythia-70m": 70426624,
54 | "pythia-31m": 31000000,
55 | "pythia-14m": 14000000,
56 | }
57 |
58 |
59 | def format_token(tokenizer, tok):
60 | """Format the token for some path patching experiment to show decoding diff"""
61 | return tokenizer.decode(tok).replace(" ", "_").replace("\n", "\\n")
62 |
63 | def top_vals(tokenizer, res, highlight=[], n=10):
64 | """Pretty print the top n values of a distribution over the vocabulary"""
65 | _, top_indices = topk(res, n)
66 | top_indices = top_indices.tolist() + highlight
67 | for i in range(len(top_indices)):
68 | val = top_indices[i]
69 | tok = format_token(tokenizer, val)
70 | if val in highlight:
71 | tok = f"\x1b[6;30;42m{tok}\x1b[0m"
72 | print(f"{tok:<34} {val:>5} {res[top_indices[i]].item():>10.4%}")
73 | else:
74 | print(f"{tok:<20} {val:>5} {res[top_indices[i]].item():>10.4%}")
75 |
76 | def get_last_token(logits, attention_mask):
77 | last_token_indices = attention_mask.sum(1) - 1
78 | batch_indices = arange(logits.size(0)).unsqueeze(1)
79 | return logits[batch_indices, last_token_indices.unsqueeze(1)].squeeze(1)
--------------------------------------------------------------------------------