├── .gitignore ├── README.md ├── benchmark.py ├── das.py ├── data.py ├── data ├── templates │ ├── data.json │ ├── data_extra.json │ ├── preposing_in_pp.json │ ├── syntaxgym.json │ └── syntaxgym_failed.json └── test_suites │ ├── center_embedding.json │ ├── filler_gap_subject.json │ ├── gss_subord_pp.json │ ├── mvrr.json │ ├── mvrr_mod.json │ ├── npi.json │ ├── npi2.json │ ├── npi_ever.json │ ├── npz_obj.json │ ├── npz_obj_mod.json │ ├── npz_v-trans.json │ ├── reflexive_number_agreement_feminine_object_relative.json │ ├── subject_verb_number_agreement_with_prepositional_phrase.json │ ├── subject_verb_number_agreement_with_subject_relative_clause.json │ └── subordination.json ├── diff_methods.py ├── eval.py ├── interventions.py ├── plot.py ├── prompt.py ├── requirements.txt ├── test_all.py ├── train.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.out 2 | .DS_Store 3 | __pycache__/ 4 | *.ipynb 5 | logs/ 6 | .ipynb_checkpoints/ 7 | .vscode/ 8 | figs/ 9 | deprecated/figs/ 10 | *.profile 11 | data/huggingface/ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |
2 | 3 | # CausalGym 4 | 5 |
6 | 7 | Aryaman Arora, Dan Jurafsky, and Christopher Potts. 2024. [CausalGym: Benchmarking causal interpretability methods on linguistic tasks](https://aclanthology.org/2024.acl-long.785/). In _Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)_, pages 14638–14663, Bangkok, Thailand. Association for Computational Linguistics. 8 | 9 | *HuggingFace dataset*: [aryaman/causalgym](https://huggingface.co/datasets/aryaman/causalgym) 10 | 11 | 12 | 13 | **CausalGym** is a benchmark for comparing the performance of causal interpretability methods on a variety of simple linguistic tasks taken from the SyntaxGym evaluation set ([Gauthier et al., 2020](https://aclanthology.org/2020.acl-demos.10/), [Hu et al., 2020](https://aclanthology.org/2020.acl-main.158/)) and converted into a format suitable for interventional interpretability. 14 | 15 | This repository includes code for: 16 | - Training DAS and all the other methods benchmarked in the paper, on every region, layer, and task for some model. This is sufficient for replicating all experiments in the paper (including hyperparameter sweeps and interpretability during training). 17 | - Reproducing every plot in the paper. 18 | - Template specifications for every task in the benchmark and utils for generating examples, tokenizing, generating non-overlapping train/test sets, and so on. 19 | - Testing model outputs on the task templates; this was used to design the benchmark tasks. 20 | 21 | You can also download the train/dev/test splits for each task as used in the paper via [HuggingFace](https://huggingface.co/datasets/aryaman/causalgym). 22 | 23 | If you are having trouble getting anything running, do not hesitate to file an issue! We would love to help you benchmark your new method or help you replicate the results from our paper. 24 | 25 | ## Instructions 26 | 27 | > [!IMPORTANT] 28 | > The implementations in this repo are only for `GPTNeoX`-type language models (e.g. the `pythia` series) and will probably not work for other architectures without some modifications. 29 | 30 | First install the requirements (a fresh environment is probably best): 31 | 32 | ```bash 33 | pip install -r requirements.txt 34 | ``` 35 | 36 | ### Training 37 | 38 | To train every method, layer, region, and task for `pythia-70m` (results are logged to the directory `logs/das/`): 39 | 40 | ```bash 41 | python test_all.py --model EleutherAI/pythia-70m 42 | ``` 43 | 44 | To do the same but with the dog-give control task used to compute selectivity: 45 | 46 | ```bash 47 | python test_all.py --model EleutherAI/pythia-70m --manipulate dog-give 48 | ``` 49 | 50 | To run just the Preposing in PP extension: 51 | 52 | ```bash 53 | python test_all.py --model EleutherAI/pythia-70m --datasets preposing_in_pp/preposing_in_pp preposing_in_pp/preposing_in_pp_embed_1 54 | ``` 55 | 56 | 57 | ### Analysis + plots 58 | 59 | Once you have run this for several models, you can create results tables (like those found in the appendix) with: 60 | 61 | ```bash 62 | python plot.py --file logs/das/ --plot summary --metric odds --reload 63 | ``` 64 | 65 | This also caches intermediate results in csv file in the directory, so you don't need to use the `--reload` option again unless you need to recompute statistics. 66 | 67 | To produce the causal tracing-style plots for all methods: 68 | 69 | ```bash 70 | python plot.py --file logs/das/ --plot pos_all --metric odds 71 | ``` 72 | 73 | To visualize just runs from the Preposing in PP extension: 74 | 75 | ```bash 76 | python plot.py --file logs/das/ --plot pos_all --metric odds --template_filename preposing_in_pp 77 | ``` 78 | 79 | You can also specify a subset of methods: 80 | 81 | ```bash 82 | python plot.py --file logs/das/ --plot pos_t --metric odds --methods das vanilla probe 83 | ``` 84 | 85 | 86 | ## Citation 87 | 88 | Please cite the CausalGym publication: 89 | 90 | ```bibtex 91 | @inproceedings{arora-etal-2024-causalgym, 92 | title = "{C}ausal{G}ym: Benchmarking causal interpretability methods on linguistic tasks", 93 | author = "Arora, Aryaman and Jurafsky, Dan and Potts, Christopher", 94 | editor = "Ku, Lun-Wei and Martins, Andre and Srikumar, Vivek", 95 | booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", 96 | month = aug, 97 | year = "2024", 98 | address = "Bangkok, Thailand", 99 | publisher = "Association for Computational Linguistics", 100 | url = "https://aclanthology.org/2024.acl-long.785", 101 | doi = "10.18653/v1/2024.acl-long.785", 102 | pages = "14638--14663" 103 | } 104 | 105 | ``` 106 | 107 | Also cite the earlier SyntaxGym papers: 108 | 109 | ```bibtex 110 | @inproceedings{gauthier-etal-2020-syntaxgym, 111 | title = "{S}yntax{G}ym: An Online Platform for Targeted Evaluation of Language Models", 112 | author = "Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger", 113 | editor = "Celikyilmaz, Asli and Wen, Tsung-Hsien", 114 | booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations", 115 | month = jul, 116 | year = "2020", 117 | address = "Online", 118 | publisher = "Association for Computational Linguistics", 119 | url = "https://aclanthology.org/2020.acl-demos.10", 120 | doi = "10.18653/v1/2020.acl-demos.10", 121 | pages = "70--76", 122 | } 123 | 124 | @inproceedings{hu-etal-2020-systematic, 125 | title = "A Systematic Assessment of Syntactic Generalization in Neural Language Models", 126 | author = "Hu, Jennifer and Gauthier, Jon and Qian, Peng and Wilcox, Ethan and Levy, Roger", 127 | editor = "Jurafsky, Dan and Chai, Joyce and Schluter, Natalie and Tetreault, Joel", 128 | booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics", 129 | month = jul, 130 | year = "2020", 131 | address = "Online", 132 | publisher = "Association for Computational Linguistics", 133 | url = "https://aclanthology.org/2020.acl-main.158", 134 | doi = "10.18653/v1/2020.acl-main.158", 135 | pages = "1725--1744", 136 | } 137 | ``` 138 | 139 | ## Task examples 140 | 141 | | **Task** | **Example** | 142 | |:-------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------| 143 | | ***Agreement*** (4) | | 144 | | `agr_gender` | \[**John**\]\[**Jane**\] walked because \[**he**\]\[**she**\] | 145 | | `agr_sv_num_subj-relc` | The \[**guard**\]\[**guards**\] that hated the manager \[**is**\]\[**are**\] | 146 | | `agr_sv_num_obj-relc` | The \[**guard**\]\[**guards**\] that the customers hated \[**is**\]\[**are**\] | 147 | | `agr_sv_num_pp` | The \[**guard**\]\[**guards**\] behind the managers \[**is**\]\[**are**\] | 148 | | ***Licensing*** (7) | | 149 | | `agr_refl_num_subj-relc` | The \[**farmer**\]\[**farmers**\] that loved the actors embarrassed \[**himself**\]\[**themselves**\] | 150 | | `agr_refl_num_obj-relc` | The \[**farmer**\]\[**farmers**\] that the actors loved embarrassed \[**himself**\]\[**themselves**\] | 151 | | `agr_refl_num_pp` | The \[**farmer**\]\[**farmers**\] behind the actors embarrassed \[**himself**\]\[**themselves**\] | 152 | | `npi_any_subj-relc` | \[**No**\]\[**The**\] consultant that has helped the taxi driver has shown \[**any**\]\[**some**\] | 153 | | `npi_any_obj-relc` | \[**No**\]\[**The**\] consultant that the taxi driver has helped has shown \[**any**\]\[**some**\] | 154 | | `npi_ever_subj-relc` | \[**No**\]\[**The**\] consultant that has helped the taxi driver has \[**ever**\]\[**never**\] | 155 | | `npi_ever_obj-relc` | \[**No**\]\[**The**\] consultant that the taxi driver has helped has \[**ever**\]\[**never**\] | 156 | | ***Garden path effects*** (6) | | 157 | | `garden_mvrr` | The infant \[**who was**\]\[**⌀**\] brought the sandwich from the kitchen \[**by**\]\[**.**\] | 158 | | `garden_mvrr_mod` | The infant \[**who was**\]\[**⌀**\] brought the sandwich from the kitchen with a new microwave \[**by**\]\[**.**\] | 159 | | `garden_npz_obj` | While the students dressed \[**,**\]\[**⌀**\] the comedian \[**was**\]\[**for**\] | 160 | | `garden_npz_obj_mod` | While the students dressed \[**,**\]\[**⌀**\] the comedian who told bad jokes \[**was**\]\[**for**\] | 161 | | `garden_npz_v-trans` | As the criminal \[**slept**\]\[**shot**\] the woman \[**was**\]\[**for**\] | 162 | | `garden_npz_v-trans_mod` | As the criminal \[**slept**\]\[**shot**\] the woman who told bad jokes \[**was**\]\[**for**\] | 163 | | ***Gross syntactic state*** (4) | | 164 | | `gss_subord` | \[**While the**\]\[**The**\] lawyers lost the plans \[**they**\]\[**.**\] | 165 | | `gss_subord_subj-relc` | \[**While the**\]\[**The**\] lawyers who wore white lab jackets studied the book that described several advances in cancer therapy \[**,**\]\[**.**\] | 166 | | `gss_subord_obj-relc` | \[**While the**\]\[**The**\] lawyers who the spy had contacted repeatedly studied the book that colleagues had written on cancer therapy \[**,**\]\[**.**\] | 167 | | `gss_subord_pp` | \[**While the**\]\[**The**\] lawyers in a long white lab jacket studied the book about several recent advances in cancer therapy \[**,**\]\[**.**\] | 168 | | ***Long-distance dependencies*** (8) | | 169 | | `cleft` | What the young man \[**did**\]\[**ate**\] was \[**make**\]\[**for**\] | 170 | | `cleft_mod` | What the young man \[**did**\]\[**ate**\] after the ingredients had been bought from the store was \[**make**\]\[**for**\] | 171 | | `filler_gap_embed_3` | I know \[**that**\]\[**what**\] the mother said the friend remarked the park attendant reported your friend sent \[**him**\]\[**.**\] | 172 | | `filler_gap_embed_4` | I know \[**that**\]\[**what**\] the mother said the friend remarked the park attendant reported the cop thinks your friend sent \[**him**\]\[**.**\] | 173 | | `filler_gap_hierarchy` | The fact that the brother said \[**that**\]\[**who**\] the friend trusted \[**the**\]\[**was**\] | 174 | | `filler_gap_obj` | I know \[**that**\]\[**what**\] the uncle grabbed \[**him**\]\[**.**\] | 175 | | `filler_gap_pp` | I know \[**that**\]\[**what**\] the uncle grabbed food in front of \[**him**\]\[**.**\] | 176 | | `filler_gap_subj` | I know \[**that**\]\[**who**\] the uncle grabbed food in front of \[**him**\]\[**.**\] | 177 | -------------------------------------------------------------------------------- /benchmark.py: -------------------------------------------------------------------------------- 1 | """ 2 | Check if a model produces the expected output for a task. 3 | """ 4 | 5 | import torch 6 | from transformers import AutoTokenizer, AutoModelForCausalLM 7 | from data import Dataset, list_datasets 8 | from utils import WEIGHTS, MODELS, top_vals, format_token, get_last_token 9 | import argparse 10 | from tqdm import tqdm 11 | import json 12 | 13 | @torch.no_grad() 14 | def benchmark(model=None, task=None, debug=False, rank=False): 15 | 16 | # get models, data 17 | if model is None: 18 | models = reversed(MODELS) 19 | else: 20 | models = [model] 21 | datasets = [dataset for dataset in list_datasets() if dataset.startswith(f"syntaxgym/{task if task is not None else ''}")] 22 | data = [] 23 | 24 | # benchmark 25 | for model in models: 26 | 27 | # load model 28 | device = "cuda:0" if torch.cuda.is_available() else "cpu" 29 | tokenizer = AutoTokenizer.from_pretrained(model) 30 | tokenizer.pad_token = tokenizer.eos_token 31 | gpt = AutoModelForCausalLM.from_pretrained( 32 | model, 33 | revision="main", 34 | torch_dtype=WEIGHTS.get(model, torch.bfloat16) if device == "cuda:0" else torch.float32, 35 | ).to(device) 36 | gpt.eval() 37 | # print model dtype 38 | print(f"{model:<30} {gpt.dtype}") 39 | 40 | # make data 41 | for dataset in datasets: 42 | data_source = Dataset.load_from(dataset) 43 | trainset = data_source.sample_batches(tokenizer, 8, 50, device, seed=42) 44 | count, correct = 0, 0 45 | probs = {} 46 | 47 | for batch in tqdm(trainset): 48 | # vars 49 | base_label = batch.base_labels 50 | src_label = batch.src_labels 51 | base_type = batch.base_types 52 | src_type = batch.src_types 53 | 54 | # inference 55 | base_output = gpt(**batch.base) 56 | src_output = gpt(**batch.src) 57 | base_logits = get_last_token(base_output.logits, batch.base['attention_mask']) 58 | src_logits = get_last_token(src_output.logits, batch.src['attention_mask']) 59 | 60 | # check for batch accuracy 61 | for i in range(4): 62 | base_probs = torch.softmax(base_logits[i], dim=-1) 63 | src_probs = torch.softmax(src_logits[i], dim=-1) 64 | if base_probs[base_label[i]] > base_probs[src_label[i]] and src_probs[src_label[i]] > src_probs[base_label[i]]: 65 | correct += 1 66 | if debug: 67 | print(base_probs[base_label[i]] > base_probs[src_label[i]] and src_probs[src_label[i]] > src_probs[base_label[i]]) 68 | print(tokenizer.decode(batch.base['input_ids'][i])) 69 | top_vals(tokenizer, base_probs, n=5, highlight=[base_label[i], src_label[i]]) 70 | print(tokenizer.decode(batch.src['input_ids'][i])) 71 | top_vals(tokenizer, src_probs, n=5, highlight=[base_label[i], src_label[i]]) 72 | input() 73 | if count == 0: 74 | probs[base_type[i]] = base_probs 75 | probs[src_type[i]] = src_probs 76 | else: 77 | probs[base_type[i]] += base_probs 78 | probs[src_type[i]] += src_probs 79 | count += 1 80 | 81 | # store stats 82 | data.append({ 83 | "model": model, 84 | "dataset": dataset, 85 | "count": count, 86 | "correct": correct, 87 | "iia": correct / count, 88 | "parameters": gpt.num_parameters(), 89 | }) 90 | print(f"{dataset:<30} {correct / count:>10.2%} ({correct} / {count})") 91 | if rank: 92 | for k, v in probs.items(): 93 | probs[k] = (v / count) 94 | print(k.upper()) 95 | top_vals(tokenizer, probs[k], n=50) 96 | print('---') 97 | print("DIFF") 98 | top_vals(tokenizer, list(probs.values())[1] - list(probs.values())[0], n=50) 99 | print('---') 100 | top_vals(tokenizer, list(probs.values())[0] - list(probs.values())[1], n=50) 101 | print('---') 102 | 103 | # save data 104 | with open("logs/benchmark.json", "w") as f: 105 | json.dump(data, f) 106 | 107 | 108 | def main(): 109 | parser = argparse.ArgumentParser() 110 | parser.add_argument("--model", type=str, default=None) 111 | parser.add_argument("--task", type=str, default=None) 112 | parser.add_argument("--debug", action="store_true") 113 | parser.add_argument("--rank", action="store_true") 114 | args = parser.parse_args() 115 | benchmark(**vars(args)) 116 | 117 | if __name__ == "__main__": 118 | main() -------------------------------------------------------------------------------- /das.py: -------------------------------------------------------------------------------- 1 | from numpy import add 2 | import torch 3 | import os 4 | import argparse 5 | from transformers import AutoTokenizer, AutoModelForCausalLM, GPTNeoXForCausalLM 6 | from utils import WEIGHTS 7 | from data import Dataset 8 | from eval import eval, augment_data 9 | from train import train_das, train_feature_direction 10 | from diff_methods import method_mapping, additional_method_mapping, probe_mapping 11 | import datetime 12 | import json 13 | from typing import Union 14 | 15 | from pyvene.models.intervenable_base import IntervenableModel 16 | from interventions import * 17 | 18 | 19 | # make das subdir 20 | if not os.path.exists("figs/das"): 21 | os.makedirs("figs/das") 22 | if not os.path.exists("figs/das/steps"): 23 | os.makedirs("figs/das/steps") 24 | if not os.path.exists("logs/das"): 25 | os.makedirs("logs/das") 26 | 27 | # clear files from figs/das/steps 28 | for file in os.listdir("figs/das/steps"): 29 | os.remove(os.path.join("figs/das/steps", file)) 30 | 31 | 32 | def experiment( 33 | model: str, 34 | dataset: str, 35 | steps: int, 36 | eval_steps: int, 37 | grad_steps: int, 38 | batch_size: int, 39 | intervention_site: str, 40 | strategy: str, 41 | lr: float, 42 | only_das: bool=False, 43 | hparam_non_das: bool=False, 44 | das_label: str=None, 45 | revision: str="main", 46 | log_folder: str="das", 47 | manipulate: Union[str, None]=None, 48 | tokenizer: Union[AutoTokenizer, None]=None, 49 | gpt: Union[AutoModelForCausalLM, None]=None, 50 | ): 51 | """Run a feature-finding experiment.""" 52 | 53 | # load model 54 | total_data = [] 55 | diff_vectors = [] 56 | NOW = datetime.datetime.now().strftime("%Y%m%d%H%M%S%f") 57 | device = "cuda:0" if torch.cuda.is_available() else "cpu" 58 | if tokenizer is None: 59 | tokenizer = AutoTokenizer.from_pretrained(model) 60 | tokenizer.pad_token = tokenizer.eos_token 61 | if gpt is None: 62 | weight_type = WEIGHTS.get(model, torch.float16) if device == "cuda:0" else torch.float32 63 | gpt = GPTNeoXForCausalLM.from_pretrained( 64 | model, 65 | revision=revision, 66 | torch_dtype=weight_type, 67 | use_flash_attention_2=(weight_type in [torch.bfloat16, torch.float16] and device == "cuda:0"), 68 | ).to(device) 69 | print(model, gpt.config.num_hidden_layers) 70 | gpt.eval() 71 | 72 | # make dataset, ensuring examples in trainset are not in evalset 73 | data_source = Dataset.load_from(dataset) 74 | trainset = data_source.sample_batches(tokenizer, batch_size, steps, device, seed=42, manipulate=manipulate) 75 | print(trainset[0]) 76 | discard = set() 77 | for batch in trainset: 78 | for pair in batch.pairs: 79 | discard.add(''.join(pair.base)) 80 | 81 | # evalset 82 | eval_seed = 420 if hparam_non_das else 1 83 | evalset = data_source.sample_batches(tokenizer, batch_size, 25, device, seed=eval_seed, discard=discard, manipulate=manipulate) 84 | 85 | # methods 86 | if hparam_non_das: 87 | method_mapping.update(additional_method_mapping) 88 | if model in probe_mapping: 89 | for i, probe_func in enumerate(probe_mapping[model]): 90 | method_mapping[f"probe_{i}"] = probe_func 91 | print(list(method_mapping.keys())) 92 | 93 | # entering train loops 94 | for pos_i in range(data_source.first_var_pos, data_source.length): 95 | if trainset[0].compute_pos(strategy)[0][0][pos_i][0] == -1: 96 | continue 97 | 98 | # per-layer training loop 99 | iterator = range(gpt.config.num_hidden_layers) 100 | for layer_i in iterator: 101 | print(f"position {pos_i} ({data_source.span_names[pos_i]}), layer {layer_i}") 102 | data = [] 103 | 104 | # vanilla intervention 105 | if strategy != "all" and not only_das: 106 | intervenable_config = intervention_config( 107 | intervention_site, pv.VanillaIntervention, layer_i, 0 108 | ) 109 | intervenable = IntervenableModel(intervenable_config, gpt) 110 | intervenable.set_device(device) 111 | intervenable.disable_model_gradients() 112 | 113 | more_data, summary, _ = eval(intervenable, evalset, layer_i, pos_i, strategy) 114 | intervenable._cleanup_states() 115 | data.extend(augment_data(more_data, {"method": "vanilla", "step": -1})) 116 | print(f"vanilla: {summary}") 117 | 118 | # DAS intervention 119 | intervenable_config = intervention_config( 120 | intervention_site, 121 | pv.LowRankRotatedSpaceIntervention if strategy != "all" else PooledLowRankRotatedSpaceIntervention, 122 | layer_i, 1 123 | ) 124 | intervenable = IntervenableModel(intervenable_config, gpt) 125 | intervenable.set_device(device) 126 | intervenable.disable_model_gradients() 127 | 128 | _, more_data, activations, eval_activations, diff_vector = train_das( 129 | intervenable, trainset, evalset, layer_i, pos_i, strategy, 130 | eval_steps, grad_steps, lr=lr, das_label="das" if das_label is None else das_label) 131 | diff_vectors.append({"method": "das" if das_label is None else das_label, 132 | "layer": layer_i, "pos": pos_i, "vec": diff_vector}) 133 | data.extend(more_data) 134 | 135 | # test other methods 136 | if not only_das: 137 | for method in list(method_mapping.keys()): 138 | try: 139 | more_data, summary, diff_vector = train_feature_direction( 140 | method, intervenable, activations, eval_activations, 141 | evalset, layer_i, pos_i, strategy, intervention_site, 142 | method_mapping 143 | ) 144 | print(f"{method}: {summary}") 145 | diff_vectors.append({"method": method, "layer": layer_i, "pos": pos_i, "vec": diff_vector}) 146 | data.extend(more_data) 147 | except: 148 | continue 149 | 150 | # store all data 151 | total_data.extend(augment_data(data, {"layer": layer_i, "pos": pos_i})) 152 | 153 | # make data dump 154 | short_dataset_name = dataset.split('/')[-1] 155 | short_model_name = model.split('/')[-1] + (f"_{revision}" if revision != "main" else "") 156 | filedump = { 157 | "metadata": { 158 | "model": model + (f"_{revision}" if revision != "main" else ""), 159 | "dataset": dataset, 160 | "steps": steps, 161 | "eval_steps": eval_steps, 162 | "grad_steps": grad_steps, 163 | "batch_size": batch_size, 164 | "intervention_site": intervention_site, 165 | "strategy": strategy, 166 | "lr": lr, 167 | "span_names": data_source.span_names, 168 | "manipulate": manipulate, 169 | }, 170 | "data": total_data, 171 | "vec": diff_vectors, 172 | } 173 | 174 | # log 175 | if manipulate is None: 176 | manipulate = "orig" 177 | log_file = f"logs/{log_folder}/{NOW}__{short_model_name}__{short_dataset_name}__{manipulate}.json" 178 | print(f"logging to {log_file}") 179 | with open(log_file, "w") as f: 180 | json.dump(filedump, f) 181 | 182 | 183 | def main(): 184 | parser = argparse.ArgumentParser() 185 | parser.add_argument("--model", type=str, default="EleutherAI/pythia-70m") 186 | parser.add_argument("--dataset", type=str, default="syntaxgym/agr_gender") 187 | parser.add_argument("--steps", type=int, default=100) 188 | parser.add_argument("--eval-steps", type=int, default=25) 189 | parser.add_argument("--grad-steps", type=int, default=1) 190 | parser.add_argument("--batch-size", type=int, default=4) 191 | parser.add_argument("--intervention-site", type=str, default="block_output") 192 | parser.add_argument("--strategy", type=str, default="last") 193 | parser.add_argument("--lr", type=float, default=5e-3) 194 | parser.add_argument("--only-das", action="store_true") 195 | parser.add_argument("--hparam-non-das", action="store_true") 196 | parser.add_argument("--das-label", type=str, default=None) 197 | parser.add_argument("--revision", type=str, default="main") 198 | parser.add_argument("--log-folder", type=str, default="das") 199 | parser.add_argument("--manipulate", type=str, default=None) 200 | args = parser.parse_args() 201 | print(vars(args)) 202 | experiment(**vars(args)) 203 | 204 | 205 | if __name__ == "__main__": 206 | main() 207 | -------------------------------------------------------------------------------- /data.py: -------------------------------------------------------------------------------- 1 | import json 2 | from transformers import AutoTokenizer, AutoModel 3 | import random 4 | import torch 5 | from collections import defaultdict, namedtuple 6 | import json 7 | import glob 8 | from typing import Union 9 | import re 10 | from tqdm import tqdm 11 | 12 | random.seed(42) 13 | Tokenized = namedtuple("Tokenized", ["base", "src", "alignment_base", "alignment_src"]) 14 | 15 | 16 | class Pair: 17 | """ 18 | A pair of sentences where all features except one are held constant. 19 | 20 | Each pair has a base sentence and a source sentence. These two sentences 21 | have different "types" (the value of the differing feature) and different 22 | "labels" (expected continuation of the sentence). 23 | """ 24 | 25 | base: list[str] 26 | src: list[str] 27 | base_type: str 28 | src_type: str 29 | base_label: str 30 | src_label: str 31 | 32 | 33 | def __init__(self, base: list[str], src: list[str], base_type: str, src_type: str, base_label: str, src_label: str): 34 | self.base = base 35 | self.src = src 36 | self.base_type = base_type 37 | self.src_type = src_type 38 | self.base_label = base_label 39 | self.src_label = src_label 40 | 41 | 42 | def tokenize(self, tokenizer: AutoTokenizer, device: str="cpu") -> Tokenized: 43 | """Tokenize the pair and produce alignments.""" 44 | alignment_base, alignment_src = [], [] 45 | pos_base, pos_src = 0, 0 46 | for span_i in range(len(self.base)): 47 | 48 | # get span lengths in tokens 49 | tok_base = tokenizer.tokenize(self.base[span_i]) 50 | tok_src = tokenizer.tokenize(self.src[span_i]) 51 | alignment = [ 52 | list(range(pos_base, pos_base + len(tok_base))), 53 | list(range(pos_src, pos_src + len(tok_src))) 54 | ] 55 | 56 | # update positions 57 | alignment_base.append(alignment[0]) 58 | alignment_src.append(alignment[1]) 59 | pos_base += len(tok_base) 60 | pos_src += len(tok_src) 61 | 62 | # tokenize full pair and return 63 | base_tok = tokenizer(''.join(self.base), return_tensors="pt", padding=True).to(device) 64 | src_tok = tokenizer(''.join(self.src), return_tensors="pt", padding=True).to(device) 65 | return Tokenized(base=base_tok, src=src_tok, alignment_base=alignment_base, alignment_src=alignment_src) 66 | 67 | 68 | def swap(self) -> "Pair": 69 | """Swap the base and src sentences.""" 70 | return Pair(self.src, self.base, self.src_type, self.base_type, self.src_label, self.base_label) 71 | 72 | 73 | def __repr__(self): 74 | return f"Pair('{self.base}' > '{self.base_label}', '{self.src}' > '{self.src_label}', {self.base_type}, {self.src_type})" 75 | 76 | 77 | class Batch: 78 | """ 79 | A Batch is a collection of Pairs that have been tokenized and padded. 80 | The messy part is figuring out where to do interventions, so a Batch 81 | encapsulates the functions for computing pos_i for the intervention 82 | at inference time, using the tokenized pair and alignments. 83 | """ 84 | 85 | def __init__(self, pairs: list[Pair], tokenizer: AutoTokenizer, device: str="cpu"): 86 | self.pairs = pairs 87 | 88 | # tokenize base and src 89 | tokenized = [pair.tokenize(tokenizer, device) for pair in pairs] 90 | max_len = max([max(x.base.input_ids.shape[-1], x.src.input_ids.shape[-1]) for x in tokenized]) 91 | self.base = self._stack_and_pad([x.base for x in tokenized], max_len=max_len) 92 | self.src = self._stack_and_pad([x.src for x in tokenized], max_len=max_len) 93 | self.alignment_base = [x.alignment_base for x in tokenized] 94 | self.alignment_src = [x.alignment_src for x in tokenized] 95 | 96 | # labels 97 | self.base_labels = torch.LongTensor([tokenizer.encode(pair.base_label)[0] for pair in pairs]).to(device) 98 | self.src_labels = torch.LongTensor([tokenizer.encode(pair.src_label)[0] for pair in pairs]).to(device) 99 | self.base_types = [pair.base_type for pair in pairs] 100 | self.src_types = [pair.src_type for pair in pairs] 101 | self.cached_pos = {} 102 | 103 | 104 | def _pos_bounds(self, span1: list[int], span2: list[int]) -> list[int]: 105 | """Compute the bounds of a span.""" 106 | if self.pos_strategy == "first": 107 | return span1[:1], span2[:1] 108 | elif self.pos_strategy == "last": 109 | return span1[-1:], span2[-1:] 110 | elif self.pos_strategy == "all": 111 | max_len = max(len(span1), len(span2)) 112 | return span1 + [span1[-1]] * (max_len - len(span1)), span2 + [span2[-1]] * (max_len - len(span2)) 113 | 114 | 115 | def compute_pos(self, strategy: str) -> torch.LongTensor: 116 | """Compute pos alignments as tensors.""" 117 | # shape of alignment: [batch_size, 2, num_spans, tokens_in_span] 118 | # not a proper tensor though! tokens_in_span is variable, rest is constant 119 | if strategy in self.cached_pos: 120 | return self.cached_pos[strategy] 121 | self.pos_strategy = strategy 122 | assert self.pos_strategy in ["first", "last", "all"] 123 | rets_base, rets_src = [], [] 124 | for batch_i in range(len(self.pairs)): 125 | ret_base, ret_src = [], [] 126 | for span_i in range(len(self.alignment_src[batch_i])): 127 | # skip null alignments 128 | if len(self.alignment_base[batch_i][span_i]) == 0 or len(self.alignment_src[batch_i][span_i]) == 0: 129 | ret_base.append([-1]) 130 | ret_src.append([-1]) 131 | else: 132 | bounds = self._pos_bounds(self.alignment_base[batch_i][span_i], self.alignment_src[batch_i][span_i]) 133 | ret_base.append(bounds[0]) 134 | ret_src.append(bounds[1]) 135 | rets_base.append(ret_base) 136 | rets_src.append(ret_src) 137 | 138 | # shape: [2, batch_size, length, 1] 139 | # dim 0 -> src, base (the intervention code wants src first) 140 | ret = [rets_src, rets_base] 141 | self.cached_pos[strategy] = ret 142 | return ret 143 | 144 | 145 | def _stack_and_pad(self, input_list: dict, pad_token: int=0, max_len: int=100) -> dict: 146 | """Stack and pad a list of tensors outputs from a tokenizer.""" 147 | input_ids = torch.stack([torch.nn.functional.pad(x.input_ids[0], (0, max_len - x.input_ids.shape[-1]), mode='constant', value=pad_token) for x in input_list]) 148 | attention_mask = torch.stack([torch.nn.functional.pad(x.attention_mask[0], (0, max_len - x.attention_mask.shape[-1]), mode='constant', value=0) for x in input_list]) 149 | return {"input_ids": input_ids, "attention_mask": attention_mask} 150 | 151 | 152 | def __repr__(self): 153 | return f"Batch({len(self.pairs)} pairs)\n" + "\n".join([f" {pair}" for pair in self.pairs]) 154 | 155 | 156 | class Dataset: 157 | """ 158 | A Dataset is a template for generating minimal pairs that is loaded 159 | from a JSON specification. 160 | 161 | We importantly want examples generated from a dataset to include token- 162 | level alignments. 163 | """ 164 | 165 | templates: list[str] 166 | label_vars: list[str] 167 | labels: dict[str, list[str]] 168 | variables: dict[str, Union[list[str], dict[str, list[str]]]] 169 | result_prepend_space: bool 170 | 171 | 172 | def __init__(self, data: dict): 173 | # load basics 174 | self.templates = data["templates"] 175 | self.template = [x for x in re.split(r"(?<=\})|(?= \{)|(?' + random.choice(self.templates)) if x != ''] 176 | self.label_vars = data["label"] if isinstance(data["label"], list) else [data["label"]] 177 | self.labels = data["labels"] 178 | self.types = list(self.labels.keys()) 179 | self.variables = data["variables"] 180 | 181 | # split template into spans and find variables 182 | self.vars_per_span, self.span_names = [], [] 183 | self.first_var_pos = None 184 | for token_i in range(len(self.template)): 185 | var = re.findall(r"\{(.+?)\}", self.template[token_i]) 186 | if len(var) > 0 and self.first_var_pos is None: 187 | if var[0] in self.label_vars: 188 | self.first_var_pos = token_i 189 | self.vars_per_span.append(var) 190 | self.span_names.append("{" + var[0] + "}" if len(var) == 1 else self.template[token_i].replace(' ', '_')) 191 | 192 | # other stuff 193 | length = {} 194 | for var in self.variables: 195 | if '.' in var: 196 | head_var = var.split('.')[0] 197 | if head_var not in length: 198 | length[head_var] = len(self.variables[var]) 199 | else: 200 | assert length[head_var] == len(self.variables[var]), f"Variable {var} has length {len(self.variables[var])} but {head_var} has length {length[head_var]}" 201 | self.result_prepend_space = data["result_prepend_space"] 202 | 203 | 204 | @classmethod 205 | def load_from(self, template: str) -> "Dataset": 206 | """Load a Dataset from a json template.""" 207 | template_file, template_name = template.split('/') 208 | if template_name.endswith("_inverted"): 209 | template_name = template_name[:-len("_inverted")] 210 | data = json.load(open(f"data/templates/{template_file}.json", "r")) 211 | return Dataset(data[template_name]) 212 | 213 | 214 | @property 215 | def length(self) -> int: 216 | return len(self.template) 217 | 218 | 219 | def sample_pair(self) -> Pair: 220 | """Sample a minimal pair from the dataset.""" 221 | # pick types (should differ) 222 | base_type = random.choice(self.types) 223 | src_type = base_type 224 | while src_type == base_type: 225 | src_type = random.choice(self.types) 226 | 227 | # make template 228 | base, src = self.template[:], self.template[:] 229 | 230 | # go token by token 231 | stored_choices = {} 232 | for token_i in range(len(self.template)): 233 | var = self.vars_per_span[token_i] 234 | if len(var) == 0: continue 235 | var = var[0] 236 | var_temp = '{' + var + '}' 237 | 238 | # set label vars (different) 239 | if var in self.label_vars: 240 | base_choice = random.choice(self.variables[var][base_type]) 241 | src_choice = random.choice(self.variables[var][src_type]) 242 | base[token_i] = base[token_i].replace(var_temp, base_choice) 243 | src[token_i] = src[token_i].replace(var_temp, src_choice) 244 | # set other vars (same for both) 245 | elif '.' in var: 246 | head_var = var.split('.')[0] 247 | if head_var not in stored_choices: 248 | stored_choices[head_var] = random.randint(0, len(self.variables[var]) - 1) 249 | base[token_i] = base[token_i].replace(var_temp, self.variables[var][stored_choices[head_var]]) 250 | src[token_i] = src[token_i].replace(var_temp, self.variables[var][stored_choices[head_var]]) 251 | else: 252 | choice = random.choice(self.variables[var]) 253 | base[token_i] = base[token_i].replace(var_temp, choice) 254 | src[token_i] = src[token_i].replace(var_temp, choice) 255 | 256 | # get continuations 257 | base_label = random.choice(self.labels[base_type]) 258 | src_label = random.choice(self.labels[src_type]) 259 | if self.result_prepend_space: 260 | base_label = " " + base_label 261 | src_label = " " + src_label 262 | 263 | return Pair(base, src, base_type, src_type, base_label, src_label) 264 | 265 | 266 | @torch.no_grad() 267 | def _sample_doable_pair(self, model: AutoModel, tokenizer: AutoTokenizer, device: str="cpu", discard: set[str]=set()) -> Pair: 268 | """Sample a minimal pair from the dataset that is correctly labelled by a model.""" 269 | 270 | # keep resampling until we get a pair that is correctly labelled 271 | correct, ct = False, 0 272 | while not correct: 273 | pair = self.sample_pair(discard) 274 | if ''.join(pair.base) in discard: 275 | continue 276 | base = tokenizer(''.join(pair.base), return_tensors="pt").to(device) 277 | src = tokenizer(''.join(pair.src), return_tensors="pt").to(device) 278 | base_logits = model(**base).logits[0, -1] 279 | src_logits = model(**src).logits[0, -1] 280 | base_label = tokenizer.encode(pair.base_label)[0] 281 | src_label = tokenizer.encode(pair.src_label)[0] 282 | if base_logits[base_label] > base_logits[src_label] and src_logits[src_label] > src_logits[base_label]: 283 | correct = True 284 | ct += 1 285 | if ct == 20 and not correct: 286 | print("WARNING: could not find a doable pair after 20 iterations") 287 | print("Using a random pair instead") 288 | break 289 | 290 | return pair 291 | 292 | 293 | def sample_batch( 294 | self, tokenizer: AutoTokenizer, batch_size: int, device: str="cpu", 295 | model: Union[AutoModel, None]=None, discard: set[str]=set(), 296 | manipulate: Union[str, None]=None) -> Batch: 297 | """Sample a batch of minimal pairs from the dataset.""" 298 | pairs = [] 299 | 300 | # get the pairs 301 | if model is None: 302 | while len(pairs) < batch_size // 2: 303 | pair = self.sample_pair() 304 | ct = 0 305 | while ''.join(pair.base) in discard: 306 | pair = self.sample_pair() 307 | ct += 1 308 | if ct == 20: 309 | print("WARNING: could not find a pair not in the discard set after 20 iterations") 310 | print("Using a random pair instead") 311 | break 312 | pairs.append(pair) 313 | else: 314 | pairs = [ 315 | self._sample_doable_pair(model, tokenizer, device, discard) 316 | for _ in range(batch_size // 2) 317 | ] 318 | 319 | # control tasks 320 | if manipulate == "invert": 321 | for i in range(len(pairs)): 322 | pairs[i].base_label, pairs[i].src_label = pairs[i].src_label, pairs[i].base_label 323 | elif manipulate == "dog-give": 324 | for i in range(len(pairs)): 325 | pairs[i].base_label = " dog" if pairs[i].base_type == self.types[0] else " give" 326 | pairs[i].src_label = " dog" if pairs[i].src_type == self.types[0] else " give" 327 | elif manipulate == "random": 328 | for i in range(len(pairs)): 329 | if random.random() < 0.5: 330 | pairs[i].base_label, pairs[i].src_label = pairs[i].src_label, pairs[i].base_label 331 | pairs[i].base_type, pairs[i].src_type = pairs[i].src_type, pairs[i].base_type 332 | 333 | # add flipped pairs 334 | for i in range(batch_size // 2): 335 | pairs.append(pairs[i].swap()) 336 | 337 | # make batch 338 | return Batch(pairs, tokenizer, device) 339 | 340 | 341 | def sample_batches( 342 | self, tokenizer: AutoTokenizer, batch_size: int, num_batches: int, 343 | device: str="cpu", seed: int=42, model: Union[AutoModel, None]=None, 344 | discard: set[str]=set(), manipulate: Union[str, None]=None) -> list[Batch]: 345 | """Sample a list of batches of minimal pairs from the dataset.""" 346 | random.seed(seed) 347 | return [self.sample_batch(tokenizer, batch_size, device, model, discard, manipulate) for _ in range(num_batches)] 348 | 349 | 350 | def load_from_syntaxgym(): 351 | for suite_file in glob.glob("data/test_suites/gss_subord_pp.json"): 352 | print(suite_file.split('/')[-1]) 353 | with open(suite_file, "r") as f: 354 | suite = json.load(f) 355 | if "items" not in suite: 356 | continue 357 | print(len(suite["items"])) 358 | 359 | region_numbers = defaultdict(list) 360 | for i, item in enumerate(suite["items"]): 361 | for condition in item["conditions"]: 362 | for region in condition["regions"]: 363 | region_numbers[f"{condition['condition_name']}_{region['region_number']}"].append(region["content"]) 364 | 365 | for key in region_numbers: 366 | print(key, json.dumps(region_numbers[key])) 367 | 368 | 369 | def list_datasets() -> list[str]: 370 | """List all available datasets.""" 371 | datasets = [] 372 | for template_file in glob.glob("data/templates/*.json"): 373 | name = template_file.split("/")[-1].split(".json")[0] 374 | with open(template_file, "r") as f: 375 | data = json.load(f) 376 | datasets.extend([name + "/" + x for x in data.keys()]) 377 | return datasets 378 | 379 | 380 | def convert_to_huggingface_format(): 381 | """Generate dataset files for HuggingFace upload.""" 382 | tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-70m") 383 | tokenizer.pad_token = tokenizer.eos_token 384 | datasets = [x for x in list_datasets() if x.startswith("syntaxgym/")] 385 | all_data = defaultdict(list) 386 | for dataset in tqdm(datasets): 387 | data = Dataset.load_from(dataset) 388 | 389 | # sample trainset 390 | trainset = data.sample_batches(tokenizer, 4, 100, "cpu", seed=42, manipulate=None) 391 | discard = set() 392 | for batch in trainset: 393 | for pair in batch.pairs: 394 | discard.add(''.join(pair.base)) 395 | 396 | # dev + test exclude trainset 397 | devset = data.sample_batches(tokenizer, 4, 25, "cpu", seed=420, manipulate=None, discard=discard) 398 | testset = data.sample_batches(tokenizer, 4, 25, "cpu", seed=1, manipulate=None, discard=discard) 399 | 400 | # get pairs from each batch 401 | groups = {"train": trainset, "dev": devset, "test": testset} 402 | for split, batches in groups.items(): 403 | pairs = [pair for batch in batches for pair in batch.pairs] 404 | all_data[split].extend([{ 405 | "base": pair.base, 406 | "src": pair.src, 407 | "base_type": pair.base_type, 408 | "src_type": pair.src_type, 409 | "base_label": pair.base_label, 410 | "src_label": pair.src_label, 411 | "task": dataset.split('/')[1] 412 | } for pair in pairs]) 413 | 414 | # dump 415 | for split in all_data: 416 | with open(f"data/huggingface/{split}.json", "w") as f: 417 | json.dump(all_data[split], f, indent=2) 418 | 419 | 420 | if __name__ == "__main__": 421 | convert_to_huggingface_format() -------------------------------------------------------------------------------- /data/templates/data.json: -------------------------------------------------------------------------------- 1 | { 2 | "gender_basic": { 3 | "templates": [ 4 | "{name} {completion} because" 5 | ], 6 | "label": "name", 7 | "label_prepend_space": false, 8 | "variables": { 9 | "name": { 10 | "he": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 11 | "she": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 12 | }, 13 | "completion": [ 14 | "walked" 15 | ] 16 | } 17 | }, 18 | "gender": { 19 | "templates": [ 20 | "{name} {completion} because" 21 | ], 22 | "label": "name", 23 | "label_prepend_space": false, 24 | "variables": { 25 | "name": { 26 | "he": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 27 | "she": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 28 | }, 29 | "completion": [ 30 | "walked", "ran", "agreed", "blinked", "bounced", "called", "disappeared", "lied", "laughed", "paid" 31 | ] 32 | } 33 | }, 34 | "gender_is_a": { 35 | "templates": [ 36 | "{name} is a" 37 | ], 38 | "label": "name", 39 | "label_prepend_space": false, 40 | "variables": { 41 | "name": { 42 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 43 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 44 | } 45 | } 46 | }, 47 | "number": { 48 | "templates": [ 49 | "The {noun}" 50 | ], 51 | "label": "noun", 52 | "label_prepend_space": true, 53 | "variables": { 54 | "noun": { 55 | "is": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"], 56 | "are": ["managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"] 57 | } 58 | } 59 | }, 60 | "animacy": { 61 | "templates": [ 62 | "The {noun} fell because" 63 | ], 64 | "label": "noun", 65 | "label_prepend_space": true, 66 | "variables": { 67 | "noun": { 68 | "he": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"], 69 | "it": ["box", "stone", "table", "chair", "book", "car", "house", "tree", "rock", "ball", "computer", "phone", "desk", "bed"] 70 | } 71 | } 72 | } 73 | } 74 | -------------------------------------------------------------------------------- /data/templates/data_extra.json: -------------------------------------------------------------------------------- 1 | { 2 | "gender": { 3 | "templates": [ 4 | "{name} {completion} because" 5 | ], 6 | "label": "name", 7 | "label_prepend_space": false, 8 | "variables": { 9 | "name": { 10 | "he": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 11 | "she": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 12 | }, 13 | "completion": [ 14 | "walked", 15 | "is tired", "is excited", "is ready", "went home", 16 | "is walking", "ran", "is running", "works there", 17 | "joined the army", "plays soccer", "likes playing games", 18 | "said no to me" 19 | ] 20 | } 21 | }, 22 | "location": { 23 | "templates": [ 24 | "{object} is a famous" 25 | ], 26 | "label": "object", 27 | "label_prepend_space": false, 28 | "variables": { 29 | "object": { 30 | "country": [ 31 | "Canada", "America", "Mexico", "Brazil", "Argentina", 32 | "Chile", "Peru", "Colombia", "Venezuela", "Ecuador", 33 | "Spain", "Portugal", "France", "Germany", "Italy", 34 | "England", "Ireland", "Scotland", "Wales", "Sweden", 35 | "Norway", "Finland", "Denmark", "Russia", "China", 36 | "Japan", "Korea", "India", "Pakistan", "Iran", 37 | "Iraq", "Egypt", "Nigeria", "South Africa", "Kenya", 38 | "Australia", "New Zealand" 39 | ], 40 | "city": [ 41 | "Madrid", "Rome", "London", "Paris", "Berlin", "Moscow", 42 | "Beijing", "Tokyo", "Seoul", "Delhi", "Mumbai", 43 | "Bangalore", "Lagos", "Cairo", "Johannesburg", 44 | "Sydney", "Melbourne", "Auckland", "Wellington", 45 | "Toronto", "Montreal", "Vancouver", "New York", 46 | "Los Angeles", "Chicago", "Houston", "Philadelphia", 47 | "Phoenix", "San Antonio", "San Diego", "Dallas", 48 | "San Jose", "Austin", "Jacksonville", "San Francisco", 49 | "Amman", "Baghdad", "Damascus", "Jerusalem", "Kabul", 50 | "Tehran", "Ankara", "Athens", "Budapest", "Dublin", 51 | "Riyadh", "Kuwait City", "Nairobi", "Lima", "Bogota", 52 | "Caracas", "Santiago", "Buenos Aires", "Mexico City", 53 | "Brasilia", "Lisbon", "Barcelona", "Vienna", "Prague", 54 | "Warsaw", "Stockholm", "Copenhagen", "Oslo", "Helsinki", 55 | "Reykjavik", "Bucharest", "Sofia", "Belgrade", "Kiev", 56 | "Minsk", "Tbilisi", "Yerevan", "Baku", "Ashgabat", 57 | "Tashkent", "Dushanbe", "Kathmandu", "Islamabad", 58 | "Kabul", "Kathmandu", "Dhaka", "Colombo", "Yangon", 59 | "Bangkok", "Hanoi", "Manila", "Jakarta", "Kuala Lumpur" 60 | ] 61 | } 62 | } 63 | }, 64 | "polarity__9_shot": { 65 | "templates": [ 66 | "- advantage: good\n- robbery: bad\n- destruction: bad\n- health: good\n- positivity: good\n- war: bad\n- peace: good\n- beautiful: good\n- family: good\n- {word}:" 67 | ], 68 | "label": "word", 69 | "label_prepend_space": true, 70 | "variables": { 71 | "word": { 72 | "good": ["toy", "happy", "friend", "child", "help", "nice", "kind", "clean"], 73 | "bad": ["kill", "hurt", "sad", "angry", "mean", "enemy", "rude", "prevent", "dirty"] 74 | } 75 | } 76 | }, 77 | "syntaxgen_number_prep": { 78 | "templates": [ 79 | "The leader was smart, but the {subj} {prep} the {prepobj}" 80 | ], 81 | "label": "subj", 82 | "label_prepend_space": true, 83 | "variables": { 84 | "subj": { 85 | "was": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"], 86 | "were": ["managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"] 87 | }, 88 | "prep": [ 89 | "in front of", "behind", "next to", "near", "across from", "to the side of" 90 | ], 91 | "prepobj": [ 92 | "manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager", 93 | "managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers" 94 | ] 95 | } 96 | }, 97 | "syntaxgen_reflexive_prep": { 98 | "templates": [ 99 | "After falling, the {subj} {prep} the {prepobj} {verb}" 100 | ], 101 | "label": "subj", 102 | "label_prepend_space": true, 103 | "variables": { 104 | "subj": { 105 | "himself": ["manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager"], 106 | "themselves": ["managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers"] 107 | }, 108 | "prep": [ 109 | "in front of", "behind", "next to", "near", "across from", "to the side of" 110 | ], 111 | "prepobj": [ 112 | "manager", "doctor", "clerk", "officer", "nurse", "woman", "man", "pilot", "architect", "actor", "minister", "manager", 113 | "managers", "doctors", "clerks", "officers", "nurses", "women", "men", "pilots", "architects", "actors", "ministers", "managers" 114 | ], 115 | "verb": [ 116 | "hurt", "injured", "trusted", "embarrassed", "disguised", "hated", "doubted" 117 | ] 118 | } 119 | }, 120 | "gender_is_a__2_shot": { 121 | "templates": [ 122 | "John is a man. Jane is a woman. {name} is a" 123 | ], 124 | "label": "name", 125 | "label_prepend_space": true, 126 | "variables": { 127 | "name": { 128 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 129 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 130 | } 131 | } 132 | }, 133 | "gender_colon__2_shot": { 134 | "templates": [ 135 | "- John: man\n- Jane: woman\n- {name}:" 136 | ], 137 | "label": "name", 138 | "label_prepend_space": true, 139 | "variables": { 140 | "name": { 141 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 142 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 143 | } 144 | } 145 | }, 146 | "gender_is_a__5_shot": { 147 | "templates": [ 148 | "Tom is a man. Janet is a woman. David is a man. Jessica is a woman. Richard is a man. {name} is a" 149 | ], 150 | "label": "name", 151 | "label_prepend_space": true, 152 | "variables": { 153 | "name": { 154 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 155 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 156 | } 157 | } 158 | }, 159 | "gender_colon__5_shot": { 160 | "templates": [ 161 | "- Tom: man\n- Janet: woman\n- David: man\n- Richard: man\n- Jessica: woman\n- {name}:" 162 | ], 163 | "label": "name", 164 | "label_prepend_space": true, 165 | "variables": { 166 | "name": { 167 | "man": ["James", "Robert", "John", "Michael", "David", "William", "Richard", "Joseph", "Thomas", "Christopher", "Charles", "Daniel", "Matthew", "Anthony", "Mark", "Donald", "Steven", "Andrew", "Paul", "Joshua", "Kenneth", "Kevin", "Brian", "George", "Timothy", "Ronald", "Jason", "Edward", "Jeffrey", "Ryan", "Jacob", "Gary", "Nicholas", "Eric", "Jonathan", "Stephen", "Larry", "Justin", "Scott", "Brandon", "Benjamin", "Samuel", "Gregory", "Alexander", "Patrick", "Frank", "Raymond", "Jack", "Dennis", "Jerry", "Tyler", "Aaron", "Jose", "Adam", "Nathan", "Henry", "Zachary", "Douglas", "Peter", "Kyle", "Noah", "Ethan", "Jeremy", "Walter", "Christian", "Keith", "Roger", "Terry", "Austin", "Sean", "Gerald", "Carl", "Harold", "Dylan", "Arthur", "Lawrence", "Jordan", "Jesse", "Bryan", "Billy", "Bruce", "Gabriel", "Joe", "Logan", "Alan", "Juan", "Albert", "Willie", "Elijah", "Wayne", "Randy", "Vincent", "Mason", "Roy", "Ralph", "Bobby", "Russell", "Bradley", "Philip", "Eugene"], 168 | "woman": ["Mary", "Patricia", "Jennifer", "Linda", "Elizabeth", "Barbara", "Susan", "Jessica", "Sarah", "Karen", "Lisa", "Nancy", "Betty", "Sandra", "Margaret", "Ashley", "Kimberly", "Emily", "Donna", "Michelle", "Carol", "Amanda", "Melissa", "Deborah", "Stephanie", "Dorothy", "Rebecca", "Sharon", "Laura", "Cynthia", "Amy", "Kathleen", "Angela", "Shirley", "Brenda", "Emma", "Anna", "Pamela", "Nicole", "Samantha", "Katherine", "Christine", "Helen", "Debra", "Rachel", "Carolyn", "Janet", "Maria", "Catherine", "Heather", "Diane", "Olivia", "Julie", "Joyce", "Victoria", "Ruth", "Virginia", "Lauren", "Kelly", "Christina", "Joan", "Evelyn", "Judith", "Andrea", "Hannah", "Megan", "Cheryl", "Jacqueline", "Martha", "Madison", "Teresa", "Gloria", "Sara", "Janice", "Ann", "Kathryn", "Abigail", "Sophia", "Frances", "Jean", "Alice", "Judy", "Isabella", "Julia", "Grace", "Amber", "Denise", "Danielle", "Marilyn", "Beverly", "Charlotte", "Natalie", "Theresa", "Diana", "Brittany", "Doris", "Kayla", "Alexis", "Lori", "Marie"] 169 | } 170 | } 171 | } 172 | } -------------------------------------------------------------------------------- /data/templates/preposing_in_pp.json: -------------------------------------------------------------------------------- 1 | { 2 | "preposing_in_pp": { 3 | "templates": [ 4 | "{prefix}{filler} though {subj} {verb}" 5 | ], 6 | "label": "filler", 7 | "result_prepend_space": false, 8 | "labels": { 9 | "pp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary", 10 | " suspicious", " doubtful", " clever", " witty", " young", " old", 11 | " sharp", " bright", " intelligent" 12 | ], 13 | "pipp": ["."] 14 | }, 15 | "variables": { 16 | "prefix": ["The work continued,", "The plan was in motion,"], 17 | "subj": ["the mother", "the security guard", "the man", "the delivery boy", 18 | "the judge", "the reporter", "the accountant", "the secretary", 19 | "the investigator", "the businessman", "the friend", "the painter", 20 | "the neighbor", "the woman", "the politician", "the old man" 21 | ], 22 | 23 | "filler": { 24 | "pp": [""], 25 | "pipp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary", 26 | " suspicious", " doubtful", " clever", " witty", " young", " old", 27 | " sharp", " bright", " intelligent" 28 | ] 29 | }, 30 | "verb": ["seemed", "seems"] 31 | } 32 | }, 33 | "preposing_in_pp_embed_1": { 34 | "templates": [ 35 | "{prefix}{filler} though {subj1} {verb1} that {subj2} {verb2}" 36 | ], 37 | "label": "filler", 38 | "result_prepend_space": false, 39 | "labels": { 40 | "pp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary", 41 | " suspicious", " doubtful", " clever", " witty", " young", " old", 42 | " sharp", " bright", " intelligent" 43 | ], 44 | "pipp": ["."] 45 | }, 46 | "variables": { 47 | "prefix": ["The work continued,", "The plan was in motion,"], 48 | "subj1": ["the mother", "the security guard", "the man", "the delivery boy", 49 | "the judge", "the reporter", "the accountant", "the secretary", 50 | "the investigator", "the businessman", "the friend", "the painter", 51 | "the neighbor", "the woman", "the politician", "the old man" 52 | ], 53 | "verb1": ["said", "believed", "knew", "remarked", "heard", 54 | "thought", "stated", "said", "said", "thought", 55 | "believes", "believed", "said", "said", "said", 56 | "thinks", "said", "stated", "believed", "stated" 57 | ], 58 | "subj2": ["the friend", "the assistant", "the woman", "the worker", "the newspaper", 59 | "the cop", "the television host", "the colleague", "the journalist", 60 | "the banker", "the colleague", "the rival", "the reporter", "the associate", 61 | "the worker", "the cop", "the mother", "the secretary", "the friend", 62 | "the press secretary", "the woman" 63 | ], 64 | "filler": { 65 | "pp": [""], 66 | "pipp": [" happy", " sad", " anxious", " afraid", " nervous", " wary", " weary", 67 | " suspicious", " doubtful", " clever", " witty", " young", " old", 68 | " sharp", " bright", " intelligent" 69 | ] 70 | }, 71 | "verb2": ["seemed", "seems"] 72 | } 73 | } 74 | } 75 | -------------------------------------------------------------------------------- /data/templates/syntaxgym_failed.json: -------------------------------------------------------------------------------- 1 | { 2 | "passive_wh_extraction": { 3 | "templates": [ 4 | "{prefix} who the {subject} {hadwas} {verb}" 5 | ], 6 | "label": "hadwas", 7 | "label_prepend_space": true, 8 | "labels": { 9 | "had": ["."], 10 | "was": ["by"] 11 | }, 12 | "variables": { 13 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 14 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 15 | "hadwas": { 16 | "had": ["had"], 17 | "was": ["was"] 18 | }, 19 | "verb": ["killed", "attacked", "admired", "amazed", "impressed", "disturbed", "injured", "hurt", "shot"] 20 | } 21 | }, 22 | "from_to": { 23 | "templates": [ 24 | "The {subject} {verb} {fromto} the {object}" 25 | ], 26 | "label": "fromto", 27 | "label_prepend_space": true, 28 | "labels": { 29 | "from": ["to"], 30 | "to": ["from"] 31 | }, 32 | "variables": { 33 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 34 | "verb": ["drove", "walked", "flew", "ran", "swam", "crawled", "sailed", "skated", "biked", "hiked", "skied", "climbed", "traveled", "jogged", "sprinted", "raced", "strolled", "marched", "ambled", "wandered"], 35 | "fromto": { 36 | "from": ["from"], 37 | "to": ["to"] 38 | }, 39 | "object": ["city", "town", "village", "beach", "north", "south"] 40 | } 41 | }, 42 | "filler_gap_time_extraction": { 43 | "templates": [ 44 | "{prefix} {comp} {np1} will {verb}" 45 | ], 46 | "label": "comp", 47 | "label_prepend_space": true, 48 | "labels": { 49 | "wh": ["."], 50 | "th": ["soon"] 51 | }, 52 | "variables": { 53 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 54 | "comp": { 55 | "wh": ["when"], 56 | "th": ["that"] 57 | }, 58 | "np1": ["our new friend", "the star student", "the nurse", "the student", "the movie star", "the man", "the convict", "the collector", "our uncle", "her rival", "my good friend", "the suspect", "the businessman"], 59 | "verb": ["arrive", "depart", "leave", "come back", "return"] 60 | } 61 | }, 62 | "agreement_number_reflexive_obj-relc2": { 63 | "templates": [ 64 | "The {subject} that the {embed_np} {embed_vp} {matrix_v}" 65 | ], 66 | "label": ["subject", "embed_np"], 67 | "label_prepend_space": true, 68 | "labels": { 69 | "plural": ["themselves", "themselves"], 70 | "singular": ["herself", "himself"] 71 | }, 72 | "variables": { 73 | "subject": { 74 | "singular": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 75 | "plural": ["managers", "farmers", "architects", "pilots", "doctors", "authors", "consultants", "taxi drivers", "customers", "secretaries", "officers", "actors", "guards", "teachers", "executives", "senators", "ministers", "clerks", "athletes"] 76 | }, 77 | "embed_np": { 78 | "singular": ["actors", "managers", "authors", "guards", "customers", "farmers", "architects", "pilots", "ministers", "clerks", "secretaries", "teachers", "doctors", "executives", "officers", "athletes", "senators"], 79 | "plural": ["doctor", "farmer", "minister", "officer", "guard", "author", "customer", "pilot", "architect", "senator", "athlete", "executive", "actor", "secretary", "clerk", "teacher", "manager"] 80 | }, 81 | "embed_vp": ["loved", "liked", "discussed", "met", "hated"], 82 | "matrix_v": ["embarrassed", "disguised", "injured", "suspected", "doubted", "hated", "hurt", "trusted"] 83 | } 84 | }, 85 | "fillergap_subj": { 86 | "templates": [ 87 | "{prefix} {comp}" 88 | ], 89 | "label": "comp", 90 | "label_prepend_space": true, 91 | "labels": { 92 | "wh": ["did"], 93 | "th": ["there"] 94 | }, 95 | "variables": { 96 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 97 | "comp": { 98 | "wh": ["who"], 99 | "th": ["that"] 100 | } 101 | } 102 | }, 103 | "fillergap_obj-him": { 104 | "templates": [ 105 | "{prefix} {comp} {np1} {verb}" 106 | ], 107 | "label": "comp", 108 | "label_prepend_space": true, 109 | "labels": { 110 | "wh": ["when"], 111 | "th": ["him"] 112 | }, 113 | "variables": { 114 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 115 | "comp": { 116 | "wh": ["who"], 117 | "th": ["that"] 118 | }, 119 | "np1": ["our new friend", "the star student", "the nurse", "the student", "the movie star", "the man", "the convict", "the collector", "our uncle", "her rival", "my good friend", "the suspect", "the businessman"], 120 | "verb": ["killed", "met", "attacked", "saw"] 121 | } 122 | }, 123 | "fillergap_obj-it": { 124 | "templates": [ 125 | "{prefix} {comp} {np1} {verb}" 126 | ], 127 | "label": "comp", 128 | "label_prepend_space": true, 129 | "labels": { 130 | "wh": ["when"], 131 | "th": ["it"] 132 | }, 133 | "variables": { 134 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 135 | "comp": { 136 | "wh": ["what"], 137 | "th": ["that"] 138 | }, 139 | "np1": ["our new friend", "the star student", "the nurse", "the student", "the movie star", "the man", "the convict", "the collector", "our uncle", "her rival", "my good friend", "the suspect", "the businessman"], 140 | "verb": ["grabbed", "caught", "stole", "forged", "derailed", "will get", "will be awarded", "placed", "repaired", "dragged"] 141 | } 142 | }, 143 | "fillergap_passive_subj-pp": { 144 | "templates": [ 145 | "{prefix} {wh} the {subject} was {verb} by" 146 | ], 147 | "label": "wh", 148 | "label_prepend_space": true, 149 | "labels": { 150 | "why": ["them"], 151 | "who": ["."] 152 | }, 153 | "variables": { 154 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 155 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 156 | "wh": { 157 | "why": ["why"], 158 | "who": ["who"] 159 | }, 160 | "verb": ["killed", "attacked", "admired", "amazed", "impressed", "disturbed", "injured", "hurt", "shot"] 161 | } 162 | }, 163 | "fillergap_ditransitive_recipient": { 164 | "templates": [ 165 | "{prefix} {wh} the {subject} {verb} the {object} to" 166 | ], 167 | "label": "wh", 168 | "label_prepend_space": true, 169 | "labels": { 170 | "that": ["them"], 171 | "who": ["."] 172 | }, 173 | "variables": { 174 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 175 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 176 | "wh": { 177 | "that": ["that"], 178 | "who": ["who"] 179 | }, 180 | "object": ["box", "toy", "present", "gift", "package", "letter", "card", "book", "note", "envelope", "package", "ball", "flower", "message", "email", "bill", "check", "money", "package", "parcel"], 181 | "verb": ["showed", "gave", "presented", "offered", "sent", "handed", "delivered", "sold"] 182 | } 183 | }, 184 | "fillergap_ditransitive_time": { 185 | "templates": [ 186 | "{prefix} {wh} the {subject} {verb} the {object} to them" 187 | ], 188 | "label": "wh", 189 | "label_prepend_space": true, 190 | "labels": { 191 | "that": ["today"], 192 | "when": ["."] 193 | }, 194 | "variables": { 195 | "prefix": ["Our neighbor reminded us", "Our neighbor said", "My sister told me", "The shop owner told me", "My friend reported", "My friend remembers", "My friend told me", "We all remember", "Our friend knew", "The mayor told me", "The newspaper said", "She told me", "She can guess", "I know", "I do not know", "You remember", "The newspaper reported", "I remember", "We recall", "She can not believe", "We remember", "My neighbor told me", "She knows"], 196 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 197 | "wh": { 198 | "that": ["that"], 199 | "when": ["when"] 200 | }, 201 | "object": ["box", "toy", "present", "gift", "package", "letter", "card", "book", "note", "envelope", "package", "ball", "flower", "message", "email", "bill", "check", "money", "package", "parcel"], 202 | "verb": ["showed", "gave", "presented", "offered", "sent", "handed", "delivered", "sold"] 203 | } 204 | }, 205 | "npi_obj-relc2": { 206 | "templates": [ 207 | "{det1} {np} that {det2} {rc_subj} {rc_verb} {matrix_v}" 208 | ], 209 | "label": ["det1", "det2"], 210 | "label_prepend_space": true, 211 | "labels": { 212 | "any": ["any"], 213 | "some": ["some"] 214 | }, 215 | "variables": { 216 | "det1": { 217 | "any": ["No"], 218 | "some": ["The"] 219 | }, 220 | "np": ["consultant", "taxi driver", "farmer", "architects", "pilots", "architect", "athlete", "authors", "journalists", "secretaries", "pilot", "minister", "clerks", "senators", "clerk", "ministers", "dancer", "officers", "athletes", "secretary", "senator", "farmers", "guard", "author", "manager", "doctors", "taxi drivers", "teacher", "customers", "executives", "managers", "teachers", "officer", "surgeon", "consultants", "executive", "guards", "customer"], 221 | "det2": { 222 | "any": ["the"], 223 | "some": ["no"] 224 | }, 225 | "rc_subj": ["consultant", "taxi driver", "architects", "farmer", "pilots", "architect", "journalists", "authors", "athlete", "secretaries", "clerks", "minister", "pilot", "senators", "clerk", "ministers", "dancer", "officers", "secretary", "senator", "farmers", "guard", "doctors", "manager", "taxi drivers", "teacher", "customers", "executives", "managers", "teachers", "officer", "surgeon", "consultants", "executive", "guards", "customer"], 226 | "rc_verb": ["helped", "liked", "admired", "contacted", "respected", "discussed", "impressed", "praised", "knew", "hated", "loved"], 227 | "matrix_v": ["has shown", "has fired", "has broken", "have planted", "has passed up", "have had", "has spent", "have missed", "has seen", "has had", "have landed", "has crashed", "have refused", "have passed", "has missed", "has known", "have failed", "has burned", "have advocated", "has completed", "have caught", "has failed", "has passed", "have purchased", "have broken", "have read", "have crashed", "have burned", "have arrested"] 228 | } 229 | }, 230 | "passive": { 231 | "templates": [ 232 | "The {subject} {hadwas} {verb}" 233 | ], 234 | "label": "hadwas", 235 | "label_prepend_space": true, 236 | "labels": { 237 | "had": ["him"], 238 | "was": ["by"] 239 | }, 240 | "variables": { 241 | "subject": ["farmer", "author", "taxi driver", "consultant", "executive", "actor", "teacher", "architect", "senator", "secretary", "clerk", "officer", "pilot", "manager", "doctor", "minister", "guard", "athlete", "customer"], 242 | "hadwas": { 243 | "had": ["had"], 244 | "was": ["was"] 245 | }, 246 | "verb": ["killed", "attacked", "admired", "amazed", "impressed", "disturbed", "injured", "hurt", "shot"] 247 | } 248 | } 249 | } -------------------------------------------------------------------------------- /data/test_suites/center_embedding.json: -------------------------------------------------------------------------------- 1 | {"items":[{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"painting","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"artist","region_number":5},{"content":"deteriorated","region_number":6},{"content":"painted","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"painting","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"artist","region_number":5},{"content":"painted","region_number":6},{"content":"deteriorated","region_number":7}]}],"item_number":1},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"captain","region_number":5},{"content":"subsided","region_number":6},{"content":"feared","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"captain","region_number":5},{"content":"feared","region_number":6},{"content":"subsided","region_number":7}]}],"item_number":2},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"girl","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bug","region_number":5},{"content":"shouted","region_number":6},{"content":"frightened","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"girl","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bug","region_number":5},{"content":"frightened","region_number":6},{"content":"shouted","region_number":7}]}],"item_number":3},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"baby","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"yelled","region_number":6},{"content":"held","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"baby","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"held","region_number":6},{"content":"yelled","region_number":7}]}],"item_number":4},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"soldier","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bullet","region_number":5},{"content":"died","region_number":6},{"content":"wounded","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"soldier","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"bullet","region_number":5},{"content":"wounded","region_number":6},{"content":"died","region_number":7}]}],"item_number":5},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"scientist","region_number":5},{"content":"intensified","region_number":6},{"content":"predicted","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"storm","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"scientist","region_number":5},{"content":"predicted","region_number":6},{"content":"intensified","region_number":7}]}],"item_number":6},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"convict","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"cop","region_number":5},{"content":"escaped","region_number":6},{"content":"arrested","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"convict","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"cop","region_number":5},{"content":"arrested","region_number":6},{"content":"escaped","region_number":7}]}],"item_number":7},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"computer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"crashed","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"computer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"bought","region_number":6},{"content":"crashed","region_number":7}]}],"item_number":8},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"floor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"cracked","region_number":6},{"content":"swept","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"floor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"swept","region_number":6},{"content":"cracked","region_number":7}]}],"item_number":9},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"yacht","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"millionaires","region_number":5},{"content":"sank","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"yacht","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"millionaires","region_number":5},{"content":"bought","region_number":6},{"content":"sank","region_number":7}]}],"item_number":10},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"shirt","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"ripped","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"shirt","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"bought","region_number":6},{"content":"ripped","region_number":7}]}],"item_number":11},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"evaporated","region_number":6},{"content":"poured","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"maid","region_number":5},{"content":"poured","region_number":6},{"content":"evaporated","region_number":7}]}],"item_number":12},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"collapsed","region_number":6},{"content":"built","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"built","region_number":6},{"content":"collapsed","region_number":7}]}],"item_number":13},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"bones","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"broke","region_number":6},{"content":"examined","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"bones","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"examined","region_number":6},{"content":"broke","region_number":7}]}],"item_number":14},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"deteriorated","region_number":6},{"content":"repaired","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"building","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"repaired","region_number":6},{"content":"deteriorated","region_number":7}]}],"item_number":15},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"ship","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"sank","region_number":6},{"content":"built","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"ship","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"workers","region_number":5},{"content":"built","region_number":6},{"content":"sank","region_number":7}]}],"item_number":16},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"horse","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"bucked","region_number":6},{"content":"rode","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"horse","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"rode","region_number":6},{"content":"bucked","region_number":7}]}],"item_number":17},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"chef","region_number":5},{"content":"evaporated","region_number":6},{"content":"needed","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"water","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"chef","region_number":5},{"content":"needed","region_number":6},{"content":"evaporated","region_number":7}]}],"item_number":18},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"tree","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"old man","region_number":5},{"content":"fell","region_number":6},{"content":"cut","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"tree","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"old man","region_number":5},{"content":"cut","region_number":6},{"content":"fell","region_number":7}]}],"item_number":19},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"letter","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"arrived","region_number":6},{"content":"wrote","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"letter","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"wrote","region_number":6},{"content":"arrived","region_number":7}]}],"item_number":20},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"glass","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"cracked","region_number":6},{"content":"hit","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"glass","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"hit","region_number":6},{"content":"cracked","region_number":7}]}],"item_number":21},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"bomb","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"terrorist","region_number":5},{"content":"exploded","region_number":6},{"content":"built","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"bomb","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"terrorist","region_number":5},{"content":"built","region_number":6},{"content":"exploded","region_number":7}]}],"item_number":22},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"meat","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"burned","region_number":6},{"content":"cooked","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"meat","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"man","region_number":5},{"content":"cooked","region_number":6},{"content":"burned","region_number":7}]}],"item_number":23},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"sugar","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"visitor","region_number":5},{"content":"dissolved","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"sugar","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"visitor","region_number":5},{"content":"bought","region_number":6},{"content":"dissolved","region_number":7}]}],"item_number":24},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"pants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"ripped","region_number":6},{"content":"bought","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"pants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"woman","region_number":5},{"content":"bought","region_number":6},{"content":"ripped","region_number":7}]}],"item_number":25},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"toilet","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"worker","region_number":5},{"content":"clogged","region_number":6},{"content":"fixed","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"toilet","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"worker","region_number":5},{"content":"fixed","region_number":6},{"content":"clogged","region_number":7}]}],"item_number":26},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"window","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"shattered","region_number":6},{"content":"wiped","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"window","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"boy","region_number":5},{"content":"wiped","region_number":6},{"content":"shattered","region_number":7}]}],"item_number":27},{"conditions":[{"condition_name":"implaus","regions":[{"content":"The","region_number":1},{"content":"child","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"shadow","region_number":5},{"content":"yelled","region_number":6},{"content":"frightened","region_number":7}]},{"condition_name":"plaus","regions":[{"content":"The","region_number":1},{"content":"child","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"shadow","region_number":5},{"content":"frightened","region_number":6},{"content":"yelled","region_number":7}]}],"item_number":28}],"meta":{"author":"","comment":null,"metric":"sum","name":"center_embed","reference":"\"Wilcox E. Levy R. & Futrell R. (2019). Hierarchical representation in neural language models: Suppression and recovery of expectations.\""},"predictions":[{"formula":"( (6;%plaus%) + (7;%plaus%) ) < ( (6;%implaus%) + (7;%implaus%) )","type":"formula"}],"region_meta":{"1":"intro","2":"np_1","3":"that","4":"det_2","5":"np_2","6":"verb1","7":"verb2"}} -------------------------------------------------------------------------------- /data/test_suites/reflexive_number_agreement_feminine_object_relative.json: -------------------------------------------------------------------------------- 1 | {"items":[{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"liked","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":1},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"met","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":2},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"hated","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":3},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"discussed","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"discussed","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":4},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"loved","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"loved","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":5},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"liked","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"liked","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":6},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"met","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"met","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":7},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"hated","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"hated","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":8},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"discussed","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":9},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"loved","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":10},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"liked","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":11},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"met","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":12},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"hated","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":13},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"discussed","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":14},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"loved","region_number":6},{"content":"hated","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"loved","region_number":6},{"content":"hated","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"loved","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":15},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"liked","region_number":6},{"content":"doubted","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"liked","region_number":6},{"content":"doubted","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"liked","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":16},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"met","region_number":6},{"content":"hurt","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"met","region_number":6},{"content":"hurt","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"met","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":17},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"hated","region_number":6},{"content":"injured","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"hated","region_number":6},{"content":"injured","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"hated","region_number":6},{"content":"embarrassed","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":18},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"discussed","region_number":6},{"content":"suspected","region_number":7},{"content":"themselves","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"discussed","region_number":6},{"content":"suspected","region_number":7},{"content":"herself","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"discussed","region_number":6},{"content":"disguised","region_number":7},{"content":"themselves","region_number":8}]}],"item_number":19}],"meta":{"author":"","comment":null,"metric":"sum","name":"reflexive_orc_fem","reference":"\"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. \""},"predictions":[{"formula":"(8;%match_sing%) < (8;%mismatch_sing%)","type":"formula"},{"formula":"(8;%match_plural%) < (8;%mismatch_plural%)","type":"formula"}],"region_meta":{"1":"intro","2":"np_subject","3":"that","4":"the","5":"embed_np","6":"embed_vp","7":"matrix_v","8":"reflexive"}} -------------------------------------------------------------------------------- /data/test_suites/subject_verb_number_agreement_with_prepositional_phrase.json: -------------------------------------------------------------------------------- 1 | {"items":[{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senator","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"senators","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":1},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"bring","region_number":6},{"content":"love to people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"brings","region_number":6},{"content":"love to people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"brings","region_number":6},{"content":"love to people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"bring","region_number":6},{"content":"love to people","region_number":7}]}],"item_number":2},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guard","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"guards","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]}],"item_number":3},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"know","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"knows","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerk","region_number":5},{"content":"knows","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"clerks","region_number":5},{"content":"know","region_number":6},{"content":"many people","region_number":7}]}],"item_number":4},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"like","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"likes","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architect","region_number":5},{"content":"likes","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"architects","region_number":5},{"content":"like","region_number":6},{"content":"to gamble","region_number":7}]}],"item_number":5},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"enjoy","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"enjoys","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athlete","region_number":5},{"content":"enjoys","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"athletes","region_number":5},{"content":"enjoy","region_number":6},{"content":"playing tennis","region_number":7}]}],"item_number":6},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":7},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"minister","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"ministers","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":8},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actor","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"actors","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":9},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretary","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"secretaries","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":10},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"are","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executive","region_number":5},{"content":"is","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"executives","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":11},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"author","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"authors","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":12},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilot","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"pilots","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":13},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"bring","region_number":6},{"content":"good feelings","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"brings","region_number":6},{"content":"good feelings","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctor","region_number":5},{"content":"brings","region_number":6},{"content":"good feelings","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"behind","region_number":3},{"content":"the","region_number":4},{"content":"doctors","region_number":5},{"content":"bring","region_number":6},{"content":"good feelings","region_number":7}]}],"item_number":14},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmer","region_number":5},{"content":"interests","region_number":6},{"content":"people","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"in front of","region_number":3},{"content":"the","region_number":4},{"content":"farmers","region_number":5},{"content":"interest","region_number":6},{"content":"people","region_number":7}]}],"item_number":15},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"know","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"knows","region_number":6},{"content":"many people","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"manager","region_number":5},{"content":"knows","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"near","region_number":3},{"content":"the","region_number":4},{"content":"managers","region_number":5},{"content":"know","region_number":6},{"content":"many people","region_number":7}]}],"item_number":16},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"like","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"likes","region_number":6},{"content":"to gamble","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customer","region_number":5},{"content":"likes","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"to the side of","region_number":3},{"content":"the","region_number":4},{"content":"customers","region_number":5},{"content":"like","region_number":6},{"content":"to gamble","region_number":7}]}],"item_number":17},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"enjoy","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"enjoys","region_number":6},{"content":"playing tennis","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officer","region_number":5},{"content":"enjoys","region_number":6},{"content":"tennis","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"across from","region_number":3},{"content":"the","region_number":4},{"content":"officers","region_number":5},{"content":"enjoy","region_number":6},{"content":"playing tennis","region_number":7}]}],"item_number":18},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teacher","region_number":5},{"content":"is","region_number":6},{"content":"good","region_number":7}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"next to","region_number":3},{"content":"the","region_number":4},{"content":"teachers","region_number":5},{"content":"are","region_number":6},{"content":"good","region_number":7}]}],"item_number":19}],"meta":{"author":"","comment":null,"metric":"sum","name":"number_prep","reference":"\"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. \""},"predictions":[{"formula":"(6;%match_sing%) < (6;%mismatch_sing%)","type":"formula"},{"formula":"(6;%match_plural%) < (6;%mismatch_plural%)","type":"formula"}],"region_meta":{"1":"intro","2":"np_subject","3":"prep","4":"the","5":"prep_np","6":"matrix_v","7":"continuation"}} 2 | -------------------------------------------------------------------------------- /data/test_suites/subject_verb_number_agreement_with_subject_relative_clause.json: -------------------------------------------------------------------------------- 1 | {"items":[{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senator","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senators","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"authors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senator","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"author","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"senators","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":1},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"bring","region_number":7},{"content":"love to people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"brings","region_number":7},{"content":"love to people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"pilots","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"brings","region_number":7},{"content":"love to people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"pilot","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"bring","region_number":7},{"content":"love to people","region_number":8}]}],"item_number":2},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guard","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guards","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"doctors","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guard","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"doctor","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"guards","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]}],"item_number":3},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerk","region_number":6},{"content":"know","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerks","region_number":6},{"content":"knows","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"farmers","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerk","region_number":6},{"content":"knows","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"farmer","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"clerks","region_number":6},{"content":"know","region_number":7},{"content":"many people","region_number":8}]}],"item_number":4},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architect","region_number":6},{"content":"like","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architects","region_number":6},{"content":"likes","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"managers","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architect","region_number":6},{"content":"likes","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"manager","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"architects","region_number":6},{"content":"like","region_number":7},{"content":"to gamble","region_number":8}]}],"item_number":5},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athlete","region_number":6},{"content":"enjoy","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athletes","region_number":6},{"content":"enjoys","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"customers","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athlete","region_number":6},{"content":"enjoys","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"customer","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"athletes","region_number":6},{"content":"enjoy","region_number":7},{"content":"playing tennis","region_number":8}]}],"item_number":6},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"officers","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"officer","region_number":2},{"content":"that","region_number":3},{"content":"liked","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":7},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"minister","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"ministers","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"teachers","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"minister","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"teacher","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"ministers","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":8},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"senators","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actor","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"senator","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"actors","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":9},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretary","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretaries","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"consultants","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretary","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"consultant","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"secretaries","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":10},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executive","region_number":6},{"content":"are","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executives","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"guards","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executive","region_number":6},{"content":"is","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"guard","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"executives","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":11},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"author","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"authors","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"clerks","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"author","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"clerk","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"authors","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":12},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilot","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilots","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"architects","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilot","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"architect","region_number":2},{"content":"that","region_number":3},{"content":"hated","region_number":4},{"content":"the","region_number":5},{"content":"pilots","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":13},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctor","region_number":6},{"content":"bring","region_number":7},{"content":"good feelings","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctors","region_number":6},{"content":"brings","region_number":7},{"content":"good feelings","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"athletes","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctor","region_number":6},{"content":"brings","region_number":7},{"content":"good feelings","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"athlete","region_number":2},{"content":"that","region_number":3},{"content":"admired","region_number":4},{"content":"the","region_number":5},{"content":"doctors","region_number":6},{"content":"bring","region_number":7},{"content":"good feelings","region_number":8}]}],"item_number":14},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmer","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmers","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"actors","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmer","region_number":6},{"content":"interests","region_number":7},{"content":"people","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"actor","region_number":2},{"content":"that","region_number":3},{"content":"hurt","region_number":4},{"content":"the","region_number":5},{"content":"farmers","region_number":6},{"content":"interest","region_number":7},{"content":"people","region_number":8}]}],"item_number":15},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"manager","region_number":6},{"content":"know","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"managers","region_number":6},{"content":"knows","region_number":7},{"content":"many people","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"ministers","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"manager","region_number":6},{"content":"knows","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"minister","region_number":2},{"content":"that","region_number":3},{"content":"injured","region_number":4},{"content":"the","region_number":5},{"content":"managers","region_number":6},{"content":"know","region_number":7},{"content":"many people","region_number":8}]}],"item_number":16},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customer","region_number":6},{"content":"like","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customers","region_number":6},{"content":"likes","region_number":7},{"content":"to gamble","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"taxi drivers","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customer","region_number":6},{"content":"likes","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"taxi driver","region_number":2},{"content":"that","region_number":3},{"content":"ignored","region_number":4},{"content":"the","region_number":5},{"content":"customers","region_number":6},{"content":"like","region_number":7},{"content":"to gamble","region_number":8}]}],"item_number":17},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officer","region_number":6},{"content":"enjoy","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officers","region_number":6},{"content":"enjoys","region_number":7},{"content":"playing tennis","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"secretaries","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officer","region_number":6},{"content":"enjoys","region_number":7},{"content":"tennis","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"secretary","region_number":2},{"content":"that","region_number":3},{"content":"embarrassed","region_number":4},{"content":"the","region_number":5},{"content":"officers","region_number":6},{"content":"enjoy","region_number":7},{"content":"playing tennis","region_number":8}]}],"item_number":18},{"conditions":[{"condition_name":"match_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"match_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_plural","regions":[{"content":"The","region_number":1},{"content":"executives","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teacher","region_number":6},{"content":"is","region_number":7},{"content":"good","region_number":8}]},{"condition_name":"mismatch_sing","regions":[{"content":"The","region_number":1},{"content":"executive","region_number":2},{"content":"that","region_number":3},{"content":"disguised","region_number":4},{"content":"the","region_number":5},{"content":"teachers","region_number":6},{"content":"are","region_number":7},{"content":"good","region_number":8}]}],"item_number":19}],"meta":{"author":"","comment":null,"metric":"sum","name":"number_src","reference":"\"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. \""},"predictions":[{"formula":"(7;%match_sing%) < (7;%mismatch_sing%)","type":"formula"},{"formula":"(7;%match_plural%) < (7;%mismatch_plural%)","type":"formula"}],"region_meta":{"1":"intro","2":"np_subject","3":"that","4":"embed_vp","5":"the","6":"embed_np","7":"matrix_v","8":"continuation"}} 2 | -------------------------------------------------------------------------------- /diff_methods.py: -------------------------------------------------------------------------------- 1 | from numpy import add 2 | import torch 3 | from collections import defaultdict 4 | 5 | from sklearn.cluster import KMeans 6 | from sklearn.decomposition import PCA 7 | from sklearn.linear_model import LogisticRegression 8 | from sklearn.discriminant_analysis import LinearDiscriminantAnalysis 9 | from sklearn.utils._testing import ignore_warnings 10 | 11 | 12 | # mean diff 13 | def mean_diff(activations, labels, eval_activations, eval_labels): 14 | means, counts = {}, defaultdict(int) 15 | 16 | # accumulate 17 | for activation, label in zip(activations, labels): 18 | if label not in means: 19 | means[label] = torch.zeros_like(activation) 20 | means[label] += activation 21 | counts[label] += 1 22 | 23 | # calc means 24 | for k in means: 25 | means[k] /= counts[k] 26 | 27 | # make vector 28 | vecs = list(means.values()) 29 | vec = vecs[1] - vecs[0] 30 | return vec / torch.norm(vec), None 31 | 32 | 33 | @ignore_warnings(category=Warning) 34 | def kmeans_diff(activations, labels, eval_activations, eval_labels): 35 | # fit kmeans 36 | kmeans = KMeans(n_clusters=2, random_state=0, n_init=10).fit(activations) 37 | 38 | # make vector 39 | vecs = kmeans.cluster_centers_ 40 | vec = torch.tensor(vecs[0] - vecs[1], dtype=torch.float32) 41 | return vec / torch.norm(vec), None 42 | 43 | 44 | def pca_diff(n_components=1): 45 | def diff_func(activations, labels, eval_activations, eval_labels): 46 | # fit pca 47 | pca = PCA(n_components=n_components).fit(activations) 48 | explained_variance = sum(pca.explained_variance_ratio_) 49 | 50 | # average all components 51 | vec = torch.tensor(pca.components_.mean(axis=0), dtype=torch.float32) 52 | return vec / torch.norm(vec), explained_variance 53 | return diff_func 54 | 55 | 56 | def probe_diff(fit_intercept=False, penalty='l2', solver="lbfgs", C=1.0) -> callable: 57 | @ignore_warnings(category=Warning) 58 | def diff_func(activations, labels, eval_activations, eval_labels): 59 | # fit lr 60 | lr = LogisticRegression(random_state=0, max_iter=1000, l1_ratio=0.5, 61 | fit_intercept=fit_intercept, C=C, 62 | penalty=penalty, solver=solver).fit(activations, labels) 63 | accuracy = lr.score(eval_activations, eval_labels) 64 | 65 | # extract weight 66 | vec = torch.tensor(lr.coef_[0], dtype=torch.float32) 67 | return vec / torch.norm(vec), accuracy 68 | return diff_func 69 | 70 | 71 | def lda_diff(activations, labels, eval_activations, eval_labels): 72 | # fit lda 73 | lda = LinearDiscriminantAnalysis(n_components=1).fit(activations, labels) 74 | accuracy = lda.score(eval_activations, eval_labels) 75 | 76 | # extract weight 77 | vec = torch.tensor(lda.coef_[0], dtype=torch.float32) 78 | return vec / torch.norm(vec), accuracy 79 | 80 | 81 | def random_diff(activations, labels, eval_activations, eval_labels): 82 | vec = torch.randn_like(activations[0]) 83 | return vec / torch.norm(vec), None 84 | 85 | 86 | method_mapping = { 87 | "mean": mean_diff, 88 | "kmeans": kmeans_diff, 89 | "pca": pca_diff(n_components=1), 90 | "lda": lda_diff, 91 | "random": random_diff, 92 | } 93 | 94 | probe_mapping = { 95 | "EleutherAI/pythia-14m": [probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-1)], 96 | "EleutherAI/pythia-31m": [probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-2)], 97 | "EleutherAI/pythia-70m": [probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-3)], 98 | "EleutherAI/pythia-160m": [ 99 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-4), 100 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5) 101 | ], 102 | "EleutherAI/pythia-410m": [ 103 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-4), 104 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5) 105 | ], 106 | "EleutherAI/pythia-1b": [ 107 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5), 108 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6) 109 | ], 110 | "EleutherAI/pythia-1.4b": [ 111 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5), 112 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6) 113 | ], 114 | "EleutherAI/pythia-2.8b": [ 115 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-5), 116 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6) 117 | ], 118 | "EleutherAI/pythia-6.9b": [ 119 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6), 120 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-7) 121 | ], 122 | "EleutherAI/pythia-12b": [ 123 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-6), 124 | probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1e-7) 125 | ], 126 | } 127 | 128 | additional_method_mapping = { 129 | # various pca components (up to 5) 130 | "pca_2": pca_diff(n_components=2), 131 | "pca_3": pca_diff(n_components=3), 132 | "pca_4": pca_diff(n_components=4), 133 | "pca_5": pca_diff(n_components=5), 134 | 135 | # various linear probe types 136 | "probe_noreg_noint": probe_diff(fit_intercept=False, penalty=None, solver="saga", C=1.0), 137 | "probe_noreg_int": probe_diff(fit_intercept=True, penalty=None, solver="saga", C=1.0), 138 | 139 | "probe_l1_noint_1": probe_diff(fit_intercept=False, penalty='l1', solver="saga", C=1.0), 140 | "probe_l2_noint_1": probe_diff(fit_intercept=False, penalty='l2', solver="saga", C=1.0), 141 | "probe_elastic_noint_1": probe_diff(fit_intercept=False, penalty="elasticnet", solver="saga", C=1.0), 142 | "probe_l1_int_1": probe_diff(fit_intercept=True, penalty='l1', solver="saga", C=1.0), 143 | "probe_l2_int_1": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=1.0), 144 | "probe_elastic_int_1": probe_diff(fit_intercept=True, penalty="elasticnet", solver="saga", C=1.0), 145 | 146 | "probe_l1_noint_0.1": probe_diff(fit_intercept=False, penalty='l1', solver="saga", C=0.1), 147 | "probe_l2_noint_0.1": probe_diff(fit_intercept=False, penalty='l2', solver="saga", C=0.1), 148 | "probe_elastic_noint_0.1": probe_diff(fit_intercept=False, penalty="elasticnet", solver="saga", C=0.1), 149 | "probe_l1_int_0.1": probe_diff(fit_intercept=True, penalty='l1', solver="saga", C=0.1), 150 | "probe_l2_int_0.1": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.1), 151 | "probe_elastic_int_0.1": probe_diff(fit_intercept=True, penalty="elasticnet", solver="saga", C=0.1), 152 | 153 | "probe_l1_noint_0.001": probe_diff(fit_intercept=False, penalty='l1', solver="saga", C=0.001), 154 | "probe_l2_noint_0.001": probe_diff(fit_intercept=False, penalty='l2', solver="saga", C=0.001), 155 | "probe_elastic_noint_0.001": probe_diff(fit_intercept=False, penalty="elasticnet", solver="saga", C=0.001), 156 | "probe_l1_int_0.001": probe_diff(fit_intercept=True, penalty='l1', solver="saga", C=0.001), 157 | "probe_l2_int_0.001": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.001), 158 | "probe_elastic_int_0.001": probe_diff(fit_intercept=True, penalty="elasticnet", solver="saga", C=0.001), 159 | 160 | "probe_l2_int_0.01": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.01), 161 | "probe_l2_int_0.0001": probe_diff(fit_intercept=True, penalty='l2', solver="saga", C=0.0001), 162 | } -------------------------------------------------------------------------------- /eval.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.nn import CrossEntropyLoss 3 | from utils import get_last_token 4 | import pyvene as pv 5 | from data import Batch 6 | 7 | loss_fct = CrossEntropyLoss() 8 | 9 | 10 | def calculate_loss(logits: torch.tensor, label: torch.tensor) -> torch.tensor: 11 | """Calculate cross entropy between logits and a single target label (can be batched)""" 12 | shift_logits = logits.contiguous() 13 | shift_labels = label.to(shift_logits.device) 14 | loss = loss_fct(shift_logits, shift_labels) 15 | return loss 16 | 17 | 18 | @torch.no_grad() 19 | def eval(intervenable: pv.IntervenableModel, evalset: list[Batch], 20 | layer_i: int, pos_i: int, strategy: str) -> tuple[list[dict], dict, list[tuple]]: 21 | """Evaluate an intervention on an evalset.""" 22 | 23 | data, activations = [], [] 24 | for batch in evalset: 25 | 26 | # inference 27 | pos_interv = [[x[pos_i] for x in y] for y in batch.compute_pos(strategy)] 28 | base_outputs, counterfactual_outputs = intervenable( 29 | batch.base, 30 | [None, batch.src], 31 | {"sources->base": ([None, pos_interv[1]], pos_interv)}, 32 | output_original_output=True 33 | ) 34 | 35 | # store activations/labels for training non-causal methods 36 | for batch_i in range(len(batch.pairs)): 37 | for unit_i in range(base_outputs[-1][batch_i].shape[0]): 38 | activation = base_outputs[-1][batch_i][unit_i].detach().cpu() 39 | activations.append((activation, batch.base_types[batch_i])) 40 | 41 | # get last token probs 42 | logits = get_last_token(counterfactual_outputs.logits, batch.base['attention_mask']) 43 | probs = logits.log_softmax(dim=-1) 44 | base_logits = get_last_token(base_outputs[0].logits, batch.base['attention_mask']) 45 | base_probs = base_logits.log_softmax(dim=-1) 46 | loss = calculate_loss(logits, batch.src_labels) 47 | 48 | # get probs 49 | for batch_i in range(len(batch.pairs)): 50 | src_label = batch.src_labels[batch_i] 51 | base_label = batch.base_labels[batch_i] 52 | # riia = 1 if logits[batch_i][src_label].item() > logits[batch_i][base_label].item() else 0 53 | # odds_ratio = (base_probs[batch_i][base_label] - base_probs[batch_i][src_label]) + (probs[batch_i][src_label] - probs[batch_i][base_label]) 54 | 55 | # store stats 56 | data.append({ 57 | "src_label": src_label.item(), 58 | "base_label": base_label.item(), 59 | "loss": loss.item(), 60 | "p_base": probs[batch_i][base_label].item(), 61 | "p_src": probs[batch_i][src_label].item(), 62 | "base_p_base": base_probs[batch_i][base_label].item(), 63 | "base_p_src": base_probs[batch_i][src_label].item(), 64 | "layer": layer_i, 65 | "pos": pos_i 66 | }) 67 | 68 | # summary metrics 69 | summary = { 70 | "iia": sum([d['p_src'] > d['p_base'] for d in data]) / len(data), 71 | "iia-flip": sum([d['p_src'] > d['p_base'] for d in data if d['base_p_base'] > d['base_p_src']]) / len(data), 72 | "odds_ratio": sum([d['base_p_base'] - d['base_p_src'] + d['p_src'] - d['p_base'] for d in data]) / len(data), 73 | "eval_loss": sum([d['loss'] for d in data]) / len(data), 74 | } 75 | 76 | # update iterator 77 | return data, summary, activations 78 | 79 | 80 | def augment_data(data: list[dict], information: dict) -> list[dict]: 81 | """Add information to a list of dicts.""" 82 | for d in data: 83 | d.update(information) 84 | return data 85 | -------------------------------------------------------------------------------- /interventions.py: -------------------------------------------------------------------------------- 1 | import pyvene as pv 2 | from pyvene.models.layers import LowRankRotateLayer 3 | from pyvene.models.modeling_utils import b_sd_to_bsd, bsd_to_b_sd 4 | import torch 5 | 6 | class PooledLowRankRotatedSpaceIntervention(pv.TrainableIntervention, pv.DistributedRepresentationIntervention): 7 | 8 | """Intervention in the rotated space.""" 9 | 10 | def __init__(self, **kwargs): 11 | super().__init__(**kwargs) 12 | rotate_layer = LowRankRotateLayer(self.embed_dim, kwargs["low_rank_dimension"]) 13 | self.rotate_layer = torch.nn.utils.parametrizations.orthogonal(rotate_layer) 14 | # TODO: put them into a parent class 15 | self.register_buffer('embed_dim', torch.tensor(self.embed_dim)) 16 | self.register_buffer('interchange_dim', torch.tensor(self.embed_dim)) 17 | 18 | def forward(self, base, source, subspaces=None): 19 | num_unit = (base.shape[1] // int(self.embed_dim)) 20 | base = b_sd_to_bsd(base, num_unit) 21 | source = b_sd_to_bsd(source, num_unit) 22 | rotated_base = self.rotate_layer(base) 23 | rotated_source = self.rotate_layer(source).mean(dim=1).unsqueeze(1).repeat(1, base.shape[1], 1) 24 | output = base + torch.matmul( 25 | (rotated_source - rotated_base), self.rotate_layer.weight.T 26 | ) 27 | output = bsd_to_b_sd(output) 28 | return output.to(base.dtype) 29 | 30 | def __str__(self): 31 | return f"LowRankRotatedSpaceIntervention(embed_dim={self.embed_dim})" 32 | 33 | 34 | class CollectIntervention(pv.CollectIntervention): 35 | 36 | """Collect activations.""" 37 | 38 | def __init__(self, **kwargs): 39 | super().__init__(**kwargs) 40 | 41 | def forward(self, base, source=None, subspaces=None): 42 | return base 43 | 44 | def __str__(self): 45 | return f"CollectIntervention(embed_dim={self.embed_dim})" 46 | 47 | 48 | def intervention_config(intervention_site, intervention_type, layer, num_dims): 49 | """Generate intervention config.""" 50 | intervenable_config = pv.IntervenableConfig([ 51 | { 52 | "layer": layer, 53 | "component": intervention_site, 54 | "intervention_type": CollectIntervention 55 | }, 56 | { 57 | "layer": layer, 58 | "component": intervention_site, 59 | "intervention_type": intervention_type, 60 | "low_rank_dimension": num_dims 61 | } 62 | ]) 63 | return intervenable_config -------------------------------------------------------------------------------- /prompt.py: -------------------------------------------------------------------------------- 1 | from transformers import AutoTokenizer, AutoModelForCausalLM 2 | from utils import WEIGHTS, top_vals, format_token 3 | import torch 4 | 5 | with torch.no_grad(): 6 | # load model 7 | model = input("Model: ") 8 | device = "cuda:0" if torch.cuda.is_available() else "cpu" 9 | tokenizer = AutoTokenizer.from_pretrained(model) 10 | tokenizer.pad_token = tokenizer.eos_token 11 | gpt = AutoModelForCausalLM.from_pretrained( 12 | model, 13 | revision="main", 14 | torch_dtype=WEIGHTS.get(model, torch.bfloat16) if device == "cuda:0" else torch.float32, 15 | ).to(device) 16 | 17 | # make data 18 | while True: 19 | text = input("Text: ") 20 | text = tokenizer(text, return_tensors="pt").to(device) 21 | print([format_token(tokenizer, i) for i in text.input_ids[0]]) 22 | logits = gpt(**text).logits[0, -1] 23 | probs = logits.softmax(-1) 24 | top_vals(tokenizer, probs) -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==1.26.4 2 | pandas==2.2.2 3 | plotnine==0.14.1 4 | pyvene==0.1.6 5 | scikit_learn==1.5.2 6 | scipy==1.13.1 7 | torch==2.5.1 8 | tqdm==4.66.6 9 | transformers==4.46.2 10 | -------------------------------------------------------------------------------- /test_all.py: -------------------------------------------------------------------------------- 1 | from data import list_datasets 2 | from das import experiment 3 | import argparse 4 | from transformers import AutoTokenizer, AutoModelForCausalLM 5 | from utils import WEIGHTS 6 | import torch 7 | import os 8 | 9 | def run_command( 10 | tokenizer: AutoTokenizer, 11 | gpt: AutoModelForCausalLM, 12 | model_name: str, 13 | dataset: str, 14 | lr: float, 15 | only_das: bool, 16 | hparam_non_das: bool, 17 | das_label: str, 18 | revision: str, 19 | folder: str, 20 | manipulate: str, 21 | ): 22 | # command = f"python das.py --model EleutherAI/pythia-70m --intervention {method} --dataset {dataset} --position each --num-tokens 1 --num-dims 1 --steps {steps}" 23 | print(dataset) 24 | experiment( 25 | model=model_name, 26 | dataset=dataset, 27 | steps=100, 28 | eval_steps=100, 29 | grad_steps=1, 30 | batch_size=4, 31 | intervention_site="block_output", 32 | strategy="last", 33 | lr=lr, 34 | only_das=only_das, 35 | hparam_non_das=hparam_non_das, 36 | das_label=das_label, 37 | revision=revision, 38 | log_folder=folder, 39 | manipulate=manipulate, 40 | tokenizer=tokenizer, 41 | gpt=gpt, 42 | ) 43 | 44 | def main( 45 | model: str, lr: float=5e-3, hparam_non_das: bool=False, only_das: bool=False, 46 | das_label: str=None, start: int=None, end: int=None, folder: str="das", revision: str="main", 47 | manipulate: str=False, datasets: str=None): 48 | 49 | # load model + tokenizer 50 | device = "cuda:0" if torch.cuda.is_available() else "cpu" 51 | tokenizer = AutoTokenizer.from_pretrained(model) 52 | tokenizer.pad_token = tokenizer.eos_token 53 | gpt = AutoModelForCausalLM.from_pretrained( 54 | model, 55 | revision=revision, 56 | torch_dtype=WEIGHTS.get(model, torch.bfloat16) if device == "cuda:0" else torch.float32, 57 | ).to(device) 58 | 59 | # run commands 60 | if datasets is None: 61 | datasets = [d for d in list_datasets() if d.startswith("syntaxgym/")] 62 | print(len(datasets)) 63 | 64 | # start/end 65 | if start is None: 66 | start = 0 67 | if end is None: 68 | end = len(datasets) 69 | 70 | # make folder 71 | if not os.path.exists(f"logs/{folder}"): 72 | os.makedirs(f"logs/{folder}") 73 | 74 | for dataset in datasets[start:end]: 75 | run_command(tokenizer, gpt, model, dataset, lr, only_das, hparam_non_das, das_label, revision, folder, manipulate) 76 | 77 | if __name__ == "__main__": 78 | parser = argparse.ArgumentParser() 79 | parser.add_argument("--model", type=str, default="EleutherAI/pythia-70m") 80 | parser.add_argument("--lr", type=float, default=5e-3) 81 | parser.add_argument("--only-das", action="store_true") 82 | parser.add_argument("--hparam_non_das", action="store_true") 83 | parser.add_argument("--das-label", type=str, default=None) 84 | parser.add_argument("--start", type=int, default=None) 85 | parser.add_argument("--end", type=int, default=None) 86 | parser.add_argument("--folder", type=str, default="das") 87 | parser.add_argument("--revision", type=str, default="main") 88 | parser.add_argument("--manipulate", type=str, default=None) 89 | parser.add_argument("--datasets", nargs='+', default=None) 90 | args = parser.parse_args() 91 | main(**vars(args)) 92 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from transformers import get_linear_schedule_with_warmup 3 | 4 | from eval import augment_data, calculate_loss, eval 5 | from utils import get_last_token 6 | from interventions import intervention_config, PooledLowRankRotatedSpaceIntervention 7 | import pyvene as pv 8 | from data import Batch 9 | 10 | def train_das( 11 | intervenable: pv.IntervenableModel, trainset: list[Batch], evalset: list[Batch], 12 | layer_i: int, pos_i: int, strategy: str, eval_steps: int, grad_steps: int, lr: float, 13 | epochs: int=1, das_label: str="das"): 14 | """Train DAS or Boundless DAS on a model.""" 15 | 16 | # setup 17 | data, activations, eval_activations, stats = [], [], [], {} 18 | total_steps = len(trainset) * epochs 19 | warm_up_steps = 0.1 * total_steps 20 | 21 | # optimizer 22 | optimizer_params = [] 23 | for k, v in intervenable.interventions.items(): 24 | if isinstance(v[0], pv.LowRankRotatedSpaceIntervention) or isinstance(v[0], PooledLowRankRotatedSpaceIntervention): 25 | optimizer_params.append({"params": v[0].rotate_layer.parameters()}) 26 | elif isinstance(v[0], pv.BoundlessRotatedSpaceIntervention): 27 | optimizer_params.append({"params": v[0].rotate_layer.parameters()}) 28 | optimizer_params.append({"params": v[0].intervention_boundaries, "lr": 1e-2}) 29 | optimizer = torch.optim.Adam(optimizer_params, lr=lr) 30 | # print("model trainable parameters: ", count_parameters(intervenable.model)) 31 | # print("intervention trainable parameters: ", intervenable.count_parameters()) 32 | 33 | # scheduler 34 | scheduler = get_linear_schedule_with_warmup( 35 | optimizer, 36 | num_warmup_steps=warm_up_steps, 37 | num_training_steps=total_steps, 38 | ) 39 | 40 | # temperature for boundless 41 | total_step = 0 42 | temperature_start = 50.0 43 | temperature_end = 0.1 44 | temperature_schedule = ( 45 | torch.linspace(temperature_start, temperature_end, total_steps) 46 | .to(torch.bfloat16) 47 | .to(intervenable.get_device()) 48 | ) 49 | intervenable.set_temperature(temperature_schedule[total_step]) 50 | 51 | # train 52 | iterator = trainset * epochs 53 | total_loss = torch.tensor(0.0).to(intervenable.get_device()) 54 | 55 | for step, batch in enumerate(iterator): 56 | 57 | # inference 58 | pos_interv = [[x[pos_i] for x in y] for y in batch.compute_pos(strategy)] 59 | base_outputs, counterfactual_outputs = intervenable( 60 | base=batch.base, 61 | sources=[None, batch.src], 62 | unit_locations={"sources->base": ([None, pos_interv[1]], pos_interv)}, 63 | ) 64 | 65 | # store activations/labels for training non-causal methods 66 | for batch_i in range(len(batch.pairs)): 67 | for unit_i in range(base_outputs[-1][batch_i].shape[0]): 68 | activation = base_outputs[-1][batch_i][unit_i].detach().cpu() 69 | activations.append((activation, batch.base_types[batch_i])) 70 | 71 | # get last token logits 72 | logits = get_last_token(counterfactual_outputs.logits, batch.base['attention_mask']) 73 | 74 | # loss and backprop 75 | loss = calculate_loss(logits, batch.src_labels) 76 | total_loss += loss 77 | 78 | # gradient accumulation 79 | if total_step % grad_steps == 0: 80 | 81 | # print stats 82 | stats["lr"] = scheduler.optimizer.param_groups[0]['lr'] 83 | stats["loss"] = total_loss.item() 84 | for k, v in intervenable.interventions.items(): 85 | if isinstance(v[0], pv.BoundlessRotatedSpaceIntervention): 86 | stats["bound"] = v[0].intervention_boundaries.sum() * v[0].embed_dim 87 | 88 | # backward 89 | if not (grad_steps > 1 and total_step == 0): 90 | total_loss.backward() 91 | total_loss = torch.tensor(0.0).to(intervenable.get_device()) 92 | optimizer.step() 93 | scheduler.step() 94 | intervenable.set_zero_grad() 95 | intervenable.set_temperature(temperature_schedule[total_step]) 96 | 97 | # eval 98 | if (step % eval_steps == 0 or step == total_steps - 1) and step != 0: 99 | more_data, summary, eval_activation = eval(intervenable, evalset, layer_i, pos_i, strategy) 100 | if eval_activations == []: 101 | eval_activations = eval_activation 102 | stats.update(summary) 103 | print(step, stats) 104 | data.extend(augment_data(more_data, {"method": das_label, "step": step})) 105 | 106 | total_step += 1 107 | 108 | # return data 109 | diff_vector = None 110 | for k, v in intervenable.interventions.items(): 111 | if isinstance(v[0], pv.LowRankRotatedSpaceIntervention) or isinstance(v[0], PooledLowRankRotatedSpaceIntervention): 112 | diff_vector = v[0].rotate_layer.weight.detach().detach().cpu().tolist() 113 | break 114 | intervenable._cleanup_states() 115 | return intervenable, data, activations, eval_activations, diff_vector 116 | 117 | 118 | def train_feature_direction( 119 | method: str, intervenable: pv.IntervenableModel, activations: list[tuple[torch.tensor, str]], 120 | eval_activations: list[tuple[torch.tensor, str]], evalset: list[Batch], layer_i: int, 121 | pos_i: int, strategy: str, intervention_site: str, method_mapping: dict[str, callable]) -> tuple[list[dict], dict]: 122 | """Train/compute and evaluate an intervention direction on some activations.""" 123 | 124 | # get diff vector based on method 125 | labels = [label for _, label in activations] 126 | activations = [activation.type(torch.float32) for activation, _ in activations] 127 | eval_labels = [label for _, label in eval_activations] 128 | eval_activations = [activation.type(torch.float32) for activation, _ in eval_activations] 129 | 130 | diff_vector, accuracy = method_mapping[method](activations, labels, eval_activations, eval_labels) 131 | diff_vector = diff_vector.to(intervenable.get_device()).unsqueeze(1) 132 | 133 | # new config 134 | eval_config = intervention_config( 135 | intervention_site, 136 | pv.LowRankRotatedSpaceIntervention if strategy != "all" else PooledLowRankRotatedSpaceIntervention, 137 | layer_i, 1 138 | ) 139 | intervenable2 = pv.IntervenableModel(eval_config, intervenable.model) 140 | intervenable2.set_device(intervenable.get_device()) 141 | intervenable2.disable_model_gradients() 142 | for k, v in intervenable2.interventions.items(): 143 | if isinstance(v[0], pv.LowRankRotatedSpaceIntervention) or isinstance(v[0], PooledLowRankRotatedSpaceIntervention): 144 | v[0].rotate_layer.weight = diff_vector 145 | 146 | # eval 147 | data, summary, _ = eval(intervenable2, evalset, layer_i, pos_i, strategy) 148 | if accuracy is not None: 149 | summary["accuracy"] = accuracy 150 | 151 | # done 152 | intervenable2._cleanup_states() 153 | return augment_data(data, {"method": method, "step": -1, "accuracy": accuracy}), summary, diff_vector.detach().cpu().tolist() -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | from torch import float32, bfloat16, float16, topk, arange 2 | from collections import namedtuple 3 | import random 4 | from transformers import AutoTokenizer 5 | import csv 6 | 7 | 8 | # models and weight format 9 | MODELS = [ 10 | # "gpt2", 11 | # "gpt2-medium", 12 | # "gpt2-large", 13 | # "gpt2-xl", 14 | "EleutherAI/pythia-14m", 15 | "EleutherAI/pythia-31m", 16 | "EleutherAI/pythia-70m", 17 | "EleutherAI/pythia-160m", 18 | "EleutherAI/pythia-410m", 19 | "EleutherAI/pythia-1b", 20 | "EleutherAI/pythia-1.4b", 21 | "EleutherAI/pythia-2.8b", 22 | "EleutherAI/pythia-6.9b", 23 | "EleutherAI/pythia-12b", 24 | ] 25 | 26 | 27 | WEIGHTS = { 28 | # "gpt2": float32, 29 | # "gpt2-medium": float32, 30 | # "gpt2-large": float32, 31 | # "gpt2-xl": float32, 32 | "EleutherAI/pythia-14m": float32, 33 | "EleutherAI/pythia-31m": float32, 34 | "EleutherAI/pythia-70m": float32, 35 | "EleutherAI/pythia-160m": float32, 36 | "EleutherAI/pythia-410m": float32, 37 | "EleutherAI/pythia-1b": bfloat16, 38 | "EleutherAI/pythia-1.4b": float16, 39 | "EleutherAI/pythia-2.8b": float16, 40 | "EleutherAI/pythia-6.9b": float16, 41 | "EleutherAI/pythia-12b": float16, 42 | } 43 | 44 | 45 | parameters = { 46 | "pythia-12b": 11846072320, 47 | "pythia-6.9b": 6857302016, 48 | "pythia-2.8b": 2775208960, 49 | "pythia-1.4b": 1414647808, 50 | "pythia-1b": 1011781632, 51 | "pythia-410m": 405334016, 52 | "pythia-160m": 162322944, 53 | "pythia-70m": 70426624, 54 | "pythia-31m": 31000000, 55 | "pythia-14m": 14000000, 56 | } 57 | 58 | 59 | def format_token(tokenizer, tok): 60 | """Format the token for some path patching experiment to show decoding diff""" 61 | return tokenizer.decode(tok).replace(" ", "_").replace("\n", "\\n") 62 | 63 | def top_vals(tokenizer, res, highlight=[], n=10): 64 | """Pretty print the top n values of a distribution over the vocabulary""" 65 | _, top_indices = topk(res, n) 66 | top_indices = top_indices.tolist() + highlight 67 | for i in range(len(top_indices)): 68 | val = top_indices[i] 69 | tok = format_token(tokenizer, val) 70 | if val in highlight: 71 | tok = f"\x1b[6;30;42m{tok}\x1b[0m" 72 | print(f"{tok:<34} {val:>5} {res[top_indices[i]].item():>10.4%}") 73 | else: 74 | print(f"{tok:<20} {val:>5} {res[top_indices[i]].item():>10.4%}") 75 | 76 | def get_last_token(logits, attention_mask): 77 | last_token_indices = attention_mask.sum(1) - 1 78 | batch_indices = arange(logits.size(0)).unsqueeze(1) 79 | return logits[batch_indices, last_token_indices.unsqueeze(1)].squeeze(1) --------------------------------------------------------------------------------