├── .gitignore ├── LICENSE ├── README.md ├── bin ├── download_glue_data.py └── eval_squad.py ├── data ├── brains │ └── .gitkeep └── sentences │ └── .gitkeep ├── main.nf ├── nextflow.config ├── nextflow.slurm.config ├── notebooks ├── decoding_rank_data.csv ├── encoding_distances.ipynb ├── pca_check.ipynb ├── predictions.ipynb ├── quantitative_dynamic.ipynb ├── quantitative_gross.ipynb ├── quantitative_roger.ipynb ├── rsa.py ├── structural-probes.ipynb ├── t-sne.ipynb └── within-subject.png ├── src ├── dependency_graph.py ├── heatmap.py ├── learn_decoder.py ├── nearest_neighbors.py └── util.py └── structural-probes ├── en_ewt-ud ├── en_ewt-ud-dev.conllu ├── en_ewt-ud-dev.txt ├── en_ewt-ud-test.conllu ├── en_ewt-ud-test.txt ├── en_ewt-ud-train.conllu └── en_ewt-ud-train.txt └── spec.yaml /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | perf.*.csv 3 | encodings/*.npy 4 | .ipynb_checkpoints 5 | .nextflow* 6 | data/brains/* 7 | data/sentences/* 8 | 9 | flow/bert 10 | flow/data 11 | flow/tasks 12 | work 13 | slurm-* 14 | encodings.* 15 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2018 Jon Gauthier 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 4 | 5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Neural network brain decoding 2 | 3 | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 4 | 5 | This repository contains analysis code for the paper: 6 | 7 | [**Linking human and artificial neural representations of language.**][3]
8 | Jon Gauthier and Roger P. Levy.
9 | [2019 Conference on Empirical Methods in Natural Language Processing][2]. 10 | 11 | This repository is open-source under the MIT License. If you would like to 12 | reuse our code or otherwise extend our work, please cite our paper: 13 | 14 | @inproceedings{gauthier2019linking, 15 | title={Linking human and artificial neural representations of language}, 16 | author={Gauthier, Jon and Levy, Roger P.}, 17 | booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing}, 18 | year={2019} 19 | } 20 | 21 | ## About the codebase 22 | 23 | We structure our data analysis pipeline, from model fine-tuning to 24 | representation analysis, using [Nextflow][4]. Our entire data analysis pipeline 25 | is specified in the file [`main.nf`](main.nf). 26 | 27 | Visualizations and statistical tests are done in Jupyter notebooks stored in 28 | the [`notebooks`](notebooks) directory. 29 | 30 | ## Running the code 31 | 32 | ### Hardware requirements 33 | 34 | - ~2 TB disk space (for storing brain images, model checkpoints, etc.) 35 | - 8 GB RAM or more 36 | - 1 GPU with > 4 GB RAM (for fine-tuning BERT models) 37 | 38 | We strongly suggest running this pipeline on a distributed computing cluster to 39 | save time. The full pipeline completes in several days on an MIT 40 | high-performance computing cluster. 41 | 42 | If you don't have a GPU or this much disk space to spare but still wish to run 43 | the pipeline, please ping me and we can make special resource-saving 44 | arrangements. 45 | 46 | ### Software requirements 47 | 48 | There are only two software requirements: 49 | 50 | 1. [Nextflow][4] is used to manage the data processing pipeline. Installing 51 | Nextflow is as simple as running the following command: 52 | 53 | ```bash 54 | wget -qO- https://get.nextflow.io | bash 55 | ``` 56 | 57 | This installation script will put a binary `nextflow` in your working 58 | directory. The later commands in this README assume that this binary is on 59 | your `PATH`. 60 | 2. [Singularity][5] retrieves and runs the software containers necessary for 61 | the pipeline. It is likely already available on your computing cluster. If 62 | not, please see the [Singularity installation instructions][6]. 63 | 64 | The pipeline is otherwise fully automated, so all other dependencies 65 | (data, BERT, etc.) will be automatically retrieved. 66 | 67 | ### Starting the pipeline 68 | 69 | Check out the repository by downloading the [**`emnlp2019-final`**](https://github.com/hans/nn-decoding/tree/emnlp2019-final) 70 | tag and run the following command in the root directory: 71 | 72 | ```bash 73 | nextflow run main.nf 74 | ``` 75 | 76 | ### Configuring the pipeline 77 | 78 | For **technical configuration** (e.g. customizing how this pipeline will be 79 | deployed on a cluster), see the file [`nextflow.config`](nextflow.config). The 80 | pipeline is configured by default to run locally, but can be easily farmed out 81 | across a computing cluster. 82 | 83 | A configuration for the [`SLURM`][6] framework is given in 84 | [`nextflow.slurm.config`](nextflow.slurm.config). If your cluster uses a 85 | framework other than SLURM, adapting to it may be as simple as changing a few 86 | settings in that file. See the [Nextflow documentation on cluster computing][7] 87 | for more information. 88 | 89 | For **model configuration** (e.g. customizing hyperparameters), see the header 90 | of the main pipeline in [`main.nf`](main.nf). Each parameter, written as `params.X`, 91 | can be overwritten with a command line flag of the same name. For example, if 92 | we wanted to run the whole pipeline with BERT models trained for 500 steps 93 | rather than 250 steps, we could simply execute 94 | 95 | ```bash 96 | nextflow run main.nf --finetune_steps 500 97 | ``` 98 | 99 | ### Analysis and visualization 100 | 101 | The `notebooks` directory contains Jupyter notebooks for producing the 102 | visualizations and statistical analyses in the paper (and much more): 103 | 104 | - [`quantitative_dynamic.ipynb`](notebooks/quantitative_dynamic.ipynb) is used 105 | to produce the majority of the plots in the paper, studying brain decoding 106 | across fine-tuning time in different models. 107 | - [`structural-probes.ipynb`](notebooks/structural-probes.ipynb) visualizes the 108 | structural probe results. 109 | - [`predictions.ipynb`](notebooks/predictions.ipynb) produces, among many other 110 | things, the RSA analysis on model representations. 111 | 112 | After the Nextflow pipeline completes, you can load and run these notebooks by 113 | beginning a Jupyter notebook session in the same directory as where you began 114 | the pipeline. The notebooks require Tensorflow and general Python data science 115 | tools to function. I recommend using my `tensorflow` Singularity image as 116 | follows: 117 | 118 | ```bash 119 | singularity run library://jon/default/tensorflow:1.12.0-cpu jupyter lab 120 | ``` 121 | 122 | 123 | [1]: https://doi.org/10.1038/s41467-018-03068-4 124 | [2]: https://www.emnlp-ijcnlp2019.org 125 | [3]: https://arxiv.org/abs/1910.01244 126 | [4]: https://www.nextflow.io 127 | [5]: https://sylabs.io/singularity/ 128 | [6]: https://slurm.schedmd.com/overview.html 129 | [7]: https://www.nextflow.io/docs/latest/executor.html 130 | -------------------------------------------------------------------------------- /bin/download_glue_data.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | ''' Script for downloading all GLUE data. 3 | 4 | Note: for legal reasons, we are unable to host MRPC. 5 | You can either use the version hosted by the SentEval team, which is already tokenized, 6 | or you can download the original data from (https://download.microsoft.com/download/D/4/6/D46FF87A-F6B9-4252-AA8B-3604ED519838/MSRParaphraseCorpus.msi) and extract the data from it manually. 7 | For Windows users, you can run the .msi file. For Mac and Linux users, consider an external library such as 'cabextract' (see below for an example). 8 | You should then rename and place specific files in a folder (see below for an example). 9 | 10 | mkdir MRPC 11 | cabextract MSRParaphraseCorpus.msi -d MRPC 12 | cat MRPC/_2DEC3DBE877E4DB192D17C0256E90F1D | tr -d $'\r' > MRPC/msr_paraphrase_train.txt 13 | cat MRPC/_D7B391F9EAFF4B1B8BCE8F21B20B1B61 | tr -d $'\r' > MRPC/msr_paraphrase_test.txt 14 | rm MRPC/_* 15 | rm MSRParaphraseCorpus.msi 16 | ''' 17 | 18 | import os 19 | import sys 20 | import shutil 21 | import argparse 22 | import tempfile 23 | import urllib 24 | import io 25 | if sys.version_info >= (3, 0): 26 | import urllib.request 27 | import zipfile 28 | 29 | URLLIB=urllib 30 | if sys.version_info >= (3, 0): 31 | URLLIB=urllib.request 32 | 33 | TASKS = ["CoLA", "SST", "MRPC", "QQP", "STS", "MNLI", "SNLI", "QNLI", "RTE", "WNLI", "diagnostic"] 34 | TASK2PATH = {"CoLA":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FCoLA.zip?alt=media&token=46d5e637-3411-4188-bc44-5809b5bfb5f4', 35 | "SST":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FSST-2.zip?alt=media&token=aabc5f6b-e466-44a2-b9b4-cf6337f84ac8', 36 | "MRPC":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2Fmrpc_dev_ids.tsv?alt=media&token=ec5c0836-31d5-48f4-b431-7480817f1adc', 37 | "QQP":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FQQP.zip?alt=media&token=700c6acf-160d-4d89-81d1-de4191d02cb5', 38 | "STS":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FSTS-B.zip?alt=media&token=bddb94a7-8706-4e0d-a694-1109e12273b5', 39 | "MNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FMNLI.zip?alt=media&token=50329ea1-e339-40e2-809c-10c40afff3ce', 40 | "SNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FSNLI.zip?alt=media&token=4afcfbb2-ff0c-4b2d-a09a-dbf07926f4df', 41 | "QNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FQNLI.zip?alt=media&token=c24cad61-f2df-4f04-9ab6-aa576fa829d0', 42 | "RTE":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FRTE.zip?alt=media&token=5efa7e85-a0bb-4f19-8ea2-9e1840f077fb', 43 | "WNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FWNLI.zip?alt=media&token=068ad0a0-ded7-4bd7-99a5-5e00222e0faf', 44 | "diagnostic":'https://storage.googleapis.com/mtl-sentence-representations.appspot.com/tsvsWithoutLabels%2FAX.tsv?GoogleAccessId=firebase-adminsdk-0khhl@mtl-sentence-representations.iam.gserviceaccount.com&Expires=2498860800&Signature=DuQ2CSPt2Yfre0C%2BiISrVYrIFaZH1Lc7hBVZDD4ZyR7fZYOMNOUGpi8QxBmTNOrNPjR3z1cggo7WXFfrgECP6FBJSsURv8Ybrue8Ypt%2FTPxbuJ0Xc2FhDi%2BarnecCBFO77RSbfuz%2Bs95hRrYhTnByqu3U%2FYZPaj3tZt5QdfpH2IUROY8LiBXoXS46LE%2FgOQc%2FKN%2BA9SoscRDYsnxHfG0IjXGwHN%2Bf88q6hOmAxeNPx6moDulUF6XMUAaXCSFU%2BnRO2RDL9CapWxj%2BDl7syNyHhB7987hZ80B%2FwFkQ3MEs8auvt5XW1%2Bd4aCU7ytgM69r8JDCwibfhZxpaa4gd50QXQ%3D%3D'} 45 | 46 | MRPC_TRAIN = 'https://s3.amazonaws.com/senteval/senteval_data/msr_paraphrase_train.txt' 47 | MRPC_TEST = 'https://s3.amazonaws.com/senteval/senteval_data/msr_paraphrase_test.txt' 48 | 49 | def download_and_extract(task, data_dir): 50 | print("Downloading and extracting %s..." % task) 51 | data_file = "%s.zip" % task 52 | URLLIB.urlretrieve(TASK2PATH[task], data_file) 53 | with zipfile.ZipFile(data_file) as zip_ref: 54 | zip_ref.extractall(data_dir) 55 | os.remove(data_file) 56 | print("\tCompleted!") 57 | 58 | def format_mrpc(data_dir, path_to_data): 59 | print("Processing MRPC...") 60 | mrpc_dir = os.path.join(data_dir, "MRPC") 61 | if not os.path.isdir(mrpc_dir): 62 | os.mkdir(mrpc_dir) 63 | if path_to_data: 64 | mrpc_train_file = os.path.join(path_to_data, "msr_paraphrase_train.txt") 65 | mrpc_test_file = os.path.join(path_to_data, "msr_paraphrase_test.txt") 66 | else: 67 | mrpc_train_file = os.path.join(mrpc_dir, "msr_paraphrase_train.txt") 68 | mrpc_test_file = os.path.join(mrpc_dir, "msr_paraphrase_test.txt") 69 | URLLIB.urlretrieve(MRPC_TRAIN, mrpc_train_file) 70 | URLLIB.urlretrieve(MRPC_TEST, mrpc_test_file) 71 | assert os.path.isfile(mrpc_train_file), "Train data not found at %s" % mrpc_train_file 72 | assert os.path.isfile(mrpc_test_file), "Test data not found at %s" % mrpc_test_file 73 | URLLIB.urlretrieve(TASK2PATH["MRPC"], os.path.join(mrpc_dir, "dev_ids.tsv")) 74 | 75 | dev_ids = [] 76 | with io.open(os.path.join(mrpc_dir, "dev_ids.tsv"), encoding='utf-8') as ids_fh: 77 | for row in ids_fh: 78 | dev_ids.append(row.strip().split('\t')) 79 | 80 | with io.open(mrpc_train_file, encoding='utf-8') as data_fh, \ 81 | io.open(os.path.join(mrpc_dir, "train.tsv"), 'w', encoding='utf-8') as train_fh, \ 82 | io.open(os.path.join(mrpc_dir, "dev.tsv"), 'w', encoding='utf-8') as dev_fh: 83 | header = data_fh.readline() 84 | train_fh.write(header) 85 | dev_fh.write(header) 86 | for row in data_fh: 87 | label, id1, id2, s1, s2 = row.strip().split('\t') 88 | if [id1, id2] in dev_ids: 89 | dev_fh.write("%s\t%s\t%s\t%s\t%s\n" % (label, id1, id2, s1, s2)) 90 | else: 91 | train_fh.write("%s\t%s\t%s\t%s\t%s\n" % (label, id1, id2, s1, s2)) 92 | 93 | with io.open(mrpc_test_file, encoding='utf-8') as data_fh, \ 94 | io.open(os.path.join(mrpc_dir, "test.tsv"), 'w', encoding='utf-8') as test_fh: 95 | header = data_fh.readline() 96 | test_fh.write("index\t#1 ID\t#2 ID\t#1 String\t#2 String\n") 97 | for idx, row in enumerate(data_fh): 98 | label, id1, id2, s1, s2 = row.strip().split('\t') 99 | test_fh.write("%d\t%s\t%s\t%s\t%s\n" % (idx, id1, id2, s1, s2)) 100 | print("\tCompleted!") 101 | 102 | def download_diagnostic(data_dir): 103 | print("Downloading and extracting diagnostic...") 104 | if not os.path.isdir(os.path.join(data_dir, "diagnostic")): 105 | os.mkdir(os.path.join(data_dir, "diagnostic")) 106 | data_file = os.path.join(data_dir, "diagnostic", "diagnostic.tsv") 107 | URLLIB.urlretrieve(TASK2PATH["diagnostic"], data_file) 108 | print("\tCompleted!") 109 | return 110 | 111 | def get_tasks(task_names): 112 | task_names = task_names.split(',') 113 | if "all" in task_names: 114 | tasks = TASKS 115 | else: 116 | tasks = [] 117 | for task_name in task_names: 118 | assert task_name in TASKS, "Task %s not found!" % task_name 119 | tasks.append(task_name) 120 | return tasks 121 | 122 | def main(arguments): 123 | parser = argparse.ArgumentParser() 124 | parser.add_argument('-d', '--data_dir', help='directory to save data to', type=str, default='glue_data') 125 | parser.add_argument('-t', '--tasks', help='tasks to download data for as a comma separated string', 126 | type=str, default='all') 127 | parser.add_argument('--path_to_mrpc', help='path to directory containing extracted MRPC data, msr_paraphrase_train.txt and msr_paraphrase_text.txt', 128 | type=str, default='') 129 | args = parser.parse_args(arguments) 130 | 131 | if not os.path.isdir(args.data_dir): 132 | os.mkdir(args.data_dir) 133 | tasks = get_tasks(args.tasks) 134 | 135 | for task in tasks: 136 | if task == 'MRPC': 137 | format_mrpc(args.data_dir, args.path_to_mrpc) 138 | elif task == 'diagnostic': 139 | download_diagnostic(args.data_dir) 140 | else: 141 | download_and_extract(task, args.data_dir) 142 | 143 | 144 | if __name__ == '__main__': 145 | sys.exit(main(sys.argv[1:])) 146 | -------------------------------------------------------------------------------- /bin/eval_squad.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """Official evaluation script for SQuAD version 2.0. 3 | 4 | In addition to basic functionality, we also compute additional statistics and 5 | plot precision-recall curves if an additional na_prob.json file is provided. 6 | This file is expected to map question ID's to the model's predicted probability 7 | that a question is unanswerable. 8 | """ 9 | import argparse 10 | import collections 11 | import json 12 | import numpy as np 13 | import os 14 | import re 15 | import string 16 | import sys 17 | 18 | OPTS = None 19 | 20 | def parse_args(): 21 | parser = argparse.ArgumentParser('Official evaluation script for SQuAD version 2.0.') 22 | parser.add_argument('data_file', metavar='data.json', help='Input data JSON file.') 23 | parser.add_argument('pred_file', metavar='pred.json', help='Model predictions.') 24 | parser.add_argument('--out-file', '-o', metavar='eval.json', 25 | help='Write accuracy metrics to file (default is stdout).') 26 | parser.add_argument('--na-prob-file', '-n', metavar='na_prob.json', 27 | help='Model estimates of probability of no answer.') 28 | parser.add_argument('--na-prob-thresh', '-t', type=float, default=1.0, 29 | help='Predict "" if no-answer probability exceeds this (default = 1.0).') 30 | parser.add_argument('--out-image-dir', '-p', metavar='out_images', default=None, 31 | help='Save precision-recall curves to directory.') 32 | parser.add_argument('--verbose', '-v', action='store_true') 33 | if len(sys.argv) == 1: 34 | parser.print_help() 35 | sys.exit(1) 36 | return parser.parse_args() 37 | 38 | def make_qid_to_has_ans(dataset): 39 | qid_to_has_ans = {} 40 | for article in dataset: 41 | for p in article['paragraphs']: 42 | for qa in p['qas']: 43 | qid_to_has_ans[qa['id']] = bool(qa['answers']) 44 | return qid_to_has_ans 45 | 46 | def normalize_answer(s): 47 | """Lower text and remove punctuation, articles and extra whitespace.""" 48 | def remove_articles(text): 49 | regex = re.compile(r'\b(a|an|the)\b', re.UNICODE) 50 | return re.sub(regex, ' ', text) 51 | def white_space_fix(text): 52 | return ' '.join(text.split()) 53 | def remove_punc(text): 54 | exclude = set(string.punctuation) 55 | return ''.join(ch for ch in text if ch not in exclude) 56 | def lower(text): 57 | return text.lower() 58 | return white_space_fix(remove_articles(remove_punc(lower(s)))) 59 | 60 | def get_tokens(s): 61 | if not s: return [] 62 | return normalize_answer(s).split() 63 | 64 | def compute_exact(a_gold, a_pred): 65 | return int(normalize_answer(a_gold) == normalize_answer(a_pred)) 66 | 67 | def compute_f1(a_gold, a_pred): 68 | gold_toks = get_tokens(a_gold) 69 | pred_toks = get_tokens(a_pred) 70 | common = collections.Counter(gold_toks) & collections.Counter(pred_toks) 71 | num_same = sum(common.values()) 72 | if len(gold_toks) == 0 or len(pred_toks) == 0: 73 | # If either is no-answer, then F1 is 1 if they agree, 0 otherwise 74 | return int(gold_toks == pred_toks) 75 | if num_same == 0: 76 | return 0 77 | precision = 1.0 * num_same / len(pred_toks) 78 | recall = 1.0 * num_same / len(gold_toks) 79 | f1 = (2 * precision * recall) / (precision + recall) 80 | return f1 81 | 82 | def get_raw_scores(dataset, preds): 83 | exact_scores = {} 84 | f1_scores = {} 85 | for article in dataset: 86 | for p in article['paragraphs']: 87 | for qa in p['qas']: 88 | qid = qa['id'] 89 | gold_answers = [a['text'] for a in qa['answers'] 90 | if normalize_answer(a['text'])] 91 | if not gold_answers: 92 | # For unanswerable questions, only correct answer is empty string 93 | gold_answers = [''] 94 | if qid not in preds: 95 | print('Missing prediction for %s' % qid) 96 | continue 97 | a_pred = preds[qid] 98 | # Take max over all gold answers 99 | exact_scores[qid] = max(compute_exact(a, a_pred) for a in gold_answers) 100 | f1_scores[qid] = max(compute_f1(a, a_pred) for a in gold_answers) 101 | return exact_scores, f1_scores 102 | 103 | def apply_no_ans_threshold(scores, na_probs, qid_to_has_ans, na_prob_thresh): 104 | new_scores = {} 105 | for qid, s in scores.items(): 106 | pred_na = na_probs.get(qid, 0.0) > na_prob_thresh 107 | if pred_na: 108 | new_scores[qid] = float(not qid_to_has_ans[qid]) 109 | else: 110 | new_scores[qid] = s 111 | return new_scores 112 | 113 | def make_eval_dict(exact_scores, f1_scores, qid_list=None): 114 | if not qid_list: 115 | total = len(exact_scores) 116 | return collections.OrderedDict([ 117 | ('exact', 100.0 * sum(exact_scores.values()) / total), 118 | ('f1', 100.0 * sum(f1_scores.values()) / total), 119 | ('total', total), 120 | ]) 121 | else: 122 | total = len(qid_list) 123 | return collections.OrderedDict([ 124 | ('exact', 100.0 * sum(exact_scores[k] for k in qid_list) / total), 125 | ('f1', 100.0 * sum(f1_scores[k] for k in qid_list) / total), 126 | ('total', total), 127 | ]) 128 | 129 | def merge_eval(main_eval, new_eval, prefix): 130 | for k in new_eval: 131 | main_eval['%s_%s' % (prefix, k)] = new_eval[k] 132 | 133 | def plot_pr_curve(precisions, recalls, out_image, title): 134 | plt.step(recalls, precisions, color='b', alpha=0.2, where='post') 135 | plt.fill_between(recalls, precisions, step='post', alpha=0.2, color='b') 136 | plt.xlabel('Recall') 137 | plt.ylabel('Precision') 138 | plt.xlim([0.0, 1.05]) 139 | plt.ylim([0.0, 1.05]) 140 | plt.title(title) 141 | plt.savefig(out_image) 142 | plt.clf() 143 | 144 | def make_precision_recall_eval(scores, na_probs, num_true_pos, qid_to_has_ans, 145 | out_image=None, title=None): 146 | qid_list = sorted(na_probs, key=lambda k: na_probs[k]) 147 | true_pos = 0.0 148 | cur_p = 1.0 149 | cur_r = 0.0 150 | precisions = [1.0] 151 | recalls = [0.0] 152 | avg_prec = 0.0 153 | for i, qid in enumerate(qid_list): 154 | if qid_to_has_ans[qid]: 155 | true_pos += scores[qid] 156 | cur_p = true_pos / float(i+1) 157 | cur_r = true_pos / float(num_true_pos) 158 | if i == len(qid_list) - 1 or na_probs[qid] != na_probs[qid_list[i+1]]: 159 | # i.e., if we can put a threshold after this point 160 | avg_prec += cur_p * (cur_r - recalls[-1]) 161 | precisions.append(cur_p) 162 | recalls.append(cur_r) 163 | if out_image: 164 | plot_pr_curve(precisions, recalls, out_image, title) 165 | return {'ap': 100.0 * avg_prec} 166 | 167 | def run_precision_recall_analysis(main_eval, exact_raw, f1_raw, na_probs, 168 | qid_to_has_ans, out_image_dir): 169 | if out_image_dir and not os.path.exists(out_image_dir): 170 | os.makedirs(out_image_dir) 171 | num_true_pos = sum(1 for v in qid_to_has_ans.values() if v) 172 | if num_true_pos == 0: 173 | return 174 | pr_exact = make_precision_recall_eval( 175 | exact_raw, na_probs, num_true_pos, qid_to_has_ans, 176 | out_image=os.path.join(out_image_dir, 'pr_exact.png'), 177 | title='Precision-Recall curve for Exact Match score') 178 | pr_f1 = make_precision_recall_eval( 179 | f1_raw, na_probs, num_true_pos, qid_to_has_ans, 180 | out_image=os.path.join(out_image_dir, 'pr_f1.png'), 181 | title='Precision-Recall curve for F1 score') 182 | oracle_scores = {k: float(v) for k, v in qid_to_has_ans.items()} 183 | pr_oracle = make_precision_recall_eval( 184 | oracle_scores, na_probs, num_true_pos, qid_to_has_ans, 185 | out_image=os.path.join(out_image_dir, 'pr_oracle.png'), 186 | title='Oracle Precision-Recall curve (binary task of HasAns vs. NoAns)') 187 | merge_eval(main_eval, pr_exact, 'pr_exact') 188 | merge_eval(main_eval, pr_f1, 'pr_f1') 189 | merge_eval(main_eval, pr_oracle, 'pr_oracle') 190 | 191 | def histogram_na_prob(na_probs, qid_list, image_dir, name): 192 | if not qid_list: 193 | return 194 | x = [na_probs[k] for k in qid_list] 195 | weights = np.ones_like(x) / float(len(x)) 196 | plt.hist(x, weights=weights, bins=20, range=(0.0, 1.0)) 197 | plt.xlabel('Model probability of no-answer') 198 | plt.ylabel('Proportion of dataset') 199 | plt.title('Histogram of no-answer probability: %s' % name) 200 | plt.savefig(os.path.join(image_dir, 'na_prob_hist_%s.png' % name)) 201 | plt.clf() 202 | 203 | def find_best_thresh(preds, scores, na_probs, qid_to_has_ans): 204 | num_no_ans = sum(1 for k in qid_to_has_ans if not qid_to_has_ans[k]) 205 | cur_score = num_no_ans 206 | best_score = cur_score 207 | best_thresh = 0.0 208 | qid_list = sorted(na_probs, key=lambda k: na_probs[k]) 209 | for i, qid in enumerate(qid_list): 210 | if qid not in scores: continue 211 | if qid_to_has_ans[qid]: 212 | diff = scores[qid] 213 | else: 214 | if preds[qid]: 215 | diff = -1 216 | else: 217 | diff = 0 218 | cur_score += diff 219 | if cur_score > best_score: 220 | best_score = cur_score 221 | best_thresh = na_probs[qid] 222 | return 100.0 * best_score / len(scores), best_thresh 223 | 224 | def find_all_best_thresh(main_eval, preds, exact_raw, f1_raw, na_probs, qid_to_has_ans): 225 | best_exact, exact_thresh = find_best_thresh(preds, exact_raw, na_probs, qid_to_has_ans) 226 | best_f1, f1_thresh = find_best_thresh(preds, f1_raw, na_probs, qid_to_has_ans) 227 | main_eval['best_exact'] = best_exact 228 | main_eval['best_exact_thresh'] = exact_thresh 229 | main_eval['best_f1'] = best_f1 230 | main_eval['best_f1_thresh'] = f1_thresh 231 | 232 | def main(): 233 | with open(OPTS.data_file) as f: 234 | dataset_json = json.load(f) 235 | dataset = dataset_json['data'] 236 | with open(OPTS.pred_file) as f: 237 | preds = json.load(f) 238 | if OPTS.na_prob_file: 239 | with open(OPTS.na_prob_file) as f: 240 | na_probs = json.load(f) 241 | else: 242 | na_probs = {k: 0.0 for k in preds} 243 | qid_to_has_ans = make_qid_to_has_ans(dataset) # maps qid to True/False 244 | has_ans_qids = [k for k, v in qid_to_has_ans.items() if v] 245 | no_ans_qids = [k for k, v in qid_to_has_ans.items() if not v] 246 | exact_raw, f1_raw = get_raw_scores(dataset, preds) 247 | exact_thresh = apply_no_ans_threshold(exact_raw, na_probs, qid_to_has_ans, 248 | OPTS.na_prob_thresh) 249 | f1_thresh = apply_no_ans_threshold(f1_raw, na_probs, qid_to_has_ans, 250 | OPTS.na_prob_thresh) 251 | out_eval = make_eval_dict(exact_thresh, f1_thresh) 252 | if has_ans_qids: 253 | has_ans_eval = make_eval_dict(exact_thresh, f1_thresh, qid_list=has_ans_qids) 254 | merge_eval(out_eval, has_ans_eval, 'HasAns') 255 | if no_ans_qids: 256 | no_ans_eval = make_eval_dict(exact_thresh, f1_thresh, qid_list=no_ans_qids) 257 | merge_eval(out_eval, no_ans_eval, 'NoAns') 258 | if OPTS.na_prob_file: 259 | find_all_best_thresh(out_eval, preds, exact_raw, f1_raw, na_probs, qid_to_has_ans) 260 | if OPTS.na_prob_file and OPTS.out_image_dir: 261 | run_precision_recall_analysis(out_eval, exact_raw, f1_raw, na_probs, 262 | qid_to_has_ans, OPTS.out_image_dir) 263 | histogram_na_prob(na_probs, has_ans_qids, OPTS.out_image_dir, 'hasAns') 264 | histogram_na_prob(na_probs, no_ans_qids, OPTS.out_image_dir, 'noAns') 265 | if OPTS.out_file: 266 | with open(OPTS.out_file, 'w') as f: 267 | json.dump(out_eval, f) 268 | else: 269 | print(json.dumps(out_eval, indent=2)) 270 | 271 | if __name__ == '__main__': 272 | OPTS = parse_args() 273 | if OPTS.out_image_dir: 274 | import matplotlib 275 | matplotlib.use('Agg') 276 | import matplotlib.pyplot as plt 277 | main() 278 | 279 | -------------------------------------------------------------------------------- /data/brains/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hans/nn-decoding/2d2cc639f650b6911cb1de7b8ecb7a872f75b36d/data/brains/.gitkeep -------------------------------------------------------------------------------- /data/sentences/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hans/nn-decoding/2d2cc639f650b6911cb1de7b8ecb7a872f75b36d/data/sentences/.gitkeep -------------------------------------------------------------------------------- /main.nf: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env nextflow 2 | 3 | import org.yaml.snakeyaml.Yaml 4 | 5 | // Finetune parameters 6 | params.finetune_runs = 1 7 | params.finetune_steps = 250 8 | params.finetune_checkpoint_steps = 5 9 | params.finetune_learning_rate = "2e-5" 10 | params.finetune_squad_learning_rate = "3e-5" 11 | // CLI params shared across GLUE and SQuAD tasks 12 | finetune_cli_params = """--do_train=true --do_eval=true \ 13 | --bert_config_file=\$BERT_MODEL/bert_config.json \ 14 | --vocab_file=\$BERT_MODEL/vocab.txt \ 15 | --init_checkpoint=\$BERT_MODEL/bert_model.ckpt \ 16 | --num_train_steps=${params.finetune_steps} \ 17 | --save_checkpoints_steps=${params.finetune_checkpoint_steps} \ 18 | --output_dir .""" 19 | 20 | // Encoding extraction parameters. 21 | params.extract_encoding_layers = "-1" 22 | params.extract_encoding_cls = true 23 | 24 | // Decoder learning parameters 25 | params.decoder_projection = 256 26 | params.brain_projection = 256 27 | params.decoder_n_jobs = 5 28 | params.decoder_n_folds = 8 29 | 30 | // Structural probe parameters 31 | params.structural_probe_layers = "11" 32 | structural_probe_layers = params.structural_probe_layers.split(",") 33 | params.structural_probe_spec = "structural-probes/spec.yaml" 34 | structural_probe_spec = new Yaml().load((params.structural_probe_spec as File).text) 35 | 36 | // TODO generalize 37 | params.structural_probe_train_path = "structural-probes/en_ewt-ud/en_ewt-ud-train.txt" 38 | params.structural_probe_dev_path = "structural-probes/en_ewt-ud/en_ewt-ud-dev.txt" 39 | params.structural_probe_train_conll_path = "structural-probes/en_ewt-ud/en_ewt-ud-train.conllu" 40 | params.structural_probe_dev_conll_path = "structural-probes/en_ewt-ud/en_ewt-ud-dev.conllu" 41 | 42 | 43 | ///////// 44 | 45 | params.outdir = "output" 46 | 47 | def get_checkpoint_num = { checkpoint_f -> 48 | (checkpoint_f.name =~ /-step(\d+)/)[0][1] as int 49 | } 50 | 51 | // Given a channel of checkpoints grouped by `(model, run) => fs`, where fs are 52 | // per-step checkpoint files, flatten to a channel of checkpoints grouped by 53 | // `(model, run, step) => f`, where `f` is an individual checkpoint file. 54 | def flatten_checkpoint_channel = { ch -> 55 | ch.flatMap { 56 | output -> 57 | run_id = output[0] 58 | files = (output[1] instanceof Collection ? output[1] : [output[1]]) 59 | files.collect { f -> 60 | step_num = get_checkpoint_num(f) 61 | step_id = run_id + [step_num] 62 | [step_id, f] 63 | } 64 | }.groupTuple() 65 | } 66 | 67 | ///////// 68 | 69 | glue_tasks = Channel.from("MNLI", "SST", "QQP") 70 | brain_images = Channel.fromPath([ 71 | // Download images for all subjects participating in experiment 2. 72 | "https://www.dropbox.com/s/5umg2ktdxvautci/P01.tar?dl=1", 73 | "https://www.dropbox.com/s/parmzwl327j0xo4/M02.tar?dl=1", 74 | "https://www.dropbox.com/s/4p9sbd0k9sq4t5o/M04.tar?dl=1", 75 | "https://www.dropbox.com/s/4gcrrxmg86t5fe2/M07.tar?dl=1", 76 | "https://www.dropbox.com/s/3q6xhtmj611ibmo/M08.tar?dl=1", 77 | "https://www.dropbox.com/s/kv1wm2ovvejt9pg/M09.tar?dl=1", 78 | "https://www.dropbox.com/s/8i0r88n3oafvsv5/M14.tar?dl=1", 79 | "https://www.dropbox.com/s/swc5tvh1ccx81qo/M15.tar?dl=1", 80 | ]) 81 | 82 | /** 83 | * Uncompress brain image data. 84 | */ 85 | process extractBrainData { 86 | label "small" 87 | publishDir "${params.outdir}/brains" 88 | 89 | input: 90 | file("*.tar*") from brain_images.collect() 91 | 92 | output: 93 | file("*") into brain_images_uncompressed 94 | 95 | """ 96 | #!/usr/bin/env bash 97 | find . -name '*tar*' | while read -r path; do 98 | newpath="\${path%.*}" 99 | mv "\$path" "\$newpath" 100 | tar xf "\$newpath" 101 | rm "\$newpath" 102 | done 103 | """ 104 | } 105 | 106 | sentence_data = Channel.fromPath("https://www.dropbox.com/s/jtqnvzg3jz6dctq/stimuli_384sentences.txt?dl=1") 107 | sentence_data.into { sentence_data_for_extraction; sentence_data_for_decoder } 108 | 109 | /** 110 | * Fetch GLUE task data (except SQuAD). 111 | */ 112 | process fetchGLUEData { 113 | label "small" 114 | 115 | output: 116 | file("GLUE") into glue_data 117 | 118 | """ 119 | #!/usr/bin/env bash 120 | download_glue_data.py -d GLUE -t SST,QQP,MNLI 121 | cd GLUE && ln -s SST-2 SST 122 | """ 123 | } 124 | 125 | /** 126 | * Fetch the SQuAD dataset. 127 | */ 128 | Channel.fromPath("https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json").set { squad_train_ch } 129 | Channel.fromPath("https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json").into { 130 | squad_dev_for_train_ch; squad_dev_for_eval_ch 131 | } 132 | 133 | /** 134 | * Fine-tune and evaluate the BERT model on the GLUE datasets (except SQuAD). 135 | */ 136 | process finetuneGlue { 137 | label "gpu_large" 138 | container params.bert_container 139 | publishDir "${params.outdir}/bert/${run_id_str}" 140 | tag "${run_id_str}" 141 | 142 | input: 143 | val glue_task from glue_tasks 144 | each file(glue_dir) from glue_data 145 | each run from Channel.from(1..params.finetune_runs) 146 | 147 | output: 148 | set run_id, file("model.ckpt-step*") into model_ckpt_files_glue 149 | set run_id, file("eval_results.txt"), file("eval") into model_eval_glue 150 | set run_id, file("events.out*") into model_events_glue 151 | 152 | script: 153 | run_id = [glue_task, run] 154 | run_id_str = run_id.join("-") 155 | // TODO assert that glue_task exists in glue_dir 156 | 157 | """ 158 | #!/usr/bin/env bash 159 | python /opt/bert/run_classifier.py --task_name=$glue_task \ 160 | ${finetune_cli_params} \ 161 | --data_dir=${glue_dir}/${glue_task} \ 162 | --learning_rate ${params.finetune_learning_rate} \ 163 | --max_seq_length 128 \ 164 | --train_batch_size 32 165 | 166 | # Rename model checkpoints to model.ckpt-step-* 167 | for f in model.ckpt*; do 168 | newname=\$(echo "\$f" | sed 's/ckpt-\$[[:digit:]]\\+\$/ckpt-step\\1/') 169 | mv "\$f" "\$newname" 170 | done 171 | """ 172 | } 173 | 174 | /** 175 | * Fine-tune the BERT model on the SQuAD dataset. 176 | */ 177 | process finetuneSquad { 178 | label "gpu_large" 179 | container params.bert_container 180 | publishDir "${params.outdir}/bert/${run_id_str}" 181 | tag "${run_id_str}" 182 | 183 | input: 184 | file("train.json") from squad_train_ch 185 | file("dev.json") from squad_dev_for_train_ch 186 | each run from Channel.from(1..params.finetune_runs) 187 | 188 | output: 189 | set val(run_id), file("model.ckpt-step*") into model_ckpt_files_squad 190 | set run_id, file("events.out*") into model_events_squad 191 | 192 | script: 193 | run_id = ["SQuAD", run] 194 | run_id_str = run_id.join("-") 195 | 196 | """ 197 | #!/usr/bin/env bash 198 | python /opt/bert/run_squad.py \ 199 | ${finetune_cli_params} \ 200 | --train_file=train.json \ 201 | --predict_file=dev.json \ 202 | --max_seq_length 384 \ 203 | --train_batch_size 12 \ 204 | --doc_stride 128 \ 205 | --learning_rate ${params.finetune_squad_learning_rate} \ 206 | --version_2_with_negative=True 207 | 208 | # Rename model checkpoints to model.ckpt-step-* 209 | for f in model.ckpt*; do 210 | newname=\$(echo "\$f" | sed 's/ckpt-\$[[:digit:]]\\+\$/ckpt-step\\1/') 211 | mv "\$f" "\$newname" 212 | done 213 | """ 214 | } 215 | 216 | model_ckpt_files_squad.into { squad_for_eval; squad_for_extraction } 217 | 218 | /** 219 | * Run evaluation for the SQuAD fine-tuned models. 220 | */ 221 | // Group SQuAD checkpoints based on their run-step. 222 | squad_eval_ckpts = flatten_checkpoint_channel(squad_for_eval) 223 | 224 | process evalSquad { 225 | label "gpu_medium" 226 | container params.bert_container 227 | publishDir "${params.outdir}/eval_squad/${ckpt_id_str}" 228 | tag "${ckpt_id_str}" 229 | 230 | input: 231 | set ckpt_id, file(ckpt_files) from squad_eval_ckpts 232 | each file("dev.json") from squad_dev_for_eval_ch 233 | 234 | output: 235 | set ckpt_id, file("predictions.json"), file("null_odds.json"), file("results.json") into squad_eval_results 236 | 237 | script: 238 | ckpt_id_str = ckpt_id.join("-") 239 | ckpt_step = ckpt_id.last() 240 | 241 | """ 242 | #!/usr/bin/env bash 243 | 244 | # Output a dummy checkpoint metadata file. 245 | echo "model_checkpoint_path: \"model.ckpt-step${ckpt_step}\"" > checkpoint 246 | 247 | # Run prediction. 248 | python /opt/bert/run_squad.py --do_predict \ 249 | --vocab_file=\$BERT_MODEL/vocab.txt \ 250 | --bert_config_file=\$BERT_MODEL/bert_config.json \ 251 | --init_checkpoint=model.ckpt-step${ckpt_step} \ 252 | --predict_file=dev.json \ 253 | --doc_stride 128 --version_2_with_negative=True \ 254 | --predict_batch_size 32 \ 255 | --output_dir . 256 | 257 | # Evaluate using SQuAD tools. 258 | eval_squad.py dev.json \ 259 | predictions.json --na-prob-file null_odds.json > results.json 260 | """ 261 | } 262 | 263 | /** 264 | * Prepare a dummy model checkpoint from the pretrained BERT model. 265 | */ 266 | process prepareBaselineCheckpoint { 267 | label "small" 268 | container params.bert_container 269 | publishDir "${params.outdir}/bert/${run_id_str}" 270 | tag "${run_id_str}" 271 | 272 | output: 273 | set run_id, file("model.ckpt-step*") into model_ckpt_files_baseline 274 | 275 | script: 276 | run_id = ["baseline", 1] 277 | run_id_str = run_id.join("-") 278 | 279 | ''' 280 | #!/usr/bin/env bash 281 | 282 | for ckpt_file in $BERT_MODEL/bert_model.ckpt*; do 283 | newname=$(basename "$ckpt_file" | sed 's/\$.\\+\$.ckpt./model.ckpt-step0./') 284 | cp "$ckpt_file" "$newname" 285 | done 286 | ''' 287 | } 288 | 289 | 290 | // Concatenate GLUE results with SQuAD and baseline results. 291 | // Each channel item is grouped by key `(, )`. 292 | model_ckpt_files_glue.concat(squad_for_extraction).concat(model_ckpt_files_baseline) \ 293 | .into { model_ckpts_for_decoder; model_ckpts_for_sprobe } 294 | 295 | /* // Group model checkpoints by keys `(, )`. */ 296 | /* model_ckpt_files.flatMap { output -> */ 297 | /* run_id = output[0] */ 298 | /* files = output[1] */ 299 | /* files.collect { f -> */ 300 | /* tuple(tuple(ckpt_id[0], (file.name =~ /^model.ckpt-(\d+)/)[0][1]), */ 301 | /* file) } } */ 302 | /* .groupTuple() */ 303 | /* .into { model_ckpts_for_decoder; model_ckpts_for_sprobe } */ 304 | 305 | /** 306 | * Extract .jsonl sentence encodings from each fine-tuned model. 307 | */ 308 | process extractEncoding { 309 | label "gpu_medium" 310 | container params.bert_container 311 | 312 | input: 313 | set run_id, file(ckpt_files) from model_ckpts_for_decoder 314 | each file(sentences) from sentence_data_for_extraction 315 | 316 | output: 317 | set run_id, file("encodings*.jsonl") into encodings_jsonl 318 | 319 | tag "${run_id_str}" 320 | 321 | script: 322 | run_id_str = run_id.join("-") 323 | 324 | all_ckpts = ckpt_files.target.collect(get_checkpoint_num).unique() 325 | all_ckpts_str = all_ckpts.join(" ") 326 | 327 | """ 328 | #!/usr/bin/env bash 329 | 330 | for ckpt in ${all_ckpts_str}; do 331 | python /opt/bert/extract_features.py \ 332 | --input_file=${sentences} \ 333 | --output_file=encodings-step\$ckpt.jsonl \ 334 | --vocab_file=\$BERT_MODEL/vocab.txt \ 335 | --bert_config_file=\$BERT_MODEL/bert_config.json \ 336 | --init_checkpoint=model.ckpt-step\$ckpt \ 337 | --layers="${params.extract_encoding_layers}" \ 338 | --max_seq_length=128 \ 339 | --batch_size=64 340 | done 341 | """ 342 | } 343 | 344 | // Expand jsonl encodings into individual identifier + jsonl files 345 | // (one item per task-run-step) 346 | encodings_jsonl_flat = flatten_checkpoint_channel(encodings_jsonl) 347 | 348 | /** 349 | * Convert .jsonl encodings to easier-to-use numpy arrays, saved as .npy 350 | */ 351 | process convertEncoding { 352 | label "medium" 353 | container params.bert_container 354 | tag "${ckpt_id_str}" 355 | publishDir "${params.outdir}/encodings/${ckpt_id_str}" 356 | 357 | input: 358 | set ckpt_id, file(encoding_jsonl) from encodings_jsonl_flat 359 | 360 | output: 361 | set ckpt_id, file("*.npy") into encodings 362 | 363 | script: 364 | ckpt_id_str = ckpt_id.join("-") 365 | 366 | if (params.extract_encoding_cls) { 367 | modifier_flag = "-c" 368 | } else { 369 | modifier_flag = "-l ${params.extract_encoding_layers}" 370 | } 371 | 372 | """ 373 | #!/usr/bin/env bash 374 | python /opt/bert/process_encodings.py \ 375 | -i ${encoding_jsonl} \ 376 | ${modifier_flag} \ 377 | -o ${ckpt_id_str}.npy 378 | """ 379 | } 380 | 381 | encodings.combine(brain_images_uncompressed.flatten()).set { encodings_brains } 382 | 383 | /** 384 | * Learn regression models mapping between brain images and model encodings. 385 | */ 386 | process learnDecoder { 387 | label "medium" 388 | container params.decoding_container 389 | 390 | publishDir "${params.outdir}/decoders/${tag_str}" 391 | cpus params.decoder_n_jobs 392 | 393 | input: 394 | set ckpt_id, file(encoding), file(brain_dir) from encodings_brains 395 | each file(sentences) from sentence_data_for_decoder 396 | 397 | output: 398 | set file("decoder.csv"), file("decoder.pred.npy") 399 | 400 | tag "${tag_str}" 401 | 402 | script: 403 | ckpt_id_str = ckpt_id.join("-") 404 | tag_str = "${ckpt_id_str}-${brain_dir.name}" 405 | """ 406 | #!/usr/bin/env bash 407 | python /opt/nn-decoding/src/learn_decoder.py ${sentences} \ 408 | ${brain_dir} ${encoding} \ 409 | --n_jobs ${params.decoder_n_jobs} \ 410 | --n_folds ${params.decoder_n_folds} \ 411 | --out_prefix decoder \ 412 | --encoding_project ${params.decoder_projection} \ 413 | --image_project ${params.brain_projection} 414 | """ 415 | } 416 | 417 | sprobe_train_ch = Channel.fromPath(params.structural_probe_train_path) 418 | sprobe_dev_ch = Channel.fromPath(params.structural_probe_dev_path) 419 | 420 | /** 421 | * Extract encodings for structural probe analysis (expects hdf5 format). 422 | */ 423 | process extractEncodingForStructuralProbe { 424 | label "gpu_medium" 425 | container params.bert_container 426 | tag "${run_id_str}" 427 | 428 | input: 429 | set run_id, file(ckpt_files) from model_ckpts_for_sprobe 430 | each file("train.txt") from sprobe_train_ch 431 | each file("dev.txt") from sprobe_dev_ch 432 | 433 | output: 434 | set run_id, file("encodings-*.hdf5") into encodings_sprobe 435 | 436 | script: 437 | run_id_str = run_id.join("-") 438 | 439 | all_ckpts = ckpt_files.collect(get_checkpoint_num).unique() 440 | all_ckpts_str = all_ckpts.join(" ") 441 | sprobe_layers = structural_probe_layers.join(",") 442 | 443 | """ 444 | #!/usr/bin/env bash 445 | for ckpt in ${all_ckpts_str}; do 446 | for split in train dev; do 447 | python /opt/bert/extract_features.py \ 448 | --input_file=\$split.txt \ 449 | --output_file=encodings-step\$ckpt-\$split.hdf5 \ 450 | --vocab_file=\$BERT_MODEL/vocab.txt \ 451 | --bert_config_file=\$BERT_MODEL/bert_config.json \ 452 | --init_checkpoint=model.ckpt-step\$ckpt \ 453 | --layers="${sprobe_layers}" \ 454 | --max_seq_length=96 \ 455 | --batch_size=64 \ 456 | --output_format=hdf5 457 | done 458 | done 459 | """ 460 | } 461 | 462 | // Expand hdf5 encodings (grouped per model run) into individual hdf5 file pairs 463 | // (train and dev), grouped by model-run-step 464 | encodings_sprobe_flat = flatten_checkpoint_channel(encodings_sprobe) 465 | 466 | // Now within each channel, order hdf5 by train / dev / etc. 467 | encodings_sprobe_flat.map { 468 | el -> [el[0], el[1].groupBy { f -> (f.name =~ /-(\w+).hdf5/)[0][1] }] 469 | }.map { 470 | el -> 471 | [el[0], el[1].train, el[1].dev] 472 | }.set { encodings_sprobe_readable } 473 | 474 | sprobe_train_conll_ch = Channel.fromPath(params.structural_probe_train_conll_path) 475 | sprobe_dev_conll_ch = Channel.fromPath(params.structural_probe_dev_conll_path) 476 | sprobe_test_conll_ch = Channel.fromPath(params.structural_probe_dev_conll_path) 477 | 478 | /** 479 | * Train and evaluate structural probe for each checkpoint and each layer. 480 | */ 481 | process runStructuralProbe { 482 | label "medium" 483 | container params.structural_probes_container 484 | tag "${ckpt_id_str}" 485 | publishDir "${params.outdir}/structural-probe/${ckpt_id_str}" 486 | 487 | input: 488 | set ckpt_id, file("encodings-train.hdf5"), file("encodings-dev.hdf5") \ 489 | from encodings_sprobe_readable 490 | each file(train_conll) from sprobe_train_conll_ch 491 | each file(dev_conll) from sprobe_dev_conll_ch 492 | each layer from Channel.from(structural_probe_layers) 493 | 494 | output: 495 | set ckpt_id, file("dev.*") into sprobe_results 496 | 497 | script: 498 | ckpt_id_str = ckpt_id.join("-") 499 | 500 | // Copy YAML template 501 | spec = new Yaml().load(new Yaml().dump(structural_probe_spec)) 502 | 503 | spec.model.model_layer = layer as int 504 | 505 | spec.dataset.corpus.root = "." 506 | spec.dataset.corpus.train_path = train_conll.getName() 507 | spec.dataset.corpus.dev_path = dev_conll.getName() 508 | spec.dataset.corpus.test_path = dev_conll.getName() 509 | 510 | spec.dataset.embeddings.train_path = "encodings-train.hdf5" 511 | spec.dataset.embeddings.dev_path = "encodings-dev.hdf5" 512 | spec.dataset.embeddings.test_path = "encodings-dev.hdf5" 513 | 514 | // Prepare to save to temporary file. 515 | yaml_spec_text = new Yaml().dump(spec) 516 | 517 | """ 518 | #!/usr/bin/env bash 519 | cat < spec.yaml 520 | ${yaml_spec_text} 521 | EOF 522 | 523 | /opt/conda/bin/python /opt/structural-probes/structural-probes/run_experiment.py \ 524 | --train-probe 1 --results-dir . spec.yaml 525 | """ 526 | } 527 | -------------------------------------------------------------------------------- /nextflow.config: -------------------------------------------------------------------------------- 1 | /** 2 | * This configuration file specifies a default pipeline setup for running brain 3 | * decoding on a local machine with a GPU. (See the project README for minimal 4 | * computing specs to run the pipeline.) 5 | * 6 | * While this pipeline does *work* running on a local machine, we recommend 7 | * deploying on a high-performance cluster. See the file 8 | * `nextflow.slurm.config` for an example deployment configuration for SLURM 9 | * clusters. 10 | */ 11 | process { 12 | executor = "local" 13 | 14 | /** 15 | * Pipeline processes are assigned "labels" according to their 16 | * computational requirements. Here we can specify the actual effects of 17 | * each label when the pipeline runs on your system. 18 | * 19 | * This first label, `small`, describes simple tasks which can easily run 20 | * on a single CPU with minimal RAM -- e.g. downloading a file. 21 | */ 22 | withLabel: 'small' { 23 | executor = 'local' 24 | memory = '1G' 25 | } 26 | 27 | /** 28 | * `medium` describes tasks which require moderate host memory, e.g. 29 | * loading brain images and learning linear regression models. 30 | */ 31 | withLabel: 'medium' { 32 | time = '1d' 33 | memory = '8G' 34 | } 35 | 36 | /** 37 | * `gpu_medium` describes tasks which require a GPU with moderate memory 38 | * and moderate host RAM, for e.g. holding a dataset in memory and running 39 | * neural network feed-forward inference. 40 | */ 41 | withLabel: 'gpu_medium' { 42 | containerOptions = "--nv" 43 | } 44 | 45 | /** 46 | * `gpu_large` describes tasks which require a GPU with lots of memory and 47 | * large host RAM, for e.g. holding a training dataset in memory and 48 | * running neural network training. 49 | */ 50 | withLabel: 'gpu_large' { 51 | containerOptions = "--nv" 52 | time = '1d' 53 | memory = '16G' 54 | containerOptions = "--nv" 55 | } 56 | } 57 | 58 | /** 59 | * You can limit the maximum number of parallel executing processes using 60 | * the variable below. This may be relevant when running with a single GPU, 61 | * for example. 62 | */ 63 | executor { 64 | queueSize = 1 65 | } 66 | 67 | // There should be no need to edit below this line. 68 | ////////////////////////////////////////// 69 | 70 | params.bert_container = "library://jon/default/bert:base-gpu" 71 | params.structural_probes_container = "library://jon/default/structural-probes:latest" 72 | params.decoding_container = "library://jon/default/nn-decoding:emnlp2019" 73 | 74 | singularity { 75 | enabled = true 76 | envWhitelist = "CUDA_VISIBLE_DEVICES" 77 | autoMounts = true 78 | } 79 | report.enabled = true 80 | 81 | -------------------------------------------------------------------------------- /nextflow.slurm.config: -------------------------------------------------------------------------------- 1 | /** 2 | * This configuration file specifies an example deployment of the pipeline for 3 | * a SLURM HPC cluster. This example can be repurposed to fit other HPC setups, 4 | * from PBS to Kubernetes. See the Nextflow docs for more information: 5 | * https://www.nextflow.io/docs/latest/executor.html 6 | */ 7 | process { 8 | /* Specify an HPC queue. */ 9 | /* queue = "cpl" */ 10 | 11 | /** 12 | * Pipeline processes are assigned "labels" according to their 13 | * computational requirements. Here we can specify the actual effects of 14 | * each label when the pipeline runs on your system. 15 | * 16 | * This first label, `small`, describes simple tasks which can easily run 17 | * on a single CPU with minimal RAM -- e.g. downloading a file. 18 | */ 19 | withLabel: 'small' { 20 | executor = 'local' 21 | time = '1h' 22 | } 23 | 24 | /** 25 | * `medium` describes tasks which require moderate host memory, e.g. 26 | * loading brain images and learning linear regression models. 27 | */ 28 | withLabel: 'medium' { 29 | executor = 'slurm' 30 | 31 | time = '1d' 32 | memory = '8G' 33 | } 34 | 35 | /** 36 | * `gpu_medium` describes tasks which require a GPU with moderate memory 37 | * and moderate host RAM, for e.g. holding a dataset in memory and running 38 | * neural network feed-forward inference. 39 | */ 40 | withLabel: 'gpu_medium' { 41 | executor = 'slurm' 42 | containerOptions = "--nv" 43 | clusterOptions = "--gres=gpu:tesla-k80:1" 44 | 45 | time = '1h' 46 | memory = '8G' 47 | } 48 | 49 | /** 50 | * `gpu_large` describes tasks which require a GPU with lots of memory and 51 | * large host RAM, for e.g. holding a training dataset in memory and 52 | * running neural network training. 53 | */ 54 | withLabel: 'gpu_large' { 55 | executor = 'slurm' 56 | containerOptions = "--nv" 57 | clusterOptions = '--gres=gpu:GEFORCEGTX1080TI:1' 58 | 59 | time = '1d' 60 | memory = '8G' 61 | } 62 | } 63 | 64 | executor { 65 | $slurm { 66 | // Limit number of parallel SLURM jobs to 16. 67 | queueSize = 16 68 | } 69 | } 70 | 71 | // There should be no need to edit below this line. 72 | ////////////////////////////////////////// 73 | 74 | params.bert_container = "library://jon/default/bert:base-gpu" 75 | params.structural_probes_container = "library://jon/default/structural-probes:latest" 76 | params.decoding_container = "library://jon/default/nn-decoding:emnlp2019" 77 | 78 | singularity { 79 | enabled = true 80 | envWhitelist = "CUDA_VISIBLE_DEVICES" 81 | autoMounts = true 82 | } 83 | report.enabled = true 84 | 85 | -------------------------------------------------------------------------------- /notebooks/pca_check.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 5, 6 | "metadata": {}, 7 | "outputs": [ 8 | { 9 | "name": "stdout", 10 | "output_type": "stream", 11 | "text": [ 12 | "The autoreload extension is already loaded. To reload it, use:\n", 13 | " %reload_ext autoreload\n" 14 | ] 15 | } 16 | ], 17 | "source": [ 18 | "from functools import partial\n", 19 | "import itertools\n", 20 | "import json\n", 21 | "from pathlib import Path\n", 22 | "import re\n", 23 | "import sys\n", 24 | "sys.path.append(\"../src\")\n", 25 | "\n", 26 | "import matplotlib\n", 27 | "import matplotlib.pyplot as plt\n", 28 | "import numpy as np\n", 29 | "import pandas as pd\n", 30 | "import seaborn as sns\n", 31 | "import scipy.io as io\n", 32 | "import scipy.stats as st\n", 33 | "from sklearn.decomposition import PCA\n", 34 | "import statsmodels.formula.api as smf\n", 35 | "from tqdm import tqdm, tqdm_notebook\n", 36 | "\n", 37 | "%matplotlib inline\n", 38 | "sns.set(style=\"whitegrid\", context=\"paper\", font_scale=3, rc={\"lines.linewidth\": 2})\n", 39 | "from IPython.display import set_matplotlib_formats\n", 40 | "set_matplotlib_formats('png')\n", 41 | "#set_matplotlib_formats('svg')\n", 42 | "\n", 43 | "%load_ext autoreload\n", 44 | "%autoreload 2\n", 45 | "import util" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": 2, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "brains_path = Path(\"../data/brains\")" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 3, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "PCA_DIM = 256" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": 6, 69 | "metadata": {}, 70 | "outputs": [ 71 | { 72 | "data": { 73 | "application/vnd.jupyter.widget-view+json": { 74 | "model_id": "5e4ac43d358643e6817d00355a6890ac", 75 | "version_major": 2, 76 | "version_minor": 0 77 | }, 78 | "text/plain": [ 79 | "HBox(children=(IntProgress(value=0, max=10), HTML(value='')))" 80 | ] 81 | }, 82 | "metadata": {}, 83 | "output_type": "display_data" 84 | }, 85 | { 86 | "name": "stdout", 87 | "output_type": "stream", 88 | "text": [ 89 | "\n" 90 | ] 91 | } 92 | ], 93 | "source": [ 94 | "pca_results = []\n", 95 | "for brain_el in tqdm_notebook(list(brains_path.iterdir())):\n", 96 | " if not brain_el.is_dir(): continue\n", 97 | " \n", 98 | " images = io.loadmat(brain_el / \"examples_384sentences.mat\")[\"examples\"]\n", 99 | " pca = PCA(PCA_DIM).fit(images)\n", 100 | " \n", 101 | " subject_name = brain_el.name\n", 102 | " pca_results.append((subject_name, sum(pca.explained_variance_ratio_)))" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": 7, 108 | "metadata": {}, 109 | "outputs": [ 110 | { 111 | "data": { 112 | "text/html": [ 113 | "

\n", 114 | "\n", 127 | "\n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | "

	subject	explained_variance
0	M02	0.963260
1	M04	0.970155
2	M07	0.965320
3	M08	0.978509
4	M09	0.974115
5	M14	0.982325
6	M15	0.975107
7	P01	0.950888

\n", 178 | "

" 179 | ], 180 | "text/plain": [ 181 | " subject explained_variance\n", 182 | "0 M02 0.963260\n", 183 | "1 M04 0.970155\n", 184 | "2 M07 0.965320\n", 185 | "3 M08 0.978509\n", 186 | "4 M09 0.974115\n", 187 | "5 M14 0.982325\n", 188 | "6 M15 0.975107\n", 189 | "7 P01 0.950888" 190 | ] 191 | }, 192 | "execution_count": 7, 193 | "metadata": {}, 194 | "output_type": "execute_result" 195 | } 196 | ], 197 | "source": [ 198 | "pca_results = pd.DataFrame(pca_results, columns=[\"subject\", \"explained_variance\"])\n", 199 | "pca_results" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": 9, 205 | "metadata": {}, 206 | "outputs": [ 207 | { 208 | "data": { 209 | "text/plain": [ 210 | "" 211 | ] 212 | }, 213 | "execution_count": 9, 214 | "metadata": {}, 215 | "output_type": "execute_result" 216 | }, 217 | { 218 | "data": { 219 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbUAAAFJCAYAAAAc+rO/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XlYVGX/P/D3GTZlVURBU5BSUUnN3M3KfckNyC3NJTPF3efxMTW11PpmPk+LpmmKlkuuKWCuuWGCoSguIOAOLpgji4AsygDn9we/Oc3AzDCcGRbp/bourmuYc59zfxgGPnPf514EURRFEBERVQGKig6AiIjIXJjUiIioymBSIyKiKoNJjYiIqgwmNSIiqjKY1IiIqMpgUiMioiqDSY2IiKoMJjUiIqoymNSIiKjKYFIjIqIqw9IcF7l06RJ2796Nixcv4vHjx3j+/Dl+++03NGrUSCpz8eJF3LlzB3Z2dujXr585qiUiItJiUlJTqVRYvHgxAgMDAQDqtZEFQShW9vnz51i4cCEUCgVatGiB+vXrm1I1ERFRMSZ1Py5atAiBgYEQRREuLi7o3bu33rKdOnWCh4cHRFHE8ePHTamWiIhIJ9lJLSIiAsHBwQCAcePGISQkBN9//73Bc3r16gVRFBERESG3WiIiIr1kdz/u3r0bANC+fXvMmzfPqHNatGgBALh9+7bcaomIiPSS3VK7dOkSBEHA8OHDjT7Hzc0NAJCUlCS3WiIiIr1kJ7Xk5GQAQMOGDY0+x9raGgCQm5srt1oiIiK9ZCc1S8vCnsvMzEyjz1EnQgcHB7nVEhER6SU7qdWuXRsAcP/+faPPOX/+PABwOD8REZUJ2UmtXbt2EEVRGgFZktTUVOzevRuCIKB9+/ZyqyUiItJLdlLz8/MDAFy4cAGHDh0yWDY1NRVTpkxBWloaFAoFhgwZIrdaIiIivWQP6W/dujUGDBiAAwcOYM6cOTh37hz69+8vHVcqlUhKSkJoaCj27NmDp0+fQhAEjBgxAp6enmYJnoiISJMgqte2kuH58+eYNGkSzp49q3NpLDV1FW+++SbWrl0rDTIhIiIyJ5OSGgAUFBQgICAAP//8M9LS0nSWsbe3x/jx4+Hv7w+FghsDEBFR2TA5qak9e/YM58+fR1RUFFJTU5GXlwdnZ2d4e3ujU6dOsLOzM0c1REREepktqREREVU09gUSEVGVwaRGRERVhuykdvPmTbRv3x6dOnXCo0ePSiz/6NEjdOzYER06dMC9e/fkVktERKSX7KT2+++/IyMjAy1atJBW3zfEzc0NLVu2REZGBg4fPiy3WiIiIr1kJ7U///wTgiDg7bffNvqcbt26QRRFhIWFya2WiIhIL9lJTd3l6OXlZfQ5jRs3BgA8fPhQbrVERER6yU5q6o0+7e3tjT5HXVa9BQ0REZE5yV6vytbWFhkZGXjy5InR56jLWllZyapTFEXcuXMHUVFR0tf169ehUqkAACdOnDDbtjbXr1/H5s2bER4ejuTkZDg5OcHb2xsjRoxAt27dzFIHERGZl+ykVq9ePWRkZCAyMhKdOnUy6pwLFy4AAOrWrSurzsTERLzzzjuyzi2NoKAgLFq0SEqWQGHL9NSpUzh16hTee+89LF68uMzjICKi0pHd/di+fXuIoojt27cjPT29xPJPnjzB9u3bzbafmqurK3r16oW2bduafC1NkZGRWLhwIVQqFZo0aYKNGzciPDwcgYGB6NmzJwBgx44dCAgIMGu9RERkOtnLZN2+fRsDBw6EKIpo0aIFVq1aBVdXV51llUolpk2bhujoaCgUCgQFBZVqgIlaZmYmzp49i1atWkk7b69atQqrV68GYJ7ux6FDhyIqKgouLi44cOAAatasKR0TRREffvghzpw5A1tbW5w4cQLOzs5GXzsyMtKk2IiI/onatGljdFnZ3Y+vvPIKRo4ciV9++QXR0dHo27cv+vXrhw4dOqBOnToQBAFKpRLnzp3D4cOH8ezZM2k/NTkJDSgcaKJuLZWF6OhoREVFAQAmTJigldAAQBAEzJ49G2fOnEF2djb27duHDz74oFR1lOaXQ0T0T1faxoBJG5vNnz8fSUlJ+P3335GTk4OgoCAEBQUVK6duDPbt2xcLFiwwpcoyFRISIj3u16+fzjLe3t5wd3fHvXv3cPLkyVInNSIiKjsmrf1oYWGBlStXYunSpXjppZcgiqLOrwYNGuCLL77AihUrYGFhYa7YzS4mJgZA4f06Q6uktGrVSqs8ERFVDmbZgnrYsGEYNmwYrl27hpiYGKSmpgIAnJ2d8eqrr8rubixv8fHxAIAGDRoYLKe+b5eVlQWlUqn3XiIREZUvsyQ1taZNm6Jp06bmvGS5Us+jq1WrlsFymsfT0tKY1IiIKgluPaMhJycHAGBtbW2wXLVq1aTH2dnZZRoTEREZz6wttapCEASDx03ZLDwuLk72uUREZJhZkpooirh16xbu37+PzMxMFBQUlHiOj4+POao2q+rVq0OlUuH58+cGy2ket7W1LVUdzZo1kxUbEdE/UbkO6c/Ly8P69euxfft2pKSkGH2eIAiVMqnVrFkTGRkZJf4smsdr1KhR1mEREZGRZCe1vLw8TJw4EeHh4SZ1x1Umnp6euHv3Lu7fv2+w3IMHDwAAdnZ2HCRCVIFysrOQn6cquWAZsrC0QnVbuwqNgf4mO6nt2LEDf/75J4DCFo6fnx9atmwJJycnKBQv5vgTb29vnDp1Ckql0uBQ/StXrkjliUojNzMTBaqK/ScMAAorK1iXYtuoyio/T4X1y+ZWaAwT5y+v0PpJm+yktn//fgCAh4cHduzYUao1ECurbt264YcffgAAHD58GOPGjStWJjY2Fvfu3QMAdO/evTzDoyqgQKXCyZn/rugw0H3ltwaP52Q/R15eyffGy5qlpQLVbW0qOgyT5T1TQcyv2NdTsFDAspq8bb9eJLKT2u3btyEIAiZPnlwlEhoAtGjRAi1btkRUVBQ2bNgAHx8frXtmoijim2++AVA4QGTw4MEVFWq5ysvJREFeXoXGoLC0hGX1F79l8aLIyyvAquUHKzoMTJ/bv6JDMAsxvwCxm8MrNIbmY43bIuxFJzupqUc4NmnSxGzBGOPWrVvIzMyUvn/06JH0OC4uTmtXbXd3d62EGxgYiPnz5wMAli1bBj8/v2LXnzdvHsaMGYOkpCSMHj0a8+bNQ7NmzaBUKrFmzRqEhYUBAKZMmWJyMs/Mfo5cVb5J1zCVtZUF7Ev4JFyQl4fotf8pp4h0azH5a4PHM59lQVVQsYkXAKwUlrCvxvsrRBXFpE1C79y5U+6Tj5csWYKIiAidx6ZNm6b1vb7EZUibNm3wxRdfYNGiRbhx4wbGjx9frMyIESPw0Ucfleq6uuSq8jHt/4ovAF2eVi/wrdD6zUVVkId/7fq0osPAd8OXVnQIRP9oskd09OjRAwBw/vx5swVTWfj6+mLv3r3w8/NDvXr1YGVlBRcXF7z99tv48ccfsWTJkooOkYiIdJDdUhs3bhwCAwOxefNm+Pr6GlzV3py2bt0q+1w/Pz+jW25eXl5YtmyZ7LqIiF40OTk5yM+v2FsiFhYWqF69uuzzZSc1Z2dnrFmzBpMnT8bIkSOxaNEidOvWTXYgRERUsfLz87Fu3boKjWHSpEkmnS87qY0ZMwYA4ODggISEBEyZMgUODg5o2LCh1oK/ugiCgM2bN8utmoiISCfZSS0iIkJa+FcQBIiiiIyMDERHRxs8TxTFEhcMJiIiksOk0Y9ERESVieykdvLkSXPGQUREZLIXc5FGIiIiHZjUiIioymBSIyKiKoNJjYiIqgyTdr5Wi4yMxNGjRxEbG4u0tDTk5OQY3DhUEAQcP37cHFUTERFJTEpqGRkZmDNnDk6fPg0AOhOZeg5b0eeIiIjMzaStZ6ZOnYoLFy5AFEXUrFkTbm5uiIuLgyAIaNOmDdLT0xEfH4+8vDwIggBPT0/UqlXLnPETERFJZCe1I0eO4Pz58xAEAVOnTsXUqVNx69YtDBo0CADwyy+/AACysrKwa9cufP/990hPT8eyZcvQqlUr80RPRESkQfZAkUOHDgEo3C16+vTpUCgUOrsV7ezsMH78eGzYsAHp6emYNm0aUlNT5UdMRESkh+ykdvXqVQiCgKFDhxpVvm3bthgyZAiSkpKwbds2udUSERHpJTupPXnyBADQoEGDvy+m+Ptyubm5xc5RbyzKJbaIiKgsyE5q6hGNNWrUkJ6zt7eXHqekpBQ7x9nZGQDw8OFDudUSERHpJTupqUcxZmRkSM85OzvD0rJw7MnNmzeLnaNUKgEA2dnZcqslIiLSS3ZSa9y4MQAgPj5ees7KygqNGjUCABw+fLjYOcHBwQAAV1dXudUSERHpJTupde7cGaIoIiIiQuv53r17QxRFBAcH47vvvsONGzcQFRWFTz/9FEePHoUgCOjcubPJgRMRERUlO6l169YNAHDq1ClkZWVJz48ZMwa1a9cGAKxfvx6DBw/G8OHD8euvvwIAbG1t8dFHH5kSMxERkU6yk5qHhwdWr16NL7/8Es+ePZOet7e3x8aNG+Hh4QFRFLW+ateujbVr12qNmCQiIjIXk9Z+7Nmzp87nmzRpggMHDuDcuXO4ceMG8vLy4OnpiTfffBM2NjamVElERKSXWVbp13lhS0u88cYbeOONN8qqCiIiIi3cT42IiKoMJjUiIqoymNSIiKjKKPGeWrNmzQAUbuwZGxtb7Hk5il6LiIjIHEpMarp2szb0PBERUUUpMan5+vqW6nkiIqKKUmJSW7ZsWameJyIiqigcKEJERFWG7MnX6pZau3bt9K4sQkREVJ5kJ7XNmzdDEAR07NjRnPEQERHJJrv7Ub3jtZubm9mCISIiMoXspObu7g4ASElJMVswREREppCd1Hr16gVRFHHs2DFzxkNERCSb7KT2/vvvw8PDA3v37sWpU6fMGBIREZE8spNa9erVsXHjRjRq1AhTpkzB/PnzER4ejrS0NK42QkREFUL26EfNtR9FUURwcDCCg4ONOpdrPxIRUVmQndSKtsbYOiMiooomO6lx7UciIqpsTF5RhIiIqLLg2o9ERFRlMKkREVGVwaRGRERVhux7arrk5+cjIyMDz549K3E0ZL169cxZNRERkelJLSMjA9u2bcPRo0dx8+ZN5Ofnl3gO56kREVFZMCmpXbt2Df7+/lAqlZynRkREFU52UsvKyoK/vz8ePXoEhUKBHj16wNnZGbt374YgCJg8eTIyMjIQHR2NK1euQBAEtG7dGp07dzZn/ERERBLZSW337t149OgRLCwssGHDBnTq1Ak3b97E7t27AQAzZsyQykZHR2POnDm4cuUKfHx8MGzYMNMjJyIiKkL26MdTp05BEAT07NkTnTp1Mli2RYsW2Lx5MxwdHfH555/j5s2bcqslIiLSS3ZSu3XrFgCgT58+Oo8Xvcfm6uqK0aNHQ6VSYdu2bXKrJSIi0kt2UktPTwegPTTfyspKepyTk1PsnPbt2wMAwsPD5VZLRESkl+ykpk5gmonMzs5Oevz48eNi51hbW+s9RkREZCrZSc3NzQ0AkJqaKj3n4uKC6tWrAwBiYmKKnZOQkCC3OiIiohLJTmrqTUI1B30IgoAWLVpAFEXs2LFDq3xubi42bdoEAHB3d5dbLRERkV6yk9pbb70FURRx5swZred9fHwAAJGRkRg5ciR++eUXBAQEYOjQoYiNjYUgCOjVq5dpURMREekge57aW2+9BUEQcO7cOfz111+oW7cugMLNQ/fs2YOLFy/i0qVLuHTpktZ5Hh4e+OCDD0yLmoiISAfZLTVnZ2ecPXsWoaGhcHFxkZ4XBAHr16+Hn58fLC0tIYoiRFGEIAjo0aMHtm7dqjWghIiIyFxMWvvRyclJ5/P29vb48ssvsWDBAiQkJCA/Px/u7u6oUaOGKdUREREZZNatZ4qys7ODt7d3WVZBREQkkd39qGtyNRERUUWSndTeeOMNzJ8/H2fPnjVnPERERLLJ7n7Mzs5GcHAwgoODUbduXQwaNAiDBw+Gp6enOeMjIiIymuyWWrt27QAULlz88OFDrFu3Du+88w6GDx+OnTt3IiMjw2xBEhERGUN2Utu6dStOnjyJmTNnwtPTUxq6HxUVhSVLlqBLly6YMWMGTp48ifz8fHPGTEREpJPspAYAdevWxeTJk3H48GHs3r0bI0eOhJOTE0RRRG5uLo4dO4apU6fizTffxJdffonY2FhzxU1ERFSMSUlNU8uWLfHpp58iLCwMq1evRq9evaTJ16mpqdi6dSveffddDBw4EBs3bjRXtURERBKzJTU1S0tL9OzZE6tWrUJYWBgWLVqEli1bSt2TN2/exNdff23uaomIiMyf1DQ5OTlh1KhR2L17N77//ns4OjqWZXVERPQPV6YriqSmpmL//v3Yt28f4uLiyrIqIiIi8ye13NxcHD9+HL/99hvCwsKkkY+iKAIAWrRoIW1PQ0REZE5mS2oXLlzAvn37cOTIEWRmZgL4O5G5ublh0KBB8PHxwcsvv2yuKomIiLSYlNTu37+P4OBg/Pbbb3jw4AGAvxNZ9erV0atXL/j6+qJjx44QBMH0aImIiAyQndTee+89XL58GcDfiUwQBHTo0AE+Pj7o06cPbG1tzRMlERGREWQnNc0drRs2bAgfHx8MHjxY2gGbiIiovMlOao6OjnjnnXfg6+uLVq1amTMmIiIiWWQntbCwMFhbW5stkJycHFy9ehXA34slExERlYbspGbOhAYADx48wOjRo6FQKLhGJBERyVKmK4rIoR50QkREVFqVLqkRERHJxaRGRERVBpMaERFVGUxqRERUZZTpKv1lKSQkBDt37kRMTAzS09Ph4uKCTp06YezYsfDy8pJ93e7duyMxMbHEcitXrkTfvn1l10NEROb3QrbUPvvsM/j7++PUqVNISkpCbm4uHj58iL1792LIkCEIDg6u6BCJiKgCvHAttYCAAOzcuRMA0LNnT0yZMgV169ZFbGwsli9fjhs3bmDBggVo0KAB2rRpI7ueSZMmYdKkSXqPV6tWTfa1iYiobLxQSS01NRVr1qwBAHTp0gWrV6+WVv/v0qULvL29MWDAACQnJ2P58uXYvXu37LqsrKxgZ2dnlriJiKh8vFDdj0FBQcjOzgYA/Pvf/y62nU3NmjUxYcIEAMCVK1cQExNT7jESEVHFeaGSWkhICADA3d0d3t7eOsv069dPenzy5MlyiYuIiCqHFyqpqVtehnYFcHNzg6urq1Z5U6hUKi7dRUT0gnhh7qkplUqp67FBgwYGy9avXx9KpRLx8fGy6wsKCsKOHTuQkpICCwsLuLm5oX379hg1ahRatGgh+7pERFR2Kk1LrXr16mjXrh3atm2r8/iTJ0+kx7Vq1TJ4LfXxtLQ02fEkJiYiJSUFAJCfn4/ExEQEBQVhyJAhWL58OVtvRESVUKVpqdWvXx9bt27Ve1zdSgMAGxsbg9dSH8/Kyip1HE2aNMH777+Ptm3bom7dunByckJSUhJOnz6NtWvXQqlU4qeffoKNjQ1mzZpV6uvHxcVJj2vVeanU55tbXl6eVky6eLi5lFM0+uXlqXDbQJwu9WqXYzT65alUBl/PBi4V/1oCJf/e3Vzrl2M0+pUUZ906Ff97z1MZ8TdUz72cotFPlZeHW3F3DJapW7duOUWjnzH/kwypNEmtJKVpGZnSivrxxx+LPffSSy/hvffeQ+/evfHee+/h7t27CAgIgJ+fH9zdS/dmbdasmfQ4NT3bQMnyYWlpqRWTLrlP5bd4zcXS0spgnE+y08sxGv0srQzH+Uyjx6EilfR7f5qRU47R6FdSnJkZleC9aVXy35Aq63k5RaOflRF/65mZmeUUjX5Ff+eRkZGlO7+kAj169Ch9VCUQBAHHjx8v1Tmac8aePzf8BsnNzS12jjnUqlULCxYswMSJE5GXl4fDhw8bnKBNRETlq8SkZsw6iGqCIBRrJel7rrRq1qwpPVbf69JHfbxGjRqlrqckb7zxBmxsbPD8+XPu0E1EVMmUmNTatWtn8Pjjx49x7949iKIIURTx0ksvoXbt2hBFESkpKVJSFAQBHh4eqF1bXh+4q6srbG1tkZ2djfv37xss++DBAwCAp6enrLoMsbS0hJOTEx4/foynT5+a/fpERCRfiUnN0OCNsLAwzJ49G9WqVcOkSZMwdOjQYiMTU1NT8euvv2LdunV48uQJPv30U3Tu3FlWsN7e3jh//jyioqL0llEqlVAqlVJ5c1OpVNKoSgcHB7Nfn4iI5JM9pP/+/fuYOXMmcnNz8csvv8Df31/nUHtnZ2dMmjQJ27ZtQ25uLmbOnCm1pEqrW7duAIC7d+/q7fo7fPiw9Lh79+6y6jHk9OnT0j27skiaREQkn+yktmnTJmRlZWHcuHFG/XNv1qwZxo0bh6dPn2LTpk2y6vT19YWtrS0A4Ntvvy12ry4tLQ0bNmwAULjqSGmTzqNHjwweVyqV+PLLLwEULnisuSQXERFVPNlJLSwsDIIgoEuXLkafoy4bGhoqq05nZ2dMmTJFusaMGTMQFxeH1NRUnDlzBqNHj0ZSUhIsLS0xd+7cYucHBgbCy8sLXl5eCAwMLHb8888/x4gRI7BlyxZERUUhOTkZGRkZuHXrFn7++Wf4+vpKrcyJEyeWuLIJERGVL9nz1NT3raytrY0+R11Wfa4cH330ER48eICdO3fi6NGjOHr0qNZxKysrfPHFF7L2UhNFEZcuXcKlS5f0lrGwsMCkSZMwY8aMUl+fiIjKluykph7Wfu3aNaPXQlTPEi9pRZCSLFmyBF27dsWOHTsQExOD9PR01K5dGx07dsS4cePg5eUl67qTJk1Cs2bNcPnyZdy7dw9paWnIzs6GnZ0d3N3d0a5dOwwdOhQvv/yySfETEVHZkJ3UmjZtinPnziEgIADvvPNOiROds7KysGHDBgiCgKZNm8qtVtKtWzdp4Iix/Pz84Ofnp/d4q1atDO4AQERElZvse2rvvvsugMJRkKNHjza4Vte1a9cwevRo3Lt3DwAwZMgQudUSERHpJbulNmjQIBw6dAinTp1CXFwc/Pz80LRpU7Rs2RK1atWCIAhITk5GdHS0VsLr2rUrBg4caJbgiYiINJm0oPH333+PTz75BAcOHABQ2CK7du1asXLqofcDBgyQhsQTERGZm0lJzdraGl9//TX8/Pywbds2hIeHa20RAwC2trbo1KkT3n//fXTq1MmkYImIiAwxy9YznTt3RufOnVFQUID79+9Ly0jVqFEDDRo0gEJRafYiJSKiKsys+6kpFAp4eHjAw8PDnJclIiIyCptQRERUZZitpRYREYGLFy8iKSkJOTk5mDVrFurUqaNVpqCgAIIgyNpPjYiIqCQmJ7WIiAgsXrwY8fHxWs+PHz9eK6lt3rwZX331Fezt7REWFmbyqiJERERFmdT9ePToUYwfPx7x8fHSJqFFV85XGz58OKpXr47MzEycPHnSlGqJiIh0kp3UkpKSMHfuXOTl5cHDwwPr1q1DZGSk3vLVqlVDjx49AAB//vmn3GqJiIj0kp3Utm7dipycHNSuXRvbt2/H22+/XeL6j+3atYMoioiJiZFbLRERkV4m76c2duxYODs7G3WOp6cnACAxMVFutURERHrJTmrqzTJff/11o89xcHAAULhiPxERkbnJTmo5OTkACjflNJY6mXHkIxERlQXZSa1mzZoASteVeP36dQCAi4uL3GqJiIj0kp3UmjVrBgC4cOGC0ecEBQVBEAS0bNlSbrVERER6yU5qPXv2hCiK2LNnD5RKZYnl161bh+joaABAnz595FZLRESkl+yk5uPjg/r16+P58+cYN24crly5orPcrVu3MHv2bKxYsQKCIMDLyws9e/aUHTAREZE+spfJsrKywqpVqzBq1CgkJCRgxIgRqFu3rnR8/vz5SEpKklpxoijC0dER3333nelRExER6WDSMlnNmjXDjh074OnpCVEU8fDhQ2mx4qtXr+LRo0fS0lmenp7Yvn27NFeNiIjI3Exe0NjLywsHDhzA77//jmPHjiEqKgopKSnIz8+Hs7MzvL290atXLwwcOBAWFhbmiJmIiEgns2w9o1Ao0K9fP/Tr188clyMiIpKFm4QSEVGVwaRGRERVBpMaERFVGSbfU3v69Cl+++03nDt3Dg8ePEBmZiby8/MNniMIAo4fP25q1URERFpMSmqnT5/G3LlzkZaWBgB6d70uSj3sn4iIyJxkJ7WbN29i2rRpUKlUEEURlpaW8PT0hJOTE5MWERFVCNlJbf369cjNzYVCocDkyZMxfvx42NvbmzM2IiKiUpGd1CIiIiAIAkaMGIEZM2aYMyYiIiJZZI9+TE1NBQD07dvXbMEQERGZQnZSc3JyAgB2ORIRUaUhO6k1b94cAHD37l2zBUNERGQK2UltxIgREEURwcHB5oyHiIhINtlJrXv37hgyZAhOnz6N1atXmzMmIiIiWWSPfjx//jwGDhyI+Ph4/PDDDwgJCcHAgQPx8ssvo3r16iWe365dO7lVExER6SQ7qY0ePVprknVsbCxiY2ONOlcQBKPLEhERGcukZbKMXRaLiIioPMhOasuWLTNnHERERCaTndR8fX3NGQcREZHJuJ8aERFVGUxqRERUZTCpERFRlVHiPbXz589LjzXnlmk+LwfnqRERkbmVmNTU89GKzi0rOk+tNDhPjYiIyoJRox/1zUfjPDUiIqpMSkxq+uajcZ4aERFVNiUmNX3z0ThPjYiIKhuOfiQioiqDSY2IiKoMJjUiIqoyTFqlv6jExEQ8efIEz549K3FkJOepERGRuZmc1O7fv49169bh2LFjyMjIMOoczlMjIqKyYFJSCw8Px/Tp05GVlcU5a0REVOFkJ7XU1FTMnDkTmZmZqF69OoYNGwYHBwesXr0agiDgiy++QEZGBqKjo3HixAnk5uaibdu28PPzM2f8REREEtlJbfv27cjIyIC1tTV27dqFJk2a4ObNm1i9ejUA4N1335XKKpVKzJo1CxcuXED79u0xffp00yMnIiIqQvbox7CwMAiCgEGDBqFJkyYGy7q6uiIgIABubm5Yu3YtLl26JLdaIiIivWQntYQj6Nw1AAAgAElEQVSEBADAm2++qfN4QUGB1vf29vYYO3YsCgoKsHPnTrnVEhER6SU7qWVmZgIA6tatKz1nbW0tPc7Ozi52zmuvvQYAiIyMlFstERGRXrKTmo2NDQBobT/j4OAgPVYqlXrPTU5OllstERGRXrKTWr169QAAKSkp0nPOzs6wt7cHAFy+fLnYOTdu3AAAWFhYyK2WiIhIL9lJ7dVXXwUAXLt2Tev5Nm3aQBRFbN68Gc+fP5eeT0tLw8aNGyEIAl5++WW51RIREeklO6l17doVoigiNDRU6/khQ4YAAG7evImBAwdi+fLlWLx4MQYNGoS7d+8CAPr3729CyERERLrJnqf2xhtvwN7eHteuXcPdu3fh4eEBAOjVqxd69+6No0eP4v79+9i0aROAv3fJbtWqFd5//33TIyciIipCdlKzt7fHhQsXdB775ptvsH79euzatQuPHz8GADg5OWHw4MGYNWsWLC3Nuo4yERERADOv0q9mZWWFqVOnYurUqUhLS0N+fj6cnZ21RkoSERGZW5k3mWrUqFHWVRAREQHgJqFERFSFMKkREVGVUWL3Y3BwcJlU7OPjUybXJSKif64Sk9q8efPMPsBDEAQmNSIiMjujBopwV2siInoRlJjUtmzZUh5xEBERmazEpNa+ffvyiIOIiMhkHP1IRERVBpMaERFVGWZbUaSgoADXr1/H9evXkZaWBqBwNREvLy94eXlBoWD+JCKismVyUsvNzUVAQAB27NihtWGoplq1amHkyJGYMGECrK2tTa2SiIhIJ5OaT48ePYKvry9Wr16N5ORkiKKo8ys5ORmrVq2Cn58fHj16ZK7YiYiItMhuqT1//hwffPAB4uPjARR2Nfbr1w+tWrWCi4sLRFFESkoKoqKicPjwYTx58gS3bt3C+PHjERwczBYbERGZneyktnnzZsTHx0MQBPTv3x+LFy+Gvb19sXI+Pj6YPXs2lixZgt9++w3x8fHYsmULJkyYYFLgRERERcnufjxy5AgEQUDbtm3x9ddf60xoanZ2dvjvf/+Ltm3bQhRFHDp0SG61REREeslOagkJCQCAUaNGGX3O+++/r3UuERGROcnuflQP0Xd3dzf6HHVZcwzvDwkJwc6dOxETE4P09HS4uLigU6dOGDt2LLy8vEy+/vXr17F582aEh4cjOTkZTk5O8Pb2xogRI9CtWzeTr09EROYnO7u89NJLAIAnT54YfY66bP369eVWCwD47LPP4O/vj1OnTiEpKQm5ubl4+PAh9u7diyFDhpi8XU5QUBDeffdd7N27Fw8fPkRubi6SkpJw6tQp+Pv7Y/HixSZdn4iIyobspNarVy+IooiDBw8afc7BgwchCAJ69+4tt1oEBARg586dAICePXsiMDAQ4eHh2LhxI5o0aYLc3FwsWLAAkZGRsq4fGRmJhQsXQqVSoUmTJti4cSPCw8MRGBiInj17AgB27NiBgIAA2T8DERGVDdlJbdy4cXB3d0dwcDACAwNLLB8UFISgoCB4eHhg7NixsupMTU3FmjVrAABdunTB6tWr4e3tDWdnZ3Tp0gVbtmyBi4sL8vLysHz5cll1fPXVV8jLy4OLiwu2bNmCLl26wNnZGd7e3li9ejXeeOMNAMCaNWuQmpoqqw4iIiobspOavb09fv75Z3h7e2PBggWYNGkSjh49CqVSCZVKhby8PCiVShw7dgz+/v745JNP0LJlS/z888+ws7OTVWdQUBCys7MBAP/+97+LbV5as2ZNaarAlStXEBMTU6rrR0dHIyoqCgAwYcIE1KxZU+u4IAiYPXs2ACA7Oxv79u2T9XMQEVHZkD1QpFmzZtJjURRx+vRpnD59Wm95URQRFRWF7t276y0jCAJiY2P1Hg8JCQFQOODE29tbZ5l+/frhq6++AgCcPHlSbzlD11dfRxdvb2+4u7vj3r17OHnyJD744AOjr09ERGVLdktNcxmsot/r+jKmTEk7bKtbXq1atdJbxs3NDa6urlrljaUu7+rqCjc3N73l1PWX9vpERFS2ZLfUfH19zRlHiZRKpdT12KBBA4Nl69evD6VSKS3hZSx1eWOuDwBZWVlQKpVSEiUiooolO6ktW7bMnHGUSHPqQK1atQyWVR9Xb4FT2jqMvb66DiY1IqLK4YXZ5EzdSgMAGxsbg2XVx7OyskpVR05ODgCUuNhytWrVdMZFREQVy2ybhJa1ku63yS2rS9FRlea8ftH5c7OHN5V9LXOIvxUHozppe0wu61AMir5xu8Qy01uOKYdIDLsTd6vEMq4zp5VDJIbF3LlTYpk+fp7lEIlhN27qHzim9vaQieUQiX7Xb5b83gQAdHAo20BKEHXtqlHlunbtWraBlOD69esmnS87qUVFRaFly5ayzl2/fj0mTizdG1FzGsDz588Nls3NzS12jjGqV68OlUpV4vU1j9va2hp9/TZt2pQqHiIiKh3Z3Y+jRo3CTz/9VKpzkpKSMG7cOHz33Xelrk9zzpi+HbaLHq9Ro4asOoy9vpw6iIio7MhOaiqVCv/73//w0UcfGbWyxh9//IHBgwfj7NmzsupzdXWVWkX37983WPbBgwcAAE/P0nWfqMsbe307OzsOEiEiqkRkJ7VGjRpBFEWEhYVh0KBBCA8P11lOpVJh2bJl8Pf3l5Lf6NGjZdWpnkitXvVDF6VSCaVSqVW+tNfXvIYuV65ckXV9IiIqW7KT2t69ezFs2DCIoojk5GR8+OGH+O6771BQUCCVSUhIwPDhw7FlyxaIoogaNWpg7dq1+OSTT2TVqd7y5e7du3pXHjl8+LD02NDqJYauX/Q6mmJjY3Hv3j1Z1yciorIlO6nZ2Nhg6dKlWLFiBRwdHVFQUID169dj1KhRePjwIYKCguDn54e4uDiIoogOHTrgt99+M2kvMl9fX6kL8ttvvy02CjEtLQ0bNmwAULjqR2lbUi1atJAGv2zYsKHYPDdRFPHNN98AKBwgMnjwYFk/BxERlQ2T56n17dsXgYGBeO211yCKIi5fvoy+ffvik08+QXZ2NiwsLDBz5kxs2rQJderUMakuZ2dnTJkyBQAQGhqKGTNmIC4uDqmpqThz5gxGjx6NpKQkWFpaYu7cucXODwwMhJeXF7y8vPTuLDBv3jxYWloiKSkJo0ePxpkzZ5Camoq4uDjMmDEDYWFhAIApU6bA2dnZpJ+HiIjMSxBNndT1/+Xn5+M///mPVredg4MDAgIC8Nprr5mjCslnn30m7alWlJWVFb744gv4+PgUOxYYGIj58+cDKFwRxc/PT+c1goKCsGjRIqhUKp3HR4wYgSVLlhiMUbMuoDD5r1y50uA5ADB8+HBcvnxZ+n7Lli3o0KGDzrL79u3D3r17cePGDWRnZ8PNzQ1vvfUWPvjgA2kTV13++usvnDhxAufOnUNkZKTWaE43NzcsXboUb7/9doXHqano66nJUN1lHee5c+cwZkzp5sedOHFCa6Pc8nwtT5w4gR9++EFr3dLq1aujW7du8PPzw5tvvqn33PKMMzQ0FN99951WnLVq1cLcuXMxaNAgg3NJS4ozPz8fN27cQFRUlPR1+/Zt5OfnAyj9PKmyem+aI85Vq1Zh9erVRv8sCoUCDg4OcHd3R+fOnTFq1CiDg+Hk/p5zcnIQExOj9bMlJiYCANq3b4+tW7caHXNRZpt8vXXrVpw4cQKCIEAURQiCgMzMTGzfvh1NmjQp1XyukixZsgRdu3bFjh07EBMTg/T0dNSuXRsdO3bEuHHj4OXlZdL1fX190bx5c2zatAlnz55FUlISnJyc4O3tjffee09WF2pISAiePn0KBwf9EzDv3r2r9SbXR6VSYfr06Vq7CqjP37p1K4KCgrBy5Up06dKl2LlHjhzBrFmz9E4gf/ToESZOnIiePXvim2++0Vo9pTzjNIfKGGeNGjW0eizKK8bc3Fz861//wvHjx4sdy8nJwaFDh3Do0CEMGDAAX331FaysrIrVUx5x5uXlYeHChQgKCip2LCUlBR9//DH279+PH374QefKQsbEeeHChVJ/EDG3yhpnQUEB0tPTER0djejoaGzbtg3ffPNNsQnZpv6eN2zYUKpkWxomdz8+efIE/v7+WL58OXJzc1GtWjVMnz4d9erVgyiK2L9/P3x9fQ1uKSNHt27dsH79epw5cwZXr15FSEgIli1bZjCh+fn54fr167h+/breVpqal5cXli1bhpCQEFy9ehVnzpzB+vXrZSU0e3t7PH/+HEeOHDFYLjg4WCpvyP/93/9Jb6Zhw4bh0KFD+PPPP7Fq1Sq4ubkhMzMTM2fOxN27d4udm52dDVEU4erqiokTJ0r7zwGFn9jVjh8/rrMLt7ziNIfyiLNt27a4ePEiLl68iMWLF0vPq+tctGgRfv31V+n5d955R2sZtvJ6Lf/73/9KCc3Dw0N6Xr1AgXoR7wMHDuj8Z1OecaoTWqNGjaTn1R+KFQoFQkND9Q42MzZOtQYNGqB///5o2rR8V/Yp7zjr1asnvU+LvldXrVqFs2fP4uLFi4iMjMSBAwcwYcIEKBQKZGZmYtasWdLgODVz/W1bWlqiWbNmGDZsGBwdHWX9bEWZlNTOnTuHwYMH448//oAoimjcuDH27NmDqVOnYt++fejTpw9EUcS9e/cwfPhwbNq0ySxBv2j69OkDAAY3FRVFEb/99huAwq5KfW7evIldu3YBAIYOHYrPP/8cr7zyCmrVqoXevXtj8+bNqFatGjIzM7FixYpi59euXVt6Q86ePRuvvPKKdOydd94BAOnNdeTIkWLTJ8orTkOMafWXV5wWFhaws7ODnZ2dVstB/Ts/cuQIfv/9d+l5zW7x8ooxKysLu3fvBlC4D6LmBxl1vXXq1JE+EO7cuVNrFHN5xZmQkIBffvkFANC5c2etvQrV+xuqE/KBAwdw4cIFrfONjbNhw4YICAjAuXPncPz4cXz77bda+0OawpzvTXPGKQiC9D4t+l51cnJCzZo1YWdnB3t7ezRu3Bhz5szB1KlTARS25H/++WepvDn+tnv06IHt27cjMjISwcHB+Pzzzw32YpWG7KS2YsUKjB8/Ho8fP4Yoihg2bBj27Nkj/ZO0t7fHypUrsWTJEtjY2EClUmH58uWYNGmSUZO1qxL1PYALFy5IE7eLioyMxIMHD2Bra4tevXrpvdaOHTtQUFAAS0tLzJo1q9jxhg0bYujQoQAK/6EWfa3ffPNNDBkyBBYWFnrjfPr0qfRc0Y1fyyvOotRLnwGAv79/ietzVlScapq/c/Wnck9PT629AMsrxjt37khLu/Xr1w+Wln/fdVDHefHiRbzxxhsACkcRa16jvOI8fPiwdL/oX//6FxSKv/89qeNMSEiQElvR+y7Gxunq6oq33nrLbKsBldV709xxltZHH30kJT/Necjm+Jtp3rw52rRpo/P2hqlkJ7Uff/wR+fn5cHBwwIoVK7B06VKdfdzDhw/Hr7/+isaNG0s7ZOsaxFGV1a9fH23atNH6hFaUuhXXu3dvrW7AotRN/nbt2sHFxUVnGfWnv4KCApw6dUpWnGqPHz+uFHFq3gvq37+/VheaLhX9emq+lsnJyQBQ7H1fXjFqdncW/YerGad64IGFhYVWV1B5xRkXFwegsLXTokULvXGqYzt9+rRWQjE2TnMrq/dmRbOxsYG7uzuAwnvtamX5P8gcTOp+fO211xAUFGSwCQ1A6pZUT9ZOSkoypdoXknpOm64uyNzcXOl+m6G5b6mpqXj48CEAw7t/t2jRQmqJlfZeZtH6Nfv8KyrO2NhYaSqFWuvWrfWWryyvp2bdgiBg0KBBFRKjp6endO/s2LFjUmuoaJznz58HAHTs2FFKhOUZp7qHwMHBQWdrR12/+v5OdnY2bt++Xao4za2s3ptlRRRFvaO6dVG3ltUfdMvjf5CpZCe1Dz/8EL/88ovW0GRD1JO1V65caba+0xdJ3759YW1tjYSEhGL3qU6cOIGMjAzUqVMHHTt21HsNzZ28De3ObWNjg9q1awMo7HoqbZyaXZOvv/56hcaZn5+PhQsXFhutqflJXv2PrSLj1EVzWkTz5s1Rr169ConR2toakyZNAlC4xNyWLVukY5mZmXB1dYUgCMjNzYWjo6PWIIzyjFP9Aerp06c6R+eq/4bS09Ol59TXMDZOcyrL96a5paSkoH///vD29sarr76qNSUpLy9P5zkqlUr6AKEesVse/4NMJTupzZkzR6tv3lh9+vSR7jH8kzg6OkojJ4u21tTfDxw4UOs+QlHlsft3QUGBFIOdnR3eeuutCo1zy5YtiImJQd26dbWe1+y2OXPmjNaxyvJ6at6HKDqyq7xjnDhxIqZNmwYrKyvcuHFDen7w4MGYOHGi1DLq3r271qjD8oxTXW92djauXi2+95fm31DReo2N05zK8r1pbs+ePcOtW7ekVvqzZ8+kY0uXLpVaX5o2b94sbZysnj9XHv+DTGXWV1WpVOLq1au4cOGC1otWlOYn1n8S9T2VgwcPSl0AqampCA0N1Tquj/oNBpTN7t+iKOLjjz+WYhMEQfpnVxFxJiYm4vvvv4dCoYCvr6/ea5w7d65Svp6aH97i4uIqNEZBEODv749Fixbp/DCqHu144sSJCouzd+/e0uMVK1bobK0VTSBZWVmlitNcyvq9aS4ODg4YNWoUNmzYgOPHjyM6OhoRERFaI0sTEhLw4YcfIisrCwUFBfjrr7+wdu1aaeSilZUVxo0bB6Ds/2bMweTJ18+ePcPmzZuxa9cu/PXXX9Lz+/fv1/rEd+jQIZw6dQqOjo5YuHChqdW+kN58803UrFkTT548QWhoKLp3744DBw4gLy8PzZo1Q5MmTQyer/lHXha7c69fvx5//PEHgMI3ZGZmZoXGuWTJEmRnZ2PEiBEGuzqePn1a6V7P5ORknDt3DkBh919aWlqFxvjXX39h4sSJuHHjBjw8PKT5Q4GBgUhJScHatWtx8eJFPH36FCtWrMCcOXPKPc5mzZphwIABOHDgAMLCwrR2ykhMTMS+ffukIf+adZUmTnMp6/emuaiTkSZra+ti9d+5c0frVoNm2S+//FL6X17W/4PMwaSWmlKpxLBhw7BixQo8fPgQoijq/UGaN2+O/fv3Y9u2bQa3jqnKrKys0L9/fwB/f4pXd0cYc9NYcw6MoZYwIG/3b/Vcpm7duuHdd9+t0DgPHjyIP/74Ay4uLpg9e3aJdVa21/PYsWNS60e99FRFxZibm4tx48bhxo0baN++PT766CPpmJOTE9566y1s3bpVum/y888/Izk5uUJey88//1xaheLmzZvS82PGjMH3338Pa2trraWkHB0dSxWnOZTHe7MiWVpawtPTEyNHjsS+ffswcOBA6VhZ/w8yB9lJLT8/H1OmTJH653v37o1FixbpLd+wYUNpDciiS6v8k6jf0CEhIbh06RKuXr0KCwsLDBgwoMRzy2P37/bt22PlypVSl0pFxJmbm4svv/wSADB37lyjVhqobK+nepi3q6urNEijomI8evQoEhISAAAzZ87UOUfR0tJSmmybn5+PgICACnktbW1tERAQgK+//hqNGzeWnndxccGQIUOwb98+aZEA4O/7b8bGaaryem+WJ82lrMLDwxETE4MjR47gs88+w8svv6xVtjz+B5lKdvfj/v37ERMTAwsLC6xcuRI9e/YEUPhJS59u3brh0qVLuHTpktxqX3gtW7aEp6cn4uPjMWfOHACFqyeoRwoZormTt6HduXNzc6VpE0XflEVptpqbNm2KH3/8ETY2NhUa57Nnz6S5XXPmzJHqV+vRo4fOa7333nvlGqch6iQyaNAgtGrVqkJ/55prDL766qvFljxS8/HxwWeffQYA0tJeFfFaKhQKDBw4ECqVSlooeNeuXdJI6xMnTkhlNVcgMSZOU5XXe7M8aSaqkob7l+XfjLnIbqkdOnQIgiDAx8dHSmglUS/Do/6D/6dSt9bUbwpjuyOcnZ2lQTaGunCjoqKkUU7NmzfXW+6PP/6QuhyBwp0LNLsKKkucxlJ3fVemONUDAirytdScoGzoPofm0ljqm/uV6bVU05zsrF7ZvbJ36ZX2vVmeNKdIlNTyrKi/7dKQndTUE+o0RyyVpKKGeFY2mltn2NnZGf2hAPh7d+7z58/rbf6rJ3cqFIpiq2urnT9/HjNmzNCaiFv0DV1RcdrZ2SE4OFjra/r06dLxdevWaX0/e/bsCn89dfH29pZusFfk71yzZaC5lUtRRYfRV6bXUu3KlSuIjIwEANmvpynK871ZXtTJydPT06gVTsrj92wK2UlNnZjkbPyp+Ynwn+ill17C77//jkOHDmH//v2lWipnxIgRUCgUUKlUOvdnu3fvHvbs2QOgcLKqro1Mr169Cn9/fzx79szgJ7OKitPCwgLNmjXT+tKcBtKoUSOt71u1alWhr6c+mkO9K/J33qlTJ+nxqlWrdP795eXlYdWqVdL3P/zwQ6V6LYHCEaUff/wxRFFEkyZNcPDgQVmvpynK871pqtTUVGnNT0PU/8s1V7wxpKx/z6aSfU/Nzs4OGRkZUv+yMdRdBU5OTnKrrTJKWh9OnyZNmmD48OHYsWOHtFL22LFjUaNGDVy6dAlffPEFcnJyYG9vr3Ox0du3b2PChAnIzMyEk5MT3n//ffzwww8ACuegFJ1T4urqKmvRUVPjLK2Kej31sbS0lEa6VnSMbdu2RceOHXH27FmcPXtWa6h8eno64uPjsXbtWqkF1KdPH1ktCnO8lmvWrMGNGzfQv39/rTj37t2L3bt3Izk5GU5OTvjf//6ntcNEacXGxmp1y2ouult0n7NGjRoZvUWMLnJ/74Bpcaq3mBk4cCA6d+6Ml19+GQ4ODtIGnUXP1TX8Xxdz/J5zc3OLLZ+l/jkzMzOL/Wyl2WhadlKrX78+YmNjcfv2bYO75WpSL2ypOX+NSm/BggV49OgRQkJCsGvXLumNpabeIUHXH9OhQ4ekVQHS09OlhAZA56gsU3ahNSXO8mSuODWHOLdr186sn1BNjXHFihWYPHkyLl26pLXUUdF9BTt16iSN7quIOJ8/f47Dhw/j8OHDWs+vWbMGAODu7o7vvvvO5P3Ppk2bJn3ILmr48OFa3xvavbqsmRpnUlISfvrpJ/z0009662jcuDE2btxYqo2cTf09P378uFj8arGxscWOlWY3ctlJrVOnToiJicGOHTswZsyYEpd8iY2Nxf79+yEIgrTFBcljZWWFH3/8EcHBwdJW6jk5OdJWFR988IHRa3IyTvPFqXnjvDT3mssjxpo1a2Lbtm04dOgQNm7cKK2Ib2VlhVq1auHVV1/FwIED0adPnxIn1ZZlnP369UNubi4iIiIQHx8v9Ry0atUKgwcPxpAhQ0pcyYIKvf766/j0009x+fJlXLt2DampqUhPT4eVlRWqVasmtfoWLlwIV1fXUl27Mv9tC6LMad8PHz5Enz59kJeXh9GjR2P+/PkQBAFNmzaFIAhaK4qEhYVh3rx5SE5Ohq2tLUJCQtgFSUREZie7pVavXj3861//wn//+19s3boV4eHh0m6/QOGw24MHDyI0NBQxMTEQRRGCIGD+/PlMaEREVCZkt9TUVq1ahTVr1khJSxf1sZkzZ8Lf39+U6oiIiPQyOakBhduT//jjjwgPDy+2N49CoUC7du0wffp0tG3b1tSqiIiI9DJLUlPLzs5GbGwsUlJSkJ+fj5o1a6J58+bsbiQionJh1qRGRERUkcpv61UiIqIyxqRGRERVBpMaERFVGUxqRERUZTCpEb3gvLy84OXlhXnz5pl0nXnz5knXInpRMakREVGVwaRGRC8Uc7VMqWpiUiMiAMBXX32F69evl2qbD6LKhkmNiIiqDCY1IiKqMmRvPUNEhVJTU7F9+3aEhoYiISEBmZmZsLOzQ82aNVG3bl20b98e3bt319qtedWqVVi9ejUA4MSJEwY3VOzevTsSExON3oU8JiYGW7Zswfnz55GUlAQHBwe0bNkSI0eOxFtvvaX3vHnz5iEoKAhAyTsNX7x4EYGBgVIdeXl5cHFxQevWrTFs2DCjdoouKCjAkSNHcPToUURFRSE1NRUKhQKurq7w9PREz5490aNHD2ntWPXroBYUFCTFq4ndp/9sTGpEJrh8+TImTZqEtLQ0refT09ORnp6OhIQEhIeHIyIiAps2bSrzeAIDA/Hpp59CpVJJz6WkpCAkJAQhISEYOXIkPv30U9m7Wz979gwLFy7E/v37ix1LTExEYmIiDhw4AB8fH3z++eewtrbWeZ27d+9ixowZuHbtWrFjd+7cwZ07d3DixAmMGTMGCxYskBUr/TMxqRHJlJubi5kzZyItLQ0WFhbw8/ND165dUbt2bVhYWCAlJQVxcXEIDQ2VnURK49q1azhw4ADs7e3x0Ucf4fXXXwdQuDXUhg0b8OTJE2zfvh3Ozs6YPn16qa+fn58Pf39/hIeHAwDatm0LHx8fNGjQAPb29khISMCuXbsQERGB4OBgKBQKLFu2rNh1Hjx4gOHDh+PJkycAgNatW8PPzw+NGjWCjY0NHj9+jMuXL+PIkSNa523cuBEqlQoDBw4EAPTo0QOzZs0q9c9BVRuTGpFMkZGRePToEQBg7ty5GDt2bLEyb7/9Nvz9/aV/4GUpLi4OderUwa5du1CvXj3p+datW6Nv374YMWIEkpKSsG7dOgwaNAgeHh6luv6GDRsQHh4OhUKB//3vfxgwYIDW8VdffRUDBgzAsmXLsGnTJgQGBmLIkCFo06aNVrn//Oc/0usxffp0TJs2Teu4t7c3unXrhlmzZkGpVErPe3p6apVzdHREkyZNSvUzUNXHgSJEMiUnJ0uPS7qHVLNmzbIOBwAwf/58rYSmVr9+fXz88ccAAJVKhZ07d5bqujk5Ofjpp58AAEOGDCmW0DT95z//Qe3atQEAu3fv1jp29uxZXLp0CQDQrVu3YglNkyAIcHNzK1WcRExqRDK5urpKj/iHI1wAAAYkSURBVPfu3YuK3prQ0dERvXr10nu8b9++cHBwAACEhYWV6trnz5+X7hv279/fYFkrKyup6/PixYtax06ePCk9Hj9+fKliIDIGux+JZHr99dfh6emJ+Ph4bNmyBaGhoejTpw/atWuHli1bwtHRsVzjad68OaysrPQet7a2RrNmzRAREYFbt24hPz8fFhYWRl07KipKeqyrm1WfpKQkre9jYmIAFCa+1q1bG30dImOxpUYkk6WlJdatWwdvb28AQHx8PH788Ud8+OGHaN++PQYPHoxVq1bh8ePH5RJPrVq1Siyj7hYsKChARkaG0ddOTU2VFVNOTo7O6zg7OxtMwERysaVGZAIPDw/s3bsX4eHhOHHiBCIjI3Hjxg3k5+fj2rVruHbtGjZu3IilS5di0KBBZRqLMSMs5XaR5uXlSY83bNig1fUqR3mMBqV/JiY1IhMJgoDOnTujc+fOAIDMzExcuHAB+/fvx6FDh5CTk4N58+ahefPmaNSoEQBAofi7k6SkRJOdnW1UHJoDV/RJSUmR6i9N96izs7P02MLCQvaoQ2dnZ9y5cwcpKSlQqVRsrZHZsfuRyMzs7e3RtWtXfPPNN5g9ezaAwjlemvOu7OzspMfp6el6r5Wammr0dIC4uDitFlVRKpUKcXFxAIBGjRoZfT8NKByurxYaGmr0efquo1KppFGQRObEpEZUhrp06SI91rwv1aBBA+lxdHS03vP37dtndF3p6ek4duyY3uO///67dB9NMy5jdOzYEfb29gCAX3/91ahWoS49evSQHqunCJSWjY0NgMLJ70RFMakRyXThwgXcvn3bYBnNVo1mInv99delrrdffvkFz549K3ZuTEwMVq1aVaqYvvrqK2lCuKaHDx9i+fLlAApHHo4YMaJU17W3t5eG4D99+hRTp04tcfBIeHg4IiMjtZ5r37492rZtCwAICQmR1r/URRRFnT+L+n5eQkJCaX4E+oewWLx48eKKDoLoRbR3717MmjULoaGhePz4MTIzM/H06VMolUpER0cjICAAP//8M0RRhJOTE5YuXQpbW1sAQPXq1XH//n3ExcUhNTUVZ86cgaOjI3Jzc3Hr1i1s374dS5YsQZ06daBQKJCTk4OXXnoJfn5+xeJQJ4amTZsiMTERwcHBEAQBCoUCjx49wqFDhzB37lypdTVlyhT07t272HWOHz8urcWoaxmtNm3a4MqVK7h37x4ePXqEvXv3IiMjA3l5ecjKykJiYiIuX76MoKAgLF26FJs2bULr1q2l0aFqHTp0wP79+/Hs2TNERETgzJkzEEURKpUKqampiI2Nxb59+7B48WKkpKQUW4Q5Li4OcXFxSEpKQkFBAWxtbZGZmYknT57gyZMnWvf/6J9HECt6xijRC0pzpX1DXFxc8P333xdbLiotLQ1jxozRu6q8u7s7AgICMH78eIOr9Ht5eQEAfH190aFDByxatEhrQWNN7733Hj777DOdow+NWaU/NzcXy5Ytw86dO1FQUKD/h0bhAJpvv/0W77zzTrFjCQkJmDZtGm7evGnwGroWNL516xaGDBlSbLqAGlfp/2fj6EcimSZMmIDXXnsNZ8+exeXLl6FUKqVRfY6OjmjcuDG6du2KoUOHSvejNNWoUQM7duzApk2bcOTIEdy7dw8KhQINGjRAnz59MHbsWJ3nGeLr64smTZpg8+bN0rYw9vb2eO2110rcesYY1tbW+Oyzz/D+++9jz549iIiIwIMHD/D06VPY2NjAxcUFr7zyCjp06ICePXtqdblqatiwIfbt24eDBw/i999/x9WrV5GamgobGxvUqVMHL7/8Mnr16qV1D06tUaNG2Lt3L3766SdcuHABSqVSb4Kjfx621IgIAPDxxx9j3759sLS0lFb+IHrRcKAIEQEoHAACoNStQ6LKhEmNiFBQUCDNYSu6xQvRi4T31Ij+we7cuQOlUok9e/bgr7/+AlC4JQzRi4r31Ij+wbp3747ExETp+4YNGyIwMFBrxROiFwlbakT/cFZWVqhbty66du2KyZMnM6HRC40tNSIiqjI4UISIiKoMJjUiIqoymNSIiKjKYFIjIqIqg0mNiIiqDCY1IiKqMv4fGph0aIDC4oAAAAAASUVORK5CYII=\n", 220 | "text/plain": [ 221 | "

" 222 | ] 223 | }, 224 | "metadata": {}, 225 | "output_type": "display_data" 226 | } 227 | ], 228 | "source": [ 229 | "sns.barplot(data=pca_results.sort_values(\"subject\"), x=\"subject\", y=\"explained_variance\")" 230 | ] 231 | } 232 | ], 233 | "metadata": { 234 | "kernelspec": { 235 | "display_name": "Python 3", 236 | "language": "python", 237 | "name": "python3" 238 | }, 239 | "language_info": { 240 | "codemirror_mode": { 241 | "name": "ipython", 242 | "version": 3 243 | }, 244 | "file_extension": ".py", 245 | "mimetype": "text/x-python", 246 | "name": "python", 247 | "nbconvert_exporter": "python", 248 | "pygments_lexer": "ipython3", 249 | "version": "3.6.8" 250 | } 251 | }, 252 | "nbformat": 4, 253 | "nbformat_minor": 2 254 | } 255 | -------------------------------------------------------------------------------- /notebooks/quantitative_dynamic.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from copy import copy\n", 10 | "from functools import partial\n", 11 | "import itertools\n", 12 | "import json\n", 13 | "from pathlib import Path\n", 14 | "import re\n", 15 | "import sys\n", 16 | "sys.path.append(\"../src\")\n", 17 | "\n", 18 | "import matplotlib\n", 19 | "import matplotlib.pyplot as plt\n", 20 | "import numpy as np\n", 21 | "import pandas as pd\n", 22 | "import seaborn as sns\n", 23 | "import scipy.stats as st\n", 24 | "import statsmodels.formula.api as smf\n", 25 | "from tqdm import tqdm, tqdm_notebook\n", 26 | "\n", 27 | "%matplotlib inline\n", 28 | "sns.set(style=\"whitegrid\", context=\"paper\", font_scale=3.5, rc={\"lines.linewidth\": 2.5})\n", 29 | "from IPython.display import set_matplotlib_formats\n", 30 | "set_matplotlib_formats('png')\n", 31 | "#set_matplotlib_formats('svg')\n", 32 | "\n", 33 | "%load_ext autoreload\n", 34 | "%autoreload 2\n", 35 | "import util" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "## Data preparation" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "output_path = Path(\"../output\")\n", 52 | "decoder_path = output_path / \"decoders\"\n", 53 | "bert_encoding_path = output_path / \"encodings\"\n", 54 | "model_path = output_path / \"bert\"" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": null, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "checkpoints = [util.get_encoding_ckpt_id(dir_entry) for dir_entry in bert_encoding_path.iterdir()]" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "models = [model for model, _, _ in checkpoints]\n", 73 | "\n", 74 | "baseline_model = \"baseline\"\n", 75 | "if baseline_model not in models:\n", 76 | " raise ValueError(\"Missing baseline model. This is necessary to compute performance deltas in the analysis of fine-tuning models. Stop.\")\n", 77 | "\n", 78 | "standard_models = [model for model in models if not model.startswith(\"LM_\") and not model == baseline_model]\n", 79 | "custom_models = [model for model in models if model.startswith(\"LM_\") and not model == baseline_model]\n", 80 | "\n", 81 | "runs = sorted(set(run for _, run, _ in checkpoints))\n", 82 | "checkpoint_steps = sorted(set(step for _, _, step in checkpoints))\n", 83 | "\n", 84 | "# Models which should appear in the final report figures\n", 85 | "report_models = [\"SQuAD\", \"QQP\", \"MNLI\", \"SST\", \"LM\", \"LM_scrambled\", \"LM_scrambled_para\", \"LM_pos\", \"glove\"]\n", 86 | "\n", 87 | "# Model subsets to render in different report figures\n", 88 | "report_model_sets = [\n", 89 | " (\"all\", set(report_models)),\n", 90 | " (\"standard\", set(report_models) & set(standard_models)),\n", 91 | " (\"custom\", set(report_models) & set(custom_models)),\n", 92 | "]\n", 93 | "report_model_sets = [(name, model_set) for name, model_set in report_model_sets\n", 94 | " if len(model_set) > 0]" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": {}, 101 | "outputs": [], 102 | "source": [ 103 | "RENDER_FINAL = True\n", 104 | "figure_path = Path(\"../reports/figures\")\n", 105 | "figure_path.mkdir(exist_ok=True, parents=True)\n", 106 | "\n", 107 | "report_hues = dict(zip(sorted(report_models), sns.color_palette()))" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "### Decoder performance metrics" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "# Load decoder performance data.\n", 124 | "decoding_perfs = util.load_decoding_perfs(decoder_path)" 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [ 133 | "# Save perf data.\n", 134 | "decoding_perfs.to_csv(output_path / \"decoder_perfs.csv\")" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "# # Load comparison model data.\n", 144 | "# for other_model in other_models:\n", 145 | "# other_perf_paths = list(Path(\"../models/decoders\").glob(\"encodings.%s-*.csv\" % other_model))\n", 146 | "# for other_perf_path in tqdm_notebook(other_perf_paths, desc=other_model):\n", 147 | "# subject, = re.findall(r\"-([\\w\\d]+)\\.csv$\", other_perf_path.name)\n", 148 | "# perf = pd.read_csv(other_perf_path,\n", 149 | "# usecols=[\"mse\", \"r2\", \"rank_median\", \"rank_mean\", \"rank_min\", \"rank_max\"])\n", 150 | "# decoding_perfs.loc[other_model, 1, 250, subject] = perf.iloc[0]" 151 | ] 152 | }, 153 | { 154 | "cell_type": "markdown", 155 | "metadata": {}, 156 | "source": [ 157 | "### Model performance metrics" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": null, 163 | "metadata": {}, 164 | "outputs": [], 165 | "source": [ 166 | "# For each model, load checkpoint data: global step, gradient norm information\n", 167 | "model_metadata = {}\n", 168 | "for model, run, step in tqdm_notebook(checkpoints): \n", 169 | " run_dir = model_path / (\"%s-%i\" % (model, run))\n", 170 | " \n", 171 | " # Fetch corresponding fine-tuning metadata.\n", 172 | " ckpt_path = run_dir / (\"model.ckpt-step%i\" % step)\n", 173 | "\n", 174 | " try:\n", 175 | " metadata = util.load_bert_finetune_metadata(run_dir, step)\n", 176 | " except Exception as e:\n", 177 | " pass\n", 178 | " else:\n", 179 | " if metadata[\"steps\"]:\n", 180 | " model_metadata[model, run] = pd.DataFrame.from_dict(metadata[\"steps\"], orient=\"index\")\n", 181 | " \n", 182 | " # SQuAD eval results need to be loaded separately, since they run offline.\n", 183 | " if model == \"SQuAD\":\n", 184 | " pred_dir = output_path / \"eval_squad\" / (\"SQuAD-%i-%i\" % (run, step))\n", 185 | " try:\n", 186 | " with (pred_dir / \"results.json\").open(\"r\") as results_f:\n", 187 | " results = json.load(results_f)\n", 188 | " model_metadata[model, run].loc[step][\"eval_accuracy\"] = results[\"best_f1\"] / 100.\n", 189 | " except:\n", 190 | " print(\"Failed to retrieve eval data for SQuAD-%i-%i\" % (run, step))\n", 191 | "\n", 192 | "model_metadata = pd.concat(model_metadata, names=[\"model\", \"run\", \"step\"], sort=True)" 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "metadata": {}, 198 | "source": [ 199 | "### Putting it all together" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [ 208 | "# Join decoding data, post-hoc rank evaluation data, and model training metadata into a single df.\n", 209 | "old_index = decoding_perfs.index\n", 210 | "df = decoding_perfs.reset_index().join(model_metadata, on=[\"model\", \"run\", \"step\"]).set_index(old_index.names)\n", 211 | "df.head()" 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": {}, 217 | "source": [ 218 | "-----------" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "metadata": {}, 225 | "outputs": [], 226 | "source": [ 227 | "all_subjects = df.index.get_level_values(\"subject\").unique()\n", 228 | "all_subjects" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "metadata": {}, 235 | "outputs": [], 236 | "source": [ 237 | "try:\n", 238 | " subjects_with_baseline = set(decoding_perfs.loc[baseline_model, :, :].index.get_level_values(\"subject\"))\n", 239 | "except:\n", 240 | " subjects_with_baseline = set()\n", 241 | " \n", 242 | "if not subjects_with_baseline == set(all_subjects): \n", 243 | " raise ValueError(\"Cannot proceed. Missing base decoder evaluation for subjects: \" + str(set(all_subjects) - subjects_with_baseline))" 244 | ] 245 | }, 246 | { 247 | "cell_type": "markdown", 248 | "metadata": {}, 249 | "source": [ 250 | "### Synthetic columns" 251 | ] 252 | }, 253 | { 254 | "cell_type": "code", 255 | "execution_count": null, 256 | "metadata": {}, 257 | "outputs": [], 258 | "source": [ 259 | "df[\"eval_accuracy_delta\"] = df.groupby([\"model\", \"run\"]).eval_accuracy.transform(lambda xs: xs - xs.iloc[0])\n", 260 | "df[\"eval_accuracy_norm\"] = df.groupby([\"model\", \"run\"]).eval_accuracy.transform(lambda accs: (accs - accs.min()) / (accs.max() - accs.min()))" 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": null, 266 | "metadata": {}, 267 | "outputs": [], 268 | "source": [ 269 | "def decoding_perf_delta(xs, metric=\"mse\"):\n", 270 | " subject = xs.index[0][3]\n", 271 | " base_metric = df.loc[baseline_model, 1, 0, subject][metric]\n", 272 | " return xs - base_metric.item()\n", 273 | "\n", 274 | "df[\"decoding_mse_delta\"] = df.groupby([\"model\", \"run\", \"subject\"]).mse.transform(partial(decoding_perf_delta, metric=\"mse\"))\n", 275 | "df[\"rank_mean_delta\"] = df.groupby([\"model\", \"run\", \"subject\"]).rank_mean.transform(partial(decoding_perf_delta, metric=\"rank_mean\"))\n", 276 | "df[\"rank_median_delta\"] = df.groupby([\"model\", \"run\", \"subject\"]).rank_median.transform(partial(decoding_perf_delta, metric=\"rank_median\"))" 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": null, 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "NUM_BINS = 50\n", 286 | "def bin(xs):\n", 287 | " if xs.isnull().values.any(): return np.nan\n", 288 | " return pd.cut(xs, np.linspace(xs.min(), xs.max() + 1e-5, NUM_BINS), labels=False)\n", 289 | "df[\"eval_accuracy_bin\"] = df.groupby([\"model\"]).eval_accuracy.transform(bin)\n", 290 | "df[\"decoding_mse_bin\"] = df.groupby([\"subject\"]).decoding_mse_delta.transform(bin)\n", 291 | "df[\"total_global_norms_bin\"] = df.groupby([\"model\"]).total_global_norms.transform(bin)" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": null, 297 | "metadata": {}, 298 | "outputs": [], 299 | "source": [ 300 | "ROLLING_WINDOW_SIZE = 5\n", 301 | "grouped = df.groupby([\"model\", \"run\", \"subject\"])\n", 302 | "for col in [\"mse\", \"decoding_mse_delta\", \"eval_accuracy\", \"train_loss\", \"rank_mean\", \"rank_mean_delta\"]:\n", 303 | " df[\"%s_rolling\" % col] = grouped[col].transform(lambda rows: rows.rolling(ROLLING_WINDOW_SIZE, min_periods=1).mean())" 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "metadata": {}, 310 | "outputs": [], 311 | "source": [ 312 | "df.tail()" 313 | ] 314 | }, 315 | { 316 | "cell_type": "code", 317 | "execution_count": null, 318 | "metadata": {}, 319 | "outputs": [], 320 | "source": [ 321 | "df.head()" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": null, 327 | "metadata": {}, 328 | "outputs": [], 329 | "source": [ 330 | "dfi = df.reset_index()" 331 | ] 332 | }, 333 | { 334 | "cell_type": "markdown", 335 | "metadata": {}, 336 | "source": [ 337 | "## Model training analysis\n", 338 | "\n", 339 | "Let's verify that each model is not overfitting; if it is overfitting, restrict our analysis to just the region before overfitting begins." 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": null, 345 | "metadata": {}, 346 | "outputs": [], 347 | "source": [ 348 | "# g = sns.FacetGrid(df.reset_index().melt(id_vars=[\"model\", \"run\", \"step\"],\n", 349 | "# value_vars=[\"train_loss_rolling\", \"eval_accuracy_rolling\"]),\n", 350 | "# row=\"variable\", col=\"model\", sharex=True, sharey=False, height=4)\n", 351 | "# g.map(sns.lineplot, \"step\", \"value\", \"run\", ci=None)\n", 352 | "# g.add_legend()" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": {}, 359 | "outputs": [], 360 | "source": [ 361 | "%matplotlib agg\n", 362 | "\n", 363 | "if RENDER_FINAL:\n", 364 | " # models which appear on left edge of subfigs in paper\n", 365 | " LEFT_EDGE_MODELS = [\"QQP\", \"LM\"]\n", 366 | " \n", 367 | " training_fig_path = figure_path / \"training\"\n", 368 | " training_fig_path.mkdir(exist_ok=True)\n", 369 | " shared_kwargs = {\"legend\": False, \"ci\": None}\n", 370 | "\n", 371 | " for model in tqdm_notebook(report_models):\n", 372 | " f, (loss_fig, acc_fig) = plt.subplots(2, 1, figsize=(10,15), sharex=True)\n", 373 | " try:\n", 374 | " local_data = df.loc[model].reset_index()\n", 375 | " except KeyError:\n", 376 | " print(f\"Missing training data for {model}\")\n", 377 | " continue\n", 378 | " \n", 379 | " ax = sns.lineplot(data=local_data, x=\"step\", y=\"train_loss_rolling\", hue=\"run\", ax=loss_fig, **shared_kwargs)\n", 380 | " ax.set_ylabel(\"Training loss\\n(rolling window)\" if model in LEFT_EDGE_MODELS else \"\")\n", 381 | " ax.set_xlabel(\"Training step\")\n", 382 | " \n", 383 | " ax = sns.lineplot(data=local_data, x=\"step\", y=\"eval_accuracy_rolling\", hue=\"run\", ax=acc_fig, **shared_kwargs)\n", 384 | " ax.set_ylabel(\"Validation set accuracy\\n(rolling window)\" if model in LEFT_EDGE_MODELS else \"\")\n", 385 | " ax.set_xlabel(\"Training step\")\n", 386 | " \n", 387 | " sns.despine()\n", 388 | " \n", 389 | " plt.tight_layout()\n", 390 | " plt.savefig(training_fig_path / (\"%s.pdf\" % model))\n", 391 | " plt.close()\n", 392 | "%matplotlib inline" 393 | ] 394 | }, 395 | { 396 | "cell_type": "markdown", 397 | "metadata": {}, 398 | "source": [ 399 | "## Decoding analyses" 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": null, 405 | "metadata": {}, 406 | "outputs": [], 407 | "source": [ 408 | "MSE_DELTA_LABEL = \"$\\Delta$(MSE)\"\n", 409 | "MAR_DELTA_LABEL = \"$\\Delta$(MAR)\"" 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": {}, 415 | "source": [ 416 | "### Final state analysis" 417 | ] 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": null, 422 | "metadata": {}, 423 | "outputs": [], 424 | "source": [ 425 | "%matplotlib agg\n", 426 | "\n", 427 | "if RENDER_FINAL:\n", 428 | " final_state_fig_path = figure_path / \"final_state\"\n", 429 | " final_state_fig_path.mkdir(exist_ok=True)\n", 430 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL, None, None),\n", 431 | " (\"rank_mean_delta\", MAR_DELTA_LABEL, None, None),\n", 432 | " (\"mse\", \"Mean squared error\", 0.00335, 0.00385),\n", 433 | " (\"rank_mean\", \"Mean average rank\", 20, 95)]\n", 434 | " \n", 435 | " for model_set_name, model_set in report_model_sets:\n", 436 | " final_df = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(model_set))]\n", 437 | " if final_df.empty:\n", 438 | " continue\n", 439 | "\n", 440 | " for metric, label, ymin, ymax in tqdm_notebook(metrics, desc=model_set_name):\n", 441 | " fig, ax = plt.subplots(figsize=(15, 10))\n", 442 | "\n", 443 | " # Plot BERT baseline performance.\n", 444 | " if \"delta\" not in metric:\n", 445 | " # TODO error region instead -- plt.fill_between\n", 446 | " ax.axhline(dfi[dfi.model == baseline_model][metric].mean(),\n", 447 | " linestyle=\"--\", color=\"gray\")\n", 448 | "\n", 449 | " sns.barplot(data=final_df, x=\"model\", y=metric,\n", 450 | " order=final_df.groupby(\"model\")[metric].mean().sort_values().index,\n", 451 | " palette=report_hues, ax=ax)\n", 452 | "\n", 453 | " padding = final_df[metric].var() * 0.005\n", 454 | " plt.ylim((ymin or (final_df[metric].min() - padding), ymax or (final_df[metric].max() + padding)))\n", 455 | " plt.xlabel(\"Model\")\n", 456 | " plt.ylabel(label)\n", 457 | " plt.xticks(rotation=45, ha=\"right\")\n", 458 | "\n", 459 | " plt.tight_layout()\n", 460 | " plt.savefig(final_state_fig_path / (f\"{metric}.{model_set_name}.pdf\"))\n", 461 | " #plt.close(fig)\n", 462 | " \n", 463 | "%matplotlib inline" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": {}, 470 | "outputs": [], 471 | "source": [ 472 | "%matplotlib agg\n", 473 | "\n", 474 | "if RENDER_FINAL:\n", 475 | " final_state_fig_path = figure_path / \"final_state_within_subject\"\n", 476 | " final_state_fig_path.mkdir(exist_ok=True)\n", 477 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL),\n", 478 | " (\"rank_mean_delta\", MAR_DELTA_LABEL),\n", 479 | " (\"mse\", \"Mean squared error\"),\n", 480 | " (\"rank_mean\", \"Mean average rank\")]\n", 481 | " \n", 482 | " for model_set_name, model_set in report_model_sets:\n", 483 | " final_df = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(model_set))]\n", 484 | "\n", 485 | " for metric, label in tqdm_notebook(metrics, desc=model_set_name):\n", 486 | " fig = plt.figure(figsize=(25, 10))\n", 487 | " sns.barplot(data=final_df, x=\"model\", y=metric, hue=\"subject\",\n", 488 | " order=final_df.groupby(\"model\")[metric].mean().sort_values().index)\n", 489 | " plt.ylabel(label)\n", 490 | " plt.xticks(rotation=30, ha=\"right\")\n", 491 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1,0.5))\n", 492 | " plt.tight_layout()\n", 493 | " plt.savefig(final_state_fig_path / f\"{metric}.{model_set_name}.pdf\")\n", 494 | " plt.close(fig)\n", 495 | " \n", 496 | "%matplotlib inline" 497 | ] 498 | }, 499 | { 500 | "cell_type": "code", 501 | "execution_count": null, 502 | "metadata": {}, 503 | "outputs": [], 504 | "source": [ 505 | "%matplotlib agg\n", 506 | "\n", 507 | "if RENDER_FINAL:\n", 508 | " final_state_fig_path = figure_path / \"final_state_within_model\"\n", 509 | " final_state_fig_path.mkdir(exist_ok=True)\n", 510 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL, None, None),\n", 511 | " (\"rank_mean_delta\", MAR_DELTA_LABEL, None, None),\n", 512 | " (\"mse\", \"Mean squared error\", None, None),\n", 513 | " (\"rank_mean\", \"Mean average rank\", None, None)]\n", 514 | " \n", 515 | " subj_order = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(report_model_sets[0][1]))] \\\n", 516 | " .groupby(\"subject\")[metrics[0][0]].mean().sort_values().index\n", 517 | " \n", 518 | " for model_set_name, model_set in report_model_sets:\n", 519 | " final_df = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(model_set))]\n", 520 | "\n", 521 | " for metric, label, ymin, ymax in tqdm_notebook(metrics, desc=model_set_name):\n", 522 | " fig = plt.figure(figsize=(25, 10))\n", 523 | " sns.barplot(data=final_df, x=\"subject\", y=metric, hue=\"model\",\n", 524 | " order=subj_order)\n", 525 | " \n", 526 | " padding = final_df[metric].var() * 0.005\n", 527 | " plt.ylim((ymin or (final_df[metric].min() - padding), ymax or (final_df[metric].max() + padding)))\n", 528 | " plt.xlabel(\"Subject\")\n", 529 | " plt.ylabel(label)\n", 530 | " \n", 531 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1,0.5))\n", 532 | " plt.tight_layout()\n", 533 | " plt.savefig(final_state_fig_path / f\"{metric}.{model_set_name}.pdf\")\n", 534 | " plt.close(fig)\n", 535 | " \n", 536 | "%matplotlib inline" 537 | ] 538 | }, 539 | { 540 | "cell_type": "markdown", 541 | "metadata": {}, 542 | "source": [ 543 | "### Step analysis" 544 | ] 545 | }, 546 | { 547 | "cell_type": "code", 548 | "execution_count": null, 549 | "metadata": { 550 | "slideshow": { 551 | "slide_type": "-" 552 | } 553 | }, 554 | "outputs": [], 555 | "source": [ 556 | "# g = sns.FacetGrid(dfi, col=\"run\", size=6)\n", 557 | "# g.map(sns.lineplot, \"step\", \"decoding_mse_delta\", \"model\").add_legend()\n", 558 | "\n", 559 | "# plt.xlabel(\"Fine-tuning step\")\n", 560 | "# plt.ylabel(MSE_DELTA_LABEL)" 561 | ] 562 | }, 563 | { 564 | "cell_type": "code", 565 | "execution_count": null, 566 | "metadata": {}, 567 | "outputs": [], 568 | "source": [ 569 | "# g = sns.FacetGrid(dfi, col=\"run\", size=6)\n", 570 | "# g.map(sns.lineplot, \"step\", \"rank_mean_delta\", \"model\").add_legend()\n", 571 | "\n", 572 | "# plt.xlabel(\"Fine-tuning step\")\n", 573 | "# plt.ylabel(MAR_DELTA_LABEL)" 574 | ] 575 | }, 576 | { 577 | "cell_type": "code", 578 | "execution_count": null, 579 | "metadata": {}, 580 | "outputs": [], 581 | "source": [ 582 | "f, ax = plt.subplots(figsize=(15, 10))\n", 583 | "sns.lineplot(data=dfi, x=\"step\", y=\"decoding_mse_delta_rolling\", hue=\"model\", ax=ax)\n", 584 | "\n", 585 | "plt.xlabel(\"Fine-tuning step\")\n", 586 | "plt.ylabel(MSE_DELTA_LABEL)\n", 587 | "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)" 588 | ] 589 | }, 590 | { 591 | "cell_type": "code", 592 | "execution_count": null, 593 | "metadata": {}, 594 | "outputs": [], 595 | "source": [ 596 | "f, ax = plt.subplots(figsize=(15, 10))\n", 597 | "sns.lineplot(data=dfi, x=\"step\", y=\"rank_mean_delta_rolling\", hue=\"model\", ax=ax)\n", 598 | "\n", 599 | "plt.xlabel(\"Fine-tuning step\")\n", 600 | "plt.ylabel(MAR_DELTA_LABEL)\n", 601 | "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)" 602 | ] 603 | }, 604 | { 605 | "cell_type": "code", 606 | "execution_count": null, 607 | "metadata": {}, 608 | "outputs": [], 609 | "source": [ 610 | "%matplotlib agg\n", 611 | "\n", 612 | "if RENDER_FINAL:\n", 613 | " trajectory_fig_dir = figure_path / \"trajectories\"\n", 614 | " trajectory_fig_dir.mkdir(exist_ok=True)\n", 615 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL),\n", 616 | " (\"rank_mean_delta\", MAR_DELTA_LABEL),\n", 617 | " (\"decoding_mse_delta_rolling\", MSE_DELTA_LABEL),\n", 618 | " (\"rank_mean_delta_rolling\", MAR_DELTA_LABEL)]\n", 619 | "\n", 620 | " for model_set_name, model_set in report_model_sets:\n", 621 | " for metric, label in tqdm_notebook(metrics, desc=model_set_name):\n", 622 | " fig = plt.figure(figsize=(18, 10))\n", 623 | " sns.lineplot(data=dfi[dfi.model.isin(model_set)],\n", 624 | " x=\"step\", y=metric, hue=\"model\", palette=report_hues)\n", 625 | " plt.xlim((0, checkpoint_steps[-1]))\n", 626 | " plt.xlabel(\"Fine-tuning step\")\n", 627 | " plt.ylabel(label)\n", 628 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1, 0.5))\n", 629 | " plt.tight_layout()\n", 630 | " plt.savefig(trajectory_fig_dir / f\"{metric}.{model_set_name}.pdf\")\n", 631 | " plt.close(fig)\n", 632 | " \n", 633 | "%matplotlib inline" 634 | ] 635 | }, 636 | { 637 | "cell_type": "code", 638 | "execution_count": null, 639 | "metadata": {}, 640 | "outputs": [], 641 | "source": [ 642 | "# g = sns.FacetGrid(dfi[dfi.model != baseline_model], col=\"model\", row=\"run\", size=6)\n", 643 | "# g.map(sns.lineplot, \"step\", \"decoding_mse_delta\", \"subject\", ci=None).add_legend()" 644 | ] 645 | }, 646 | { 647 | "cell_type": "code", 648 | "execution_count": null, 649 | "metadata": {}, 650 | "outputs": [], 651 | "source": [ 652 | "# g = sns.FacetGrid(dfi, col=\"model\", row=\"run\", size=6)\n", 653 | "# g.map(sns.lineplot, \"step\", \"rank_median_delta\", \"subject\", ci=None).add_legend()" 654 | ] 655 | }, 656 | { 657 | "cell_type": "markdown", 658 | "metadata": {}, 659 | "source": [ 660 | "### Gradient norm analysis" 661 | ] 662 | }, 663 | { 664 | "cell_type": "code", 665 | "execution_count": null, 666 | "metadata": {}, 667 | "outputs": [], 668 | "source": [ 669 | "# f, ax = plt.subplots(figsize=(10, 8))\n", 670 | "# sns.lineplot(data=dfi, y=\"decoding_mse_delta\", x=\"total_global_norms_bin\", hue=\"model\", ax=ax)\n", 671 | "# ax.set_title(\"Decoding performance delta vs. binned total global gradient norm\")\n", 672 | "# ax.set_xlabel(\"Cumulative global gradient norm bin\")\n", 673 | "# ax.set_ylabel(MSE_DELTA_LABEL)" 674 | ] 675 | }, 676 | { 677 | "cell_type": "code", 678 | "execution_count": null, 679 | "metadata": {}, 680 | "outputs": [], 681 | "source": [ 682 | "#g = sns.FacetGrid(dfi, col=\"model\", row=\"run\", size=6, sharex=False, sharey=True)\n", 683 | "#g.map(sns.lineplot, \"total_global_norms\", \"decoding_mse_delta\", \"subject\", ci=None).add_legend()" 684 | ] 685 | }, 686 | { 687 | "cell_type": "markdown", 688 | "metadata": {}, 689 | "source": [ 690 | "### Eval accuracy analysis" 691 | ] 692 | }, 693 | { 694 | "cell_type": "code", 695 | "execution_count": null, 696 | "metadata": {}, 697 | "outputs": [], 698 | "source": [ 699 | "#g = sns.FacetGrid(dfi, col=\"model\", row=\"run\", sharex=False, sharey=True, size=7)\n", 700 | "#g.map(sns.lineplot, \"eval_accuracy\", \"decoding_mse_delta\", \"subject\", ci=None).add_legend()" 701 | ] 702 | }, 703 | { 704 | "cell_type": "markdown", 705 | "metadata": {}, 706 | "source": [ 707 | "## Per-subject analysis" 708 | ] 709 | }, 710 | { 711 | "cell_type": "code", 712 | "execution_count": null, 713 | "metadata": {}, 714 | "outputs": [], 715 | "source": [ 716 | "f, ax = plt.subplots(figsize=(14, 9))\n", 717 | "dff = pd.DataFrame(dfi[dfi.step == checkpoint_steps[-1]].groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").decoding_mse_delta.mean()).stack()).reset_index()\n", 718 | "sns.barplot(data=dff, x=\"model\", hue=\"subject\", y=0, ax=ax)\n", 719 | "plt.title(\"subject final decoding mse delta, averaging across runs\")" 720 | ] 721 | }, 722 | { 723 | "cell_type": "code", 724 | "execution_count": null, 725 | "metadata": {}, 726 | "outputs": [], 727 | "source": [ 728 | "f, ax = plt.subplots(figsize=(14, 9))\n", 729 | "dff = pd.DataFrame(dfi[dfi.step == checkpoint_steps[-1]].groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").rank_mean_delta.mean()).stack()).reset_index()\n", 730 | "sns.barplot(data=dff, x=\"model\", hue=\"subject\", y=0, ax=ax)\n", 731 | "plt.title(\"subject final rank mean delta, averaging across runs\")" 732 | ] 733 | }, 734 | { 735 | "cell_type": "code", 736 | "execution_count": null, 737 | "metadata": {}, 738 | "outputs": [], 739 | "source": [ 740 | "f, ax = plt.subplots(figsize=(14, 9))\n", 741 | "dff = pd.DataFrame(dfi.groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").decoding_mse_delta.max()).stack()).reset_index()\n", 742 | "sns.violinplot(data=dff, x=\"subject\", y=0)\n", 743 | "sns.stripplot(data=dff, x=\"subject\", y=0, edgecolor=\"white\", linewidth=1, alpha=0.7, ax=ax)\n", 744 | "plt.title(\"subject max decoding mse delta, averaging across models and runs\")" 745 | ] 746 | }, 747 | { 748 | "cell_type": "code", 749 | "execution_count": null, 750 | "metadata": {}, 751 | "outputs": [], 752 | "source": [ 753 | "f, ax = plt.subplots(figsize=(14, 9))\n", 754 | "dff = pd.DataFrame(dfi.groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").decoding_mse_delta.min()).stack()).reset_index()\n", 755 | "sns.violinplot(data=dff, x=\"subject\", y=0)\n", 756 | "sns.stripplot(data=dff, x=\"subject\", y=0, edgecolor=\"white\", linewidth=1, alpha=0.7, ax=ax)\n", 757 | "plt.title(\"subject min decoding mse delta, averaging across models and runs\")" 758 | ] 759 | }, 760 | { 761 | "cell_type": "markdown", 762 | "metadata": {}, 763 | "source": [ 764 | "## Statistical analyses\n", 765 | "\n", 766 | "First, some data prep for comparing final vs. start states:" 767 | ] 768 | }, 769 | { 770 | "cell_type": "code", 771 | "execution_count": null, 772 | "metadata": {}, 773 | "outputs": [], 774 | "source": [ 775 | "perf_comp = df.query(\"step == %i\" % checkpoint_steps[-1]).reset_index(level=\"step\", drop=True).sort_index()\n", 776 | "# Join data from baseline\n", 777 | "perf_comp = perf_comp.join(df.loc[baseline_model, 1, 0].rename(columns=lambda c: \"start_%s\" % c))\n", 778 | "if \"glove\" in perf_comp.index.levels[0]:\n", 779 | " perf_comp = perf_comp.join(df.loc[\"glove\", 1, 250].rename(columns=lambda c: \"glove_%s\" % c))\n", 780 | "perf_comp.head()" 781 | ] 782 | }, 783 | { 784 | "cell_type": "code", 785 | "execution_count": null, 786 | "metadata": {}, 787 | "outputs": [], 788 | "source": [ 789 | "(perf_comp.mse - perf_comp.start_mse).plot.hist()" 790 | ] 791 | }, 792 | { 793 | "cell_type": "code", 794 | "execution_count": null, 795 | "metadata": {}, 796 | "outputs": [], 797 | "source": [ 798 | "perf_compi = perf_comp.reset_index()" 799 | ] 800 | }, 801 | { 802 | "cell_type": "markdown", 803 | "metadata": {}, 804 | "source": [ 805 | "Quantitative tests:\n", 806 | " \n", 807 | "1. for any GLUE task g, MSE(g after 250) > MSE(LM)\n", 808 | "2. for any LM_scrambled_para task t, MSE(t after 250) < MSE(LM)\n", 809 | "3. for any GLUE task g, MAR(g after 250) > MAR(LM)\n", 810 | "4. for any LM_scrambled_para task t, MAR(t after 250) < MAR(LM)\n", 811 | "5. MSE(LM after 250) =~ MSE(LM)\n", 812 | "6. MAR(LM after 250) =~ MSE(LM)\n", 813 | "7. for any LM_scrambled_para task t, MSE(t after 250) < MSE(glove)\n", 814 | "8. for any LM_scrambled_para task t, MAR(t after 250) < MAR(glove)\n", 815 | "9. for any LM_pos task t, MSE(t after 250) > MSE(LM)\n", 816 | "10. for any LM_pos task t, MAR(t after 250) > MAR(LM)" 817 | ] 818 | }, 819 | { 820 | "cell_type": "markdown", 821 | "metadata": {}, 822 | "source": [ 823 | "### test 1" 824 | ] 825 | }, 826 | { 827 | "cell_type": "code", 828 | "execution_count": null, 829 | "metadata": {}, 830 | "outputs": [], 831 | "source": [ 832 | "sample = perf_compi[~perf_compi.model.str.startswith((baseline_model, \"LM\", \"glove\"))]" 833 | ] 834 | }, 835 | { 836 | "cell_type": "code", 837 | "execution_count": null, 838 | "metadata": {}, 839 | "outputs": [], 840 | "source": [ 841 | "sample.mse.hist()" 842 | ] 843 | }, 844 | { 845 | "cell_type": "code", 846 | "execution_count": null, 847 | "metadata": {}, 848 | "outputs": [], 849 | "source": [ 850 | "sample.start_mse.hist()" 851 | ] 852 | }, 853 | { 854 | "cell_type": "code", 855 | "execution_count": null, 856 | "metadata": {}, 857 | "outputs": [], 858 | "source": [ 859 | "st.ttest_rel(sample.mse, sample.start_mse)" 860 | ] 861 | }, 862 | { 863 | "cell_type": "markdown", 864 | "metadata": {}, 865 | "source": [ 866 | "### test 1 (split across models)" 867 | ] 868 | }, 869 | { 870 | "cell_type": "code", 871 | "execution_count": null, 872 | "metadata": {}, 873 | "outputs": [], 874 | "source": [ 875 | "results = []\n", 876 | "for model in standard_models:\n", 877 | " if model in [\"LM\", \"glove\"]: continue\n", 878 | " sample = perf_compi[perf_compi.model == model]\n", 879 | " results.append((model,) + st.ttest_rel(sample.mse, sample.start_mse))\n", 880 | " \n", 881 | "pd.DataFrame(results, columns=[\"model\", \"tval\", \"pval\"])" 882 | ] 883 | }, 884 | { 885 | "cell_type": "markdown", 886 | "metadata": {}, 887 | "source": [ 888 | "### test 2" 889 | ] 890 | }, 891 | { 892 | "cell_type": "code", 893 | "execution_count": null, 894 | "metadata": {}, 895 | "outputs": [], 896 | "source": [ 897 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]" 898 | ] 899 | }, 900 | { 901 | "cell_type": "code", 902 | "execution_count": null, 903 | "metadata": {}, 904 | "outputs": [], 905 | "source": [ 906 | "sample.mse.hist()" 907 | ] 908 | }, 909 | { 910 | "cell_type": "code", 911 | "execution_count": null, 912 | "metadata": {}, 913 | "outputs": [], 914 | "source": [ 915 | "sample.start_mse.hist()" 916 | ] 917 | }, 918 | { 919 | "cell_type": "code", 920 | "execution_count": null, 921 | "metadata": {}, 922 | "outputs": [], 923 | "source": [ 924 | "st.ttest_rel(sample.mse, sample.start_mse)" 925 | ] 926 | }, 927 | { 928 | "cell_type": "markdown", 929 | "metadata": {}, 930 | "source": [ 931 | "### test 3" 932 | ] 933 | }, 934 | { 935 | "cell_type": "code", 936 | "execution_count": null, 937 | "metadata": {}, 938 | "outputs": [], 939 | "source": [ 940 | "sample = perf_compi[~perf_compi.model.str.startswith((baseline_model, \"LM\", \"glove\"))]" 941 | ] 942 | }, 943 | { 944 | "cell_type": "code", 945 | "execution_count": null, 946 | "metadata": {}, 947 | "outputs": [], 948 | "source": [ 949 | "sample.rank_mean.hist()" 950 | ] 951 | }, 952 | { 953 | "cell_type": "code", 954 | "execution_count": null, 955 | "metadata": {}, 956 | "outputs": [], 957 | "source": [ 958 | "sample.start_rank_mean.hist()" 959 | ] 960 | }, 961 | { 962 | "cell_type": "code", 963 | "execution_count": null, 964 | "metadata": {}, 965 | "outputs": [], 966 | "source": [ 967 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)" 968 | ] 969 | }, 970 | { 971 | "cell_type": "markdown", 972 | "metadata": {}, 973 | "source": [ 974 | "### test 3 (split across models)" 975 | ] 976 | }, 977 | { 978 | "cell_type": "code", 979 | "execution_count": null, 980 | "metadata": {}, 981 | "outputs": [], 982 | "source": [ 983 | "results = []\n", 984 | "for model in standard_models:\n", 985 | " if model in [\"LM\", \"glove\"]: continue\n", 986 | " sample = perf_compi[perf_compi.model == model]\n", 987 | " results.append((model,) + st.ttest_rel(sample.rank_mean, sample.start_rank_mean))\n", 988 | " \n", 989 | "pd.DataFrame(results, columns=[\"model\", \"tval\", \"pval\"])" 990 | ] 991 | }, 992 | { 993 | "cell_type": "markdown", 994 | "metadata": {}, 995 | "source": [ 996 | "### test 4" 997 | ] 998 | }, 999 | { 1000 | "cell_type": "code", 1001 | "execution_count": null, 1002 | "metadata": {}, 1003 | "outputs": [], 1004 | "source": [ 1005 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]" 1006 | ] 1007 | }, 1008 | { 1009 | "cell_type": "code", 1010 | "execution_count": null, 1011 | "metadata": {}, 1012 | "outputs": [], 1013 | "source": [ 1014 | "sample.rank_mean.hist()" 1015 | ] 1016 | }, 1017 | { 1018 | "cell_type": "code", 1019 | "execution_count": null, 1020 | "metadata": {}, 1021 | "outputs": [], 1022 | "source": [ 1023 | "sample.start_rank_mean.hist()" 1024 | ] 1025 | }, 1026 | { 1027 | "cell_type": "code", 1028 | "execution_count": null, 1029 | "metadata": {}, 1030 | "outputs": [], 1031 | "source": [ 1032 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)" 1033 | ] 1034 | }, 1035 | { 1036 | "cell_type": "markdown", 1037 | "metadata": {}, 1038 | "source": [ 1039 | "### test 5" 1040 | ] 1041 | }, 1042 | { 1043 | "cell_type": "code", 1044 | "execution_count": null, 1045 | "metadata": {}, 1046 | "outputs": [], 1047 | "source": [ 1048 | "sample = perf_compi[perf_compi.model == \"LM\"]" 1049 | ] 1050 | }, 1051 | { 1052 | "cell_type": "code", 1053 | "execution_count": null, 1054 | "metadata": {}, 1055 | "outputs": [], 1056 | "source": [ 1057 | "sample.mse.hist()" 1058 | ] 1059 | }, 1060 | { 1061 | "cell_type": "code", 1062 | "execution_count": null, 1063 | "metadata": {}, 1064 | "outputs": [], 1065 | "source": [ 1066 | "sample.start_mse.hist()" 1067 | ] 1068 | }, 1069 | { 1070 | "cell_type": "code", 1071 | "execution_count": null, 1072 | "metadata": {}, 1073 | "outputs": [], 1074 | "source": [ 1075 | "st.ttest_rel(sample.mse, sample.start_mse)" 1076 | ] 1077 | }, 1078 | { 1079 | "cell_type": "markdown", 1080 | "metadata": {}, 1081 | "source": [ 1082 | "### test 6" 1083 | ] 1084 | }, 1085 | { 1086 | "cell_type": "code", 1087 | "execution_count": null, 1088 | "metadata": {}, 1089 | "outputs": [], 1090 | "source": [ 1091 | "sample = perf_compi[perf_compi.model == \"LM\"]" 1092 | ] 1093 | }, 1094 | { 1095 | "cell_type": "code", 1096 | "execution_count": null, 1097 | "metadata": {}, 1098 | "outputs": [], 1099 | "source": [ 1100 | "sample.rank_mean.hist()" 1101 | ] 1102 | }, 1103 | { 1104 | "cell_type": "code", 1105 | "execution_count": null, 1106 | "metadata": {}, 1107 | "outputs": [], 1108 | "source": [ 1109 | "sample.start_rank_mean.hist()" 1110 | ] 1111 | }, 1112 | { 1113 | "cell_type": "code", 1114 | "execution_count": null, 1115 | "metadata": {}, 1116 | "outputs": [], 1117 | "source": [ 1118 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)" 1119 | ] 1120 | }, 1121 | { 1122 | "cell_type": "markdown", 1123 | "metadata": {}, 1124 | "source": [ 1125 | "### test 7" 1126 | ] 1127 | }, 1128 | { 1129 | "cell_type": "code", 1130 | "execution_count": null, 1131 | "metadata": {}, 1132 | "outputs": [], 1133 | "source": [ 1134 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]" 1135 | ] 1136 | }, 1137 | { 1138 | "cell_type": "code", 1139 | "execution_count": null, 1140 | "metadata": {}, 1141 | "outputs": [], 1142 | "source": [ 1143 | "sample.mse.hist()" 1144 | ] 1145 | }, 1146 | { 1147 | "cell_type": "code", 1148 | "execution_count": null, 1149 | "metadata": {}, 1150 | "outputs": [], 1151 | "source": [ 1152 | "sample.glove_mse.hist()" 1153 | ] 1154 | }, 1155 | { 1156 | "cell_type": "code", 1157 | "execution_count": null, 1158 | "metadata": {}, 1159 | "outputs": [], 1160 | "source": [ 1161 | "st.ttest_rel(sample.mse, sample.glove_mse)" 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "markdown", 1166 | "metadata": {}, 1167 | "source": [ 1168 | "### test 8" 1169 | ] 1170 | }, 1171 | { 1172 | "cell_type": "code", 1173 | "execution_count": null, 1174 | "metadata": {}, 1175 | "outputs": [], 1176 | "source": [ 1177 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]" 1178 | ] 1179 | }, 1180 | { 1181 | "cell_type": "code", 1182 | "execution_count": null, 1183 | "metadata": {}, 1184 | "outputs": [], 1185 | "source": [ 1186 | "sample.rank_mean.hist()" 1187 | ] 1188 | }, 1189 | { 1190 | "cell_type": "code", 1191 | "execution_count": null, 1192 | "metadata": {}, 1193 | "outputs": [], 1194 | "source": [ 1195 | "sample.glove_rank_mean.hist()" 1196 | ] 1197 | }, 1198 | { 1199 | "cell_type": "code", 1200 | "execution_count": null, 1201 | "metadata": {}, 1202 | "outputs": [], 1203 | "source": [ 1204 | "st.ttest_rel(sample.rank_mean, sample.glove_rank_mean)" 1205 | ] 1206 | }, 1207 | { 1208 | "cell_type": "markdown", 1209 | "metadata": {}, 1210 | "source": [ 1211 | "### test 9" 1212 | ] 1213 | }, 1214 | { 1215 | "cell_type": "code", 1216 | "execution_count": null, 1217 | "metadata": {}, 1218 | "outputs": [], 1219 | "source": [ 1220 | "sample = perf_compi[perf_compi.model == \"LM_pos\"]" 1221 | ] 1222 | }, 1223 | { 1224 | "cell_type": "code", 1225 | "execution_count": null, 1226 | "metadata": {}, 1227 | "outputs": [], 1228 | "source": [ 1229 | "sample.mse.hist()" 1230 | ] 1231 | }, 1232 | { 1233 | "cell_type": "code", 1234 | "execution_count": null, 1235 | "metadata": {}, 1236 | "outputs": [], 1237 | "source": [ 1238 | "sample.start_mse.hist()" 1239 | ] 1240 | }, 1241 | { 1242 | "cell_type": "code", 1243 | "execution_count": null, 1244 | "metadata": {}, 1245 | "outputs": [], 1246 | "source": [ 1247 | "st.ttest_rel(sample.mse, sample.start_mse)" 1248 | ] 1249 | }, 1250 | { 1251 | "cell_type": "code", 1252 | "execution_count": null, 1253 | "metadata": {}, 1254 | "outputs": [], 1255 | "source": [ 1256 | "f = plt.figure(figsize=(20,20))\n", 1257 | "sns.barplot(data=pd.melt(sample, id_vars=[\"subject\"], value_vars=[\"mse\", \"start_mse\"]),\n", 1258 | " x=\"subject\", y=\"value\", hue=\"variable\")\n", 1259 | "plt.ylim((0.0033, 0.0038))" 1260 | ] 1261 | }, 1262 | { 1263 | "cell_type": "markdown", 1264 | "metadata": {}, 1265 | "source": [ 1266 | "### test 10" 1267 | ] 1268 | }, 1269 | { 1270 | "cell_type": "code", 1271 | "execution_count": null, 1272 | "metadata": {}, 1273 | "outputs": [], 1274 | "source": [ 1275 | "sample = perf_compi[perf_compi.model == \"LM_pos\"]" 1276 | ] 1277 | }, 1278 | { 1279 | "cell_type": "code", 1280 | "execution_count": null, 1281 | "metadata": {}, 1282 | "outputs": [], 1283 | "source": [ 1284 | "sample.rank_mean.hist()" 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "code", 1289 | "execution_count": null, 1290 | "metadata": {}, 1291 | "outputs": [], 1292 | "source": [ 1293 | "sample.start_rank_mean.hist()" 1294 | ] 1295 | }, 1296 | { 1297 | "cell_type": "code", 1298 | "execution_count": null, 1299 | "metadata": {}, 1300 | "outputs": [], 1301 | "source": [ 1302 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)" 1303 | ] 1304 | } 1305 | ], 1306 | "metadata": { 1307 | "kernelspec": { 1308 | "display_name": "Python 3", 1309 | "language": "python", 1310 | "name": "python3" 1311 | }, 1312 | "language_info": { 1313 | "codemirror_mode": { 1314 | "name": "ipython", 1315 | "version": 3 1316 | }, 1317 | "file_extension": ".py", 1318 | "mimetype": "text/x-python", 1319 | "name": "python", 1320 | "nbconvert_exporter": "python", 1321 | "pygments_lexer": "ipython3", 1322 | "version": "3.6.8" 1323 | } 1324 | }, 1325 | "nbformat": 4, 1326 | "nbformat_minor": 2 1327 | } 1328 | -------------------------------------------------------------------------------- /notebooks/rsa.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | 3 | from scipy import stats as st 4 | from scipy.spatial.distance import pdist 5 | from tqdm import tqdm 6 | 7 | import pandas as pd 8 | 9 | 10 | def rsa_encodings(encodings_dict, pairs=None, collapse_fn=None): 11 | """ 12 | Compute representational similarity metrics on the given encodings. 13 | 14 | Arguments: 15 | pairs: encoding pairs (keys of `encodings_dict`) to compare. If `None`, all possible pairs are evaluated. 16 | collapse_fn: if not `None`, store the results of each pairwise analysis not under the key `(model1, model2)` (where `model1`, `model2` are keys of `pairs`), but rather `(collapse_fn(model1), collapse_fn(model2))`. 17 | """ 18 | 19 | if pairs is None: 20 | pairs = list(itertools.combinations(encodings_dict.keys(), 2)) 21 | 22 | # Cache distance matrices. 23 | dist_matrices = {} 24 | 25 | rsa_sims = [] 26 | for m1_key, m2_key in tqdm(pairs): 27 | dists1 = dist_matrices.get(m1_key) 28 | if dists1 is None: 29 | dists1 = pdist(encodings_dict[m1_key]) 30 | dist_matrices[m1_key] = dists1 31 | 32 | dists2 = dist_matrices.get(m2_key) 33 | if dists2 is None: 34 | dists2 = pdist(encodings_dict[m2_key]) 35 | dist_matrices[m2_key] = dists2 36 | 37 | pearson_coef, _ = st.spearmanr(dists1, dists2) 38 | 39 | if collapse_fn is not None: 40 | m1_key = collapse_fn(m1_key) 41 | m2_key = collapse_fn(m2_key) 42 | 43 | rsa_sims.append((m1_key, m2_key, pearson_coef)) 44 | 45 | rsa_sims = pd.DataFrame(rsa_sims, columns=["model1", "model2", "pearsonr"]) 46 | return rsa_sims -------------------------------------------------------------------------------- /notebooks/structural-probes.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from functools import partial\n", 10 | "import itertools\n", 11 | "import json\n", 12 | "from pathlib import Path\n", 13 | "import re\n", 14 | "import sys\n", 15 | "sys.path.append(\"../src\")\n", 16 | "\n", 17 | "import matplotlib\n", 18 | "import matplotlib.pyplot as plt\n", 19 | "import numpy as np\n", 20 | "import pandas as pd\n", 21 | "import seaborn as sns\n", 22 | "import statsmodels.formula.api as smf\n", 23 | "from tqdm import tqdm, tqdm_notebook\n", 24 | "\n", 25 | "%matplotlib inline\n", 26 | "sns.set(style=\"whitegrid\", context=\"paper\", font_scale=3.5, rc={\"lines.linewidth\": 2.5})\n", 27 | "from IPython.display import set_matplotlib_formats\n", 28 | "set_matplotlib_formats('png')\n", 29 | "#set_matplotlib_formats('svg')\n", 30 | "\n", 31 | "%load_ext autoreload\n", 32 | "%autoreload 2\n", 33 | "import util" 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "metadata": {}, 39 | "source": [ 40 | "## Data preparation" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": null, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "output_path = Path(\"../output\")\n", 50 | "bert_encoding_path = output_path / \"encodings\"\n", 51 | "sprobe_results_path = output_path / \"structural-probe\"" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": null, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "checkpoints = [util.get_encoding_ckpt_id(dir_entry) for dir_entry in bert_encoding_path.iterdir()]" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": null, 66 | "metadata": {}, 67 | "outputs": [], 68 | "source": [ 69 | "models = [model for model, _, _ in checkpoints]\n", 70 | "\n", 71 | "baseline_model = \"baseline\"\n", 72 | "if baseline_model not in models:\n", 73 | " raise ValueError(\"Missing baseline model. This is necessary to compute performance deltas in the analysis of fine-tuning models. Stop.\")\n", 74 | "\n", 75 | "standard_models = [model for model in models if not model.startswith(\"LM_\") and not model == baseline_model]\n", 76 | "custom_models = [model for model in models if model.startswith(\"LM_\") and not model == baseline_model]\n", 77 | "\n", 78 | "runs = sorted(set(run for _, run, _ in checkpoints))\n", 79 | "checkpoint_steps = sorted(set(step for _, _, step in checkpoints))\n", 80 | "\n", 81 | "# Models which should appear in the final report figures\n", 82 | "report_models = [\"SQuAD\", \"QQP\", \"MNLI\", \"SST\", \"LM\", \"LM_scrambled\", \"LM_scrambled_para\", \"LM_pos\", \"glove\"]\n", 83 | "\n", 84 | "# Model subsets to render in different report figures\n", 85 | "report_model_sets = [\n", 86 | " (\"all\", set(report_models)),\n", 87 | " (\"standard\", set(report_models) & set(standard_models)),\n", 88 | " (\"custom\", set(report_models) & set(custom_models)),\n", 89 | "]\n", 90 | "report_model_sets = [(name, model_set) for name, model_set in report_model_sets\n", 91 | " if len(model_set) > 0]" 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": null, 97 | "metadata": {}, 98 | "outputs": [], 99 | "source": [ 100 | "RENDER_FINAL = True\n", 101 | "figure_path = Path(\"../reports/figures\")\n", 102 | "figure_path.mkdir(exist_ok=True)\n", 103 | "\n", 104 | "report_hues = dict(zip(sorted(report_models), sns.color_palette()))" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": {}, 110 | "source": [ 111 | "## Collect results" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": {}, 118 | "outputs": [], 119 | "source": [ 120 | "eval_results = {}\n", 121 | "for eval_dir in tqdm_notebook(list(sprobe_results_path.iterdir())):\n", 122 | " if not eval_dir.is_dir(): continue\n", 123 | " model, run, step = util.get_encoding_ckpt_id(eval_dir)\n", 124 | " \n", 125 | " try:\n", 126 | " uuas_file = list(eval_dir.glob(\"**/dev.uuas\"))[0]\n", 127 | " with uuas_file.open(\"r\") as f:\n", 128 | " uuas = float(f.read().strip())\n", 129 | " except: continue\n", 130 | " \n", 131 | " try:\n", 132 | " spearman_file = list(eval_dir.glob(\"**/dev.spearmanr-*-mean\"))[0]\n", 133 | " with spearman_file.open(\"r\") as f:\n", 134 | " spearman = float(f.read().strip())\n", 135 | " except: continue\n", 136 | " \n", 137 | " eval_results[model, run, step] = pd.Series({\"uuas\": uuas, \"spearman\": spearman})" 138 | ] 139 | }, 140 | { 141 | "cell_type": "markdown", 142 | "metadata": {}, 143 | "source": [ 144 | "### Add non-BERT results" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "nonbert_models = []" 154 | ] 155 | }, 156 | { 157 | "cell_type": "code", 158 | "execution_count": null, 159 | "metadata": {}, 160 | "outputs": [], 161 | "source": [ 162 | "# GloVe\n", 163 | "# for glove_dir in tqdm_notebook(list(sprobe_glove_path.glob(\"*\"))):\n", 164 | "# if not glove_dir.is_dir(): continue\n", 165 | "# model = glove_dir.name\n", 166 | " \n", 167 | "# try:\n", 168 | "# uuas_file = list(glove_dir.glob(\"**/dev.uuas\"))[0]\n", 169 | "# with uuas_file.open(\"r\") as f:\n", 170 | "# uuas = float(f.read().strip())\n", 171 | "# except: continue\n", 172 | " \n", 173 | "# try:\n", 174 | "# spearman_file = list(glove_dir.glob(\"**/dev.spearmanr-*-mean\"))[0]\n", 175 | "# with spearman_file.open(\"r\") as f:\n", 176 | "# spearman = float(f.read().strip())\n", 177 | "# except: continue\n", 178 | " \n", 179 | "# nonbert_models.append(model)\n", 180 | "# eval_results[model, 1, 250, 0] = pd.Series({\"uuas\": uuas, \"spearman\": spearman})" 181 | ] 182 | }, 183 | { 184 | "cell_type": "markdown", 185 | "metadata": {}, 186 | "source": [ 187 | "### Aggregate" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": {}, 194 | "outputs": [], 195 | "source": [ 196 | "eval_results = pd.DataFrame(pd.concat(eval_results, names=[\"model\", \"run\", \"step\", \"metric\"]))" 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": null, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [ 205 | "eval_results.tail(20)" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": {}, 212 | "outputs": [], 213 | "source": [ 214 | "# Only use spaCy results\n", 215 | "nonbert_models_to_graph = [(\"spaCy-en_vectors_web_lg\", \"GloVe\")]\n", 216 | "nonbert_models_to_graph = [(name, label) for name, label in nonbert_models_to_graph if name in nonbert_models]" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": {}, 222 | "source": [ 223 | "## Graph" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": null, 229 | "metadata": {}, 230 | "outputs": [], 231 | "source": [ 232 | "graph_data = eval_results.reset_index()\n", 233 | "graph_data = graph_data[~graph_data.model.isin(nonbert_models + [baseline_model])]" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": null, 239 | "metadata": {}, 240 | "outputs": [], 241 | "source": [ 242 | "g = sns.FacetGrid(data=graph_data, col=\"metric\", height=7, sharex=True, sharey=True)\n", 243 | "g.map(sns.lineplot, \"step\", 0, \"model\")\n", 244 | "\n", 245 | "for uuas_ax in g.axes[:, 0]:\n", 246 | " for nonbert_model, label in nonbert_models_to_graph:\n", 247 | " uuas_ax.axhline(eval_results.loc[nonbert_model, 1, 250, 0, \"uuas\"][0], linestyle='--', label=label)\n", 248 | "for spearman_ax in g.axes[:, 1]:\n", 249 | " for nonbert_model, label in nonbert_models_to_graph:\n", 250 | " spearman_ax.axhline(eval_results.loc[nonbert_model, 1, 250, 0, \"spearman\"][0], linestyle='--', label=label)\n", 251 | " \n", 252 | "g.add_legend()\n", 253 | "g" 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": null, 259 | "metadata": {}, 260 | "outputs": [], 261 | "source": [ 262 | "g = sns.FacetGrid(data=graph_data, col=\"metric\", row=\"model\", height=7, sharex=True, sharey=True)\n", 263 | "g.map(sns.lineplot, \"step\", 0).add_legend()" 264 | ] 265 | }, 266 | { 267 | "cell_type": "code", 268 | "execution_count": null, 269 | "metadata": {}, 270 | "outputs": [], 271 | "source": [ 272 | "%matplotlib agg\n", 273 | "\n", 274 | "if RENDER_FINAL:\n", 275 | " dir = figure_path / \"structural_probe\"\n", 276 | " dir.mkdir(exist_ok=True)\n", 277 | " \n", 278 | " for metric, label in [(\"uuas\", \"UUAS\"), (\"spearman\", \"Spearman correlation\")]:\n", 279 | " fig = plt.figure(figsize=(15, 9))\n", 280 | " ax = sns.lineplot(data=graph_data[(graph_data.metric == metric)], x=\"step\", y=0,\n", 281 | " hue=\"model\", palette=report_hues)\n", 282 | " for nonbert_model, nonbert_label in nonbert_models_to_graph:\n", 283 | " ax.axhline(eval_results.loc[nonbert_model, 1, 0, metric][0],\n", 284 | " linestyle='--', label=nonbert_label, linewidth=3)\n", 285 | " \n", 286 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1, 0.5))\n", 287 | " plt.xlim((0, checkpoint_steps[-1]))\n", 288 | " plt.ylabel(label)\n", 289 | " plt.xlabel(\"Training step\")\n", 290 | " plt.tight_layout()\n", 291 | " plt.savefig(dir / (\"%s.pdf\" % metric))\n", 292 | " plt.close()\n", 293 | " \n", 294 | "%matplotlib inline" 295 | ] 296 | } 297 | ], 298 | "metadata": { 299 | "kernelspec": { 300 | "display_name": "Python 3", 301 | "language": "python", 302 | "name": "python3" 303 | }, 304 | "language_info": { 305 | "codemirror_mode": { 306 | "name": "ipython", 307 | "version": 3 308 | }, 309 | "file_extension": ".py", 310 | "mimetype": "text/x-python", 311 | "name": "python", 312 | "nbconvert_exporter": "python", 313 | "pygments_lexer": "ipython3", 314 | "version": "3.6.8" 315 | } 316 | }, 317 | "nbformat": 4, 318 | "nbformat_minor": 2 319 | } 320 | -------------------------------------------------------------------------------- /notebooks/within-subject.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hans/nn-decoding/2d2cc639f650b6911cb1de7b8ecb7a872f75b36d/notebooks/within-subject.png -------------------------------------------------------------------------------- /src/dependency_graph.py: -------------------------------------------------------------------------------- 1 | from argparse import ArgumentParser 2 | import itertools 3 | 4 | import matplotlib 5 | matplotlib.use("Agg") 6 | import matplotlib.pyplot as plt 7 | import numpy as np 8 | import networkx as nx 9 | import pandas as pd 10 | 11 | 12 | p = ArgumentParser() 13 | p.add_argument("heatmap_path") 14 | 15 | args = p.parse_args() 16 | 17 | heatmap = pd.read_csv(args.heatmap_path, index_col=0) 18 | assert heatmap.index.equals(heatmap.columns) 19 | encodings = heatmap.columns 20 | 21 | G = nx.Graph() 22 | edges = [] 23 | for enc1, enc2 in itertools.product(list(range(len(encodings))), repeat=2): 24 | if enc1 == enc2: 25 | continue 26 | enc1_name = encodings[enc1] 27 | enc2_name = encodings[enc2] 28 | 29 | score_forward = heatmap.loc[enc1_name, enc2_name] 30 | score_reverse = heatmap.loc[enc2_name, enc1_name] 31 | 32 | # Extrema along rows, ignoring diagonal element 33 | enc1_min = min(heatmap.iloc[enc1, 0:enc1].min() if enc1 > 0 else np.inf, heatmap.iloc[enc1, enc1+1:].min()) 34 | enc2_max = max(heatmap.iloc[enc2, 0:enc2].max() if enc2 > 0 else -np.inf, heatmap.iloc[enc2, enc2+1:].max()) 35 | if score_forward > score_reverse and enc1_min >= enc2_max: 36 | edges.append((enc1_name, enc2_name)) 37 | 38 | print(edges) 39 | G.add_edges_from(edges) 40 | pos = nx.spring_layout(G) 41 | nx.draw_networkx_nodes(G, pos) 42 | nx.draw_networkx_labels(G, pos) 43 | nx.draw_networkx_edges(G, pos, arrowstyle="->", arrowsize=10, arrows=True) 44 | plt.savefig("graph.png") 45 | -------------------------------------------------------------------------------- /src/heatmap.py: -------------------------------------------------------------------------------- 1 | """ 2 | Render a heat-map describing the relationship between different encodings. 3 | """ 4 | 5 | from argparse import ArgumentParser 6 | import itertools 7 | import logging 8 | import multiprocessing 9 | import os.path 10 | logging.basicConfig(level=logging.DEBUG) 11 | logger = logging.getLogger(__name__) 12 | 13 | import matplotlib 14 | matplotlib.use("Agg") 15 | import matplotlib.pyplot as plt 16 | import numpy as np 17 | import pandas as pd 18 | from scipy.linalg import sqrtm 19 | from scipy.spatial.distance import pdist, squareform 20 | from scipy.stats import spearmanr 21 | from sklearn.decomposition import PCA 22 | from sklearn.linear_model import RidgeCV 23 | from sklearn.model_selection import KFold 24 | import seaborn as sns 25 | from tqdm import tqdm, trange 26 | 27 | 28 | def eval_encodings_cca(enc1, enc2): 29 | cv = KFold(n_splits=4) 30 | corr_results = [] 31 | 32 | for train_idxs, test_idxs in tqdm(cv.split(enc1), total=cv.get_n_splits(enc1), 33 | desc="CV splits"): 34 | from rcca import CCA 35 | # TODO sanity check regularization constant s.t. CCA on self yields reasonable numbers 36 | cca = CCA(kernelcca=False, reg=1e-6, numCC=128, verbose=False) 37 | cca.train([enc1[train_idxs], enc2[train_idxs]]) 38 | 39 | print(np.mean(cca.validate([enc1[train_idxs], enc2[train_idxs]])[0])) 40 | enc1_pred_corrs, enc2_pred_corrs = cca.validate([enc1[test_idxs], enc2[test_idxs]]) 41 | # TODO projection weighting 42 | corr_results.append(np.mean(enc2_pred_corrs)) 43 | 44 | return np.mean(corr_results) 45 | 46 | 47 | def eval_encodings_rdm(encodings, enc1_key, enc2_key, 48 | n_bootstrap_samples=100, sentences=None): 49 | """ 50 | Evaluate the similarity between two encodings `e1, e2` as follows: 51 | 52 | 1. Align the paired representations of `e1` and `e2` to have maximal 53 | similarity (maximal dot product) via regularized CCA. (This CCA is 54 | cross-validated to prevent overfitting.) 55 | 2. Estimate a Spearman correlation coefficient relating the pairwise 56 | similarity judgments predicted by the two encodings by a bootstrap. (In 57 | other words, bootstrap-estimate a representational similarity analysis; in 58 | other words; bootstrap-estimate the difference between the 59 | representational dissimilarity matrices (RDMs) of the two aligned 60 | encodings.) representations. 61 | 62 | Args: 63 | encodings: Dictionary mapping from encoding name to n_examples * d matrices 64 | enc1_key: string key into `encodings` 65 | enc2_key: string key into `encodings` 66 | n_bootstrap_samples: Number of samples to take in estimating bootstrap. 67 | sentences: Optional `n_examples` array of sentences, for debugging 68 | 69 | Returns: 70 | spearman_coefs: Bootstrap estimates of the Spearman coefficient relating 71 | the pairwise similarity rankings predicted by CCA-aligned forms of `enc1` 72 | and `enc2` 73 | """ 74 | enc1 = encodings[enc1_key] 75 | enc2 = encodings[enc2_key] 76 | assert enc1.shape[0] == enc2.shape[0] 77 | 78 | # First align with CCA. 79 | from rcca import CCA, CCACrossValidate 80 | cca = CCACrossValidate(kernelcca=False, regs=[1e-2,1e-1,1.0,10.0,20.0], numCCs=[32,64,128,256]) 81 | cca.train([enc1, enc2]) 82 | print("Best reg: %f; best CC: %i" % (cca.best_reg, cca.best_numCC)) 83 | 84 | enc1_aligned, enc2_aligned = cca.comps 85 | # Calculate pairwise distances. 86 | dists_X = pdist(enc1_aligned, "correlation") 87 | dists_Y = pdist(enc2_aligned, "correlation") 88 | 89 | dists_X_square = squareform(dists_X) 90 | dists_Y_square = squareform(dists_Y) 91 | 92 | if sentences is not None: 93 | # DEBUG: List some of the most similar inputs 94 | sent_combinations = list(itertools.combinations(range(len(enc1)), 2)) 95 | high_sim_X = np.argsort(dists_X) 96 | high_sim_Y = np.argsort(dists_Y) 97 | 98 | out_path = "sim_%s_%s.csv" % (enc1_key, enc2_key) 99 | with open(out_path, "w") as out_f: 100 | for i, high_sim_X_idx in enumerate(high_sim_X): 101 | sent1, sent2 = sent_combinations[high_sim_X_idx] 102 | out_f.write("%s,%d,%f,\"%s\",\"%s\"\n" % (enc1_key, i, dists_X_square[sent1, sent2], 103 | sentences[sent1], sentences[sent2])) 104 | for i, high_sim_Y_idx in enumerate(high_sim_Y): 105 | sent1, sent2 = sent_combinations[high_sim_Y_idx] 106 | out_f.write("%s,%d,%f,\"%s\",\"%s\"\n" % (enc2_key, i, dists_Y_square[sent1, sent2], 107 | sentences[sent1], sentences[sent2])) 108 | 109 | # # Bootstrap estimate the Spearman coefficient. 110 | # spearman_coefs = [] 111 | # for _ in trange(n_bootstrap_samples): 112 | # idxs = np.random.choice(len(enc1), size=len(enc1), replace=True) 113 | # dists_X_sample = dists_X_square[np.ix_(idxs, idxs)] 114 | # dists_Y_sample = dists_Y_square[np.ix_(idxs, idxs)] 115 | 116 | # # Compute Spearman coefficient on condensed / non-redundant form. 117 | # sample_coef, _ = spearmanr(squareform(dists_X_sample), squareform(dists_Y_sample)) 118 | # spearman_coefs.append(sample_coef) 119 | 120 | spearman_coef, _ = spearmanr(dists_X, dists_Y) 121 | print("\t", enc1_key, enc2_key, spearman_coef) 122 | return [spearman_coef] 123 | 124 | 125 | def eval_pair(inputs): 126 | enc1, enc2, encodings, sentences = inputs 127 | # Multiprocessing task function. 128 | if enc1 == enc2: 129 | return enc1, enc2, (1.0, 1.0) 130 | else: 131 | coefs = eval_encodings_rdm(encodings, enc1, enc2, sentences=sentences) 132 | # Calculate 95% CI bounds 133 | lower_bound, upper_bound = np.percentile(coefs, (0.5, 0.95)) 134 | return enc1, enc2, (lower_bound, upper_bound) 135 | 136 | 137 | def main(args): 138 | encodings, encoding_keys = {}, [] 139 | for encoding_path in args.encodings: 140 | encodings_i = np.load(encoding_path) 141 | encoding_key = os.path.basename(encoding_path) 142 | encoding_key = encoding_key[:encoding_key.rindex(".")] 143 | 144 | if args.encoding_project is not None and args.encoding_project < encodings_i.shape[1]: 145 | logger.info("Projecting %s to dimension %i with PCA", encoding_path, args.encoding_project) 146 | pca = PCA(args.encoding_project).fit(encodings_i) 147 | logger.info("PCA explained variance: %f", sum(pca.explained_variance_ratio_) * 100) 148 | encodings_i = pca.transform(encodings_i) 149 | 150 | encodings[encoding_key] = encodings_i 151 | encoding_keys.append(encoding_key) 152 | 153 | sentences = None 154 | if args.sentences_path is not None: 155 | with open(args.sentences_path, "r") as sentences_f: 156 | sentences = [line.strip() for line in sentences_f] 157 | 158 | # Prepare output structures 159 | assert len(set(enc.shape[0] for enc in encodings.values())) == 1 160 | # Make sure to maintain ordering of the encodings given in the CLI arguments. 161 | heatmap_mat_lower_bound = np.zeros((len(encodings), len(encodings))) 162 | heatmap_mat_upper_bound = np.zeros_like(heatmap_mat_lower_bound) 163 | 164 | # Prepare multiprocessing jobs 165 | pool = multiprocessing.Pool(processes=args.num_processes) 166 | job_inputs = [(enc1, enc2, encodings, sentences) for enc1, enc2 167 | in itertools.combinations(encoding_keys, 2)] 168 | jobs = pool.imap_unordered(eval_pair, job_inputs) 169 | 170 | # Join jobs and update matrices 171 | with tqdm(total=len(job_inputs)) as pbar: 172 | for enc1, enc2, val in tqdm(jobs): 173 | pbar.update() 174 | lower_bound, upper_bound = val 175 | if lower_bound is None: 176 | continue 177 | 178 | enc1_idx = encoding_keys.index(enc1) 179 | enc2_idx = encoding_keys.index(enc2) 180 | heatmap_mat_lower_bound[enc1_idx, enc2_idx] = lower_bound 181 | heatmap_mat_upper_bound[enc1_idx, enc2_idx] = upper_bound 182 | 183 | print(heatmap_mat_lower_bound) 184 | if args.names is not None: 185 | names = args.names.strip().split(",") 186 | assert len(names) == len(args.encodings) 187 | else: 188 | names = list(map(str, range(1, len(args.encodings) + 1))) 189 | 190 | # Calculate heatmap statistics / render figures. 191 | for heatmap_mat, heatmap_name in zip([heatmap_mat_lower_bound, heatmap_mat_upper_bound], 192 | ["lower_bound", "upper_bound"]): 193 | # Copy upper triangle of matrix to lower triangle. 194 | heatmap_mat[np.tril_indices(len(heatmap_mat), -1)] = \ 195 | heatmap_mat.T[np.tril_indices(len(heatmap_mat), -1)] 196 | 197 | np.fill_diagonal(heatmap_mat, 1.0) 198 | 199 | df = pd.DataFrame(heatmap_mat, index=names, columns=names) 200 | df.mean(axis=1).to_csv("averages_%s.csv" % heatmap_name) 201 | df.to_csv("heatmap_%s.csv" % heatmap_name) 202 | 203 | # Only plot lower triangle. 204 | mask = np.zeros_like(heatmap_mat, dtype=np.bool) 205 | mask[np.triu_indices_from(mask, 1)] = True 206 | fig = plt.figure(figsize=(6, 5)) 207 | sns.heatmap(data=df, annot=True, square=True, mask=mask) 208 | plt.xticks(weight="bold") 209 | plt.yticks(rotation=0, weight="bold") 210 | plt.tight_layout() 211 | fig.savefig("heatmap_%s.svg" % heatmap_name) 212 | 213 | 214 | if __name__ == '__main__': 215 | p = ArgumentParser() 216 | p.add_argument("encodings", nargs="+") 217 | p.add_argument("--encoding_project", type=int) 218 | p.add_argument("--names") 219 | p.add_argument("--sentences_path") 220 | p.add_argument("-p", "--num_processes", default=1, type=int) 221 | main(p.parse_args()) 222 | -------------------------------------------------------------------------------- /src/learn_decoder.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """ 3 | Learn a decoder mapping from functional imaging data to target model 4 | representations. 5 | """ 6 | from argparse import ArgumentParser 7 | from collections import defaultdict 8 | import itertools 9 | import logging 10 | from pathlib import Path 11 | import time 12 | 13 | import numpy as np 14 | import pandas as pd 15 | from sklearn.decomposition import PCA 16 | from sklearn.linear_model import Ridge 17 | from sklearn.metrics import mean_squared_error, r2_score 18 | from sklearn.model_selection import KFold, cross_val_predict, GridSearchCV 19 | import scipy.io 20 | from scipy.spatial import distance 21 | from tqdm import tqdm 22 | 23 | import util 24 | 25 | logging.basicConfig(level=logging.INFO) 26 | L = logging.getLogger(__name__) 27 | 28 | # Candidate ridge regression regularization parameters. 29 | ALPHAS = [1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e1] 30 | 31 | 32 | def main(args): 33 | print(args) 34 | 35 | sentences = util.load_sentences(args.sentences_path) 36 | encodings = util.load_encodings(args.encoding_paths, project=args.encoding_project) 37 | encodings_normed = encodings / np.linalg.norm(encodings, axis=1, keepdims=True) 38 | 39 | assert len(encodings) == len(sentences) 40 | 41 | ######### Prepare to process subject. 42 | 43 | # Load subject data. 44 | subject = args.subject_name or args.brain_path.name 45 | L.info("Loading subject %s data.", subject) 46 | subject_images = util.load_brain_data(str(args.brain_path / args.mat_name), 47 | project=args.image_project) 48 | assert len(subject_images) == len(sentences) 49 | 50 | ######### Prepare learning setup. 51 | 52 | # Track within-subject performance. 53 | metrics = pd.DataFrame(columns=["mse", "r2"]) 54 | 55 | # Prepare nested CV. 56 | # Inner CV is responsible for hyperparameter optimization; 57 | # outer CV is responsible for prediction. 58 | state = int(time.time()) 59 | inner_cv = KFold(n_splits=args.n_folds, shuffle=True, random_state=state) 60 | outer_cv = KFold(n_splits=args.n_folds, shuffle=True, random_state=state) 61 | 62 | # Final data prep: normalize. 63 | X = subject_images - subject_images.mean(axis=0) 64 | X = X / np.linalg.norm(X, axis=1, keepdims=True) 65 | Y = encodings - encodings.mean(axis=0) 66 | Y = Y / np.linalg.norm(Y, axis=1, keepdims=True) 67 | 68 | ######## Run learning. 69 | 70 | # Run inner CV. 71 | gs = GridSearchCV(Ridge(fit_intercept=False, normalize=False), 72 | {"alpha": ALPHAS}, cv=inner_cv, n_jobs=args.n_jobs, verbose=10) 73 | # Run outer CV. 74 | decoder_predictions = cross_val_predict(gs, X, Y, cv=outer_cv) 75 | 76 | ######### Evaluate. 77 | 78 | metrics.loc[subject, "mse"] = mean_squared_error(Y, decoder_predictions) 79 | metrics.loc[subject, "r2"] = r2_score(Y, decoder_predictions) 80 | 81 | # Rank evaluation. 82 | _, rank_of_correct = util.eval_ranks(decoder_predictions, np.arange(len(decoder_predictions)), Y) 83 | rank_stats = pd.Series(rank_of_correct).agg(["mean", "median", "min", "max"]) 84 | metrics = metrics.join(pd.concat([rank_stats], keys=[subject]).unstack().rename(columns=lambda c: "rank_%s" % c)) 85 | 86 | ######### Save results. 87 | 88 | csv_path = "%s.csv" % args.out_prefix 89 | metrics.to_csv(csv_path) 90 | L.info("Wrote decoding results to %s" % csv_path) 91 | 92 | # Save per-sentence outputs. 93 | npy_path = "%s.pred.npy" % args.out_prefix 94 | np.save(npy_path, decoder_predictions) 95 | L.info("Wrote decoder predictions to %s" % npy_path) 96 | 97 | 98 | if __name__ == '__main__': 99 | p = ArgumentParser() 100 | 101 | p.add_argument("sentences_path", type=Path) 102 | p.add_argument("brain_path", type=Path) 103 | p.add_argument("encoding_paths", type=Path, nargs="+") 104 | p.add_argument("--encoding_project", type=int) 105 | p.add_argument("--image_project", type=int) 106 | p.add_argument("--n_folds", type=int, default=12) 107 | p.add_argument("--mat_name", default="examples_384sentences.mat") 108 | p.add_argument("--out_prefix", default="decoder_perf") 109 | p.add_argument("--subject_name", help="By default, basename of brain_path") 110 | p.add_argument("--n_jobs", type=int, default=1) 111 | 112 | main(p.parse_args()) 113 | -------------------------------------------------------------------------------- /src/nearest_neighbors.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | 3 | import numpy as np 4 | from scipy.spatial import distance 5 | 6 | import util 7 | 8 | 9 | def eval_quant(encoding, metric="cosine"): 10 | # Compute pairwise cosine similarities. 11 | similarities = 1 - distance.pdist(encoding, metric=metric) 12 | 13 | return similarities 14 | 15 | 16 | def main(args): 17 | sentences = util.load_sentences(args.sentences_path) 18 | encoding = np.load(encoding_path) 19 | 20 | if args.mode == "quant": 21 | eval_quant(encoding) 22 | elif args.mode == "qual": 23 | pass 24 | 25 | 26 | if __name__ == '__main__': 27 | p = ArgumentParser() 28 | p.add_argument("sentences_path", type=Path) 29 | p.add_argument("encoding_path") 30 | p.add_argument("--mode", choices=["quant", "qual"], default="quant") 31 | -------------------------------------------------------------------------------- /src/util.py: -------------------------------------------------------------------------------- 1 | """ 2 | Data analysis tools shared across scripts and notebooks. 3 | """ 4 | 5 | from collections import defaultdict 6 | import itertools 7 | import logging 8 | from pathlib import Path 9 | import re 10 | 11 | import matplotlib 12 | matplotlib.use("Agg", warn=False) 13 | import numpy as np 14 | import pandas as pd 15 | import seaborn as sns 16 | import scipy.io as io 17 | import scipy.stats as st 18 | from sklearn.decomposition import PCA 19 | from tqdm import tqdm 20 | 21 | L = logging.getLogger(__name__) 22 | 23 | 24 | def load_sentences(sentence_path="data/sentences/stimuli_384sentences.txt"): 25 | with open(sentence_path, "r") as f: 26 | sentences = [line.strip() for line in f] 27 | return sentences 28 | 29 | 30 | def load_encodings(paths, project=None): 31 | encodings = [] 32 | for encoding_path in paths: 33 | encodings_i = np.load(encoding_path) 34 | L.info("%s: Loaded encodings of size %s.", encoding_path, encodings_i.shape) 35 | 36 | if project is not None: 37 | L.info("Projecting encodings to dimension %i with PCA", project) 38 | 39 | if encodings_i.shape[1] < project: 40 | L.warn("Encodings are already below requested dimensionality: %i < %i" 41 | % (encodings_i.shape[1], project)) 42 | else: 43 | pca = PCA(project).fit(encodings_i) 44 | L.info("PCA explained variance: %f", sum(pca.explained_variance_ratio_) * 100) 45 | encodings_i = pca.transform(encodings_i) 46 | 47 | encodings.append(encodings_i) 48 | 49 | encodings = np.concatenate(encodings, axis=1) 50 | return encodings 51 | 52 | 53 | def load_brain_data(path, project=None): 54 | subject_data = io.loadmat(path) 55 | subject_images = subject_data["examples"] 56 | if project is not None: 57 | L.info("Projecting brain images to dimension %i with PCA", project) 58 | if subject_images.shape[1] < project: 59 | L.warn("Images are already below requested dimensionality: %i < %i" 60 | % (subject_images.shape[1], project)) 61 | else: 62 | pca = PCA(project).fit(subject_images) 63 | L.info("PCA explained variance: %f", sum(pca.explained_variance_ratio_) * 100) 64 | subject_images = pca.transform(subject_images) 65 | 66 | return subject_images 67 | 68 | 69 | def load_decoding_perfs(results_dir): 70 | """ 71 | Load and render a DataFrame describing decoding performance across models, 72 | model runs, and subjects. 73 | 74 | Args: 75 | results_dir: path to pipeline decoder output directory 76 | """ 77 | 78 | results = {} 79 | result_keys = ["model", "run", "step", "subject"] 80 | for csv in tqdm(list(Path(results_dir).glob("**/*.csv")), 81 | desc="Loading perf files"): 82 | key = get_decoder_id(csv.parent.name) 83 | try: 84 | df = pd.read_csv(csv, usecols=["mse", "r2", 85 | "rank_median", "rank_mean", 86 | "rank_min", "rank_max"]) 87 | except ValueError: 88 | continue 89 | 90 | results[key] = df 91 | 92 | if len(results) == 0: 93 | raise ValueError("No valid csv outputs found.") 94 | 95 | ret = pd.concat(results, names=result_keys) 96 | # drop irrelevant CSV row ID level 97 | ret.index = ret.index.droplevel(-1) 98 | return ret 99 | 100 | 101 | def load_decoding_preds(results_dir, glob_prefix=None): 102 | """ 103 | Load decoder predictions into a dictionary organized by decoder properties: 104 | decoder target model, target model run, target model run training step, 105 | and source subject image. 106 | """ 107 | decoder_re = re.compile(r"\.(\w+)-run(\d+)-(\d+)-([\w\d]+)\.pred\.npy$") 108 | 109 | results = {} 110 | for npy in tqdm(list(Path(results_dir).glob("%s*.pred.npy" % (glob_prefix or ""))), 111 | desc="Loading prediction files"): 112 | model, run, step, subject = decoder_re.findall(npy.name)[0] 113 | results[model, int(run), int(step), subject] = np.load(npy) 114 | 115 | if len(results) == 0: 116 | raise ValueError("No valid npy pred files found.") 117 | 118 | return results 119 | 120 | 121 | def get_encoding_ckpt_id(encoding_dir): 122 | """ 123 | Get information about a model encoding from its output directory name. 124 | """ 125 | encoding_dir = encoding_dir.name if isinstance(encoding_dir, Path) else encoding_dir 126 | try: 127 | model, run, step = re.findall(r"^([\w_]+)-(\d+)-(\d+)$", encoding_dir)[0] 128 | except IndexError: 129 | raise ValueError("Failed to extract checkpoint information from encoding directory %s" % encoding_dir) 130 | 131 | return model, int(run), int(step) 132 | 133 | 134 | def get_decoder_id(decoder_dir): 135 | """ 136 | Get information about a learned decoder from its output directory name. 137 | """ 138 | decoder_dir = decoder_dir.name if isinstance(decoder_dir, Path) else decoder_dir 139 | model, run, step, subject = re.findall("^([\w_]+)-(\d+)-(\d+)-([\w\d]+)$", decoder_dir)[0] 140 | return model, int(run), int(step), subject 141 | 142 | 143 | def eval_ranks(Y_pred, idxs, encodings, encodings_normed=True): 144 | """ 145 | Run a rank evaluation on predicted encodings `Y_pred` with dataset indices 146 | `idxs`. 147 | 148 | Args: 149 | Y_pred: `N_test * n_dim`-matrix of predicted encodings for some 150 | `N_test`-subset of sentences 151 | idxs: `N_test`-length array of dataset indices generating each of `Y_pred` 152 | encodings: `M * n_dim`-matrix of dataset encodings. The perfect decoder 153 | would predict `Y_pred[idxs] == encoding[idxs]`. 154 | 155 | Returns: 156 | ranks: `N_test * M` integer matrix. Each row specifies a 157 | ranking over sentences computed using the decoding model, given the 158 | brain image corresponding to each row of Y_test_idxs. 159 | rank_of_correct: `N_test` array indicating the rank of the target 160 | concept for each test input. 161 | """ 162 | N_test = len(Y_pred) 163 | assert N_test == len(idxs) 164 | 165 | # TODO implicitly coupled to decoder normalization -- best to factor this 166 | # out! 167 | if encodings_normed: 168 | Y_pred -= Y_pred.mean(axis=0) 169 | Y_pred /= np.linalg.norm(Y_pred, axis=1, keepdims=True) 170 | 171 | # For each Y_pred, evaluate rank of corresponding Y_test example among the 172 | # entire collection of Ys (not just Y_test), where rank is established by 173 | # cosine distance. 174 | # n_Y_test * n_sentences 175 | similarities = np.dot(Y_pred, encodings.T) 176 | 177 | # Calculate distance ranks across rows. 178 | orders = (-similarities).argsort(axis=1) 179 | ranks = orders.argsort(axis=1) 180 | # Find the rank of the desired vectors. 181 | ranks_test = ranks[np.arange(len(idxs)), idxs] 182 | 183 | return ranks, ranks_test 184 | 185 | 186 | def wilcoxon_rank_preds(models, correct_bonferroni=True, pairs=None): 187 | """ 188 | Run Wilcoxon rank tests comparing the ranks of correct sentence representations in predictions 189 | from two or more models. 190 | """ 191 | if pairs is None: 192 | pairs = list(itertools.combinations(models.keys(), 2)) 193 | 194 | model_preds = {model: pd.read_csv("perf.384sentences.%s.pred.csv" % path).sort_index() 195 | for model, path in models.items()} 196 | 197 | subjects = next(iter(model_preds.values())).subject.unique() 198 | 199 | results = {} 200 | for model1, model2 in pairs: 201 | m1_preds, m2_preds = model_preds[model1], model_preds[model2] 202 | m_preds = m1_preds.join(m2_preds["rank"], rsuffix="_m2") 203 | pair_results = m_preds.groupby("subject").apply(lambda xs: st.wilcoxon(xs["rank"], xs["rank_m2"])) \ 204 | .apply(lambda ys: pd.Series(ys, index=("w_stat", "p_val"))) 205 | 206 | results[model1, model2] = pair_results 207 | 208 | results = pd.concat(results, names=["model1", "model2"]).sort_index() 209 | 210 | if correct_bonferroni: 211 | correction = len(results) 212 | print(0.01 / correction, len(results)) 213 | results["p_val_corrected"] = results.p_val * correction 214 | 215 | return results 216 | 217 | 218 | def load_bert_finetune_metadata(savedir, checkpoint_step=None): 219 | """ 220 | Load metadata for an instance of a finetuned BERT model. 221 | """ 222 | savedir = Path(savedir) 223 | 224 | import tensorflow as tf 225 | from tensorflow.python.pywrap_tensorflow import NewCheckpointReader 226 | try: 227 | ckpt = NewCheckpointReader(str(savedir / "model.ckpt")) 228 | except tf.errors.NotFoundError: 229 | if checkpoint_step is None: 230 | raise 231 | ckpt = NewCheckpointReader(str(savedir / ("model.ckpt-step%i" % checkpoint_step))) 232 | 233 | ret = {} 234 | try: 235 | ret["global_steps"] = ckpt.get_tensor("global_step") 236 | ret["output_dims"] = ckpt.get_tensor("output_bias").shape[0] 237 | except tf.errors.NotFoundError: 238 | ret.setdefault("global_steps", np.nan) 239 | ret.setdefault("output_dims", np.nan) 240 | 241 | ret["steps"] = defaultdict(dict) 242 | 243 | # Load training events data. 244 | try: 245 | events_file = next(savedir.glob("events.*")) 246 | except StopIteration: 247 | # no events data -- skip 248 | print("Missing training events file in savedir:", savedir) 249 | pass 250 | else: 251 | total_global_norm = 0. 252 | first_loss, cur_loss = None, None 253 | tags = set() 254 | for e in tf.train.summary_iterator(str(events_file)): 255 | for v in e.summary.value: 256 | tags.add(v.tag) 257 | if v.tag == "grads/global_norm": 258 | total_global_norm += v.simple_value 259 | elif v.tag in ["loss_1", "loss"]: 260 | # SQuAD output stores loss in `loss` key; 261 | # classifier stores in `loss_1` key. 262 | 263 | if e.step == 1: 264 | first_loss = v.simple_value 265 | cur_loss = v.simple_value 266 | 267 | if checkpoint_step is None or e.step == checkpoint_step: 268 | ret["steps"][e.step].update({ 269 | "total_global_norms": total_global_norm, 270 | "train_loss": cur_loss, 271 | "train_loss_norm": cur_loss / ret["output_dims"] 272 | }) 273 | 274 | ret["first_train_loss"] = first_loss 275 | ret["first_train_loss_norm"] = first_loss / ret["output_dims"] 276 | 277 | # Load eval events data. 278 | try: 279 | eval_events_file = next(savedir.glob("eval/events.*")) 280 | except StopIteration: 281 | # no eval events data -- skip 282 | print("Missing eval events data in savedir:", savedir) 283 | pass 284 | else: 285 | tags = set() 286 | eval_loss, eval_accuracy = None, None 287 | for e in tf.train.summary_iterator(str(eval_events_file)): 288 | for v in e.summary.value: 289 | tags.add(v.tag) 290 | if v.tag == "eval_loss": 291 | eval_loss = v.simple_value 292 | elif v.tag == "eval_accuracy": 293 | eval_accuracy = v.simple_value 294 | elif v.tag == "masked_lm_accuracy": 295 | eval_accuracy = v.simple_value 296 | 297 | if checkpoint_step is None or e.step == checkpoint_step: 298 | ret["steps"][e.step].update({ 299 | "eval_accuracy": eval_accuracy, 300 | "eval_loss": eval_loss, 301 | }) 302 | 303 | return ret 304 | -------------------------------------------------------------------------------- /structural-probes/en_ewt-ud/en_ewt-ud-test.txt: -------------------------------------------------------------------------------- 1 | AMAZING 2 | By the way we now have a " forum " in the post link 3 | Also , I have an extra ticket for the Comets game on Sat. you said you wanted to go ? 4 | Obviously , he should have been arrested and jailed - imagine making a statue or painting on the same subject as another artist - clearly insulting and disrespecting - wasted time that could have been used making a statue with clothes on . 5 | Possibly a freshwater tank with a ton of different species in there . 6 | Any particular shop that you know of AND their number . 7 | KK 8 | image_jpg_part 9 | " NASA then plans to develop a new 100 - metric - ton - class launch vehicle derived from existing capabilities with the space shuttle external tanks and solid rocket boosters for future missions to the moon , " the letter said . 10 | Thank you though . 11 | There is no lower rating for Noonan 's Liquor , owners and employees . 12 | I shoot a t1i and have n't had such an issue . 13 | What should I do ? 14 | This place is great ! 15 | Sanctuary is amazing ! 16 | Please note that neither the e-mail address nor name of the sender have been verified . 17 | email them at address below 18 | They also had a special connection to some extremists in Jordan and Germany . 19 | It is a place in Argentina lol 20 | good outside , bad inside 21 | Ideally , we would like a fast turnaround . 22 | Worst experience ever like a sardine can and the bartender downstairs is the rudest person I have ever met . 23 | " ... to gaze at Wei 's art is like entering a floating world of dreams . " 24 | Al Arfsten 713 965 2158 25 | - Ram Tackett ( E-mail ) .vcf 4222 26 | So does anyone know what the difference is ? 27 | We 're at war with Islamic fascists . 28 | Umar Islam , 28 , ( born Brian Young ) High Wycombe 29 | yeah 30 | Rich was here before the scheduled time . 31 | I have no complaints about the service I received . 32 | Vince : 33 | ( Space.com ) 34 | Hi David : 35 | WHAT A GREAT DEAL THANK YOU 36 | i am sure i could have persuaded you to give me some action . 37 | You company and services will be recommended by us to everyone . " 38 | Airfare alone will be incredibly expensive so make sure you have the money and of course free time to take your time and have a great time . 39 | =) 40 | Looking for something on the casual side and we want it to be fun . 41 | I think this location is no longer in business . 42 | The sushi is great , and they have a great selection . 43 | The correction to the working gas includes TWO corrections . 44 | i know you remember the bet . 45 | Has that gone anywhere or are the other possibilities you had better ? 46 | By using the word " Islamic " as an adjective Bush was purposely not associating Muslims with fascism , hence the qualifier . 47 | Email : n3td3v < xploita...@gmail.com > 48 | - Joe Namath 49 | they are great dogs . 50 | They were playing Captain Ahab to Saddam 's great white whale . 51 | We have this report ? 52 | I 'm not driving tonite , but I bet that we could hitch a ride back with Anil . 53 | exelent Job 54 | After this weekend , we will no longer have access to the estate files , these people will be able to help you with any of your questions . 55 | Susan : 56 | Deep tissue massage helps with pain in neck and shoulders 57 | Fast and friendly service , they know my order when I walk in the door ! 58 | You will always find fascinating links at : Extreme Web Surfs http://extremewebsurfs.blogspot.com/ ( nice urban wildlife post today ) & Me and the Web http://maartenvt.blogspot.com/ Arts , History , Animals , Music , Games , Politics , Technology , Fun and more ! 59 | A big rally then took the Dow ( unconfirmed ) to a record high of 1051 in January of '73 , turning everyone bullish . 60 | weather in december in Tremblant ? 61 | AEP $ 19,250,000 $ 38,750,000 62 | Well , he launched today . 63 | that deal is making like it wants to close and the traders scheduled a 430 call to wrap it up 64 | Thank you (: 65 | Food is often expired so check the dates every time ! 66 | Sent by : Janette Elbertson 67 | i had a blast that night . 68 | Thanks , 69 | The employees do n't really seem to enjoy what they are doing and it shows . 70 | - Winston Churchill 71 | ( Most Salafis are not militant or violent , though they tend to be rather narrow - minded in my experience , on the order of Protestant Pietists ) . 72 | Mine does . 73 | This place had the worst tasting pizza I have ever tasted it was possible the worst food I 've ever eaten . 74 | a staple ! 75 | Yes . 76 | I had a rose named after me and I was very flattered . 77 | - Socrates 78 | Got to love this place . 79 | I have never hated a man enough to give his diamonds back . 80 | you guys have any job opening for ex natural gas traders that made their now almost defunctc ompany over 40 million in the last two years ? 81 | The owner was very friendly and helpful . 82 | Geoff , 83 | ( ZD Net ) 84 | Not enough seating . 85 | It 's impossible to understand how this place has survived . 86 | They specialize in financial institutions , medical , and retail projects . 87 | Do You Yahoo! ? 88 | Thank you . 89 | Expect either undercooked or mushy food and lackluster service . 90 | Michael helped shoot the majority of my firm 's website and we could not have been happier . 91 | Of THESE three , it 's a toss - up between Royal and Carnival . 92 | We got upgraded to a corner suite ! 93 | I will not be there at 7:30 , but will see you arond 9:30 on Tuesday . 94 | My wife and I would love for you to come and visit our page 95 | DO Nt Go here 96 | Job Title : Attorney 97 | Poor Taste 98 | Martin , 99 | Any information would help . 100 | Did you have a chance to take a look at the resume I sent you ? 101 | Money ca n't buy you happiness ... but it does bring you a more pleasant form of misery . 102 | " Inhibitory systems are essential for controlling the pattern of activity in the cortex , which has important implications for the mechanisms of cortical operation , according to a Yale School of Medicine study in Neuron .... 103 | extend , and if you want to end it , you just say bye , 104 | Twinkle Twinkle lazy star Kitna soyega uthja yaar , up above the world so high , sun has risen in the sky , uthke jaldi pee le chai , then call me up and say " HI " 105 | SCALIA filed a dissenting opinion , in which THOMAS and ALITO joined . 106 | An Hour Of Prego Bliss ! 107 | In any event , my presentation should give you a starting point . 108 | One can suspect the Iranian Government . 109 | Her flexibility and accessibility made for an easy closing . 110 | He is now lecturing in USA . 111 | Totals $ 22,750,000 $ 40,000,000 112 | Good Morning * * 113 | Our client is a small law firm that is looking for an individual to join their team handling toxic tort with some minor PI defense . 114 | Very professional and great results . 115 | I wish I had the capital to open my own shop ? 116 | Zakaria Amara , 20 , Mississauga , Ont. ; 117 | Sent by : Janette Elbertson 118 | If you enjoy amazing things , you must go to World 's Finest Donair . 119 | it s cheap and it s good ! 120 | you must be thinking of someone else . 121 | At that time , the gurus and geniuses of Wall Street were predicting a 250 Dow and many were talking openly about the end of capitalism . 122 | If you do not wish to receive such e-mails in the future or want to know more about the BBC 's Email a Friend service , please read our frequently asked questions . 123 | By the time a man is wise enough to watch his step , he 's too old to go anywhere . 124 | Give me a few days , and I 'll be in touch . 125 | Thank you for your help in tracking these invoices . 126 | Stephanie 127 | Look on the debenhams website 128 | In the world of " Wei 's magic cubes , " all seem to be ingeniously planned and tricky . 129 | Linda 130 | I 'll search him out before class or after that break and see if I can set it up . 131 | I had no problem with my delivery . 132 | And a portion of each package or memorial purchased goes to a charity on their database . 133 | I am not sure how reliable that is , though . 134 | A real pleasure training with Natasha . 135 | prime ribs have very good food but it s super expensive 136 | I heard that more may be going up for sale in the next month or do . 137 | Great job on my roof and the pricing was fair . 138 | in n out of the chicago area ? 139 | I own a property management firm and need a contractor with the credentials that Farrell Electric has . 140 | Friendly service . 141 | EY4096.1 PERFORMANCE 01-Feb-02 P 6,363,217 - $ 250,393 142 | Thanks . 143 | < http://www.bbc.co.uk/dailyemail/ > 144 | Did a great job of removing my tree in Conyers . 145 | why are there two statues of David ? 146 | Horrible ! 147 | It was all sorted with no hassle at all and I 'm really grateful - they were fab . 148 | Thanks 149 | Women 's rain coat ... where can I find one ? 150 | I want to go to the cafeteria for vegetables . 151 | The President has also said he would like to see Israel wiped off the map which he could n't even begin to try without nuclear weapons . 152 | 732-657-3416 153 | Why not put together a bottle of champagne , a picnic and have a date on Treasure Island . 154 | Thanks ! 155 | Vince : 156 | There are deals in the Aruba book so I 'm not sure why you are n't picking those up . 157 | yep 158 | So delightful . 159 | Love Hop City 160 | Vince 161 | Not me sorry . 162 | The answer is , " Yes ! " 163 | These have been sold . 164 | I ca nt find any information about it 165 | But will diplomacy work ? 166 | really , i have no idea what you 're talking about . 167 | Google is probably making this move to counter Microsoft Search using Encarta ( it's online dictionary ) . 168 | I am bringing two of my girlfriends from LJ . 169 | Cast . 170 | It sounds like a firmware issue and the camera requires a re-boot just like what happens in a computer - needs a re-start from time to time but it should n't be happening in a camera . 171 | the bast cab in minneapolis 172 | Thanks 173 | Overpriced and the doctor acted arrogant and rushed at a time when there was very few clients in the facility . 174 | We would like to thank our emergency plumbers who visted our shop in Morningside Road today . 175 | hahaha 176 | And international donors have given only half of the relief aid that Darfur needs , according to the local UN officials . 177 | Nacho Libre is suppose to be inspired in Mexicans , not in Argentineans . 178 | And there 's nothing distinctly Irish about them . 179 | why is enron blowing up ? 180 | do n't forget to use a calcium supplement twice a week ; captive reptiles are prone to calcium deficiency . 181 | Highly recommended landscaper !!! 182 | These guys really know their stuff .. they have almost anything you could want in terms of spy and surviellance equipment . 183 | CLH 184 | It taste better than In and Out .... 185 | Name of specific Hibachi restaurant in Chicago ? 186 | Syria has agreed to withdraw under the conditions set forth in UNSC Resolution 1559 , which has already begun . 187 | Good selection . 188 | Or how about visiting the Chicago Botanical Gardens and see the change of colors and enjoy the air , They also have many inside exhibits you might enjoy , food is pretty good to . 189 | I know saltwater is a possibility , can you give me a possible stocking option for that too ? 190 | Best Limo Limousine service in all of Dallas 191 | They were from all accounts marginalized and not listened to . 192 | I highly recommend Garage Pros to my friends . 193 | ------ 194 | M 195 | the attitude of some staff is terrible , did not solve anything only say i can do nothing . 196 | Or more if you have drinks . 197 | Rodgers 198 | never response the phone call 199 | Sara , 200 | I would n't want any other company in my time of need . 201 | that is how i want you to refer to me as " the king " 202 | Shamin Mohammed Uddin , 36 , Stoke Newington 203 | Compact 's Corona dryers remove at least twice as much water as the previous dryers , allowing a production increase of over 10 % and a significant energy saving . 204 | " Thank you so much for the superior job well done . 205 | spanish 206 | Fast Service Called them one hour ago and they just left my house five minutes ago . 207 | hopefully she does n't hose you . 208 | ------------------------------ 209 | I have a friend out in Chicago this week , and I am trying to remember the name of an awesome hibachi style restaurant i visited while out there a couple years ago . 210 | Please let us know if you need any additional information . 211 | The waiting staff is really friendly , it s like every one knows each other , the manager is really sweet and the food .. well no complaints from me . 212 | I met you at the Risk conference last week in Houston . 213 | did anyone have this issue ? 214 | I 'm not sure how the market will react . 215 | That 's because of the buffer that holds the data until it 's ready to be recorded to the memory card . 216 | someplace that is like $ 30 an entree . 217 | He is at his best when he is doing his Nerd impression ... 218 | Absoul is the greatest donair man on the planet . 219 | how fare of kolkatta ? 220 | I have a Kodak Camera ( 10.2 Megapixels ) ... Kodak AF 5x OPTICAL LENS ... how do I pause it while recording ? 221 | Let aggressive ( American ) leaders and soldiers know that we are capable of protecting the city 's security and safety , and ask them to lift their hands from the city . " 222 | EY4108.I PERFORMANCE 01-Feb-02 P - 10,274,494 - $ 166,960 223 | sorry again 224 | No problem . 225 | I had to go to the BBC for this report . 226 | My canon t2i stops working at times as in the power bottom is switched to " on " but the camera does not respond to any function . 227 | I enjoyed working with you and wish you the best of everything . 228 | I hope that this would mean that you would remain involved at some level . 229 | Cafeteria is fine . 230 | Friendly service . 231 | it is a cute little nice and quiet library 232 | Fresh and unic ! 233 | Chris Abel Manager , Risk Controls Global Risk Operations chris.abel@enron.com < mailto:chris.abel@enron.com > 713.853.3102 234 | Bike ride in the park , followed by coffee . 235 | We still have the traders and books that you provided last week , but need to know if there are any changes to this . 236 | ------ 237 | UN Secretary - General Kofi Annan has indicated it is time to " recognize Hezbollah " after easily being duped by " the message on the placards they are using " . 238 | - Bob Hope 239 | Michael Olsen@ENRON 240 | Greg Couch will be taking over the responsibility for the estate risk group and will be able to assist you with your requests going forward . 241 | Tracy , Do we have concerns here . 242 | Can you use the ' find my phone ' feature to track someone else 's phone ? 243 | What do you think of Air France ? 244 | ALL OF THE TEACHERS THERE ARE SO MEAN THEY GET MAD AT YOU FOR NOTHING !!!!!!!!!!!!!!!!!!! 245 | When the French returned to Indochina at the end of WW II the Viet Minh were in control of the Red River Delta . 246 | They sell these kits in most hobby and craft stores . 247 | - Jimmy Durante 248 | The gods were n't with us on that one . 249 | I 'm more than happy to help people with the site or answer any questions about Action Network - just drop me a message . 250 | While Tanya is reviewing credit , can you please send a " blank form Paragraph 13 " for this master . 251 | Will use again in the future . 252 | Where can I go on a first date ( adults ) ? 253 | While our established schedules of Tuesday and Friday DPR's would have us reporting tomorrow 's activity on Monday , we will change that for month end . 254 | ??? 255 | We honestly can not think of even 1 thing we did n't like ! 256 | Say after I finished those 2 years and I found a job . 257 | http://www.google.com/search?hl=en&rlz=1G1GGLQ_ENUS359&q=gunther+uecker+biography&gs_sm=c&gs_upl=484l484l0l5093l1l1l0l0l0l0l328l328l3-1l1l0&um=1&ie=UTF-8&tbm=isch&source=og&sa=N&tab=wi&biw=1221&bih=756&sei=XVG5TqrrGoXK2AXG0ry-Bw#um=1&hl=en&rlz=1G1GGLQ_ENUS359&tbm=isch&sa=1&q=gunther+uecker+artist&pbx=1&oq=gunther+uecker+&aq=1S&aqi=g1g-S3&aql=&gs_sm=c&gs_upl=10219l10219l0l13797l1l1l0l0l0l0l125l125l0.1l1l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=13ddcdc64cbf5fd&biw=1221&bih=756 258 | I just wanted to check with you regarding the consulting arrangement we discussed a couple of weeks ago . 259 | Should I be concerned ? 260 | ( Space.com ) 261 | Onion Rings are great and the fries are endless . 262 | I think it was in the Lincoln Square area but do n't quote me on that . 263 | I just wanted to follow up on whether you will have a chance to send a draft Credit Support Annex ( similar in form to the one previously executed with ENA ) . 264 | like what ? 265 | I need something reliable and good looking . 266 | Do you prefer ham , bacon or sausages with your breakfast ? 267 | EY4106.6 PERFORMANCE 01-Feb-02 P - 1,993,045 - $ 32,387 268 | Pam the Pom 269 | The credit guys are currently assuming that there is no correlation and may consequently be double dipping the credit reserve on this basis too . 270 | I 'm considering taking a job with Steiner and noticed I have to pay for all my travel . 271 | Elizabeth 36349 272 | But there is no proof . 273 | Amazing customer service 274 | This is the very best in the Gables . 275 | Bay of Plenty - Are you even old enough to vote ? 276 | Michael L. Beatty & Associates , PC ; # 10461 , # 10469 & # 10468 dated 5/28/00 . 277 | Martin 278 | - Groucho Marx 279 | - Herbert Henry Asquith 280 | Yasim Abdi Mohamed , 24 , Kingston ; 281 | No minimum order amount . 282 | U.S. officials have said the plot , thwarted by Britain , to blow up several aircraft over the Atlantic bore many of the hallmarks of al Qaeda . 283 | Which is why he did n't say we 're at war with Islamic people . 284 | Hi Kevin , 285 | Many thanks from myself and all of our wedding guests ! 286 | - Phyllis Diller 287 | We just did a deal for the rest of the month for 10,000 / d at meter # 1552 QE - 1 @ $ 4.355 .... can you let me and Robert Lloyd know what the sitara # is ? 288 | Clewlow / Strickland book is out . 289 | She makes every item fit you perfectly . 290 | Vince 291 | Has another store in the st. charles mall . 292 | As we discussed , here is a copy of the draft memo . 293 | Thanks a lot . 294 | Vincent , 295 | Further to your call attached is a presentation I gave at the Canadian Risk Managers Conference in Edmonton in the fall of 2000 . 296 | houston wo n't be too affected b/c most of the layoffs affect satelite offices . 297 | The people at Gulf Coast Siding were very easy and clear to work with . 298 | ** and i can upload my pictures and videos on the computer ( facebook ) 299 | No , but I do believe some Koreans reside in the country of HA - ha ! 300 | i always thought there s no custom charges for gifts . 301 | Swetha 302 | A Look at Slogans - http://www.small-business-software.net/look-at-slogans.htm 303 | Excellent energy efficiency 304 | wow wow wow . 305 | Pretty spendy for really not great quality 306 | The Lunar Transporatation Systems ( LTS ) is actually being funded by two space businessmen , Walter Kistler and Bob Citron . 307 | Nearby what ? 308 | i am in portland . 309 | Click here To view it . 310 | I 'll ask around ? 311 | Jeff 312 | if i preorder a game at gamestop can someone else pick it up for me ? 313 | mazzoni 's deli best italian food in phila pa ? 314 | Regards 315 | regards , 316 | Google is a nice search engine . 317 | I 've been a regular customer at this store since it opened , and love the fact that all of the employees are friendly locals . 318 | Frank 319 | Bush is in Santiago for the annual Asia - Pacific Economic Cooperation ( APEC ) leaders meeting . 320 | The IIP had also been the main force urging Sunni Arabs to participate in the elections scheduled for January , and had been opposed in this stance by the Association of Muslim Scholars . 321 | Let me join the chorus of annoyance over Google 's new toolbar , which , as noted in the linked article , commits just about every sin an online marketer could commit , and makes up a few new ones besides . 322 | Here is a revised draft of the CDWR risk memo . 323 | He listens and is excellent in diagnosing , addressing and explaining the specific issues and suggesting exercises to use . 324 | Compensation : $ 60000 - 70000 325 | Does 5 make a chain ? 326 | Expensive for the level of food and the quality of service . 327 | Farrell Electric is a very good electrical contractor . 328 | What 's going on with the UBS weather position ? 329 | You also need Pakistani air space . 330 | Not so good 331 | The clerics demanded talks with local US commanders . 332 | Well there s Mc. Donald s , Taco Bell , Burger King ..... 333 | i.e . 334 | very reasonable prices . 335 | That is Flat Top Grill 336 | they recovered the pics geeksquad deleted . 337 | Listened to my problem and took care of it . 338 | Has anyone ever worked for steiner leisure cruises ? 339 | It could notionally be expanded to encompass the 5,000 - strong " 55th Brigade " of the Taliban regime , though this is not the technical definition . 340 | 01/24/2001 03:51 PM 341 | Thank you 342 | I 'll be back on Monday . 343 | ------ 344 | FYI , 345 | The finest German bedding and linens store . 346 | Please start using the ENA DPR 0102 file rather than the EWS DPR 2002 file to send to Chris . 347 | Best fried shrimp in the state ! 348 | When I tried to return it they refused , so I had to leave without a refund and still hungry . 349 | 732-657-3416 350 | It looks like The Lunar Transportation Systems , Inc. is visualizing a " space highway " going from the moon to Earth ( and back again ) . 351 | They basically buy daily deals from Groupon , Living Social , and all sorts of other places . 352 | There must be a better mexican place in Rockland . 353 | Susan 354 | Average food and deathly slow service 355 | Hope you will be sorted . 356 | ** 357 | sounds exciting . 358 | Al , 359 | Ifunny.com 360 | I was wondering if you could give me some references regarding the calculation of correlation coefficients from a GARCH model . 361 | Get great service , fantastic menu , and relax . 362 | My nails looked great for the better part of 2 weeks ! 363 | I 've looked and looked , but can not find one anywhere ! 364 | As in the old days , varnish is often used as a protective film against years of dirt , grease , smoke , etc . 365 | Delhi police chief K K Paul named the man as Tariq Ahmed Dar , and said police were hunting for four accomplices . 366 | Thanks - 367 | Walgreens on University 368 | looking for a surprise spot to take my bf . 369 | * ... * 370 | However , the request below is to " replenish " the CASH that was drawn down ... please advise . 371 | Prosperity POS makes the best pos systems . 372 | have a look at sony wx10 373 | Microsoft is 4 - 0 ( they took down Netscape , Suns Systems , MAC and IBM ) and Google may be their next target . 374 | Magali Van Belle Consultant PHB Hagler Bailly MANAGEMENT AND ECONOMIC CONSULTANTS PHB Hagler Bailly , Inc. ( 202 ) 828-3933 direct dial 1776 Eye Street , N.W. ( 202 ) 296-3858 facsimile Washington , D.C. 20006-3700 mvanbell@haglerbailly.com e-mail 375 | The deals are listed below . 376 | http://reflectioncafe.blogspot.com/2005/09/unnatural-disasterthe-less... 377 | Kristen , 378 | Was wondering if anyone knew a rough estimate of how much it costs with travel and training 379 | I 'll post highlights from the opinion and dissents when I 'm finished . 380 | Tanvir Hussain , 24 , London E10 381 | i got her number though . 382 | - REDLINE GPSA Guaranty.doc 383 | PS - we also have more cats coming in for re-homing see our ' Homes Wanted ' page 384 | You have to see these slides .... they are amazing . 385 | Very professional , talented , unic and fresh work . 386 | hard to forgive such an awful margarita and steep prices but the food can be good 387 | Richard Harper and Mary Nell Browning may have some ideas here but I am sure you 've already gone through it with them . 388 | I 've tried bland white rice but he wo nt eat anything . 389 | Kam 390 | Your suggestion to introduce the concept discussed with one of the Lays is welcomed . 391 | The decision to sidestep the obvious to satisfy the need to avoid confrontation does not bring peace , but only delays the eventual conflict as the predators of Hamas and Hezbollah exploit the inherent weakness of the internationals and the media . 392 | The best pilates on the Gold Coast ! 393 | I 've never kept cichlids though . 394 | I do see myself as a conservative 395 | I 'm working hard for you ! 396 | THE TEACHING THERE SUCKS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 397 | But my resturant is way better than all of them ... and it 's quite close . 398 | Great job ! 399 | I need creative art ideas ? 400 | Wei 's Magic Cubes 401 | I have used Bright Futures for the last 7 years . 402 | I do n't know how much it will help however . 403 | Yes , you must pay customs duties . 404 | Zarqawi is a Jordanian , and his Monotheism and Holy War group in Afghanistan probably had a distinctive coloration as mainly Jordanian , Palestinian and Syrian . 405 | I love Air France ! 406 | Did you get in as much trouble as I did this weekend - Lori seems to think I need to get help - I told her it 's normal to drink all day at a bar , then go to dinner where you do n't know half the people there and proceed to get extremely fucked up 407 | I love them !!! 408 | Great work ! 409 | Unfortunately , I will be in Plano that weekend . 410 | All - you can - eat style deal . 411 | Any of the tip - top places have great ice - cream , get them to mix it up . 412 | I felt as if I was in an over priced Olive Garden . 413 | The company gets busy but you never have to wait long because they ARE orginizied , so you are in , out , and paid well for your scrap 414 | Besides eating good foods , what else do people do in Miramar ? 415 | Staten Island Computers 416 | Musharraf told Clinton he could n't use Pakistani soil or air space to send the team in against Bin Laden . 417 | I wo n't return . 418 | Daren 419 | Well , they have a variety of sports that they play like basketball , soccer , etc . 420 | I should have asked for a jury . 421 | " We believe this is an ill - advised term and we believe that it is counterproductive to associate Islam or Muslims with fascism , " said Nihad Awad , executive director of the Council on American - Islamic Relations advocacy group . 422 | I have been using Steele Electric for years . 423 | Melissa , 424 | Along with an area on the page for those to join the mailing list . 425 | -----== Over 100,000 Newsgroups - 19 Different Servers ! =----- 426 | Thanks 427 | Bush came out today and said that if he had known what was coming , he would have expended every effort to stop it , and that so would have Clinton . 428 | Highly recomended . 429 | I doubt you will get a sensible answer in the " TRAVEL " section . 430 | thumbs down ??? 431 | Also very friendly and the stylists are not in the " been there / done that " mood ! 432 | Call me if you have time . 433 | I prefer Royal Caribbean out of all these . 434 | However , I did not find her very helpful and her receptionist was rude . 435 | yeah , i was thinking somewhere like mcdonald 's . 436 | 01/24/2001 11:21 AM 437 | It was pretty epic as I remember and would love to send my friend there . 438 | 42299 439 | Wolfowitz contradicted counter-terrorism czar Richard Clarke when the latter spoke of the al - Qaeda threat , insisting that the preeminent threat of terrorism against the US came from Iraq , and indicating he accepted Laurie Mylroie 's crackpot conspiracy theory that Saddam was behind the 1993 World Trade Towers bombing . 440 | I would highly recommend Landscape by Hiro . 441 | WASHINGTON ( Reuters ) - 442 | I 'm free any day but Tuesday . 443 | Green 444 | Please return an executed copy of confirm to me . 445 | Hooray for Craggy . 446 | Hospitality .! 447 | Green Tea . 448 | This group does sound pretty interesting though . 449 | kolkatta is an Indian State . 450 | many PCs have sleep & charge now , that allow the PC to go into sleep mode , and still allow the USB ports to charge things like phones . 451 | Will definitely go back when I need medical care . 452 | I have use them four times for fixing items from pushing out a dent in a bumper to fixing the fender on my beloved Miata . 453 | We 've got a Steven , the one word that did n't crash my spell - check , despite it being followed by a Vikash Chand Abdul Shakur . 454 | I have a Nacho Libre question .? 455 | Natasha 456 | crab 457 | Everyone is relaxed and having fun !!! 458 | i was very pleased with the service . 459 | yeah i got yelled at also . 460 | house of pies here i come . 461 | Installed Biometrics and Got Excellent Service . 462 | The guaranty for the PPA is blacklined against the Enron EPC Contract guaranty in favor of the banks which was granted in connection with our Cabazon Wind Power Project ( the project most recently financed by EWC ) and the GPSA guaranty is blacklined against the PPA guaranty . 463 | Great Service , Thanks Don . 464 | Email : " Ian " < ian.gilb...@btinternet.com > 465 | It 's well cool . :) 466 | This is a beautiful site and a wonderful idea . 467 | Does anyone know any good restaurants in cordoba ? 468 | What is this Miramar ? 469 | We 've got a page dedicated to issues around hospitals ( http://www.bbc.co.uk/dna/actionnetwork/C55153 ) and a group called NHS SOS have already put up a campaign about ward closures in Cumbria . 470 | This consolidation is obviously a result of Bush 's aggressive invasion of Iraq and of the botching of the aftermath . 471 | * Ireland 472 | I 've been fuming over this fact for a few weeks now , ever since some organizations and governments suggested we need to accept the fact that Hezbollah will get involved in running Lebanon . 473 | I like music very loud and with a lot of bass . 474 | ? 475 | well , I do n't ask questions here because I have no clue what " Iguazu " is ... 476 | I do n't feel anything until noon . 477 | Best count on $ 50 per person no matter what . 478 | it was a little to high dollar for me 479 | this is the worst Sam s club I 've ever been to 480 | My pharmacy order is always correct and promptly delivered but the pharmacy staff are always very short with me and do n't seem to like answering questions . 481 | Even the least discriminating diner would know not to eat at Sprecher 's . 482 | Dick could never thereafter get any real cooperation from the cabinet officers , who outranked him , and he could not convince them to go to battle stations in the summer of 2001 when George Tenet 's hair was " on fire " about the excited chatter the CIA was picking up from radical Islamist terrorists . 483 | I usually use ZebraKlub . 484 | ---- cgy 485 | i would like to have one of those super random surprisingly nice nights out ... suggestions ? 486 | Arriving in Auckland on a direct flight from Canada .? 487 | -- 488 | well since i do nt know your budget , i recommend Hakka Restaurant for chinese food 489 | We discussed a few days ago a consulting arrangement with Prof. Sheridan Titman from UT . 490 | Vince 491 | Now at 83.5 . 492 | All you can do is take each section ( individual video ) and edit them together on software . 493 | Drove all the way over from the highway ... closed at 7 . 494 | Highly recommended 495 | All of those ;D 496 | If you get a good wife , you 'll become happy ; if you get a bad one , you 'll become a philosopher . 497 | Grateful for any help or suggestions you could provide . 498 | Because he liked making statues of David ! :D 499 | He mentions his wife 's death having an effect on him . 500 | obviously take her to the vet 501 | Recommend you call in for a look . 502 | and hopefully you do not know the same people because he tells others about you payment status . 503 | The African Union is clearly not up to the task of keeping the peace , pledging 300 troops to an area that will need 15,000 , according to analysts . 504 | .. 505 | What if Google Morphed Into GoogleOS ? 506 | it is to late for me to add changes . 507 | We do n't have to believe him . 508 | I do nt go there anymore 509 | These guys took Customer Service 101 from a Neanderthal . 510 | complete with original Magnavox tubes - all tubes have been tested they are all good - stereo amp 511 | kudos to Allentown Post Office staff 512 | And from a place that specializes in high quality meat , too . 513 | Okay , FIRST , you have posted a question about an American movie , set in Mexico , in the Dining Out in Argentina category . 514 | Very Mediocre donuts ! 515 | David , 516 | 02/28/2001 03:16 PM 517 | Enron could be an ideal environment from which to the concept enhancement through to commercialization could be successfully accomplished . 518 | Ruth Penycate www.bbc.co.uk/actionnetwork 519 | Animal News Center Webmaster 520 | What 's the difference between Indian and African ringnecks and alexandrine parrots ? ? 521 | Who does that ?! 522 | I think it will help me very much in my role . 523 | Please be brief ! 524 | Key Delhi blast suspect arrested 525 | there might be bigger and more well known bagel places in the area but Family Bagels are nice people , small shop and incredibly friendly . 526 | Darrell Duffie mail GSB Stanford CA 94305-5015 USA phone 650 723 1976 fax 650 725 7979 email duffie@stanford.edu web http://www.stanford.edu/~duffie/ 527 | Hi Sara , 528 | Otherwise , hope to see you there ! 529 | Thanks 530 | i did n't want you to go . 531 | If you believe crackpot theories instead of focusing on the reality -- that was an al - Qaeda operation mainly carried out by al - Gamaa al - Islamiyyah , an Egyptian terrorist component allied with Bin Laden -- then you will concentrate on the wrong threat . 532 | Thank you . 533 | Neither did Cheney , Rumsfeld , or Wolfowitz . 534 | not sure , but i assume that the bluegrass songbook is mine . 535 | Great place 536 | Best Electrician in Florence 537 | Sheridan 538 | 02/28/2001 04:41 PM 539 | Michael 540 | It looks like the war between Microsoft and Google is quickly brewing on the horizon . 541 | Abu Musab al - Zarqawi and his group are said to have been bitter rivals of al - Qaeda during the Afghan resistance days . 542 | The findings demonstrate the inhibitory network is central to controlling not only the amplitude , extent and duration of activation of recurrent excitatory cortical networks , but also the precise timing of action potentials , and , thus , network synchronization ... " 543 | I really want to go to andiamo s for my birthday and i was just wondering how much it would cost for the four of us to eat there 544 | This BuzzMachine post argues that Google 's rush toward ubiquity might backfire -- which we 've all heard before , but it 's particularly well - put in this post . 545 | Rumsfeld initially rejected an attack on al - Qaeda bases in Afghanistan , saying there were " no good targets " in Afghanistan . 546 | Extremely greasy . 547 | i love The Script and know the re from iraland . 548 | 03/26/2001 08:58 PM 549 | http://www.theadvocate.com/sports/story.asp?StoryID=16475 550 | Really great service and kind staff . 551 | Deb Price 552 | Chicago 's a big area . 553 | Amazing service ! 554 | Thanks 555 | Are you free for lunch today . 556 | See what DD & D showed at the original place on Harms Rd in Glenview : 557 | I live in the neighborhood and this place is one of my favorites for a tasty , quick and inexpensive meal . 558 | No ... that 's all . 559 | Depends of what . 560 | best burger chain in the Chicago area ? 561 | Iran says it is creating nuclear energy without wanting nuclear weapons . 562 | I 'm looking for websites like flickr.com tumblr.com and autocorrects.com but ones that are n't very common and have a variety of funny pictures ! 563 | Teco Tap 85.000 / HPL IFERC ; 20.000 / Enron 564 | He 's not giving 85 % away , he 's giving a number of shares each year that decrease in number at the rate of 5 % a year ( until gone ? ) . 565 | Becky A. Stephens Litigation Unit , Enron Corp. 713/853-5025 EB 4809 566 | image001.jpg 567 | Please verify receipt at your earliest convenience . 568 | could not find any info online 569 | will i have to pay customs in NZ . 570 | Although these new rockets are probably more expensive , they will be able to go at a much greater range than it's shuttle cousins , as they can not only break free from the atmosphere but reach the moon as well . 571 | The Sunni AMS told Iraqis , " You sinned when you participated with occupation forces in the assault on Najaf , and beware lest you repeat this same sin in Fallujah . 572 | Would love for you to join us . 573 | Best ceviche that I 'd had so far ! :) 574 | Jeff 575 | out of carnival , royal caribbean , and norweigan ( cruises ) which is the best and why ? 576 | 09/08/2000 09:36 AM 577 | Description : 578 | -- 579 | Just wanted you to know that Eric came by as scheduled today and sprayed our house for scorpions . 580 | see you there on court 10 581 | thanks 582 | thick cut bacon or really good sausages 583 | They are like family . 584 | Tired of spam ? 585 | ALITO filed a dissenting opinion , in which SCALIA and THOMAS joined as to Parts I through III . 586 | I appreciate the quick , good service and the reasonable prices and will definitely use American Pride Irrigation & Landscaping again . 587 | I started collecting animations & jokes just to help with my boredom and depression . 588 | no they wo nt be able to some gamestops do nt check ID for the pre-order so some ppl can get away with doing this but a lot of store want to see your id and / or your reciept of the pre-order 589 | you live in NZ and you eat McDonald s ice cream ? 590 | [ http://www.space.com/missionlaunches/ft_050829_ksc_spacefuture.html ] 591 | Ladies room , Open Sundays 592 | I 'm free any day but Tuesday . 593 | So hear we are , two weeks later , after that dazzling PR display two weeks ago by Powell and Annan , and the situation on the ground in Darfur appears basically unchanged . 594 | Vince , 595 | or has acquired some type of disease and that too needs to be attended to ... 596 | The Supreme Court announced its ruling today in Hamdan v. Rumsfeld divided along idelogical lines with John Roberts abstaining due to his involvement at the D.C. Circuit level and Anthony Kennedy joining the liberals in a 5 - 3 decision that is 185 pages long . 597 | 11/15/2000 11:58 AM 598 | A reminder . 599 | colorado beat texas a&m . 600 | This was a risk that we had but we did have assurances from Phillips regarding performance . 601 | has life like animal wholesale figurines made from rabbit and goat fur , feathers and sheep s wool . 602 | Angry crowds chanted anti-American slogans in the western city of Falluja ( pop. 256,000 ) as the security police killed in a friendly fire incident by US troops were buried on Saturday . 603 | so i m a little confused , why is there two statues of David ? 604 | --- The Art of Calligraphy in Modern China ( British Museum Press , 2002 ) 605 | Out of business ? 606 | EY4096.7 PERFORMANCE 01-Feb-02 P - 6,363,217 - $ 55,678 607 | It will be interesting to see whether or not Google will finally slay the Microsoft Goliath , who has known no major defeat and seeks to vanquish all competition . 608 | The latest spot for a real Hackney 's is Printers ' Row : 609 | Email : franz371...@gmail.com 610 | What if Google expanded on its search - engine ( and now e-mail ) wares into a full - fledged operating system ? 611 | I shall send you a copy today . 612 | now i will have really straight teeth . 613 | D 614 | Wolfowitz lied to him and said that there was a 10 to 50 % chance that Iraq was behind them . 615 | The actual vote is a little confusing . 616 | Meat Kabob 617 | People love to buy these cute cuddly little animals for gifts and collectables . 618 | These guys do great work at VERY reasonable prices . 619 | Ahmad Mustafa Ghany , 21 , Mississauga ; 620 | Love this place !! 621 | Are you free for lunch some day this week ? 622 | Well , would n't you know it . 623 | 3 thumbs up . 624 | are you lying ? 625 | A very nice park . 626 | YUM 627 | Does anyone know of any good food in iguazu ? 628 | ca n't believe you left last night . 629 | Groups : alt.animals , alt.animals.cat , alt.animals.ethics.vegetarian , talk.politics.animals 630 | I 'm looking for a camera that has really good zoom during a video and pictures ; and good quality pictures / videos 631 | Absolutely rude . 632 | The convergence of views among the more militant Sunni Muslim clerics of AMS and the radical Shiites of the Sadr movement has been seen before , last spring during the initial US assault on Fallujah and during the US attack on Mahdi Army militiamen in Najaf . 633 | EY4106.7 PERFORMANCE 01-Feb-02 P 1,993,045 - $ 43,548 634 | They are very well made and realistic . 635 | If you own a Retail Store or are a Professional Vendor who exhibits at Sport , Hunting , or Craft Shows and are interested in selling our products , please give us a call ! 636 | I just got your email and I certainly concur with Jeff making the call . 637 | To summarize : Enron 's pad gas will now be 3,993,310 MMBTU , instead of 4,223,000 MMBTU . 638 | Winning Attorney ! 639 | Please give me lots of links and places to look ! 640 | Umir Hussain , 24 , London E14 641 | But not so . 642 | Company : 643 | Is that a money maker ? 644 | Media , Software , Fun and Games , Website design , Web Promotion , B2B , Business Promotion , Search Engine Optimization . 645 | Does anybody use it for anything else ? 646 | Rajendra 647 | Channel Guide 648 | I was married by a judge . 649 | Choose the news and sport headlines you want - when you want them , all in one daily e-mail 650 | KENNEDY filed an opinion concurring in part , in which SOUTER , GINSBURG , and BREYER joined as to Parts I and II . 651 | 09/20/2000 03:22 PM 652 | Do n't give these guys a penny . 653 | Seth provides deep tissue massage which has significantly reduced the pain in my neck and shoulders and added flexibility and movement back to the area . 654 | My fries were n't fully cooked last time I went there . 655 | I 've thought about you a few times in the last few months , did n't want to intrude upon an already bad situation with my bullshit questions . 656 | Even after the attacks on September 11 , Bush was obsessing about Iraq . 657 | great , we look forward to seeing you . 658 | We at R&L Plumbing Services are pleased with your professionalism and the extra mile you went to get out computers working correctly , you will be our first call if anything happens again and we will refer you to other people with computer issues . 659 | Strip mall asian it is not ! 660 | NASA is planning on using these new shuttles to replace the current models , with industry forecasters predicting a launch as early as 2014 . 661 | Hackney 's has a great burger formula that started about 80 years ago . 662 | Like I 'm legitimately concerned at this point ... lol 663 | Noticed a few of these Cookie cutter places opening in Summit and New Providence . 664 | The thing about The Script is they do not sound that Irish , I was surprised to hear they were from Dublin . 665 | i was thinking somewhere that requires a jacket , like tony 's . 666 | Lest you be lame !!! 667 | Thanks for the great care !!!! 668 | Why certain slogans work and why some do n't . 669 | image_gif_part 670 | Louise , 671 | Of course , you could just go in by main force . 672 | Are you going to be able to make the power VAR meeting on Thursday ? 673 | 20 fluid ounces in a Pint in Ireland 674 | Sara 675 | Also more often than not you end up with a healthy dose of nasty rude attitude from the employees ! 676 | A Top Quality Sandwich made to artistic standards . 677 | I would not hesitate to use him again or refer him to my family or friends . 678 | Steven Vikash Chand alias Abdul Shakur , 25 , Toronto ; 679 | Finally a convenient place close to home . 680 | Clean store , friendly check - out staff up front . 681 | I have Chronic Lyme disease , so I 'm stuck at home . 682 | Abdul Muneem Patel , 17 , London E5 683 | The food is amazing , and the prices can not be beat . 684 | I have two upcoming events one is for 200 and another is for 21 . 685 | i need to now get a job at house of pies b/c that is the only way to pay the bills . 686 | ' Everything is for the best in the best of all possible worlds if only no artificial hindrances are put in the way of free exchange , for demand and supply will regulate everything better than any Government would be able to . ' 687 | yuk . 688 | they are the best orthodontics in the world . 689 | Cheapest airline ticket from Raleigh to Philippines ? 690 | Universities will take you whatever age you are . 691 | Definetely going back 692 | Do n't worry about avoiding temptation ... as you grow older , it will avoid you . 693 | http://news.bbc.co.uk/1/hi/help/4162471.stm 694 | Elizabeth 36349 695 | Feel good 696 | Starting in February , you will be able to export the data , as opposed to using the spreadsheets . 697 | Privileged / Confidential Information may be contained in this message . 698 | Green Tea Or White Tea ? 699 | They own blogger , of course . 700 | This place is awesome 701 | Any good suggestions would really be appreciated . 702 | ** Disclaimer ** 703 | Any help ? 704 | As we discussed , here is a first effort at a revised TVA offer letter . 705 | Just our standard . 706 | " ... there is no companion quite so devoted , so communicative , so loving and so mesmerizing as a rat . " 707 | __________________________________________________ 708 | The actual word " MAD " has to be on the cover and incorporated into the image . 709 | buy them in any good photography supplies shop . 710 | I have had several dentists in my life , but Dr. Deters is by far my favorite . 711 | These people were so helpful this week and did everything to sort out my windscreen and insurance . 712 | tttthhhhh Madonna ! 713 | home team - thanks 4 playin !!! 714 | On the other hand , it looks pretty cool . 715 | can ever & never forget the training undergone here which made my life step onto the successful job without any hurdles . 716 | I would prefer a simple , fitted black one . 717 | We look forward to your active participation to make this forum an exciting meeting place for like minded individuals . 718 | Today is good 12:30 ? 719 | Please advise immediately if you or your employer do not consent to Internet email for messages of this kind . 720 | Today is good 12:30 ? 721 | I thought that since Chonawee has an optimization background , he would be good to have him go to dinner with Dr. Lasdon on Thrusday as well . 722 | By April of '71 the Dow had climbed back to 950 , only to fall to 869 in February of '72 . 723 | In such case , you should destroy this message and kindly notify the sender by reply email . 724 | I decided to get a 150 gal aquarium , what can I fill it with ? 725 | -- 726 | I need a new lawnmower , so I 'll try to bump it up a little more . 727 | REUTERS / Jason Reed 728 | Know this well because I remember an ' irish ' pub in the town in canada i grew up in used to advertise the cheapest pints of guinness in town , but they served them in american sized pints . 729 | Looks like the kids had a great time ! 730 | In addition , there is a reduction of 22,101 MMBTU which is the difference between the SCADA values ( Best Available ) that Anita showed on the February 29th Storage Sheet and the " official " February 29th values that Gary Wilson received from MIPS . 731 | any format url ? 732 | enron is blowing up . 733 | U.S. President George W. Bush shakes hands with Chinese President Hu Jintao in a bilateral meeting in Santiago . 734 | I 'm hearing some pretty depressing stuff from the people I know at ENE . 735 | Thanks and Regard 736 | FYI . 737 | Kind regards 738 | We accecpt : Visa , MasterCard , Amex , Dinner s Club / Carte Blanche , & Personal Checks / Money Orders . 739 | The food was incredibly bland . 740 | Edward Terry 741 | After tomorrow , I will no longer have access to the estate files . 742 | We are still trying to work the PSE swap transaction , now that the forex desk has been able to find a fix for CPI in the market . 743 | Debra Perlingiere 744 | Monkey Brain . 745 | I need suggestions for San Francisco restaurants with good food and good catering service .? 746 | Someone had to be first . 747 | On the other hand , this is essentially a statement that the company is overpriced from the guy who knows it best -- and happens to be the best investor of the last century . 748 | Kyle with Bullwark 749 | I just had the best experience at this Kal Tire location . 750 | Bland and over cooked . 751 | Highly recommended . 752 | I 'll probably start looking next weekend . 753 | Dinner and dancing in Chicago ? 754 | http://www.newsfeeds.com - The # 1 Newsgroup Service in the World ! 755 | EY4106.9 PERFORMANCE 01-Feb-02 P 27,886 $ 27,361 756 | so i live in Invercargill New Zealand and i want to know if there are any good places to buy an ice - cream sundae from other than mc donald s lol 757 | Police in the Indian capital Delhi say they have arrested the suspected co-ordinator and financier of last month 's deadly bomb blasts in the city . 758 | I have ordered Bose Headfones worth 300 USD . 759 | Bryan , 760 | Osman Adam Khatib , 20 , London E17 761 | Any suggestions would be really helpful , thanks ! 762 | it s a gift from my brother . 763 | Ian - Webmaster www.southbhamcats.org.uk 764 | Would do business with them again . 765 | i flew here last night . 766 | Thanks 767 | In a timid voice , he says : " If an airplane carrying Winston Peters was blown up by a bomb , THAT would be a tragedy " . 768 | Hope you 're doing good . 769 | Ram Tackett , ( mailto:rtackett@abacustech.net ) Owner , Abacus Technologies 17611 Loring Lane , Spring , TX 77388-5746 ( 281 ) 651-7106 ; Fax ( 281 ) 528-8636 Web : http://www.abacustech.net 770 | The door is easy to use and it keeps the cold out during the winter . 771 | different generations , the donatello is of a boy david as a young sheep Herder , the Michelangelo is the grown up man david as slayer and king 772 | ( 713 ) 853-7408 773 | Yes , they all have secret locator chips , just like gps 774 | His work 775 | I have never been disappointed . 776 | Great meats that are already cooked , easy to take home for dinner . 777 | ------ 778 | I would highly recommend her services . 779 | ------ 780 | " Our new lunar transportation system utilizes a unique architecture that will establish the equivalent of a two - way highway between the Earth and the Moon , " Kistler told SPACE.com . 781 | Following up on your and Ken Lay 's conversation with Gary Cohn , I would like to forward the following proposal , acting for each of Goldman Sachs Capital Markets and J. Aron . 782 | Friendliest place I have ever stayed ! 783 | If the PX comes back again , I will call their in - house attys . 784 | Very Informative website with a lot of good work 785 | Stayed here for 2 nights . 786 | Portia 787 | Mohammed Dirie , 22 , Kingston , Ont. ; 788 | Daren , 789 | Rubbish 790 | You 'd need their Apple ID and password , if you had that then yes you can track any iPhone . 791 | " Well , " says the boy , " because it would n't be an accident , and it certainly would n't be a great loss ! " 792 | I 've been looking at the bose sound dock 10 i ve currently got a jvc mini hifi system , i was wondering what would be a good set of speakers . 793 | Al - Qaeda in Afghanistan was a group of only a few hundred " Afghan Arabs " who pledged personal loyalty to Usamah Bin Laden . 794 | I have 3 children there and they are the Best . 795 | I am going to be the Senior Regulatory Counsel at ISO New England starting on April 9 , 2001 . 796 | i can think of a few things 797 | A girl raises her hand . 798 | SS 799 | If I went into the " pre-university " direction with business administration in mind . 800 | here s the link : 801 | Or background stands 802 | Studying in Quebec , Canada ? 803 | fyi 804 | my bad . 805 | My results were just AWFUL . 806 | They know that the American advent implies for them a demotion , and an elevation of the Shiites and Kurds , and they refuse to go quietly . 807 | the camera only begins to work again when i take out the battery and put it back in . 808 | A thoroughly comprehensive service ; excellent communication and best of they are transparent with their fee ( ie nothing is simply implied or assumed ) . 809 | More below . 810 | Mike 811 | The best climbing club around . 812 | Opinions , conclusions and other information in this message that do not relate to the official business of my firm shall be understood as neither given nor endorsed by it . 813 | Marvel Consultants , Inc. 28601 Chagrin Blvd. Cleveland , Ohio 44122 USA Email : recruiters@marvelconsultants.com < mailto:recruiters@marvelconsultants.com > Phone : 216-292-2855 Fax : 216-292-7207 814 | No , technically they do not need a UVB light ; they are nocturnal . 815 | I enjoyed your presentations very much . 816 | Launching this way will hopefully avoid future disasters , giving more support towards NASA revisiting the stars . 817 | http://www.google.co.uk/search?q=backdrop+frame&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&safe=active&sout=1 818 | his clinic is very very dirty he is a real disaster to go totally not organized for every step he take . 819 | you know , whatever . 820 | He was very clean , very nice to work with and gave a very reasonable price . 821 | Currently we have a blank " sample " for our Paragraph 13s which are attached to our sample ISDAs for ( a ) US Corporate , ( b ) Hedge Funds , ( c ) Municipal . 822 | OK Food , Slow service 823 | Old time grocery , best steaks I have ever had ! 824 | He 's worth every penny . 825 | The workers sped up and down the street with no mind to the small children playing . 826 | Rest was too oily . 827 | I refer to VNHH often and love you guys . 828 | The results of the February 26th test reduces the working gas by 398,487 MMBTU . 829 | Shareef Abdelhaleen , 30 , Mississauga ; 830 | The September 11 Panel will issue its findings on Thursday . 831 | The overwhelming human and financial impacts of Hurricane Katrina are powerful evidence that political and economic decisions made in the United States and other countries have failed to account for our dependence on a healthy resource base , according to an assessment released today by the Worldwatch Institute . 832 | [ via Microsoft Watch from Mary Jo Foley ] 833 | http://www.nola.com/lsu/t-p/football/index.ssf?/lsustory/lsunotes08.html 834 | None of the above . 835 | You should really ask this in the art section . 836 | this kebab shop is one of the best around the meat is good and fresh and the chilly sauce is the best , keep them lovely kebabs coming and a happy new year to all the staff 837 | Wonderful Wonderful People ! 838 | Holland & Hart , LLP ; # 432785 dated 4/14/00 839 | 06/02/2001 10:53 AM 840 | http://www.theadvocate.com/sports/story.asp?StoryID=16473 841 | But we ca n't prove it . 842 | I give this place 11 / 10 . 843 | Travelled 40 mins after calling to see if a product was in stock . 844 | Which wonderful contact of mine is thumbs upping all my best answers ^^ ? 845 | Hidden Treasure . 846 | Hi I ´m from Brazil and I want to know of book 06 . 847 | I never wait in the waiting room more than two minutes and the cleanings are quick and painless . 848 | Thank you though . 849 | We have updated our site to include a LOST and FOUND page and you can now join our branch and make a secure on - line donation to the charity . 850 | roflmao 851 | My assistant Joanne Rozycki has cell , car numbers to reach me . 852 | Yeah you got ta burn it .. that s the only way 853 | They picked my car up in Yarmouth and towed to Bath for a great price . 854 | I have just checked with RAC ( David Gorte ) and we have a green light to go ahead with the project . 855 | Please update daily 856 | I plan on going again . 857 | The haircut was inexpensive and so were the salon services ( eyebrows were cheap ! ) . 858 | Wonderful Atmosphere 859 | " What ? " asks Winston , " is n't there any one here who can give me an example of a tragedy ? " 860 | There may or may not be snow , depending on local weather conditions . 861 | economy should be good here . 862 | I have spoken with Mark Lay and he is interested . 863 | Further to my voicemail , our colleagues in credit are calculating the reserve on the PSE swap . 864 | By Deb Price / The Detroit News 865 | Is Hank Green Awesome ? 866 | The rooms were very clean and the breakfast was excellent . 867 | Bush did not have his eye on the ball . 868 | Posted by Anthony Beavers to Cognitive Science News at 8/28/2005 07:18:20 AM 869 | i used to have one . 870 | here is to getting back on track after thanksgiving 871 | Name something you find at a carnival that comes on a stick ? 872 | - Alex Levine 873 | i must have had you messed up with some other girl i made the bet with . 874 | I use their limo services for all of my airport car services and airport transportation needs 875 | em ... no ... the Gates foundation mainly invests in medical research and education , that means donating now adds a tremendous value compared to donating in ten years . 876 | green curry and red curry is awesome ! 877 | Mike Curry 878 | The Donuts were very over proofed , making them stale and bready . 879 | Very fast and efficient service . 880 | Inford Media 881 | A fast service , saved a bad situation getting a lot worse . 882 | Do nt go to the one by pepco , I got confused !!! 883 | Someone told me that Chase is planning a shitload of layoffs . 884 | Most troubling , however , is the fact that the political will to end the crisis expressed a few short weeks ago seems to have ebbed . 885 | In one class , he asks the students if anyone can give him an example of a " tragedy " . 886 | i do n't even like a&m , i would n't bet on them . 887 | thank you 888 | I better pass on the Comets game . 889 | Atal Pharmacy , Karol Bag . 890 | you can buy me dinner when we get back . 891 | Thanks , 892 | Sure Google , although this would put it on coarse for global domination of the internet by 2014 . 893 | Debra Perlingiere 894 | Red Robin . 895 | My weekends seem to be taken up with condo matters , house hunting . 896 | It does n't change the company 's intrinsic worth , and as the article notes , the company might be added to a major index once the shares get more liquid . 897 | Do n't waste your money on the jukebox 898 | which is the best burger chain in the chicago metro area like for example burger king portillo s white castle which one do like the best ? 899 | Do n't judge a book by its cover 900 | - Victor Borge 901 | ca n't go to any more lsu games unless i get a free ticket . 902 | to make convo , it 's short , so if things go great , you can 903 | then i told her i felt i should be able to screw missy just once . 904 | there will be talent and opportunity a plenty on the market soon . 905 | But I was not pleased to read the description in the catalog : " No good in a bed , but fine against a wall . " 906 | Daren 907 | Events change everyday . 908 | " If a school bus carrying fifty children drove off a cliff , killing everyone involved ... that would be a tragedy " . 909 | Groups : alt.animals.cat 910 | Thanks 911 | I will go there again . 912 | This is by far the best run dealership in Miami . 913 | I could go on and on ! 914 | On my last 6 trips here from the states I used " Fly genesis " based in Seattle , Wa . 915 | not sure how much longer ene is going to be around and i 'm checking out my options ! 916 | ** 917 | They have always done a great job at a reasonable price . 918 | August 12 , 2000 919 | I was very upset when I went to Mother Plucker , they had NO FEATHERS and the quality is TERRIBLE . 920 | slow service 921 | Vijay K. Suchdev Vice President Equity Derivatives First Union Securities , Inc. Telephone : ( 212 ) 909-0951 Facsimile : ( 212 ) 891-5042 email : vijay.suchdev@funb.com < mailto:vijay.suchdev@funb.com > 922 | B & w . 923 | I hope you do n't mind , but I 've taken liberty to turn them into a web photo album at http://24.27.98.30/pictures/08-05_Garrett_Gayle_Bday . 924 | The Pentagon did not even have a plan for dealing with Afghanistan or al - Qaeda that it could pull off the shelf , according to Bob Woodward . 925 | Sheridan Titman < titman@mail.utexas.edu > on 01/24/2001 02:45:50 PM 926 | Over three years after 9-11 , the United Nations , despite their attempts to project strength in fighting terrorism , still can not properly define the word " terrorist " , waffling over the issue of whether the murder of innocent civilians are terrorist acts . 927 | If you took out the clown loach it would make a nice 150 gallon tank . 928 | Stick to Hop Hing , 20 year + resident . 929 | Send the revised report by e-mail . 930 | -- 931 | After friday , I will no longer have access to the estate , so if you could shoot this off over night so I could have something in the morning to work with I would appreciate it . 932 | Cheap , great view , time together . 933 | See http://www.gulf-news.com/Articles/news.asp?ArticleID=97508 934 | Posted by Hidden Nook to Hidden Nook at 2/14/2005 07:03:00 PM 935 | i have stronger will than you think . 936 | Thanks again , Directv . 937 | not sure yet 938 | Well , I 'm about to graduate in less then a year , and I 'm planning to study medical school . 939 | Please confidentially share matters as you think best and advise me of the interest generated . 940 | Restaurant on top was renovated , food was decent , price was way to high for Duluth for quality , new decor seems tacky 941 | I 'm planning on buying a compact system camera at best buy ; so please list the one ( s ) I should purchase . 942 | canon t2i stops working ? 943 | Compare compare compare - that 's the key to getting the best deal . 944 | Use Travelocity or Expedia and see what you come up with . 945 | By September of that year the Dow had tumbled to 744 . 946 | " No , " Winston says , " That would be an ACCIDENT . " 947 | I like my Monkey Brain on a stick for sure . 948 | http://www.hackneys.net/ 949 | I gave Dr. Rohatgi 2 stars because her assistant was very pleasant . 950 | Hilary E. Ackermann Goldman Sachs Credit Risk Management & Advisory Phone : 212-902-3724 Fax : 212-428-1181 E-Mail : hilary.ackermann@gs.com 951 | Of course , that was the bottom 952 | Relish 953 | I started this page to help with my boredom . 954 | Just ask American Express 955 | We 'll see you on ' Border Patrol ' 956 | On the same day Palestinians protest in support of Hezbollah and Syria , the terrorist group Hamas has indicated it will participate in the scheduled upcoming Parliamentary elections . 957 | The best customer service I 've come across for long time . 958 | EY4108.H PERFORMANCE 01-Feb-02 P - 2,239,879 - $ 36,398 959 | this dentist want to pull the tooth out always .. always wants to do the cheapest for his benefit .. not unless he knows you . 960 | It 's a little hard to parse , but at this point his ostensible view is that the Gateses are very good money - redistributors , and he wants them to have the money as soon as possible . 961 | And can you tell me WHY that would be a tragedy ? " 962 | Wendi has worked for them have a look at her blog . 963 | Original Margin Call Margin Due Today 964 | I am amazed how the details get fuzzy on an old project . 965 | If you want a CD copy of this web site , give me a yell . 966 | Amy :-) 967 | The positions needed to be divided to reflect BCF 's . 968 | Hi , i 'm looking to take myself and my best friend and his girl friend and this girl i really like out to dinner for my birthday . 969 | and the people are sweet :) 970 | They chased the Communists out of the capital ( Hanoi ) and retook control . 971 | Maybe Labour . 972 | The employees are really friendly . 973 | We plan to use the same basic form for the Enron guaranty that will be made in favor of EPMI with respect to the seller 's obligations under the PPA . 974 | Great Place ! 975 | Since work has gone to hell , I am hoping to find some excitement in the possibility that LSU may play in the Cotton Bowl ( if Rohan " Alabama " Davey shows up for the next 3 games . ) 976 | we can go somewhere nice . 977 | i put $ 5 bucks down for it too . 978 | American Food , Soul Food , Mexican , Italian , and Chinese are the options . 979 | depends on the computer 980 | Photo from Technology News 981 | Michelangelo made the marble one but why did he do another if Donatello had already made one ? 982 | You should use the same spreadsheet format used for the 1/29/02 DPR . 983 | Sheridan , 984 | We will have to correct them after the churn . 985 | ~ CGoehring 986 | EY4096.3 PERFORMANCE 01-Feb-02 P 202,989 $ 195,610 987 | It is now threatening to pull out of the Allawi caretaker government . 988 | Very knowledgeable and friendly design build firm . 989 | My last day in the Portland area will be March 31 , 2001 . 990 | Seth K . 991 | Rude and Untrustworthy 992 | Dawn 993 | Close to my house , this is the only reason I would go to this particular QT . 994 | i wish the other utilities i had to set up had people to work with like this .. 995 | He is regarded as one of the leading Avant - Garde artist of modern calligraphy . 996 | Quality has fallen over the years , but still the best go - to burger place on the East Bay . 997 | Look at a map and you try to figure out how , in fall of 1999 , you could possibly pull off such an operation without Pakistani facilities . 998 | Best , 999 | We now have over 5000 addresses . 1000 | WE AT HOME LOVE IT AT $ 80 +++ 1001 | -------------------------------------------------------------------------------- /structural-probes/spec.yaml: -------------------------------------------------------------------------------- 1 | dataset: 2 | observation_fieldnames: 3 | - index 4 | - sentence 5 | - lemma_sentence 6 | - upos_sentence 7 | - xpos_sentence 8 | - morph 9 | - head_indices 10 | - governance_relations 11 | - secondary_relations 12 | - extra_info 13 | - embeddings 14 | corpus: 15 | root: en_ewt-ud/ 16 | train_path: en_ewt-ud-train.conllu 17 | dev_path: en_ewt-ud-dev.conllu 18 | test_path: en_ewt-ud-test.conllu 19 | embeddings: 20 | type: token #{token,subword} 21 | root: . 22 | train_path: encodings-train.hdf5 23 | dev_path: encodings-dev.hdf5 24 | test_path: encodings-test.hdf5 25 | batch_size: 40 26 | model: 27 | hidden_dim: 768 28 | model_type: BERT-disk # BERT-disk, ELMo-disk, 29 | use_disk: True 30 | model_layer: 31 | probe: 32 | task_signature: word_pair # word, word_pair 33 | task_name: parse-distance 34 | maximum_rank: 32 35 | psd_parameters: True 36 | diagonal: False 37 | params_path: predictor.params 38 | probe_training: 39 | epochs: 30 40 | loss: L1 41 | reporting: 42 | root: . 43 | observation_paths: 44 | train_path: train.observations 45 | dev_path: dev.observations 46 | test_path: test.observations 47 | prediction_paths: 48 | train_path: train.predictions 49 | dev_path: dev.predictions 50 | test_path: test.predictions 51 | reporting_methods: 52 | - spearmanr 53 | - uuas 54 | - root_acc 55 | --------------------------------------------------------------------------------