├── .gitignore
├── LICENSE
├── README.md
├── bin
├── download_glue_data.py
└── eval_squad.py
├── data
├── brains
│ └── .gitkeep
└── sentences
│ └── .gitkeep
├── main.nf
├── nextflow.config
├── nextflow.slurm.config
├── notebooks
├── decoding_rank_data.csv
├── encoding_distances.ipynb
├── pca_check.ipynb
├── predictions.ipynb
├── quantitative_dynamic.ipynb
├── quantitative_gross.ipynb
├── quantitative_roger.ipynb
├── rsa.py
├── structural-probes.ipynb
├── t-sne.ipynb
└── within-subject.png
├── src
├── dependency_graph.py
├── heatmap.py
├── learn_decoder.py
├── nearest_neighbors.py
└── util.py
└── structural-probes
├── en_ewt-ud
├── en_ewt-ud-dev.conllu
├── en_ewt-ud-dev.txt
├── en_ewt-ud-test.conllu
├── en_ewt-ud-test.txt
├── en_ewt-ud-train.conllu
└── en_ewt-ud-train.txt
└── spec.yaml
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__
2 | perf.*.csv
3 | encodings/*.npy
4 | .ipynb_checkpoints
5 | .nextflow*
6 | data/brains/*
7 | data/sentences/*
8 |
9 | flow/bert
10 | flow/data
11 | flow/tasks
12 | work
13 | slurm-*
14 | encodings.*
15 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright 2018 Jon Gauthier
2 |
3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4 |
5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6 |
7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Neural network brain decoding
2 |
3 | [](https://opensource.org/licenses/MIT)
4 |
5 | This repository contains analysis code for the paper:
6 |
7 | [**Linking human and artificial neural representations of language.**][3]
8 | Jon Gauthier and Roger P. Levy.
9 | [2019 Conference on Empirical Methods in Natural Language Processing][2].
10 |
11 | This repository is open-source under the MIT License. If you would like to
12 | reuse our code or otherwise extend our work, please cite our paper:
13 |
14 | @inproceedings{gauthier2019linking,
15 | title={Linking human and artificial neural representations of language},
16 | author={Gauthier, Jon and Levy, Roger P.},
17 | booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
18 | year={2019}
19 | }
20 |
21 | ## About the codebase
22 |
23 | We structure our data analysis pipeline, from model fine-tuning to
24 | representation analysis, using [Nextflow][4]. Our entire data analysis pipeline
25 | is specified in the file [`main.nf`](main.nf).
26 |
27 | Visualizations and statistical tests are done in Jupyter notebooks stored in
28 | the [`notebooks`](notebooks) directory.
29 |
30 | ## Running the code
31 |
32 | ### Hardware requirements
33 |
34 | - ~2 TB disk space (for storing brain images, model checkpoints, etc.)
35 | - 8 GB RAM or more
36 | - 1 GPU with > 4 GB RAM (for fine-tuning BERT models)
37 |
38 | We strongly suggest running this pipeline on a distributed computing cluster to
39 | save time. The full pipeline completes in several days on an MIT
40 | high-performance computing cluster.
41 |
42 | If you don't have a GPU or this much disk space to spare but still wish to run
43 | the pipeline, please ping me and we can make special resource-saving
44 | arrangements.
45 |
46 | ### Software requirements
47 |
48 | There are only two software requirements:
49 |
50 | 1. [Nextflow][4] is used to manage the data processing pipeline. Installing
51 | Nextflow is as simple as running the following command:
52 |
53 | ```bash
54 | wget -qO- https://get.nextflow.io | bash
55 | ```
56 |
57 | This installation script will put a binary `nextflow` in your working
58 | directory. The later commands in this README assume that this binary is on
59 | your `PATH`.
60 | 2. [Singularity][5] retrieves and runs the software containers necessary for
61 | the pipeline. It is likely already available on your computing cluster. If
62 | not, please see the [Singularity installation instructions][6].
63 |
64 | The pipeline is otherwise fully automated, so all other dependencies
65 | (data, BERT, etc.) will be automatically retrieved.
66 |
67 | ### Starting the pipeline
68 |
69 | Check out the repository by downloading the [**`emnlp2019-final`**](https://github.com/hans/nn-decoding/tree/emnlp2019-final)
70 | tag and run the following command in the root directory:
71 |
72 | ```bash
73 | nextflow run main.nf
74 | ```
75 |
76 | ### Configuring the pipeline
77 |
78 | For **technical configuration** (e.g. customizing how this pipeline will be
79 | deployed on a cluster), see the file [`nextflow.config`](nextflow.config). The
80 | pipeline is configured by default to run locally, but can be easily farmed out
81 | across a computing cluster.
82 |
83 | A configuration for the [`SLURM`][6] framework is given in
84 | [`nextflow.slurm.config`](nextflow.slurm.config). If your cluster uses a
85 | framework other than SLURM, adapting to it may be as simple as changing a few
86 | settings in that file. See the [Nextflow documentation on cluster computing][7]
87 | for more information.
88 |
89 | For **model configuration** (e.g. customizing hyperparameters), see the header
90 | of the main pipeline in [`main.nf`](main.nf). Each parameter, written as `params.X`,
91 | can be overwritten with a command line flag of the same name. For example, if
92 | we wanted to run the whole pipeline with BERT models trained for 500 steps
93 | rather than 250 steps, we could simply execute
94 |
95 | ```bash
96 | nextflow run main.nf --finetune_steps 500
97 | ```
98 |
99 | ### Analysis and visualization
100 |
101 | The `notebooks` directory contains Jupyter notebooks for producing the
102 | visualizations and statistical analyses in the paper (and much more):
103 |
104 | - [`quantitative_dynamic.ipynb`](notebooks/quantitative_dynamic.ipynb) is used
105 | to produce the majority of the plots in the paper, studying brain decoding
106 | across fine-tuning time in different models.
107 | - [`structural-probes.ipynb`](notebooks/structural-probes.ipynb) visualizes the
108 | structural probe results.
109 | - [`predictions.ipynb`](notebooks/predictions.ipynb) produces, among many other
110 | things, the RSA analysis on model representations.
111 |
112 | After the Nextflow pipeline completes, you can load and run these notebooks by
113 | beginning a Jupyter notebook session in the same directory as where you began
114 | the pipeline. The notebooks require Tensorflow and general Python data science
115 | tools to function. I recommend using my `tensorflow` Singularity image as
116 | follows:
117 |
118 | ```bash
119 | singularity run library://jon/default/tensorflow:1.12.0-cpu jupyter lab
120 | ```
121 |
122 |
123 | [1]: https://doi.org/10.1038/s41467-018-03068-4
124 | [2]: https://www.emnlp-ijcnlp2019.org
125 | [3]: https://arxiv.org/abs/1910.01244
126 | [4]: https://www.nextflow.io
127 | [5]: https://sylabs.io/singularity/
128 | [6]: https://slurm.schedmd.com/overview.html
129 | [7]: https://www.nextflow.io/docs/latest/executor.html
130 |
--------------------------------------------------------------------------------
/bin/download_glue_data.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | ''' Script for downloading all GLUE data.
3 |
4 | Note: for legal reasons, we are unable to host MRPC.
5 | You can either use the version hosted by the SentEval team, which is already tokenized,
6 | or you can download the original data from (https://download.microsoft.com/download/D/4/6/D46FF87A-F6B9-4252-AA8B-3604ED519838/MSRParaphraseCorpus.msi) and extract the data from it manually.
7 | For Windows users, you can run the .msi file. For Mac and Linux users, consider an external library such as 'cabextract' (see below for an example).
8 | You should then rename and place specific files in a folder (see below for an example).
9 |
10 | mkdir MRPC
11 | cabextract MSRParaphraseCorpus.msi -d MRPC
12 | cat MRPC/_2DEC3DBE877E4DB192D17C0256E90F1D | tr -d $'\r' > MRPC/msr_paraphrase_train.txt
13 | cat MRPC/_D7B391F9EAFF4B1B8BCE8F21B20B1B61 | tr -d $'\r' > MRPC/msr_paraphrase_test.txt
14 | rm MRPC/_*
15 | rm MSRParaphraseCorpus.msi
16 | '''
17 |
18 | import os
19 | import sys
20 | import shutil
21 | import argparse
22 | import tempfile
23 | import urllib
24 | import io
25 | if sys.version_info >= (3, 0):
26 | import urllib.request
27 | import zipfile
28 |
29 | URLLIB=urllib
30 | if sys.version_info >= (3, 0):
31 | URLLIB=urllib.request
32 |
33 | TASKS = ["CoLA", "SST", "MRPC", "QQP", "STS", "MNLI", "SNLI", "QNLI", "RTE", "WNLI", "diagnostic"]
34 | TASK2PATH = {"CoLA":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FCoLA.zip?alt=media&token=46d5e637-3411-4188-bc44-5809b5bfb5f4',
35 | "SST":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FSST-2.zip?alt=media&token=aabc5f6b-e466-44a2-b9b4-cf6337f84ac8',
36 | "MRPC":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2Fmrpc_dev_ids.tsv?alt=media&token=ec5c0836-31d5-48f4-b431-7480817f1adc',
37 | "QQP":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FQQP.zip?alt=media&token=700c6acf-160d-4d89-81d1-de4191d02cb5',
38 | "STS":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FSTS-B.zip?alt=media&token=bddb94a7-8706-4e0d-a694-1109e12273b5',
39 | "MNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FMNLI.zip?alt=media&token=50329ea1-e339-40e2-809c-10c40afff3ce',
40 | "SNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FSNLI.zip?alt=media&token=4afcfbb2-ff0c-4b2d-a09a-dbf07926f4df',
41 | "QNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FQNLI.zip?alt=media&token=c24cad61-f2df-4f04-9ab6-aa576fa829d0',
42 | "RTE":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FRTE.zip?alt=media&token=5efa7e85-a0bb-4f19-8ea2-9e1840f077fb',
43 | "WNLI":'https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FWNLI.zip?alt=media&token=068ad0a0-ded7-4bd7-99a5-5e00222e0faf',
44 | "diagnostic":'https://storage.googleapis.com/mtl-sentence-representations.appspot.com/tsvsWithoutLabels%2FAX.tsv?GoogleAccessId=firebase-adminsdk-0khhl@mtl-sentence-representations.iam.gserviceaccount.com&Expires=2498860800&Signature=DuQ2CSPt2Yfre0C%2BiISrVYrIFaZH1Lc7hBVZDD4ZyR7fZYOMNOUGpi8QxBmTNOrNPjR3z1cggo7WXFfrgECP6FBJSsURv8Ybrue8Ypt%2FTPxbuJ0Xc2FhDi%2BarnecCBFO77RSbfuz%2Bs95hRrYhTnByqu3U%2FYZPaj3tZt5QdfpH2IUROY8LiBXoXS46LE%2FgOQc%2FKN%2BA9SoscRDYsnxHfG0IjXGwHN%2Bf88q6hOmAxeNPx6moDulUF6XMUAaXCSFU%2BnRO2RDL9CapWxj%2BDl7syNyHhB7987hZ80B%2FwFkQ3MEs8auvt5XW1%2Bd4aCU7ytgM69r8JDCwibfhZxpaa4gd50QXQ%3D%3D'}
45 |
46 | MRPC_TRAIN = 'https://s3.amazonaws.com/senteval/senteval_data/msr_paraphrase_train.txt'
47 | MRPC_TEST = 'https://s3.amazonaws.com/senteval/senteval_data/msr_paraphrase_test.txt'
48 |
49 | def download_and_extract(task, data_dir):
50 | print("Downloading and extracting %s..." % task)
51 | data_file = "%s.zip" % task
52 | URLLIB.urlretrieve(TASK2PATH[task], data_file)
53 | with zipfile.ZipFile(data_file) as zip_ref:
54 | zip_ref.extractall(data_dir)
55 | os.remove(data_file)
56 | print("\tCompleted!")
57 |
58 | def format_mrpc(data_dir, path_to_data):
59 | print("Processing MRPC...")
60 | mrpc_dir = os.path.join(data_dir, "MRPC")
61 | if not os.path.isdir(mrpc_dir):
62 | os.mkdir(mrpc_dir)
63 | if path_to_data:
64 | mrpc_train_file = os.path.join(path_to_data, "msr_paraphrase_train.txt")
65 | mrpc_test_file = os.path.join(path_to_data, "msr_paraphrase_test.txt")
66 | else:
67 | mrpc_train_file = os.path.join(mrpc_dir, "msr_paraphrase_train.txt")
68 | mrpc_test_file = os.path.join(mrpc_dir, "msr_paraphrase_test.txt")
69 | URLLIB.urlretrieve(MRPC_TRAIN, mrpc_train_file)
70 | URLLIB.urlretrieve(MRPC_TEST, mrpc_test_file)
71 | assert os.path.isfile(mrpc_train_file), "Train data not found at %s" % mrpc_train_file
72 | assert os.path.isfile(mrpc_test_file), "Test data not found at %s" % mrpc_test_file
73 | URLLIB.urlretrieve(TASK2PATH["MRPC"], os.path.join(mrpc_dir, "dev_ids.tsv"))
74 |
75 | dev_ids = []
76 | with io.open(os.path.join(mrpc_dir, "dev_ids.tsv"), encoding='utf-8') as ids_fh:
77 | for row in ids_fh:
78 | dev_ids.append(row.strip().split('\t'))
79 |
80 | with io.open(mrpc_train_file, encoding='utf-8') as data_fh, \
81 | io.open(os.path.join(mrpc_dir, "train.tsv"), 'w', encoding='utf-8') as train_fh, \
82 | io.open(os.path.join(mrpc_dir, "dev.tsv"), 'w', encoding='utf-8') as dev_fh:
83 | header = data_fh.readline()
84 | train_fh.write(header)
85 | dev_fh.write(header)
86 | for row in data_fh:
87 | label, id1, id2, s1, s2 = row.strip().split('\t')
88 | if [id1, id2] in dev_ids:
89 | dev_fh.write("%s\t%s\t%s\t%s\t%s\n" % (label, id1, id2, s1, s2))
90 | else:
91 | train_fh.write("%s\t%s\t%s\t%s\t%s\n" % (label, id1, id2, s1, s2))
92 |
93 | with io.open(mrpc_test_file, encoding='utf-8') as data_fh, \
94 | io.open(os.path.join(mrpc_dir, "test.tsv"), 'w', encoding='utf-8') as test_fh:
95 | header = data_fh.readline()
96 | test_fh.write("index\t#1 ID\t#2 ID\t#1 String\t#2 String\n")
97 | for idx, row in enumerate(data_fh):
98 | label, id1, id2, s1, s2 = row.strip().split('\t')
99 | test_fh.write("%d\t%s\t%s\t%s\t%s\n" % (idx, id1, id2, s1, s2))
100 | print("\tCompleted!")
101 |
102 | def download_diagnostic(data_dir):
103 | print("Downloading and extracting diagnostic...")
104 | if not os.path.isdir(os.path.join(data_dir, "diagnostic")):
105 | os.mkdir(os.path.join(data_dir, "diagnostic"))
106 | data_file = os.path.join(data_dir, "diagnostic", "diagnostic.tsv")
107 | URLLIB.urlretrieve(TASK2PATH["diagnostic"], data_file)
108 | print("\tCompleted!")
109 | return
110 |
111 | def get_tasks(task_names):
112 | task_names = task_names.split(',')
113 | if "all" in task_names:
114 | tasks = TASKS
115 | else:
116 | tasks = []
117 | for task_name in task_names:
118 | assert task_name in TASKS, "Task %s not found!" % task_name
119 | tasks.append(task_name)
120 | return tasks
121 |
122 | def main(arguments):
123 | parser = argparse.ArgumentParser()
124 | parser.add_argument('-d', '--data_dir', help='directory to save data to', type=str, default='glue_data')
125 | parser.add_argument('-t', '--tasks', help='tasks to download data for as a comma separated string',
126 | type=str, default='all')
127 | parser.add_argument('--path_to_mrpc', help='path to directory containing extracted MRPC data, msr_paraphrase_train.txt and msr_paraphrase_text.txt',
128 | type=str, default='')
129 | args = parser.parse_args(arguments)
130 |
131 | if not os.path.isdir(args.data_dir):
132 | os.mkdir(args.data_dir)
133 | tasks = get_tasks(args.tasks)
134 |
135 | for task in tasks:
136 | if task == 'MRPC':
137 | format_mrpc(args.data_dir, args.path_to_mrpc)
138 | elif task == 'diagnostic':
139 | download_diagnostic(args.data_dir)
140 | else:
141 | download_and_extract(task, args.data_dir)
142 |
143 |
144 | if __name__ == '__main__':
145 | sys.exit(main(sys.argv[1:]))
146 |
--------------------------------------------------------------------------------
/bin/eval_squad.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | """Official evaluation script for SQuAD version 2.0.
3 |
4 | In addition to basic functionality, we also compute additional statistics and
5 | plot precision-recall curves if an additional na_prob.json file is provided.
6 | This file is expected to map question ID's to the model's predicted probability
7 | that a question is unanswerable.
8 | """
9 | import argparse
10 | import collections
11 | import json
12 | import numpy as np
13 | import os
14 | import re
15 | import string
16 | import sys
17 |
18 | OPTS = None
19 |
20 | def parse_args():
21 | parser = argparse.ArgumentParser('Official evaluation script for SQuAD version 2.0.')
22 | parser.add_argument('data_file', metavar='data.json', help='Input data JSON file.')
23 | parser.add_argument('pred_file', metavar='pred.json', help='Model predictions.')
24 | parser.add_argument('--out-file', '-o', metavar='eval.json',
25 | help='Write accuracy metrics to file (default is stdout).')
26 | parser.add_argument('--na-prob-file', '-n', metavar='na_prob.json',
27 | help='Model estimates of probability of no answer.')
28 | parser.add_argument('--na-prob-thresh', '-t', type=float, default=1.0,
29 | help='Predict "" if no-answer probability exceeds this (default = 1.0).')
30 | parser.add_argument('--out-image-dir', '-p', metavar='out_images', default=None,
31 | help='Save precision-recall curves to directory.')
32 | parser.add_argument('--verbose', '-v', action='store_true')
33 | if len(sys.argv) == 1:
34 | parser.print_help()
35 | sys.exit(1)
36 | return parser.parse_args()
37 |
38 | def make_qid_to_has_ans(dataset):
39 | qid_to_has_ans = {}
40 | for article in dataset:
41 | for p in article['paragraphs']:
42 | for qa in p['qas']:
43 | qid_to_has_ans[qa['id']] = bool(qa['answers'])
44 | return qid_to_has_ans
45 |
46 | def normalize_answer(s):
47 | """Lower text and remove punctuation, articles and extra whitespace."""
48 | def remove_articles(text):
49 | regex = re.compile(r'\b(a|an|the)\b', re.UNICODE)
50 | return re.sub(regex, ' ', text)
51 | def white_space_fix(text):
52 | return ' '.join(text.split())
53 | def remove_punc(text):
54 | exclude = set(string.punctuation)
55 | return ''.join(ch for ch in text if ch not in exclude)
56 | def lower(text):
57 | return text.lower()
58 | return white_space_fix(remove_articles(remove_punc(lower(s))))
59 |
60 | def get_tokens(s):
61 | if not s: return []
62 | return normalize_answer(s).split()
63 |
64 | def compute_exact(a_gold, a_pred):
65 | return int(normalize_answer(a_gold) == normalize_answer(a_pred))
66 |
67 | def compute_f1(a_gold, a_pred):
68 | gold_toks = get_tokens(a_gold)
69 | pred_toks = get_tokens(a_pred)
70 | common = collections.Counter(gold_toks) & collections.Counter(pred_toks)
71 | num_same = sum(common.values())
72 | if len(gold_toks) == 0 or len(pred_toks) == 0:
73 | # If either is no-answer, then F1 is 1 if they agree, 0 otherwise
74 | return int(gold_toks == pred_toks)
75 | if num_same == 0:
76 | return 0
77 | precision = 1.0 * num_same / len(pred_toks)
78 | recall = 1.0 * num_same / len(gold_toks)
79 | f1 = (2 * precision * recall) / (precision + recall)
80 | return f1
81 |
82 | def get_raw_scores(dataset, preds):
83 | exact_scores = {}
84 | f1_scores = {}
85 | for article in dataset:
86 | for p in article['paragraphs']:
87 | for qa in p['qas']:
88 | qid = qa['id']
89 | gold_answers = [a['text'] for a in qa['answers']
90 | if normalize_answer(a['text'])]
91 | if not gold_answers:
92 | # For unanswerable questions, only correct answer is empty string
93 | gold_answers = ['']
94 | if qid not in preds:
95 | print('Missing prediction for %s' % qid)
96 | continue
97 | a_pred = preds[qid]
98 | # Take max over all gold answers
99 | exact_scores[qid] = max(compute_exact(a, a_pred) for a in gold_answers)
100 | f1_scores[qid] = max(compute_f1(a, a_pred) for a in gold_answers)
101 | return exact_scores, f1_scores
102 |
103 | def apply_no_ans_threshold(scores, na_probs, qid_to_has_ans, na_prob_thresh):
104 | new_scores = {}
105 | for qid, s in scores.items():
106 | pred_na = na_probs.get(qid, 0.0) > na_prob_thresh
107 | if pred_na:
108 | new_scores[qid] = float(not qid_to_has_ans[qid])
109 | else:
110 | new_scores[qid] = s
111 | return new_scores
112 |
113 | def make_eval_dict(exact_scores, f1_scores, qid_list=None):
114 | if not qid_list:
115 | total = len(exact_scores)
116 | return collections.OrderedDict([
117 | ('exact', 100.0 * sum(exact_scores.values()) / total),
118 | ('f1', 100.0 * sum(f1_scores.values()) / total),
119 | ('total', total),
120 | ])
121 | else:
122 | total = len(qid_list)
123 | return collections.OrderedDict([
124 | ('exact', 100.0 * sum(exact_scores[k] for k in qid_list) / total),
125 | ('f1', 100.0 * sum(f1_scores[k] for k in qid_list) / total),
126 | ('total', total),
127 | ])
128 |
129 | def merge_eval(main_eval, new_eval, prefix):
130 | for k in new_eval:
131 | main_eval['%s_%s' % (prefix, k)] = new_eval[k]
132 |
133 | def plot_pr_curve(precisions, recalls, out_image, title):
134 | plt.step(recalls, precisions, color='b', alpha=0.2, where='post')
135 | plt.fill_between(recalls, precisions, step='post', alpha=0.2, color='b')
136 | plt.xlabel('Recall')
137 | plt.ylabel('Precision')
138 | plt.xlim([0.0, 1.05])
139 | plt.ylim([0.0, 1.05])
140 | plt.title(title)
141 | plt.savefig(out_image)
142 | plt.clf()
143 |
144 | def make_precision_recall_eval(scores, na_probs, num_true_pos, qid_to_has_ans,
145 | out_image=None, title=None):
146 | qid_list = sorted(na_probs, key=lambda k: na_probs[k])
147 | true_pos = 0.0
148 | cur_p = 1.0
149 | cur_r = 0.0
150 | precisions = [1.0]
151 | recalls = [0.0]
152 | avg_prec = 0.0
153 | for i, qid in enumerate(qid_list):
154 | if qid_to_has_ans[qid]:
155 | true_pos += scores[qid]
156 | cur_p = true_pos / float(i+1)
157 | cur_r = true_pos / float(num_true_pos)
158 | if i == len(qid_list) - 1 or na_probs[qid] != na_probs[qid_list[i+1]]:
159 | # i.e., if we can put a threshold after this point
160 | avg_prec += cur_p * (cur_r - recalls[-1])
161 | precisions.append(cur_p)
162 | recalls.append(cur_r)
163 | if out_image:
164 | plot_pr_curve(precisions, recalls, out_image, title)
165 | return {'ap': 100.0 * avg_prec}
166 |
167 | def run_precision_recall_analysis(main_eval, exact_raw, f1_raw, na_probs,
168 | qid_to_has_ans, out_image_dir):
169 | if out_image_dir and not os.path.exists(out_image_dir):
170 | os.makedirs(out_image_dir)
171 | num_true_pos = sum(1 for v in qid_to_has_ans.values() if v)
172 | if num_true_pos == 0:
173 | return
174 | pr_exact = make_precision_recall_eval(
175 | exact_raw, na_probs, num_true_pos, qid_to_has_ans,
176 | out_image=os.path.join(out_image_dir, 'pr_exact.png'),
177 | title='Precision-Recall curve for Exact Match score')
178 | pr_f1 = make_precision_recall_eval(
179 | f1_raw, na_probs, num_true_pos, qid_to_has_ans,
180 | out_image=os.path.join(out_image_dir, 'pr_f1.png'),
181 | title='Precision-Recall curve for F1 score')
182 | oracle_scores = {k: float(v) for k, v in qid_to_has_ans.items()}
183 | pr_oracle = make_precision_recall_eval(
184 | oracle_scores, na_probs, num_true_pos, qid_to_has_ans,
185 | out_image=os.path.join(out_image_dir, 'pr_oracle.png'),
186 | title='Oracle Precision-Recall curve (binary task of HasAns vs. NoAns)')
187 | merge_eval(main_eval, pr_exact, 'pr_exact')
188 | merge_eval(main_eval, pr_f1, 'pr_f1')
189 | merge_eval(main_eval, pr_oracle, 'pr_oracle')
190 |
191 | def histogram_na_prob(na_probs, qid_list, image_dir, name):
192 | if not qid_list:
193 | return
194 | x = [na_probs[k] for k in qid_list]
195 | weights = np.ones_like(x) / float(len(x))
196 | plt.hist(x, weights=weights, bins=20, range=(0.0, 1.0))
197 | plt.xlabel('Model probability of no-answer')
198 | plt.ylabel('Proportion of dataset')
199 | plt.title('Histogram of no-answer probability: %s' % name)
200 | plt.savefig(os.path.join(image_dir, 'na_prob_hist_%s.png' % name))
201 | plt.clf()
202 |
203 | def find_best_thresh(preds, scores, na_probs, qid_to_has_ans):
204 | num_no_ans = sum(1 for k in qid_to_has_ans if not qid_to_has_ans[k])
205 | cur_score = num_no_ans
206 | best_score = cur_score
207 | best_thresh = 0.0
208 | qid_list = sorted(na_probs, key=lambda k: na_probs[k])
209 | for i, qid in enumerate(qid_list):
210 | if qid not in scores: continue
211 | if qid_to_has_ans[qid]:
212 | diff = scores[qid]
213 | else:
214 | if preds[qid]:
215 | diff = -1
216 | else:
217 | diff = 0
218 | cur_score += diff
219 | if cur_score > best_score:
220 | best_score = cur_score
221 | best_thresh = na_probs[qid]
222 | return 100.0 * best_score / len(scores), best_thresh
223 |
224 | def find_all_best_thresh(main_eval, preds, exact_raw, f1_raw, na_probs, qid_to_has_ans):
225 | best_exact, exact_thresh = find_best_thresh(preds, exact_raw, na_probs, qid_to_has_ans)
226 | best_f1, f1_thresh = find_best_thresh(preds, f1_raw, na_probs, qid_to_has_ans)
227 | main_eval['best_exact'] = best_exact
228 | main_eval['best_exact_thresh'] = exact_thresh
229 | main_eval['best_f1'] = best_f1
230 | main_eval['best_f1_thresh'] = f1_thresh
231 |
232 | def main():
233 | with open(OPTS.data_file) as f:
234 | dataset_json = json.load(f)
235 | dataset = dataset_json['data']
236 | with open(OPTS.pred_file) as f:
237 | preds = json.load(f)
238 | if OPTS.na_prob_file:
239 | with open(OPTS.na_prob_file) as f:
240 | na_probs = json.load(f)
241 | else:
242 | na_probs = {k: 0.0 for k in preds}
243 | qid_to_has_ans = make_qid_to_has_ans(dataset) # maps qid to True/False
244 | has_ans_qids = [k for k, v in qid_to_has_ans.items() if v]
245 | no_ans_qids = [k for k, v in qid_to_has_ans.items() if not v]
246 | exact_raw, f1_raw = get_raw_scores(dataset, preds)
247 | exact_thresh = apply_no_ans_threshold(exact_raw, na_probs, qid_to_has_ans,
248 | OPTS.na_prob_thresh)
249 | f1_thresh = apply_no_ans_threshold(f1_raw, na_probs, qid_to_has_ans,
250 | OPTS.na_prob_thresh)
251 | out_eval = make_eval_dict(exact_thresh, f1_thresh)
252 | if has_ans_qids:
253 | has_ans_eval = make_eval_dict(exact_thresh, f1_thresh, qid_list=has_ans_qids)
254 | merge_eval(out_eval, has_ans_eval, 'HasAns')
255 | if no_ans_qids:
256 | no_ans_eval = make_eval_dict(exact_thresh, f1_thresh, qid_list=no_ans_qids)
257 | merge_eval(out_eval, no_ans_eval, 'NoAns')
258 | if OPTS.na_prob_file:
259 | find_all_best_thresh(out_eval, preds, exact_raw, f1_raw, na_probs, qid_to_has_ans)
260 | if OPTS.na_prob_file and OPTS.out_image_dir:
261 | run_precision_recall_analysis(out_eval, exact_raw, f1_raw, na_probs,
262 | qid_to_has_ans, OPTS.out_image_dir)
263 | histogram_na_prob(na_probs, has_ans_qids, OPTS.out_image_dir, 'hasAns')
264 | histogram_na_prob(na_probs, no_ans_qids, OPTS.out_image_dir, 'noAns')
265 | if OPTS.out_file:
266 | with open(OPTS.out_file, 'w') as f:
267 | json.dump(out_eval, f)
268 | else:
269 | print(json.dumps(out_eval, indent=2))
270 |
271 | if __name__ == '__main__':
272 | OPTS = parse_args()
273 | if OPTS.out_image_dir:
274 | import matplotlib
275 | matplotlib.use('Agg')
276 | import matplotlib.pyplot as plt
277 | main()
278 |
279 |
--------------------------------------------------------------------------------
/data/brains/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hans/nn-decoding/2d2cc639f650b6911cb1de7b8ecb7a872f75b36d/data/brains/.gitkeep
--------------------------------------------------------------------------------
/data/sentences/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hans/nn-decoding/2d2cc639f650b6911cb1de7b8ecb7a872f75b36d/data/sentences/.gitkeep
--------------------------------------------------------------------------------
/main.nf:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env nextflow
2 |
3 | import org.yaml.snakeyaml.Yaml
4 |
5 | // Finetune parameters
6 | params.finetune_runs = 1
7 | params.finetune_steps = 250
8 | params.finetune_checkpoint_steps = 5
9 | params.finetune_learning_rate = "2e-5"
10 | params.finetune_squad_learning_rate = "3e-5"
11 | // CLI params shared across GLUE and SQuAD tasks
12 | finetune_cli_params = """--do_train=true --do_eval=true \
13 | --bert_config_file=\$BERT_MODEL/bert_config.json \
14 | --vocab_file=\$BERT_MODEL/vocab.txt \
15 | --init_checkpoint=\$BERT_MODEL/bert_model.ckpt \
16 | --num_train_steps=${params.finetune_steps} \
17 | --save_checkpoints_steps=${params.finetune_checkpoint_steps} \
18 | --output_dir ."""
19 |
20 | // Encoding extraction parameters.
21 | params.extract_encoding_layers = "-1"
22 | params.extract_encoding_cls = true
23 |
24 | // Decoder learning parameters
25 | params.decoder_projection = 256
26 | params.brain_projection = 256
27 | params.decoder_n_jobs = 5
28 | params.decoder_n_folds = 8
29 |
30 | // Structural probe parameters
31 | params.structural_probe_layers = "11"
32 | structural_probe_layers = params.structural_probe_layers.split(",")
33 | params.structural_probe_spec = "structural-probes/spec.yaml"
34 | structural_probe_spec = new Yaml().load((params.structural_probe_spec as File).text)
35 |
36 | // TODO generalize
37 | params.structural_probe_train_path = "structural-probes/en_ewt-ud/en_ewt-ud-train.txt"
38 | params.structural_probe_dev_path = "structural-probes/en_ewt-ud/en_ewt-ud-dev.txt"
39 | params.structural_probe_train_conll_path = "structural-probes/en_ewt-ud/en_ewt-ud-train.conllu"
40 | params.structural_probe_dev_conll_path = "structural-probes/en_ewt-ud/en_ewt-ud-dev.conllu"
41 |
42 |
43 | /////////
44 |
45 | params.outdir = "output"
46 |
47 | def get_checkpoint_num = { checkpoint_f ->
48 | (checkpoint_f.name =~ /-step(\d+)/)[0][1] as int
49 | }
50 |
51 | // Given a channel of checkpoints grouped by `(model, run) => fs`, where fs are
52 | // per-step checkpoint files, flatten to a channel of checkpoints grouped by
53 | // `(model, run, step) => f`, where `f` is an individual checkpoint file.
54 | def flatten_checkpoint_channel = { ch ->
55 | ch.flatMap {
56 | output ->
57 | run_id = output[0]
58 | files = (output[1] instanceof Collection ? output[1] : [output[1]])
59 | files.collect { f ->
60 | step_num = get_checkpoint_num(f)
61 | step_id = run_id + [step_num]
62 | [step_id, f]
63 | }
64 | }.groupTuple()
65 | }
66 |
67 | /////////
68 |
69 | glue_tasks = Channel.from("MNLI", "SST", "QQP")
70 | brain_images = Channel.fromPath([
71 | // Download images for all subjects participating in experiment 2.
72 | "https://www.dropbox.com/s/5umg2ktdxvautci/P01.tar?dl=1",
73 | "https://www.dropbox.com/s/parmzwl327j0xo4/M02.tar?dl=1",
74 | "https://www.dropbox.com/s/4p9sbd0k9sq4t5o/M04.tar?dl=1",
75 | "https://www.dropbox.com/s/4gcrrxmg86t5fe2/M07.tar?dl=1",
76 | "https://www.dropbox.com/s/3q6xhtmj611ibmo/M08.tar?dl=1",
77 | "https://www.dropbox.com/s/kv1wm2ovvejt9pg/M09.tar?dl=1",
78 | "https://www.dropbox.com/s/8i0r88n3oafvsv5/M14.tar?dl=1",
79 | "https://www.dropbox.com/s/swc5tvh1ccx81qo/M15.tar?dl=1",
80 | ])
81 |
82 | /**
83 | * Uncompress brain image data.
84 | */
85 | process extractBrainData {
86 | label "small"
87 | publishDir "${params.outdir}/brains"
88 |
89 | input:
90 | file("*.tar*") from brain_images.collect()
91 |
92 | output:
93 | file("*") into brain_images_uncompressed
94 |
95 | """
96 | #!/usr/bin/env bash
97 | find . -name '*tar*' | while read -r path; do
98 | newpath="\${path%.*}"
99 | mv "\$path" "\$newpath"
100 | tar xf "\$newpath"
101 | rm "\$newpath"
102 | done
103 | """
104 | }
105 |
106 | sentence_data = Channel.fromPath("https://www.dropbox.com/s/jtqnvzg3jz6dctq/stimuli_384sentences.txt?dl=1")
107 | sentence_data.into { sentence_data_for_extraction; sentence_data_for_decoder }
108 |
109 | /**
110 | * Fetch GLUE task data (except SQuAD).
111 | */
112 | process fetchGLUEData {
113 | label "small"
114 |
115 | output:
116 | file("GLUE") into glue_data
117 |
118 | """
119 | #!/usr/bin/env bash
120 | download_glue_data.py -d GLUE -t SST,QQP,MNLI
121 | cd GLUE && ln -s SST-2 SST
122 | """
123 | }
124 |
125 | /**
126 | * Fetch the SQuAD dataset.
127 | */
128 | Channel.fromPath("https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json").set { squad_train_ch }
129 | Channel.fromPath("https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json").into {
130 | squad_dev_for_train_ch; squad_dev_for_eval_ch
131 | }
132 |
133 | /**
134 | * Fine-tune and evaluate the BERT model on the GLUE datasets (except SQuAD).
135 | */
136 | process finetuneGlue {
137 | label "gpu_large"
138 | container params.bert_container
139 | publishDir "${params.outdir}/bert/${run_id_str}"
140 | tag "${run_id_str}"
141 |
142 | input:
143 | val glue_task from glue_tasks
144 | each file(glue_dir) from glue_data
145 | each run from Channel.from(1..params.finetune_runs)
146 |
147 | output:
148 | set run_id, file("model.ckpt-step*") into model_ckpt_files_glue
149 | set run_id, file("eval_results.txt"), file("eval") into model_eval_glue
150 | set run_id, file("events.out*") into model_events_glue
151 |
152 | script:
153 | run_id = [glue_task, run]
154 | run_id_str = run_id.join("-")
155 | // TODO assert that glue_task exists in glue_dir
156 |
157 | """
158 | #!/usr/bin/env bash
159 | python /opt/bert/run_classifier.py --task_name=$glue_task \
160 | ${finetune_cli_params} \
161 | --data_dir=${glue_dir}/${glue_task} \
162 | --learning_rate ${params.finetune_learning_rate} \
163 | --max_seq_length 128 \
164 | --train_batch_size 32
165 |
166 | # Rename model checkpoints to model.ckpt-step-*
167 | for f in model.ckpt*; do
168 | newname=\$(echo "\$f" | sed 's/ckpt-\\([[:digit:]]\\+\\)/ckpt-step\\1/')
169 | mv "\$f" "\$newname"
170 | done
171 | """
172 | }
173 |
174 | /**
175 | * Fine-tune the BERT model on the SQuAD dataset.
176 | */
177 | process finetuneSquad {
178 | label "gpu_large"
179 | container params.bert_container
180 | publishDir "${params.outdir}/bert/${run_id_str}"
181 | tag "${run_id_str}"
182 |
183 | input:
184 | file("train.json") from squad_train_ch
185 | file("dev.json") from squad_dev_for_train_ch
186 | each run from Channel.from(1..params.finetune_runs)
187 |
188 | output:
189 | set val(run_id), file("model.ckpt-step*") into model_ckpt_files_squad
190 | set run_id, file("events.out*") into model_events_squad
191 |
192 | script:
193 | run_id = ["SQuAD", run]
194 | run_id_str = run_id.join("-")
195 |
196 | """
197 | #!/usr/bin/env bash
198 | python /opt/bert/run_squad.py \
199 | ${finetune_cli_params} \
200 | --train_file=train.json \
201 | --predict_file=dev.json \
202 | --max_seq_length 384 \
203 | --train_batch_size 12 \
204 | --doc_stride 128 \
205 | --learning_rate ${params.finetune_squad_learning_rate} \
206 | --version_2_with_negative=True
207 |
208 | # Rename model checkpoints to model.ckpt-step-*
209 | for f in model.ckpt*; do
210 | newname=\$(echo "\$f" | sed 's/ckpt-\\([[:digit:]]\\+\\)/ckpt-step\\1/')
211 | mv "\$f" "\$newname"
212 | done
213 | """
214 | }
215 |
216 | model_ckpt_files_squad.into { squad_for_eval; squad_for_extraction }
217 |
218 | /**
219 | * Run evaluation for the SQuAD fine-tuned models.
220 | */
221 | // Group SQuAD checkpoints based on their run-step.
222 | squad_eval_ckpts = flatten_checkpoint_channel(squad_for_eval)
223 |
224 | process evalSquad {
225 | label "gpu_medium"
226 | container params.bert_container
227 | publishDir "${params.outdir}/eval_squad/${ckpt_id_str}"
228 | tag "${ckpt_id_str}"
229 |
230 | input:
231 | set ckpt_id, file(ckpt_files) from squad_eval_ckpts
232 | each file("dev.json") from squad_dev_for_eval_ch
233 |
234 | output:
235 | set ckpt_id, file("predictions.json"), file("null_odds.json"), file("results.json") into squad_eval_results
236 |
237 | script:
238 | ckpt_id_str = ckpt_id.join("-")
239 | ckpt_step = ckpt_id.last()
240 |
241 | """
242 | #!/usr/bin/env bash
243 |
244 | # Output a dummy checkpoint metadata file.
245 | echo "model_checkpoint_path: \"model.ckpt-step${ckpt_step}\"" > checkpoint
246 |
247 | # Run prediction.
248 | python /opt/bert/run_squad.py --do_predict \
249 | --vocab_file=\$BERT_MODEL/vocab.txt \
250 | --bert_config_file=\$BERT_MODEL/bert_config.json \
251 | --init_checkpoint=model.ckpt-step${ckpt_step} \
252 | --predict_file=dev.json \
253 | --doc_stride 128 --version_2_with_negative=True \
254 | --predict_batch_size 32 \
255 | --output_dir .
256 |
257 | # Evaluate using SQuAD tools.
258 | eval_squad.py dev.json \
259 | predictions.json --na-prob-file null_odds.json > results.json
260 | """
261 | }
262 |
263 | /**
264 | * Prepare a dummy model checkpoint from the pretrained BERT model.
265 | */
266 | process prepareBaselineCheckpoint {
267 | label "small"
268 | container params.bert_container
269 | publishDir "${params.outdir}/bert/${run_id_str}"
270 | tag "${run_id_str}"
271 |
272 | output:
273 | set run_id, file("model.ckpt-step*") into model_ckpt_files_baseline
274 |
275 | script:
276 | run_id = ["baseline", 1]
277 | run_id_str = run_id.join("-")
278 |
279 | '''
280 | #!/usr/bin/env bash
281 |
282 | for ckpt_file in $BERT_MODEL/bert_model.ckpt*; do
283 | newname=$(basename "$ckpt_file" | sed 's/\\(.\\+\\).ckpt./model.ckpt-step0./')
284 | cp "$ckpt_file" "$newname"
285 | done
286 | '''
287 | }
288 |
289 |
290 | // Concatenate GLUE results with SQuAD and baseline results.
291 | // Each channel item is grouped by key `(, )`.
292 | model_ckpt_files_glue.concat(squad_for_extraction).concat(model_ckpt_files_baseline) \
293 | .into { model_ckpts_for_decoder; model_ckpts_for_sprobe }
294 |
295 | /* // Group model checkpoints by keys `(, )`. */
296 | /* model_ckpt_files.flatMap { output -> */
297 | /* run_id = output[0] */
298 | /* files = output[1] */
299 | /* files.collect { f -> */
300 | /* tuple(tuple(ckpt_id[0], (file.name =~ /^model.ckpt-(\d+)/)[0][1]), */
301 | /* file) } } */
302 | /* .groupTuple() */
303 | /* .into { model_ckpts_for_decoder; model_ckpts_for_sprobe } */
304 |
305 | /**
306 | * Extract .jsonl sentence encodings from each fine-tuned model.
307 | */
308 | process extractEncoding {
309 | label "gpu_medium"
310 | container params.bert_container
311 |
312 | input:
313 | set run_id, file(ckpt_files) from model_ckpts_for_decoder
314 | each file(sentences) from sentence_data_for_extraction
315 |
316 | output:
317 | set run_id, file("encodings*.jsonl") into encodings_jsonl
318 |
319 | tag "${run_id_str}"
320 |
321 | script:
322 | run_id_str = run_id.join("-")
323 |
324 | all_ckpts = ckpt_files.target.collect(get_checkpoint_num).unique()
325 | all_ckpts_str = all_ckpts.join(" ")
326 |
327 | """
328 | #!/usr/bin/env bash
329 |
330 | for ckpt in ${all_ckpts_str}; do
331 | python /opt/bert/extract_features.py \
332 | --input_file=${sentences} \
333 | --output_file=encodings-step\$ckpt.jsonl \
334 | --vocab_file=\$BERT_MODEL/vocab.txt \
335 | --bert_config_file=\$BERT_MODEL/bert_config.json \
336 | --init_checkpoint=model.ckpt-step\$ckpt \
337 | --layers="${params.extract_encoding_layers}" \
338 | --max_seq_length=128 \
339 | --batch_size=64
340 | done
341 | """
342 | }
343 |
344 | // Expand jsonl encodings into individual identifier + jsonl files
345 | // (one item per task-run-step)
346 | encodings_jsonl_flat = flatten_checkpoint_channel(encodings_jsonl)
347 |
348 | /**
349 | * Convert .jsonl encodings to easier-to-use numpy arrays, saved as .npy
350 | */
351 | process convertEncoding {
352 | label "medium"
353 | container params.bert_container
354 | tag "${ckpt_id_str}"
355 | publishDir "${params.outdir}/encodings/${ckpt_id_str}"
356 |
357 | input:
358 | set ckpt_id, file(encoding_jsonl) from encodings_jsonl_flat
359 |
360 | output:
361 | set ckpt_id, file("*.npy") into encodings
362 |
363 | script:
364 | ckpt_id_str = ckpt_id.join("-")
365 |
366 | if (params.extract_encoding_cls) {
367 | modifier_flag = "-c"
368 | } else {
369 | modifier_flag = "-l ${params.extract_encoding_layers}"
370 | }
371 |
372 | """
373 | #!/usr/bin/env bash
374 | python /opt/bert/process_encodings.py \
375 | -i ${encoding_jsonl} \
376 | ${modifier_flag} \
377 | -o ${ckpt_id_str}.npy
378 | """
379 | }
380 |
381 | encodings.combine(brain_images_uncompressed.flatten()).set { encodings_brains }
382 |
383 | /**
384 | * Learn regression models mapping between brain images and model encodings.
385 | */
386 | process learnDecoder {
387 | label "medium"
388 | container params.decoding_container
389 |
390 | publishDir "${params.outdir}/decoders/${tag_str}"
391 | cpus params.decoder_n_jobs
392 |
393 | input:
394 | set ckpt_id, file(encoding), file(brain_dir) from encodings_brains
395 | each file(sentences) from sentence_data_for_decoder
396 |
397 | output:
398 | set file("decoder.csv"), file("decoder.pred.npy")
399 |
400 | tag "${tag_str}"
401 |
402 | script:
403 | ckpt_id_str = ckpt_id.join("-")
404 | tag_str = "${ckpt_id_str}-${brain_dir.name}"
405 | """
406 | #!/usr/bin/env bash
407 | python /opt/nn-decoding/src/learn_decoder.py ${sentences} \
408 | ${brain_dir} ${encoding} \
409 | --n_jobs ${params.decoder_n_jobs} \
410 | --n_folds ${params.decoder_n_folds} \
411 | --out_prefix decoder \
412 | --encoding_project ${params.decoder_projection} \
413 | --image_project ${params.brain_projection}
414 | """
415 | }
416 |
417 | sprobe_train_ch = Channel.fromPath(params.structural_probe_train_path)
418 | sprobe_dev_ch = Channel.fromPath(params.structural_probe_dev_path)
419 |
420 | /**
421 | * Extract encodings for structural probe analysis (expects hdf5 format).
422 | */
423 | process extractEncodingForStructuralProbe {
424 | label "gpu_medium"
425 | container params.bert_container
426 | tag "${run_id_str}"
427 |
428 | input:
429 | set run_id, file(ckpt_files) from model_ckpts_for_sprobe
430 | each file("train.txt") from sprobe_train_ch
431 | each file("dev.txt") from sprobe_dev_ch
432 |
433 | output:
434 | set run_id, file("encodings-*.hdf5") into encodings_sprobe
435 |
436 | script:
437 | run_id_str = run_id.join("-")
438 |
439 | all_ckpts = ckpt_files.collect(get_checkpoint_num).unique()
440 | all_ckpts_str = all_ckpts.join(" ")
441 | sprobe_layers = structural_probe_layers.join(",")
442 |
443 | """
444 | #!/usr/bin/env bash
445 | for ckpt in ${all_ckpts_str}; do
446 | for split in train dev; do
447 | python /opt/bert/extract_features.py \
448 | --input_file=\$split.txt \
449 | --output_file=encodings-step\$ckpt-\$split.hdf5 \
450 | --vocab_file=\$BERT_MODEL/vocab.txt \
451 | --bert_config_file=\$BERT_MODEL/bert_config.json \
452 | --init_checkpoint=model.ckpt-step\$ckpt \
453 | --layers="${sprobe_layers}" \
454 | --max_seq_length=96 \
455 | --batch_size=64 \
456 | --output_format=hdf5
457 | done
458 | done
459 | """
460 | }
461 |
462 | // Expand hdf5 encodings (grouped per model run) into individual hdf5 file pairs
463 | // (train and dev), grouped by model-run-step
464 | encodings_sprobe_flat = flatten_checkpoint_channel(encodings_sprobe)
465 |
466 | // Now within each channel, order hdf5 by train / dev / etc.
467 | encodings_sprobe_flat.map {
468 | el -> [el[0], el[1].groupBy { f -> (f.name =~ /-(\w+).hdf5/)[0][1] }]
469 | }.map {
470 | el ->
471 | [el[0], el[1].train, el[1].dev]
472 | }.set { encodings_sprobe_readable }
473 |
474 | sprobe_train_conll_ch = Channel.fromPath(params.structural_probe_train_conll_path)
475 | sprobe_dev_conll_ch = Channel.fromPath(params.structural_probe_dev_conll_path)
476 | sprobe_test_conll_ch = Channel.fromPath(params.structural_probe_dev_conll_path)
477 |
478 | /**
479 | * Train and evaluate structural probe for each checkpoint and each layer.
480 | */
481 | process runStructuralProbe {
482 | label "medium"
483 | container params.structural_probes_container
484 | tag "${ckpt_id_str}"
485 | publishDir "${params.outdir}/structural-probe/${ckpt_id_str}"
486 |
487 | input:
488 | set ckpt_id, file("encodings-train.hdf5"), file("encodings-dev.hdf5") \
489 | from encodings_sprobe_readable
490 | each file(train_conll) from sprobe_train_conll_ch
491 | each file(dev_conll) from sprobe_dev_conll_ch
492 | each layer from Channel.from(structural_probe_layers)
493 |
494 | output:
495 | set ckpt_id, file("dev.*") into sprobe_results
496 |
497 | script:
498 | ckpt_id_str = ckpt_id.join("-")
499 |
500 | // Copy YAML template
501 | spec = new Yaml().load(new Yaml().dump(structural_probe_spec))
502 |
503 | spec.model.model_layer = layer as int
504 |
505 | spec.dataset.corpus.root = "."
506 | spec.dataset.corpus.train_path = train_conll.getName()
507 | spec.dataset.corpus.dev_path = dev_conll.getName()
508 | spec.dataset.corpus.test_path = dev_conll.getName()
509 |
510 | spec.dataset.embeddings.train_path = "encodings-train.hdf5"
511 | spec.dataset.embeddings.dev_path = "encodings-dev.hdf5"
512 | spec.dataset.embeddings.test_path = "encodings-dev.hdf5"
513 |
514 | // Prepare to save to temporary file.
515 | yaml_spec_text = new Yaml().dump(spec)
516 |
517 | """
518 | #!/usr/bin/env bash
519 | cat < spec.yaml
520 | ${yaml_spec_text}
521 | EOF
522 |
523 | /opt/conda/bin/python /opt/structural-probes/structural-probes/run_experiment.py \
524 | --train-probe 1 --results-dir . spec.yaml
525 | """
526 | }
527 |
--------------------------------------------------------------------------------
/nextflow.config:
--------------------------------------------------------------------------------
1 | /**
2 | * This configuration file specifies a default pipeline setup for running brain
3 | * decoding on a local machine with a GPU. (See the project README for minimal
4 | * computing specs to run the pipeline.)
5 | *
6 | * While this pipeline does *work* running on a local machine, we recommend
7 | * deploying on a high-performance cluster. See the file
8 | * `nextflow.slurm.config` for an example deployment configuration for SLURM
9 | * clusters.
10 | */
11 | process {
12 | executor = "local"
13 |
14 | /**
15 | * Pipeline processes are assigned "labels" according to their
16 | * computational requirements. Here we can specify the actual effects of
17 | * each label when the pipeline runs on your system.
18 | *
19 | * This first label, `small`, describes simple tasks which can easily run
20 | * on a single CPU with minimal RAM -- e.g. downloading a file.
21 | */
22 | withLabel: 'small' {
23 | executor = 'local'
24 | memory = '1G'
25 | }
26 |
27 | /**
28 | * `medium` describes tasks which require moderate host memory, e.g.
29 | * loading brain images and learning linear regression models.
30 | */
31 | withLabel: 'medium' {
32 | time = '1d'
33 | memory = '8G'
34 | }
35 |
36 | /**
37 | * `gpu_medium` describes tasks which require a GPU with moderate memory
38 | * and moderate host RAM, for e.g. holding a dataset in memory and running
39 | * neural network feed-forward inference.
40 | */
41 | withLabel: 'gpu_medium' {
42 | containerOptions = "--nv"
43 | }
44 |
45 | /**
46 | * `gpu_large` describes tasks which require a GPU with lots of memory and
47 | * large host RAM, for e.g. holding a training dataset in memory and
48 | * running neural network training.
49 | */
50 | withLabel: 'gpu_large' {
51 | containerOptions = "--nv"
52 | time = '1d'
53 | memory = '16G'
54 | containerOptions = "--nv"
55 | }
56 | }
57 |
58 | /**
59 | * You can limit the maximum number of parallel executing processes using
60 | * the variable below. This may be relevant when running with a single GPU,
61 | * for example.
62 | */
63 | executor {
64 | queueSize = 1
65 | }
66 |
67 | // There should be no need to edit below this line.
68 | //////////////////////////////////////////
69 |
70 | params.bert_container = "library://jon/default/bert:base-gpu"
71 | params.structural_probes_container = "library://jon/default/structural-probes:latest"
72 | params.decoding_container = "library://jon/default/nn-decoding:emnlp2019"
73 |
74 | singularity {
75 | enabled = true
76 | envWhitelist = "CUDA_VISIBLE_DEVICES"
77 | autoMounts = true
78 | }
79 | report.enabled = true
80 |
81 |
--------------------------------------------------------------------------------
/nextflow.slurm.config:
--------------------------------------------------------------------------------
1 | /**
2 | * This configuration file specifies an example deployment of the pipeline for
3 | * a SLURM HPC cluster. This example can be repurposed to fit other HPC setups,
4 | * from PBS to Kubernetes. See the Nextflow docs for more information:
5 | * https://www.nextflow.io/docs/latest/executor.html
6 | */
7 | process {
8 | /* Specify an HPC queue. */
9 | /* queue = "cpl" */
10 |
11 | /**
12 | * Pipeline processes are assigned "labels" according to their
13 | * computational requirements. Here we can specify the actual effects of
14 | * each label when the pipeline runs on your system.
15 | *
16 | * This first label, `small`, describes simple tasks which can easily run
17 | * on a single CPU with minimal RAM -- e.g. downloading a file.
18 | */
19 | withLabel: 'small' {
20 | executor = 'local'
21 | time = '1h'
22 | }
23 |
24 | /**
25 | * `medium` describes tasks which require moderate host memory, e.g.
26 | * loading brain images and learning linear regression models.
27 | */
28 | withLabel: 'medium' {
29 | executor = 'slurm'
30 |
31 | time = '1d'
32 | memory = '8G'
33 | }
34 |
35 | /**
36 | * `gpu_medium` describes tasks which require a GPU with moderate memory
37 | * and moderate host RAM, for e.g. holding a dataset in memory and running
38 | * neural network feed-forward inference.
39 | */
40 | withLabel: 'gpu_medium' {
41 | executor = 'slurm'
42 | containerOptions = "--nv"
43 | clusterOptions = "--gres=gpu:tesla-k80:1"
44 |
45 | time = '1h'
46 | memory = '8G'
47 | }
48 |
49 | /**
50 | * `gpu_large` describes tasks which require a GPU with lots of memory and
51 | * large host RAM, for e.g. holding a training dataset in memory and
52 | * running neural network training.
53 | */
54 | withLabel: 'gpu_large' {
55 | executor = 'slurm'
56 | containerOptions = "--nv"
57 | clusterOptions = '--gres=gpu:GEFORCEGTX1080TI:1'
58 |
59 | time = '1d'
60 | memory = '8G'
61 | }
62 | }
63 |
64 | executor {
65 | $slurm {
66 | // Limit number of parallel SLURM jobs to 16.
67 | queueSize = 16
68 | }
69 | }
70 |
71 | // There should be no need to edit below this line.
72 | //////////////////////////////////////////
73 |
74 | params.bert_container = "library://jon/default/bert:base-gpu"
75 | params.structural_probes_container = "library://jon/default/structural-probes:latest"
76 | params.decoding_container = "library://jon/default/nn-decoding:emnlp2019"
77 |
78 | singularity {
79 | enabled = true
80 | envWhitelist = "CUDA_VISIBLE_DEVICES"
81 | autoMounts = true
82 | }
83 | report.enabled = true
84 |
85 |
--------------------------------------------------------------------------------
/notebooks/pca_check.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 5,
6 | "metadata": {},
7 | "outputs": [
8 | {
9 | "name": "stdout",
10 | "output_type": "stream",
11 | "text": [
12 | "The autoreload extension is already loaded. To reload it, use:\n",
13 | " %reload_ext autoreload\n"
14 | ]
15 | }
16 | ],
17 | "source": [
18 | "from functools import partial\n",
19 | "import itertools\n",
20 | "import json\n",
21 | "from pathlib import Path\n",
22 | "import re\n",
23 | "import sys\n",
24 | "sys.path.append(\"../src\")\n",
25 | "\n",
26 | "import matplotlib\n",
27 | "import matplotlib.pyplot as plt\n",
28 | "import numpy as np\n",
29 | "import pandas as pd\n",
30 | "import seaborn as sns\n",
31 | "import scipy.io as io\n",
32 | "import scipy.stats as st\n",
33 | "from sklearn.decomposition import PCA\n",
34 | "import statsmodels.formula.api as smf\n",
35 | "from tqdm import tqdm, tqdm_notebook\n",
36 | "\n",
37 | "%matplotlib inline\n",
38 | "sns.set(style=\"whitegrid\", context=\"paper\", font_scale=3, rc={\"lines.linewidth\": 2})\n",
39 | "from IPython.display import set_matplotlib_formats\n",
40 | "set_matplotlib_formats('png')\n",
41 | "#set_matplotlib_formats('svg')\n",
42 | "\n",
43 | "%load_ext autoreload\n",
44 | "%autoreload 2\n",
45 | "import util"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 2,
51 | "metadata": {},
52 | "outputs": [],
53 | "source": [
54 | "brains_path = Path(\"../data/brains\")"
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 3,
60 | "metadata": {},
61 | "outputs": [],
62 | "source": [
63 | "PCA_DIM = 256"
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "execution_count": 6,
69 | "metadata": {},
70 | "outputs": [
71 | {
72 | "data": {
73 | "application/vnd.jupyter.widget-view+json": {
74 | "model_id": "5e4ac43d358643e6817d00355a6890ac",
75 | "version_major": 2,
76 | "version_minor": 0
77 | },
78 | "text/plain": [
79 | "HBox(children=(IntProgress(value=0, max=10), HTML(value='')))"
80 | ]
81 | },
82 | "metadata": {},
83 | "output_type": "display_data"
84 | },
85 | {
86 | "name": "stdout",
87 | "output_type": "stream",
88 | "text": [
89 | "\n"
90 | ]
91 | }
92 | ],
93 | "source": [
94 | "pca_results = []\n",
95 | "for brain_el in tqdm_notebook(list(brains_path.iterdir())):\n",
96 | " if not brain_el.is_dir(): continue\n",
97 | " \n",
98 | " images = io.loadmat(brain_el / \"examples_384sentences.mat\")[\"examples\"]\n",
99 | " pca = PCA(PCA_DIM).fit(images)\n",
100 | " \n",
101 | " subject_name = brain_el.name\n",
102 | " pca_results.append((subject_name, sum(pca.explained_variance_ratio_)))"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 7,
108 | "metadata": {},
109 | "outputs": [
110 | {
111 | "data": {
112 | "text/html": [
113 | "
\n",
114 | "\n",
127 | "
\n",
128 | " \n",
129 | "
\n",
130 | "
\n",
131 | "
subject
\n",
132 | "
explained_variance
\n",
133 | "
\n",
134 | " \n",
135 | " \n",
136 | "
\n",
137 | "
0
\n",
138 | "
M02
\n",
139 | "
0.963260
\n",
140 | "
\n",
141 | "
\n",
142 | "
1
\n",
143 | "
M04
\n",
144 | "
0.970155
\n",
145 | "
\n",
146 | "
\n",
147 | "
2
\n",
148 | "
M07
\n",
149 | "
0.965320
\n",
150 | "
\n",
151 | "
\n",
152 | "
3
\n",
153 | "
M08
\n",
154 | "
0.978509
\n",
155 | "
\n",
156 | "
\n",
157 | "
4
\n",
158 | "
M09
\n",
159 | "
0.974115
\n",
160 | "
\n",
161 | "
\n",
162 | "
5
\n",
163 | "
M14
\n",
164 | "
0.982325
\n",
165 | "
\n",
166 | "
\n",
167 | "
6
\n",
168 | "
M15
\n",
169 | "
0.975107
\n",
170 | "
\n",
171 | "
\n",
172 | "
7
\n",
173 | "
P01
\n",
174 | "
0.950888
\n",
175 | "
\n",
176 | " \n",
177 | "
\n",
178 | "
"
179 | ],
180 | "text/plain": [
181 | " subject explained_variance\n",
182 | "0 M02 0.963260\n",
183 | "1 M04 0.970155\n",
184 | "2 M07 0.965320\n",
185 | "3 M08 0.978509\n",
186 | "4 M09 0.974115\n",
187 | "5 M14 0.982325\n",
188 | "6 M15 0.975107\n",
189 | "7 P01 0.950888"
190 | ]
191 | },
192 | "execution_count": 7,
193 | "metadata": {},
194 | "output_type": "execute_result"
195 | }
196 | ],
197 | "source": [
198 | "pca_results = pd.DataFrame(pca_results, columns=[\"subject\", \"explained_variance\"])\n",
199 | "pca_results"
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": 9,
205 | "metadata": {},
206 | "outputs": [
207 | {
208 | "data": {
209 | "text/plain": [
210 | ""
211 | ]
212 | },
213 | "execution_count": 9,
214 | "metadata": {},
215 | "output_type": "execute_result"
216 | },
217 | {
218 | "data": {
219 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbUAAAFJCAYAAAAc+rO/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XlYVGX/P/D3GTZlVURBU5BSUUnN3M3KfckNyC3NJTPF3efxMTW11PpmPk+LpmmKlkuuKWCuuWGCoSguIOAOLpgji4AsygDn9we/Oc3AzDCcGRbp/bourmuYc59zfxgGPnPf514EURRFEBERVQGKig6AiIjIXJjUiIioymBSIyKiKoNJjYiIqgwmNSIiqjKY1IiIqMpgUiMioiqDSY2IiKoMJjUiIqoymNSIiKjKYFIjIqIqw9IcF7l06RJ2796Nixcv4vHjx3j+/Dl+++03NGrUSCpz8eJF3LlzB3Z2dujXr585qiUiItJiUlJTqVRYvHgxAgMDAQDqtZEFQShW9vnz51i4cCEUCgVatGiB+vXrm1I1ERFRMSZ1Py5atAiBgYEQRREuLi7o3bu33rKdOnWCh4cHRFHE8ePHTamWiIhIJ9lJLSIiAsHBwQCAcePGISQkBN9//73Bc3r16gVRFBERESG3WiIiIr1kdz/u3r0bANC+fXvMmzfPqHNatGgBALh9+7bcaomIiPSS3VK7dOkSBEHA8OHDjT7Hzc0NAJCUlCS3WiIiIr1kJ7Xk5GQAQMOGDY0+x9raGgCQm5srt1oiIiK9ZCc1S8vCnsvMzEyjz1EnQgcHB7nVEhER6SU7qdWuXRsAcP/+faPPOX/+PABwOD8REZUJ2UmtXbt2EEVRGgFZktTUVOzevRuCIKB9+/ZyqyUiItJLdlLz8/MDAFy4cAGHDh0yWDY1NRVTpkxBWloaFAoFhgwZIrdaIiIivWQP6W/dujUGDBiAAwcOYM6cOTh37hz69+8vHVcqlUhKSkJoaCj27NmDp0+fQhAEjBgxAp6enmYJnoiISJMgqte2kuH58+eYNGkSzp49q3NpLDV1FW+++SbWrl0rDTIhIiIyJ5OSGgAUFBQgICAAP//8M9LS0nSWsbe3x/jx4+Hv7w+FghsDEBFR2TA5qak9e/YM58+fR1RUFFJTU5GXlwdnZ2d4e3ujU6dOsLOzM0c1REREepktqREREVU09gUSEVGVwaRGRERVhuykdvPmTbRv3x6dOnXCo0ePSiz/6NEjdOzYER06dMC9e/fkVktERKSX7KT2+++/IyMjAy1atJBW3zfEzc0NLVu2REZGBg4fPiy3WiIiIr1kJ7U///wTgiDg7bffNvqcbt26QRRFhIWFya2WiIhIL9lJTd3l6OXlZfQ5jRs3BgA8fPhQbrVERER6yU5q6o0+7e3tjT5HXVa9BQ0REZE5yV6vytbWFhkZGXjy5InR56jLWllZyapTFEXcuXMHUVFR0tf169ehUqkAACdOnDDbtjbXr1/H5s2bER4ejuTkZDg5OcHb2xsjRoxAt27dzFIHERGZl+ykVq9ePWRkZCAyMhKdOnUy6pwLFy4AAOrWrSurzsTERLzzzjuyzi2NoKAgLFq0SEqWQGHL9NSpUzh16hTee+89LF68uMzjICKi0pHd/di+fXuIoojt27cjPT29xPJPnjzB9u3bzbafmqurK3r16oW2bduafC1NkZGRWLhwIVQqFZo0aYKNGzciPDwcgYGB6NmzJwBgx44dCAgIMGu9RERkOtnLZN2+fRsDBw6EKIpo0aIFVq1aBVdXV51llUolpk2bhujoaCgUCgQFBZVqgIlaZmYmzp49i1atWkk7b69atQqrV68GYJ7ux6FDhyIqKgouLi44cOAAatasKR0TRREffvghzpw5A1tbW5w4cQLOzs5GXzsyMtKk2IiI/onatGljdFnZ3Y+vvPIKRo4ciV9++QXR0dHo27cv+vXrhw4dOqBOnToQBAFKpRLnzp3D4cOH8ezZM2k/NTkJDSgcaKJuLZWF6OhoREVFAQAmTJigldAAQBAEzJ49G2fOnEF2djb27duHDz74oFR1lOaXQ0T0T1faxoBJG5vNnz8fSUlJ+P3335GTk4OgoCAEBQUVK6duDPbt2xcLFiwwpcoyFRISIj3u16+fzjLe3t5wd3fHvXv3cPLkyVInNSIiKjsmrf1oYWGBlStXYunSpXjppZcgiqLOrwYNGuCLL77AihUrYGFhYa7YzS4mJgZA4f06Q6uktGrVSqs8ERFVDmbZgnrYsGEYNmwYrl27hpiYGKSmpgIAnJ2d8eqrr8rubixv8fHxAIAGDRoYLKe+b5eVlQWlUqn3XiIREZUvsyQ1taZNm6Jp06bmvGS5Us+jq1WrlsFymsfT0tKY1IiIKgluPaMhJycHAGBtbW2wXLVq1aTH2dnZZRoTEREZz6wttapCEASDx03ZLDwuLk72uUREZJhZkpooirh16xbu37+PzMxMFBQUlHiOj4+POao2q+rVq0OlUuH58+cGy2ket7W1LVUdzZo1kxUbEdE/UbkO6c/Ly8P69euxfft2pKSkGH2eIAiVMqnVrFkTGRkZJf4smsdr1KhR1mEREZGRZCe1vLw8TJw4EeHh4SZ1x1Umnp6euHv3Lu7fv2+w3IMHDwAAdnZ2HCRCVIFysrOQn6cquWAZsrC0QnVbuwqNgf4mO6nt2LEDf/75J4DCFo6fnx9atmwJJycnKBQv5vgTb29vnDp1Ckql0uBQ/StXrkjliUojNzMTBaqK/ScMAAorK1iXYtuoyio/T4X1y+ZWaAwT5y+v0PpJm+yktn//fgCAh4cHduzYUao1ECurbt264YcffgAAHD58GOPGjStWJjY2Fvfu3QMAdO/evTzDoyqgQKXCyZn/rugw0H3ltwaP52Q/R15eyffGy5qlpQLVbW0qOgyT5T1TQcyv2NdTsFDAspq8bb9eJLKT2u3btyEIAiZPnlwlEhoAtGjRAi1btkRUVBQ2bNgAHx8frXtmoijim2++AVA4QGTw4MEVFWq5ysvJREFeXoXGoLC0hGX1F79l8aLIyyvAquUHKzoMTJ/bv6JDMAsxvwCxm8MrNIbmY43bIuxFJzupqUc4NmnSxGzBGOPWrVvIzMyUvn/06JH0OC4uTmtXbXd3d62EGxgYiPnz5wMAli1bBj8/v2LXnzdvHsaMGYOkpCSMHj0a8+bNQ7NmzaBUKrFmzRqEhYUBAKZMmWJyMs/Mfo5cVb5J1zCVtZUF7Ev4JFyQl4fotf8pp4h0azH5a4PHM59lQVVQsYkXAKwUlrCvxvsrRBXFpE1C79y5U+6Tj5csWYKIiAidx6ZNm6b1vb7EZUibNm3wxRdfYNGiRbhx4wbGjx9frMyIESPw0Ucfleq6uuSq8jHt/4ovAF2eVi/wrdD6zUVVkId/7fq0osPAd8OXVnQIRP9oskd09OjRAwBw/vx5swVTWfj6+mLv3r3w8/NDvXr1YGVlBRcXF7z99tv48ccfsWTJkooOkYiIdJDdUhs3bhwCAwOxefNm+Pr6GlzV3py2bt0q+1w/Pz+jW25eXl5YtmyZ7LqIiF40OTk5yM+v2FsiFhYWqF69uuzzZSc1Z2dnrFmzBpMnT8bIkSOxaNEidOvWTXYgRERUsfLz87Fu3boKjWHSpEkmnS87qY0ZMwYA4ODggISEBEyZMgUODg5o2LCh1oK/ugiCgM2bN8utmoiISCfZSS0iIkJa+FcQBIiiiIyMDERHRxs8TxTFEhcMJiIiksOk0Y9ERESVieykdvLkSXPGQUREZLIXc5FGIiIiHZjUiIioymBSIyKiKoNJjYiIqgyTdr5Wi4yMxNGjRxEbG4u0tDTk5OQY3DhUEAQcP37cHFUTERFJTEpqGRkZmDNnDk6fPg0AOhOZeg5b0eeIiIjMzaStZ6ZOnYoLFy5AFEXUrFkTbm5uiIuLgyAIaNOmDdLT0xEfH4+8vDwIggBPT0/UqlXLnPETERFJZCe1I0eO4Pz58xAEAVOnTsXUqVNx69YtDBo0CADwyy+/AACysrKwa9cufP/990hPT8eyZcvQqlUr80RPRESkQfZAkUOHDgEo3C16+vTpUCgUOrsV7ezsMH78eGzYsAHp6emYNm0aUlNT5UdMRESkh+ykdvXqVQiCgKFDhxpVvm3bthgyZAiSkpKwbds2udUSERHpJTupPXnyBADQoEGDvy+m+Ptyubm5xc5RbyzKJbaIiKgsyE5q6hGNNWrUkJ6zt7eXHqekpBQ7x9nZGQDw8OFDudUSERHpJTupqUcxZmRkSM85OzvD0rJw7MnNmzeLnaNUKgEA2dnZcqslIiLSS3ZSa9y4MQAgPj5ees7KygqNGjUCABw+fLjYOcHBwQAAV1dXudUSERHpJTupde7cGaIoIiIiQuv53r17QxRFBAcH47vvvsONGzcQFRWFTz/9FEePHoUgCOjcubPJgRMRERUlO6l169YNAHDq1ClkZWVJz48ZMwa1a9cGAKxfvx6DBw/G8OHD8euvvwIAbG1t8dFHH5kSMxERkU6yk5qHhwdWr16NL7/8Es+ePZOet7e3x8aNG+Hh4QFRFLW+ateujbVr12qNmCQiIjIXk9Z+7Nmzp87nmzRpggMHDuDcuXO4ceMG8vLy4OnpiTfffBM2NjamVElERKSXWVbp13lhS0u88cYbeOONN8qqCiIiIi3cT42IiKoMJjUiIqoymNSIiKjKKPGeWrNmzQAUbuwZGxtb7Hk5il6LiIjIHEpMarp2szb0PBERUUUpMan5+vqW6nkiIqKKUmJSW7ZsWameJyIiqigcKEJERFWG7MnX6pZau3bt9K4sQkREVJ5kJ7XNmzdDEAR07NjRnPEQERHJJrv7Ub3jtZubm9mCISIiMoXspObu7g4ASElJMVswREREppCd1Hr16gVRFHHs2DFzxkNERCSb7KT2/vvvw8PDA3v37sWpU6fMGBIREZE8spNa9erVsXHjRjRq1AhTpkzB/PnzER4ejrS0NK42QkREFUL26EfNtR9FUURwcDCCg4ONOpdrPxIRUVmQndSKtsbYOiMiooomO6lx7UciIqpsTF5RhIiIqLLg2o9ERFRlMKkREVGVwaRGRERVhux7arrk5+cjIyMDz549K3E0ZL169cxZNRERkelJLSMjA9u2bcPRo0dx8+ZN5Ofnl3gO56kREVFZMCmpXbt2Df7+/lAqlZynRkREFU52UsvKyoK/vz8ePXoEhUKBHj16wNnZGbt374YgCJg8eTIyMjIQHR2NK1euQBAEtG7dGp07dzZn/ERERBLZSW337t149OgRLCwssGHDBnTq1Ak3b97E7t27AQAzZsyQykZHR2POnDm4cuUKfHx8MGzYMNMjJyIiKkL26MdTp05BEAT07NkTnTp1Mli2RYsW2Lx5MxwdHfH555/j5s2bcqslIiLSS3ZSu3XrFgCgT58+Oo8Xvcfm6uqK0aNHQ6VSYdu2bXKrJSIi0kt2UktPTwegPTTfyspKepyTk1PsnPbt2wMAwsPD5VZLRESkl+ykpk5gmonMzs5Oevz48eNi51hbW+s9RkREZCrZSc3NzQ0AkJqaKj3n4uKC6tWrAwBiYmKKnZOQkCC3OiIiohLJTmrqTUI1B30IgoAWLVpAFEXs2LFDq3xubi42bdoEAHB3d5dbLRERkV6yk9pbb70FURRx5swZred9fHwAAJGRkRg5ciR++eUXBAQEYOjQoYiNjYUgCOjVq5dpURMREekge57aW2+9BUEQcO7cOfz111+oW7cugMLNQ/fs2YOLFy/i0qVLuHTpktZ5Hh4e+OCDD0yLmoiISAfZLTVnZ2ecPXsWoaGhcHFxkZ4XBAHr16+Hn58fLC0tIYoiRFGEIAjo0aMHtm7dqjWghIiIyFxMWvvRyclJ5/P29vb48ssvsWDBAiQkJCA/Px/u7u6oUaOGKdUREREZZNatZ4qys7ODt7d3WVZBREQkkd39qGtyNRERUUWSndTeeOMNzJ8/H2fPnjVnPERERLLJ7n7Mzs5GcHAwgoODUbduXQwaNAiDBw+Gp6enOeMjIiIymuyWWrt27QAULlz88OFDrFu3Du+88w6GDx+OnTt3IiMjw2xBEhERGUN2Utu6dStOnjyJmTNnwtPTUxq6HxUVhSVLlqBLly6YMWMGTp48ifz8fHPGTEREpJPspAYAdevWxeTJk3H48GHs3r0bI0eOhJOTE0RRRG5uLo4dO4apU6fizTffxJdffonY2FhzxU1ERFSMSUlNU8uWLfHpp58iLCwMq1evRq9evaTJ16mpqdi6dSveffddDBw4EBs3bjRXtURERBKzJTU1S0tL9OzZE6tWrUJYWBgWLVqEli1bSt2TN2/exNdff23uaomIiMyf1DQ5OTlh1KhR2L17N77//ns4OjqWZXVERPQPV6YriqSmpmL//v3Yt28f4uLiyrIqIiIi8ye13NxcHD9+HL/99hvCwsKkkY+iKAIAWrRoIW1PQ0REZE5mS2oXLlzAvn37cOTIEWRmZgL4O5G5ublh0KBB8PHxwcsvv2yuKomIiLSYlNTu37+P4OBg/Pbbb3jw4AGAvxNZ9erV0atXL/j6+qJjx44QBMH0aImIiAyQndTee+89XL58GcDfiUwQBHTo0AE+Pj7o06cPbG1tzRMlERGREWQnNc0drRs2bAgfHx8MHjxY2gGbiIiovMlOao6OjnjnnXfg6+uLVq1amTMmIiIiWWQntbCwMFhbW5stkJycHFy9ehXA34slExERlYbspGbOhAYADx48wOjRo6FQKLhGJBERyVKmK4rIoR50QkREVFqVLqkRERHJxaRGRERVBpMaERFVGUxqRERUZZTpKv1lKSQkBDt37kRMTAzS09Ph4uKCTp06YezYsfDy8pJ93e7duyMxMbHEcitXrkTfvn1l10NEROb3QrbUPvvsM/j7++PUqVNISkpCbm4uHj58iL1792LIkCEIDg6u6BCJiKgCvHAttYCAAOzcuRMA0LNnT0yZMgV169ZFbGwsli9fjhs3bmDBggVo0KAB2rRpI7ueSZMmYdKkSXqPV6tWTfa1iYiobLxQSS01NRVr1qwBAHTp0gWrV6+WVv/v0qULvL29MWDAACQnJ2P58uXYvXu37LqsrKxgZ2dnlriJiKh8vFDdj0FBQcjOzgYA/Pvf/y62nU3NmjUxYcIEAMCVK1cQExNT7jESEVHFeaGSWkhICADA3d0d3t7eOsv069dPenzy5MlyiYuIiCqHFyqpqVtehnYFcHNzg6urq1Z5U6hUKi7dRUT0gnhh7qkplUqp67FBgwYGy9avXx9KpRLx8fGy6wsKCsKOHTuQkpICCwsLuLm5oX379hg1ahRatGgh+7pERFR2Kk1LrXr16mjXrh3atm2r8/iTJ0+kx7Vq1TJ4LfXxtLQ02fEkJiYiJSUFAJCfn4/ExEQEBQVhyJAhWL58OVtvRESVUKVpqdWvXx9bt27Ve1zdSgMAGxsbg9dSH8/Kyip1HE2aNMH777+Ptm3bom7dunByckJSUhJOnz6NtWvXQqlU4qeffoKNjQ1mzZpV6uvHxcVJj2vVeanU55tbXl6eVky6eLi5lFM0+uXlqXDbQJwu9WqXYzT65alUBl/PBi4V/1oCJf/e3Vzrl2M0+pUUZ906Ff97z1MZ8TdUz72cotFPlZeHW3F3DJapW7duOUWjnzH/kwypNEmtJKVpGZnSivrxxx+LPffSSy/hvffeQ+/evfHee+/h7t27CAgIgJ+fH9zdS/dmbdasmfQ4NT3bQMnyYWlpqRWTLrlP5bd4zcXS0spgnE+y08sxGv0srQzH+Uyjx6EilfR7f5qRU47R6FdSnJkZleC9aVXy35Aq63k5RaOflRF/65mZmeUUjX5Ff+eRkZGlO7+kAj169Ch9VCUQBAHHjx8v1Tmac8aePzf8BsnNzS12jjnUqlULCxYswMSJE5GXl4fDhw8bnKBNRETlq8SkZsw6iGqCIBRrJel7rrRq1qwpPVbf69JHfbxGjRqlrqckb7zxBmxsbPD8+XPu0E1EVMmUmNTatWtn8Pjjx49x7949iKIIURTx0ksvoXbt2hBFESkpKVJSFAQBHh4eqF1bXh+4q6srbG1tkZ2djfv37xss++DBAwCAp6enrLoMsbS0hJOTEx4/foynT5+a/fpERCRfiUnN0OCNsLAwzJ49G9WqVcOkSZMwdOjQYiMTU1NT8euvv2LdunV48uQJPv30U3Tu3FlWsN7e3jh//jyioqL0llEqlVAqlVJ5c1OpVNKoSgcHB7Nfn4iI5JM9pP/+/fuYOXMmcnNz8csvv8Df31/nUHtnZ2dMmjQJ27ZtQ25uLmbOnCm1pEqrW7duAIC7d+/q7fo7fPiw9Lh79+6y6jHk9OnT0j27skiaREQkn+yktmnTJmRlZWHcuHFG/XNv1qwZxo0bh6dPn2LTpk2y6vT19YWtrS0A4Ntvvy12ry4tLQ0bNmwAULjqSGmTzqNHjwweVyqV+PLLLwEULnisuSQXERFVPNlJLSwsDIIgoEuXLkafoy4bGhoqq05nZ2dMmTJFusaMGTMQFxeH1NRUnDlzBqNHj0ZSUhIsLS0xd+7cYucHBgbCy8sLXl5eCAwMLHb8888/x4gRI7BlyxZERUUhOTkZGRkZuHXrFn7++Wf4+vpKrcyJEyeWuLIJERGVL9nz1NT3raytrY0+R11Wfa4cH330ER48eICdO3fi6NGjOHr0qNZxKysrfPHFF7L2UhNFEZcuXcKlS5f0lrGwsMCkSZMwY8aMUl+fiIjKluykph7Wfu3aNaPXQlTPEi9pRZCSLFmyBF27dsWOHTsQExOD9PR01K5dGx07dsS4cePg5eUl67qTJk1Cs2bNcPnyZdy7dw9paWnIzs6GnZ0d3N3d0a5dOwwdOhQvv/yySfETEVHZkJ3UmjZtinPnziEgIADvvPNOiROds7KysGHDBgiCgKZNm8qtVtKtWzdp4Iix/Pz84Ofnp/d4q1atDO4AQERElZvse2rvvvsugMJRkKNHjza4Vte1a9cwevRo3Lt3DwAwZMgQudUSERHpJbulNmjQIBw6dAinTp1CXFwc/Pz80LRpU7Rs2RK1atWCIAhITk5GdHS0VsLr2rUrBg4caJbgiYiINJm0oPH333+PTz75BAcOHABQ2CK7du1asXLqofcDBgyQhsQTERGZm0lJzdraGl9//TX8/Pywbds2hIeHa20RAwC2trbo1KkT3n//fXTq1MmkYImIiAwxy9YznTt3RufOnVFQUID79+9Ly0jVqFEDDRo0gEJRafYiJSKiKsys+6kpFAp4eHjAw8PDnJclIiIyCptQRERUZZitpRYREYGLFy8iKSkJOTk5mDVrFurUqaNVpqCgAIIgyNpPjYiIqCQmJ7WIiAgsXrwY8fHxWs+PHz9eK6lt3rwZX331Fezt7REWFmbyqiJERERFmdT9ePToUYwfPx7x8fHSJqFFV85XGz58OKpXr47MzEycPHnSlGqJiIh0kp3UkpKSMHfuXOTl5cHDwwPr1q1DZGSk3vLVqlVDjx49AAB//vmn3GqJiIj0kp3Utm7dipycHNSuXRvbt2/H22+/XeL6j+3atYMoioiJiZFbLRERkV4m76c2duxYODs7G3WOp6cnACAxMVFutURERHrJTmrqzTJff/11o89xcHAAULhiPxERkbnJTmo5OTkACjflNJY6mXHkIxERlQXZSa1mzZoASteVeP36dQCAi4uL3GqJiIj0kp3UmjVrBgC4cOGC0ecEBQVBEAS0bNlSbrVERER6yU5qPXv2hCiK2LNnD5RKZYnl161bh+joaABAnz595FZLRESkl+yk5uPjg/r16+P58+cYN24crly5orPcrVu3MHv2bKxYsQKCIMDLyws9e/aUHTAREZE+spfJsrKywqpVqzBq1CgkJCRgxIgRqFu3rnR8/vz5SEpKklpxoijC0dER3333nelRExER6WDSMlnNmjXDjh074OnpCVEU8fDhQ2mx4qtXr+LRo0fS0lmenp7Yvn27NFeNiIjI3Exe0NjLywsHDhzA77//jmPHjiEqKgopKSnIz8+Hs7MzvL290atXLwwcOBAWFhbmiJmIiEgns2w9o1Ao0K9fP/Tr188clyMiIpKFm4QSEVGVwaRGRERVBpMaERFVGSbfU3v69Cl+++03nDt3Dg8ePEBmZiby8/MNniMIAo4fP25q1URERFpMSmqnT5/G3LlzkZaWBgB6d70uSj3sn4iIyJxkJ7WbN29i2rRpUKlUEEURlpaW8PT0hJOTE5MWERFVCNlJbf369cjNzYVCocDkyZMxfvx42NvbmzM2IiKiUpGd1CIiIiAIAkaMGIEZM2aYMyYiIiJZZI9+TE1NBQD07dvXbMEQERGZQnZSc3JyAgB2ORIRUaUhO6k1b94cAHD37l2zBUNERGQK2UltxIgREEURwcHB5oyHiIhINtlJrXv37hgyZAhOnz6N1atXmzMmIiIiWWSPfjx//jwGDhyI+Ph4/PDDDwgJCcHAgQPx8ssvo3r16iWe365dO7lVExER6SQ7qY0ePVprknVsbCxiY2ONOlcQBKPLEhERGcukZbKMXRaLiIioPMhOasuWLTNnHERERCaTndR8fX3NGQcREZHJuJ8aERFVGUxqRERUZTCpERFRlVHiPbXz589LjzXnlmk+LwfnqRERkbmVmNTU89GKzi0rOk+tNDhPjYiIyoJRox/1zUfjPDUiIqpMSkxq+uajcZ4aERFVNiUmNX3z0ThPjYiIKhuOfiQioiqDSY2IiKoMJjUiIqoyTFqlv6jExEQ8efIEz549K3FkJOepERGRuZmc1O7fv49169bh2LFjyMjIMOoczlMjIqKyYFJSCw8Px/Tp05GVlcU5a0REVOFkJ7XU1FTMnDkTmZmZqF69OoYNGwYHBwesXr0agiDgiy++QEZGBqKjo3HixAnk5uaibdu28PPzM2f8REREEtlJbfv27cjIyIC1tTV27dqFJk2a4ObNm1i9ejUA4N1335XKKpVKzJo1CxcuXED79u0xffp00yMnIiIqQvbox7CwMAiCgEGDBqFJkyYGy7q6uiIgIABubm5Yu3YtLl26JLdaIiIivWQntYQj6Nw1AAAgAElEQVSEBADAm2++qfN4QUGB1vf29vYYO3YsCgoKsHPnTrnVEhER6SU7qWVmZgIA6tatKz1nbW0tPc7Ozi52zmuvvQYAiIyMlFstERGRXrKTmo2NDQBobT/j4OAgPVYqlXrPTU5OllstERGRXrKTWr169QAAKSkp0nPOzs6wt7cHAFy+fLnYOTdu3AAAWFhYyK2WiIhIL9lJ7dVXXwUAXLt2Tev5Nm3aQBRFbN68Gc+fP5eeT0tLw8aNGyEIAl5++WW51RIREeklO6l17doVoigiNDRU6/khQ4YAAG7evImBAwdi+fLlWLx4MQYNGoS7d+8CAPr3729CyERERLrJnqf2xhtvwN7eHteuXcPdu3fh4eEBAOjVqxd69+6No0eP4v79+9i0aROAv3fJbtWqFd5//33TIyciIipCdlKzt7fHhQsXdB775ptvsH79euzatQuPHz8GADg5OWHw4MGYNWsWLC3Nuo4yERERADOv0q9mZWWFqVOnYurUqUhLS0N+fj6cnZ21RkoSERGZW5k3mWrUqFHWVRAREQHgJqFERFSFMKkREVGVUWL3Y3BwcJlU7OPjUybXJSKif64Sk9q8efPMPsBDEAQmNSIiMjujBopwV2siInoRlJjUtmzZUh5xEBERmazEpNa+ffvyiIOIiMhkHP1IRERVBpMaERFVGWZbUaSgoADXr1/H9evXkZaWBqBwNREvLy94eXlBoWD+JCKismVyUsvNzUVAQAB27NihtWGoplq1amHkyJGYMGECrK2tTa2SiIhIJ5OaT48ePYKvry9Wr16N5ORkiKKo8ys5ORmrVq2Cn58fHj16ZK7YiYiItMhuqT1//hwffPAB4uPjARR2Nfbr1w+tWrWCi4sLRFFESkoKoqKicPjwYTx58gS3bt3C+PHjERwczBYbERGZneyktnnzZsTHx0MQBPTv3x+LFy+Gvb19sXI+Pj6YPXs2lixZgt9++w3x8fHYsmULJkyYYFLgRERERcnufjxy5AgEQUDbtm3x9ddf60xoanZ2dvjvf/+Ltm3bQhRFHDp0SG61REREeslOagkJCQCAUaNGGX3O+++/r3UuERGROcnuflQP0Xd3dzf6HHVZcwzvDwkJwc6dOxETE4P09HS4uLigU6dOGDt2LLy8vEy+/vXr17F582aEh4cjOTkZTk5O8Pb2xogRI9CtWzeTr09EROYnO7u89NJLAIAnT54YfY66bP369eVWCwD47LPP4O/vj1OnTiEpKQm5ubl4+PAh9u7diyFDhpi8XU5QUBDeffdd7N27Fw8fPkRubi6SkpJw6tQp+Pv7Y/HixSZdn4iIyobspNarVy+IooiDBw8afc7BgwchCAJ69+4tt1oEBARg586dAICePXsiMDAQ4eHh2LhxI5o0aYLc3FwsWLAAkZGRsq4fGRmJhQsXQqVSoUmTJti4cSPCw8MRGBiInj17AgB27NiBgIAA2T8DERGVDdlJbdy4cXB3d0dwcDACAwNLLB8UFISgoCB4eHhg7NixsupMTU3FmjVrAABdunTB6tWr4e3tDWdnZ3Tp0gVbtmyBi4sL8vLysHz5cll1fPXVV8jLy4OLiwu2bNmCLl26wNnZGd7e3li9ejXeeOMNAMCaNWuQmpoqqw4iIiobspOavb09fv75Z3h7e2PBggWYNGkSjh49CqVSCZVKhby8PCiVShw7dgz+/v745JNP0LJlS/z888+ws7OTVWdQUBCys7MBAP/+97+LbV5as2ZNaarAlStXEBMTU6rrR0dHIyoqCgAwYcIE1KxZU+u4IAiYPXs2ACA7Oxv79u2T9XMQEVHZkD1QpFmzZtJjURRx+vRpnD59Wm95URQRFRWF7t276y0jCAJiY2P1Hg8JCQFQOODE29tbZ5l+/frhq6++AgCcPHlSbzlD11dfRxdvb2+4u7vj3r17OHnyJD744AOjr09ERGVLdktNcxmsot/r+jKmTEk7bKtbXq1atdJbxs3NDa6urlrljaUu7+rqCjc3N73l1PWX9vpERFS2ZLfUfH19zRlHiZRKpdT12KBBA4Nl69evD6VSKS3hZSx1eWOuDwBZWVlQKpVSEiUiooolO6ktW7bMnHGUSHPqQK1atQyWVR9Xb4FT2jqMvb66DiY1IqLK4YXZ5EzdSgMAGxsbg2XVx7OyskpVR05ODgCUuNhytWrVdMZFREQVy2ybhJa1ku63yS2rS9FRlea8ftH5c7OHN5V9LXOIvxUHozppe0wu61AMir5xu8Qy01uOKYdIDLsTd6vEMq4zp5VDJIbF3LlTYpk+fp7lEIlhN27qHzim9vaQieUQiX7Xb5b83gQAdHAo20BKEHXtqlHlunbtWraBlOD69esmnS87qUVFRaFly5ayzl2/fj0mTizdG1FzGsDz588Nls3NzS12jjGqV68OlUpV4vU1j9va2hp9/TZt2pQqHiIiKh3Z3Y+jRo3CTz/9VKpzkpKSMG7cOHz33Xelrk9zzpi+HbaLHq9Ro4asOoy9vpw6iIio7MhOaiqVCv/73//w0UcfGbWyxh9//IHBgwfj7NmzsupzdXWVWkX37983WPbBgwcAAE/P0nWfqMsbe307OzsOEiEiqkRkJ7VGjRpBFEWEhYVh0KBBCA8P11lOpVJh2bJl8Pf3l5Lf6NGjZdWpnkitXvVDF6VSCaVSqVW+tNfXvIYuV65ckXV9IiIqW7KT2t69ezFs2DCIoojk5GR8+OGH+O6771BQUCCVSUhIwPDhw7FlyxaIoogaNWpg7dq1+OSTT2TVqd7y5e7du3pXHjl8+LD02NDqJYauX/Q6mmJjY3Hv3j1Z1yciorIlO6nZ2Nhg6dKlWLFiBRwdHVFQUID169dj1KhRePjwIYKCguDn54e4uDiIoogOHTrgt99+M2kvMl9fX6kL8ttvvy02CjEtLQ0bNmwAULjqR2lbUi1atJAGv2zYsKHYPDdRFPHNN98AKBwgMnjwYFk/BxERlQ2T56n17dsXgYGBeO211yCKIi5fvoy+ffvik08+QXZ2NiwsLDBz5kxs2rQJderUMakuZ2dnTJkyBQAQGhqKGTNmIC4uDqmpqThz5gxGjx6NpKQkWFpaYu7cucXODwwMhJeXF7y8vPTuLDBv3jxYWloiKSkJo0ePxpkzZ5Camoq4uDjMmDEDYWFhAIApU6bA2dnZpJ+HiIjMSxBNndT1/+Xn5+M///mPVredg4MDAgIC8Nprr5mjCslnn30m7alWlJWVFb744gv4+PgUOxYYGIj58+cDKFwRxc/PT+c1goKCsGjRIqhUKp3HR4wYgSVLlhiMUbMuoDD5r1y50uA5ADB8+HBcvnxZ+n7Lli3o0KGDzrL79u3D3r17cePGDWRnZ8PNzQ1vvfUWPvjgA2kTV13++usvnDhxAufOnUNkZKTWaE43NzcsXboUb7/9doXHqano66nJUN1lHee5c+cwZkzp5sedOHFCa6Pc8nwtT5w4gR9++EFr3dLq1aujW7du8PPzw5tvvqn33PKMMzQ0FN99951WnLVq1cLcuXMxaNAgg3NJS4ozPz8fN27cQFRUlPR1+/Zt5OfnAyj9PKmyem+aI85Vq1Zh9erVRv8sCoUCDg4OcHd3R+fOnTFq1CiDg+Hk/p5zcnIQExOj9bMlJiYCANq3b4+tW7caHXNRZpt8vXXrVpw4cQKCIEAURQiCgMzMTGzfvh1NmjQp1XyukixZsgRdu3bFjh07EBMTg/T0dNSuXRsdO3bEuHHj4OXlZdL1fX190bx5c2zatAlnz55FUlISnJyc4O3tjffee09WF2pISAiePn0KBwf9EzDv3r2r9SbXR6VSYfr06Vq7CqjP37p1K4KCgrBy5Up06dKl2LlHjhzBrFmz9E4gf/ToESZOnIiePXvim2++0Vo9pTzjNIfKGGeNGjW0eizKK8bc3Fz861//wvHjx4sdy8nJwaFDh3Do0CEMGDAAX331FaysrIrVUx5x5uXlYeHChQgKCip2LCUlBR9//DH279+PH374QefKQsbEeeHChVJ/EDG3yhpnQUEB0tPTER0djejoaGzbtg3ffPNNsQnZpv6eN2zYUKpkWxomdz8+efIE/v7+WL58OXJzc1GtWjVMnz4d9erVgyiK2L9/P3x9fQ1uKSNHt27dsH79epw5cwZXr15FSEgIli1bZjCh+fn54fr167h+/breVpqal5cXli1bhpCQEFy9ehVnzpzB+vXrZSU0e3t7PH/+HEeOHDFYLjg4WCpvyP/93/9Jb6Zhw4bh0KFD+PPPP7Fq1Sq4ubkhMzMTM2fOxN27d4udm52dDVEU4erqiokTJ0r7zwGFn9jVjh8/rrMLt7ziNIfyiLNt27a4ePEiLl68iMWLF0vPq+tctGgRfv31V+n5d955R2sZtvJ6Lf/73/9KCc3Dw0N6Xr1AgXoR7wMHDuj8Z1OecaoTWqNGjaTn1R+KFQoFQkND9Q42MzZOtQYNGqB///5o2rR8V/Yp7zjr1asnvU+LvldXrVqFs2fP4uLFi4iMjMSBAwcwYcIEKBQKZGZmYtasWdLgODVz/W1bWlqiWbNmGDZsGBwdHWX9bEWZlNTOnTuHwYMH448//oAoimjcuDH27NmDqVOnYt++fejTpw9EUcS9e/cwfPhwbNq0ySxBv2j69OkDAAY3FRVFEb/99huAwq5KfW7evIldu3YBAIYOHYrPP/8cr7zyCmrVqoXevXtj8+bNqFatGjIzM7FixYpi59euXVt6Q86ePRuvvPKKdOydd94BAOnNdeTIkWLTJ8orTkOMafWXV5wWFhaws7ODnZ2dVstB/Ts/cuQIfv/9d+l5zW7x8ooxKysLu3fvBlC4D6LmBxl1vXXq1JE+EO7cuVNrFHN5xZmQkIBffvkFANC5c2etvQrV+xuqE/KBAwdw4cIFrfONjbNhw4YICAjAuXPncPz4cXz77bda+0OawpzvTXPGKQiC9D4t+l51cnJCzZo1YWdnB3t7ezRu3Bhz5szB1KlTARS25H/++WepvDn+tnv06IHt27cjMjISwcHB+Pzzzw32YpWG7KS2YsUKjB8/Ho8fP4Yoihg2bBj27Nkj/ZO0t7fHypUrsWTJEtjY2EClUmH58uWYNGmSUZO1qxL1PYALFy5IE7eLioyMxIMHD2Bra4tevXrpvdaOHTtQUFAAS0tLzJo1q9jxhg0bYujQoQAK/6EWfa3ffPNNDBkyBBYWFnrjfPr0qfRc0Y1fyyvOotRLnwGAv79/ietzVlScapq/c/Wnck9PT629AMsrxjt37khLu/Xr1w+Wln/fdVDHefHiRbzxxhsACkcRa16jvOI8fPiwdL/oX//6FxSKv/89qeNMSEiQElvR+y7Gxunq6oq33nrLbKsBldV709xxltZHH30kJT/Necjm+Jtp3rw52rRpo/P2hqlkJ7Uff/wR+fn5cHBwwIoVK7B06VKdfdzDhw/Hr7/+isaNG0s7ZOsaxFGV1a9fH23atNH6hFaUuhXXu3dvrW7AotRN/nbt2sHFxUVnGfWnv4KCApw6dUpWnGqPHz+uFHFq3gvq37+/VheaLhX9emq+lsnJyQBQ7H1fXjFqdncW/YerGad64IGFhYVWV1B5xRkXFwegsLXTokULvXGqYzt9+rRWQjE2TnMrq/dmRbOxsYG7uzuAwnvtamX5P8gcTOp+fO211xAUFGSwCQ1A6pZUT9ZOSkoypdoXknpOm64uyNzcXOl+m6G5b6mpqXj48CEAw7t/t2jRQmqJlfZeZtH6Nfv8KyrO2NhYaSqFWuvWrfWWryyvp2bdgiBg0KBBFRKjp6endO/s2LFjUmuoaJznz58HAHTs2FFKhOUZp7qHwMHBQWdrR12/+v5OdnY2bt++Xao4za2s3ptlRRRFvaO6dVG3ltUfdMvjf5CpZCe1Dz/8EL/88ovW0GRD1JO1V65caba+0xdJ3759YW1tjYSEhGL3qU6cOIGMjAzUqVMHHTt21HsNzZ28De3ObWNjg9q1awMo7HoqbZyaXZOvv/56hcaZn5+PhQsXFhutqflJXv2PrSLj1EVzWkTz5s1Rr169ConR2toakyZNAlC4xNyWLVukY5mZmXB1dYUgCMjNzYWjo6PWIIzyjFP9Aerp06c6R+eq/4bS09Ol59TXMDZOcyrL96a5paSkoH///vD29sarr76qNSUpLy9P5zkqlUr6AKEesVse/4NMJTupzZkzR6tv3lh9+vSR7jH8kzg6OkojJ4u21tTfDxw4UOs+QlHlsft3QUGBFIOdnR3eeuutCo1zy5YtiImJQd26dbWe1+y2OXPmjNaxyvJ6at6HKDqyq7xjnDhxIqZNmwYrKyvcuHFDen7w4MGYOHGi1DLq3r271qjD8oxTXW92djauXi2+95fm31DReo2N05zK8r1pbs+ePcOtW7ekVvqzZ8+kY0uXLpVaX5o2b94sbZysnj9XHv+DTGXWV1WpVOLq1au4cOGC1otWlOYn1n8S9T2VgwcPSl0AqampCA0N1Tquj/oNBpTN7t+iKOLjjz+WYhMEQfpnVxFxJiYm4vvvv4dCoYCvr6/ea5w7d65Svp6aH97i4uIqNEZBEODv749Fixbp/DCqHu144sSJCouzd+/e0uMVK1bobK0VTSBZWVmlitNcyvq9aS4ODg4YNWoUNmzYgOPHjyM6OhoRERFaI0sTEhLw4YcfIisrCwUFBfjrr7+wdu1aaeSilZUVxo0bB6Ds/2bMweTJ18+ePcPmzZuxa9cu/PXXX9Lz+/fv1/rEd+jQIZw6dQqOjo5YuHChqdW+kN58803UrFkTT548QWhoKLp3744DBw4gLy8PzZo1Q5MmTQyer/lHXha7c69fvx5//PEHgMI3ZGZmZoXGuWTJEmRnZ2PEiBEGuzqePn1a6V7P5ORknDt3DkBh919aWlqFxvjXX39h4sSJuHHjBjw8PKT5Q4GBgUhJScHatWtx8eJFPH36FCtWrMCcOXPKPc5mzZphwIABOHDgAMLCwrR2ykhMTMS+ffukIf+adZUmTnMp6/emuaiTkSZra+ti9d+5c0frVoNm2S+//FL6X17W/4PMwaSWmlKpxLBhw7BixQo8fPgQoijq/UGaN2+O/fv3Y9u2bQa3jqnKrKys0L9/fwB/f4pXd0cYc9NYcw6MoZYwIG/3b/Vcpm7duuHdd9+t0DgPHjyIP/74Ay4uLpg9e3aJdVa21/PYsWNS60e99FRFxZibm4tx48bhxo0baN++PT766CPpmJOTE9566y1s3bpVum/y888/Izk5uUJey88//1xaheLmzZvS82PGjMH3338Pa2trraWkHB0dSxWnOZTHe7MiWVpawtPTEyNHjsS+ffswcOBA6VhZ/w8yB9lJLT8/H1OmTJH653v37o1FixbpLd+wYUNpDciiS6v8k6jf0CEhIbh06RKuXr0KCwsLDBgwoMRzy2P37/bt22PlypVSl0pFxJmbm4svv/wSADB37lyjVhqobK+nepi3q6urNEijomI8evQoEhISAAAzZ87UOUfR0tJSmmybn5+PgICACnktbW1tERAQgK+//hqNGzeWnndxccGQIUOwb98+aZEA4O/7b8bGaaryem+WJ82lrMLDwxETE4MjR47gs88+w8svv6xVtjz+B5lKdvfj/v37ERMTAwsLC6xcuRI9e/YEUPhJS59u3brh0qVLuHTpktxqX3gtW7aEp6cn4uPjMWfOHACFqyeoRwoZormTt6HduXNzc6VpE0XflEVptpqbNm2KH3/8ETY2NhUa57Nnz6S5XXPmzJHqV+vRo4fOa7333nvlGqch6iQyaNAgtGrVqkJ/55prDL766qvFljxS8/HxwWeffQYA0tJeFfFaKhQKDBw4ECqVSlooeNeuXdJI6xMnTkhlNVcgMSZOU5XXe7M8aSaqkob7l+XfjLnIbqkdOnQIgiDAx8dHSmglUS/Do/6D/6dSt9bUbwpjuyOcnZ2lQTaGunCjoqKkUU7NmzfXW+6PP/6QuhyBwp0LNLsKKkucxlJ3fVemONUDAirytdScoGzoPofm0ljqm/uV6bVU05zsrF7ZvbJ36ZX2vVmeNKdIlNTyrKi/7dKQndTUE+o0RyyVpKKGeFY2mltn2NnZGf2hAPh7d+7z58/rbf6rJ3cqFIpiq2urnT9/HjNmzNCaiFv0DV1RcdrZ2SE4OFjra/r06dLxdevWaX0/e/bsCn89dfH29pZusFfk71yzZaC5lUtRRYfRV6bXUu3KlSuIjIwEANmvpynK871ZXtTJydPT06gVTsrj92wK2UlNnZjkbPyp+Ynwn+ill17C77//jkOHDmH//v2lWipnxIgRUCgUUKlUOvdnu3fvHvbs2QOgcLKqro1Mr169Cn9/fzx79szgJ7OKitPCwgLNmjXT+tKcBtKoUSOt71u1alWhr6c+mkO9K/J33qlTJ+nxqlWrdP795eXlYdWqVdL3P/zwQ6V6LYHCEaUff/wxRFFEkyZNcPDgQVmvpynK871pqtTUVGnNT0PU/8s1V7wxpKx/z6aSfU/Nzs4OGRkZUv+yMdRdBU5OTnKrrTJKWh9OnyZNmmD48OHYsWOHtFL22LFjUaNGDVy6dAlffPEFcnJyYG9vr3Ox0du3b2PChAnIzMyEk5MT3n//ffzwww8ACuegFJ1T4urqKmvRUVPjLK2Kej31sbS0lEa6VnSMbdu2RceOHXH27FmcPXtWa6h8eno64uPjsXbtWqkF1KdPH1ktCnO8lmvWrMGNGzfQv39/rTj37t2L3bt3Izk5GU5OTvjf//6ntcNEacXGxmp1y2ouult0n7NGjRoZvUWMLnJ/74Bpcaq3mBk4cCA6d+6Ml19+GQ4ODtIGnUXP1TX8Xxdz/J5zc3OLLZ+l/jkzMzOL/Wyl2WhadlKrX78+YmNjcfv2bYO75WpSL2ypOX+NSm/BggV49OgRQkJCsGvXLumNpabeIUHXH9OhQ4ekVQHS09OlhAZA56gsU3ahNSXO8mSuODWHOLdr186sn1BNjXHFihWYPHkyLl26pLXUUdF9BTt16iSN7quIOJ8/f47Dhw/j8OHDWs+vWbMGAODu7o7vvvvO5P3Ppk2bJn3ILmr48OFa3xvavbqsmRpnUlISfvrpJ/z0009662jcuDE2btxYqo2cTf09P378uFj8arGxscWOlWY3ctlJrVOnToiJicGOHTswZsyYEpd8iY2Nxf79+yEIgrTFBcljZWWFH3/8EcHBwdJW6jk5OdJWFR988IHRa3IyTvPFqXnjvDT3mssjxpo1a2Lbtm04dOgQNm7cKK2Ib2VlhVq1auHVV1/FwIED0adPnxIn1ZZlnP369UNubi4iIiIQHx8v9Ry0atUKgwcPxpAhQ0pcyYIKvf766/j0009x+fJlXLt2DampqUhPT4eVlRWqVasmtfoWLlwIV1fXUl27Mv9tC6LMad8PHz5Enz59kJeXh9GjR2P+/PkQBAFNmzaFIAhaK4qEhYVh3rx5SE5Ohq2tLUJCQtgFSUREZie7pVavXj3861//wn//+19s3boV4eHh0m6/QOGw24MHDyI0NBQxMTEQRRGCIGD+/PlMaEREVCZkt9TUVq1ahTVr1khJSxf1sZkzZ8Lf39+U6oiIiPQyOakBhduT//jjjwgPDy+2N49CoUC7du0wffp0tG3b1tSqiIiI9DJLUlPLzs5GbGwsUlJSkJ+fj5o1a6J58+bsbiQionJh1qRGRERUkcpv61UiIqIyxqRGRERVBpMaERFVGUxqRERUZTCpEb3gvLy84OXlhXnz5pl0nXnz5knXInpRMakREVGVwaRGRC8Uc7VMqWpiUiMiAMBXX32F69evl2qbD6LKhkmNiIiqDCY1IiKqMmRvPUNEhVJTU7F9+3aEhoYiISEBmZmZsLOzQ82aNVG3bl20b98e3bt319qtedWqVVi9ejUA4MSJEwY3VOzevTsSExON3oU8JiYGW7Zswfnz55GUlAQHBwe0bNkSI0eOxFtvvaX3vHnz5iEoKAhAyTsNX7x4EYGBgVIdeXl5cHFxQevWrTFs2DCjdoouKCjAkSNHcPToUURFRSE1NRUKhQKurq7w9PREz5490aNHD2ntWPXroBYUFCTFq4ndp/9sTGpEJrh8+TImTZqEtLQ0refT09ORnp6OhIQEhIeHIyIiAps2bSrzeAIDA/Hpp59CpVJJz6WkpCAkJAQhISEYOXIkPv30U9m7Wz979gwLFy7E/v37ix1LTExEYmIiDhw4AB8fH3z++eewtrbWeZ27d+9ixowZuHbtWrFjd+7cwZ07d3DixAmMGTMGCxYskBUr/TMxqRHJlJubi5kzZyItLQ0WFhbw8/ND165dUbt2bVhYWCAlJQVxcXEIDQ2VnURK49q1azhw4ADs7e3x0Ucf4fXXXwdQuDXUhg0b8OTJE2zfvh3Ozs6YPn16qa+fn58Pf39/hIeHAwDatm0LHx8fNGjQAPb29khISMCuXbsQERGB4OBgKBQKLFu2rNh1Hjx4gOHDh+PJkycAgNatW8PPzw+NGjWCjY0NHj9+jMuXL+PIkSNa523cuBEqlQoDBw4EAPTo0QOzZs0q9c9BVRuTGpFMkZGRePToEQBg7ty5GDt2bLEyb7/9Nvz9/aV/4GUpLi4OderUwa5du1CvXj3p+datW6Nv374YMWIEkpKSsG7dOgwaNAgeHh6luv6GDRsQHh4OhUKB//3vfxgwYIDW8VdffRUDBgzAsmXLsGnTJgQGBmLIkCFo06aNVrn//Oc/0usxffp0TJs2Teu4t7c3unXrhlmzZkGpVErPe3p6apVzdHREkyZNSvUzUNXHgSJEMiUnJ0uPS7qHVLNmzbIOBwAwf/58rYSmVr9+fXz88ccAAJVKhZ07d5bqujk5Ofjpp58AAEOGDCmW0DT95z//Qe3atQEAu3fv1jp29uxZXLp0CQDQrVu3YglNkyAIcHNzK1WcRExqRDK5urpKj/iHI1wAAAYkSURBVPfu3YuK3prQ0dERvXr10nu8b9++cHBwAACEhYWV6trnz5+X7hv279/fYFkrKyup6/PixYtax06ePCk9Hj9+fKliIDIGux+JZHr99dfh6emJ+Ph4bNmyBaGhoejTpw/atWuHli1bwtHRsVzjad68OaysrPQet7a2RrNmzRAREYFbt24hPz8fFhYWRl07KipKeqyrm1WfpKQkre9jYmIAFCa+1q1bG30dImOxpUYkk6WlJdatWwdvb28AQHx8PH788Ud8+OGHaN++PQYPHoxVq1bh8ePH5RJPrVq1Siyj7hYsKChARkaG0ddOTU2VFVNOTo7O6zg7OxtMwERysaVGZAIPDw/s3bsX4eHhOHHiBCIjI3Hjxg3k5+fj2rVruHbtGjZu3IilS5di0KBBZRqLMSMs5XaR5uXlSY83bNig1fUqR3mMBqV/JiY1IhMJgoDOnTujc+fOAIDMzExcuHAB+/fvx6FDh5CTk4N58+ahefPmaNSoEQBAofi7k6SkRJOdnW1UHJoDV/RJSUmR6i9N96izs7P02MLCQvaoQ2dnZ9y5cwcpKSlQqVRsrZHZsfuRyMzs7e3RtWtXfPPNN5g9ezaAwjlemvOu7OzspMfp6el6r5Wammr0dIC4uDitFlVRKpUKcXFxAIBGjRoZfT8NKByurxYaGmr0efquo1KppFGQRObEpEZUhrp06SI91rwv1aBBA+lxdHS03vP37dtndF3p6ek4duyY3uO///67dB9NMy5jdOzYEfb29gCAX3/91ahWoS49evSQHqunCJSWjY0NgMLJ70RFMakRyXThwgXcvn3bYBnNVo1mInv99delrrdffvkFz549K3ZuTEwMVq1aVaqYvvrqK2lCuKaHDx9i+fLlAApHHo4YMaJU17W3t5eG4D99+hRTp04tcfBIeHg4IiMjtZ5r37492rZtCwAICQmR1r/URRRFnT+L+n5eQkJCaX4E+oewWLx48eKKDoLoRbR3717MmjULoaGhePz4MTIzM/H06VMolUpER0cjICAAP//8M0RRhJOTE5YuXQpbW1sAQPXq1XH//n3ExcUhNTUVZ86cgaOjI3Jzc3Hr1i1s374dS5YsQZ06daBQKJCTk4OXXnoJfn5+xeJQJ4amTZsiMTERwcHBEAQBCoUCjx49wqFDhzB37lypdTVlyhT07t272HWOHz8urcWoaxmtNm3a4MqVK7h37x4ePXqEvXv3IiMjA3l5ecjKykJiYiIuX76MoKAgLF26FJs2bULr1q2l0aFqHTp0wP79+/Hs2TNERETgzJkzEEURKpUKqampiI2Nxb59+7B48WKkpKQUW4Q5Li4OcXFxSEpKQkFBAWxtbZGZmYknT57gyZMnWvf/6J9HECt6xijRC0pzpX1DXFxc8P333xdbLiotLQ1jxozRu6q8u7s7AgICMH78eIOr9Ht5eQEAfH190aFDByxatEhrQWNN7733Hj777DOdow+NWaU/NzcXy5Ytw86dO1FQUKD/h0bhAJpvv/0W77zzTrFjCQkJmDZtGm7evGnwGroWNL516xaGDBlSbLqAGlfp/2fj6EcimSZMmIDXXnsNZ8+exeXLl6FUKqVRfY6OjmjcuDG6du2KoUOHSvejNNWoUQM7duzApk2bcOTIEdy7dw8KhQINGjRAnz59MHbsWJ3nGeLr64smTZpg8+bN0rYw9vb2eO2110rcesYY1tbW+Oyzz/D+++9jz549iIiIwIMHD/D06VPY2NjAxcUFr7zyCjp06ICePXtqdblqatiwIfbt24eDBw/i999/x9WrV5GamgobGxvUqVMHL7/8Mnr16qV1D06tUaNG2Lt3L3766SdcuHABSqVSb4Kjfx621IgIAPDxxx9j3759sLS0lFb+IHrRcKAIEQEoHAACoNStQ6LKhEmNiFBQUCDNYSu6xQvRi4T31Ij+we7cuQOlUok9e/bgr7/+AlC4JQzRi4r31Ij+wbp3747ExETp+4YNGyIwMFBrxROiFwlbakT/cFZWVqhbty66du2KyZMnM6HRC40tNSIiqjI4UISIiKoMJjUiIqoymNSIiKjKYFIjIqIqg0mNiIiqDCY1IiKqMv4fGph0aIDC4oAAAAAASUVORK5CYII=\n",
220 | "text/plain": [
221 | ""
222 | ]
223 | },
224 | "metadata": {},
225 | "output_type": "display_data"
226 | }
227 | ],
228 | "source": [
229 | "sns.barplot(data=pca_results.sort_values(\"subject\"), x=\"subject\", y=\"explained_variance\")"
230 | ]
231 | }
232 | ],
233 | "metadata": {
234 | "kernelspec": {
235 | "display_name": "Python 3",
236 | "language": "python",
237 | "name": "python3"
238 | },
239 | "language_info": {
240 | "codemirror_mode": {
241 | "name": "ipython",
242 | "version": 3
243 | },
244 | "file_extension": ".py",
245 | "mimetype": "text/x-python",
246 | "name": "python",
247 | "nbconvert_exporter": "python",
248 | "pygments_lexer": "ipython3",
249 | "version": "3.6.8"
250 | }
251 | },
252 | "nbformat": 4,
253 | "nbformat_minor": 2
254 | }
255 |
--------------------------------------------------------------------------------
/notebooks/quantitative_dynamic.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "from copy import copy\n",
10 | "from functools import partial\n",
11 | "import itertools\n",
12 | "import json\n",
13 | "from pathlib import Path\n",
14 | "import re\n",
15 | "import sys\n",
16 | "sys.path.append(\"../src\")\n",
17 | "\n",
18 | "import matplotlib\n",
19 | "import matplotlib.pyplot as plt\n",
20 | "import numpy as np\n",
21 | "import pandas as pd\n",
22 | "import seaborn as sns\n",
23 | "import scipy.stats as st\n",
24 | "import statsmodels.formula.api as smf\n",
25 | "from tqdm import tqdm, tqdm_notebook\n",
26 | "\n",
27 | "%matplotlib inline\n",
28 | "sns.set(style=\"whitegrid\", context=\"paper\", font_scale=3.5, rc={\"lines.linewidth\": 2.5})\n",
29 | "from IPython.display import set_matplotlib_formats\n",
30 | "set_matplotlib_formats('png')\n",
31 | "#set_matplotlib_formats('svg')\n",
32 | "\n",
33 | "%load_ext autoreload\n",
34 | "%autoreload 2\n",
35 | "import util"
36 | ]
37 | },
38 | {
39 | "cell_type": "markdown",
40 | "metadata": {},
41 | "source": [
42 | "## Data preparation"
43 | ]
44 | },
45 | {
46 | "cell_type": "code",
47 | "execution_count": null,
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "output_path = Path(\"../output\")\n",
52 | "decoder_path = output_path / \"decoders\"\n",
53 | "bert_encoding_path = output_path / \"encodings\"\n",
54 | "model_path = output_path / \"bert\""
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": null,
60 | "metadata": {},
61 | "outputs": [],
62 | "source": [
63 | "checkpoints = [util.get_encoding_ckpt_id(dir_entry) for dir_entry in bert_encoding_path.iterdir()]"
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "execution_count": null,
69 | "metadata": {},
70 | "outputs": [],
71 | "source": [
72 | "models = [model for model, _, _ in checkpoints]\n",
73 | "\n",
74 | "baseline_model = \"baseline\"\n",
75 | "if baseline_model not in models:\n",
76 | " raise ValueError(\"Missing baseline model. This is necessary to compute performance deltas in the analysis of fine-tuning models. Stop.\")\n",
77 | "\n",
78 | "standard_models = [model for model in models if not model.startswith(\"LM_\") and not model == baseline_model]\n",
79 | "custom_models = [model for model in models if model.startswith(\"LM_\") and not model == baseline_model]\n",
80 | "\n",
81 | "runs = sorted(set(run for _, run, _ in checkpoints))\n",
82 | "checkpoint_steps = sorted(set(step for _, _, step in checkpoints))\n",
83 | "\n",
84 | "# Models which should appear in the final report figures\n",
85 | "report_models = [\"SQuAD\", \"QQP\", \"MNLI\", \"SST\", \"LM\", \"LM_scrambled\", \"LM_scrambled_para\", \"LM_pos\", \"glove\"]\n",
86 | "\n",
87 | "# Model subsets to render in different report figures\n",
88 | "report_model_sets = [\n",
89 | " (\"all\", set(report_models)),\n",
90 | " (\"standard\", set(report_models) & set(standard_models)),\n",
91 | " (\"custom\", set(report_models) & set(custom_models)),\n",
92 | "]\n",
93 | "report_model_sets = [(name, model_set) for name, model_set in report_model_sets\n",
94 | " if len(model_set) > 0]"
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": null,
100 | "metadata": {},
101 | "outputs": [],
102 | "source": [
103 | "RENDER_FINAL = True\n",
104 | "figure_path = Path(\"../reports/figures\")\n",
105 | "figure_path.mkdir(exist_ok=True, parents=True)\n",
106 | "\n",
107 | "report_hues = dict(zip(sorted(report_models), sns.color_palette()))"
108 | ]
109 | },
110 | {
111 | "cell_type": "markdown",
112 | "metadata": {},
113 | "source": [
114 | "### Decoder performance metrics"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": null,
120 | "metadata": {},
121 | "outputs": [],
122 | "source": [
123 | "# Load decoder performance data.\n",
124 | "decoding_perfs = util.load_decoding_perfs(decoder_path)"
125 | ]
126 | },
127 | {
128 | "cell_type": "code",
129 | "execution_count": null,
130 | "metadata": {},
131 | "outputs": [],
132 | "source": [
133 | "# Save perf data.\n",
134 | "decoding_perfs.to_csv(output_path / \"decoder_perfs.csv\")"
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "execution_count": null,
140 | "metadata": {},
141 | "outputs": [],
142 | "source": [
143 | "# # Load comparison model data.\n",
144 | "# for other_model in other_models:\n",
145 | "# other_perf_paths = list(Path(\"../models/decoders\").glob(\"encodings.%s-*.csv\" % other_model))\n",
146 | "# for other_perf_path in tqdm_notebook(other_perf_paths, desc=other_model):\n",
147 | "# subject, = re.findall(r\"-([\\w\\d]+)\\.csv$\", other_perf_path.name)\n",
148 | "# perf = pd.read_csv(other_perf_path,\n",
149 | "# usecols=[\"mse\", \"r2\", \"rank_median\", \"rank_mean\", \"rank_min\", \"rank_max\"])\n",
150 | "# decoding_perfs.loc[other_model, 1, 250, subject] = perf.iloc[0]"
151 | ]
152 | },
153 | {
154 | "cell_type": "markdown",
155 | "metadata": {},
156 | "source": [
157 | "### Model performance metrics"
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": null,
163 | "metadata": {},
164 | "outputs": [],
165 | "source": [
166 | "# For each model, load checkpoint data: global step, gradient norm information\n",
167 | "model_metadata = {}\n",
168 | "for model, run, step in tqdm_notebook(checkpoints): \n",
169 | " run_dir = model_path / (\"%s-%i\" % (model, run))\n",
170 | " \n",
171 | " # Fetch corresponding fine-tuning metadata.\n",
172 | " ckpt_path = run_dir / (\"model.ckpt-step%i\" % step)\n",
173 | "\n",
174 | " try:\n",
175 | " metadata = util.load_bert_finetune_metadata(run_dir, step)\n",
176 | " except Exception as e:\n",
177 | " pass\n",
178 | " else:\n",
179 | " if metadata[\"steps\"]:\n",
180 | " model_metadata[model, run] = pd.DataFrame.from_dict(metadata[\"steps\"], orient=\"index\")\n",
181 | " \n",
182 | " # SQuAD eval results need to be loaded separately, since they run offline.\n",
183 | " if model == \"SQuAD\":\n",
184 | " pred_dir = output_path / \"eval_squad\" / (\"SQuAD-%i-%i\" % (run, step))\n",
185 | " try:\n",
186 | " with (pred_dir / \"results.json\").open(\"r\") as results_f:\n",
187 | " results = json.load(results_f)\n",
188 | " model_metadata[model, run].loc[step][\"eval_accuracy\"] = results[\"best_f1\"] / 100.\n",
189 | " except:\n",
190 | " print(\"Failed to retrieve eval data for SQuAD-%i-%i\" % (run, step))\n",
191 | "\n",
192 | "model_metadata = pd.concat(model_metadata, names=[\"model\", \"run\", \"step\"], sort=True)"
193 | ]
194 | },
195 | {
196 | "cell_type": "markdown",
197 | "metadata": {},
198 | "source": [
199 | "### Putting it all together"
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": null,
205 | "metadata": {},
206 | "outputs": [],
207 | "source": [
208 | "# Join decoding data, post-hoc rank evaluation data, and model training metadata into a single df.\n",
209 | "old_index = decoding_perfs.index\n",
210 | "df = decoding_perfs.reset_index().join(model_metadata, on=[\"model\", \"run\", \"step\"]).set_index(old_index.names)\n",
211 | "df.head()"
212 | ]
213 | },
214 | {
215 | "cell_type": "markdown",
216 | "metadata": {},
217 | "source": [
218 | "-----------"
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": null,
224 | "metadata": {},
225 | "outputs": [],
226 | "source": [
227 | "all_subjects = df.index.get_level_values(\"subject\").unique()\n",
228 | "all_subjects"
229 | ]
230 | },
231 | {
232 | "cell_type": "code",
233 | "execution_count": null,
234 | "metadata": {},
235 | "outputs": [],
236 | "source": [
237 | "try:\n",
238 | " subjects_with_baseline = set(decoding_perfs.loc[baseline_model, :, :].index.get_level_values(\"subject\"))\n",
239 | "except:\n",
240 | " subjects_with_baseline = set()\n",
241 | " \n",
242 | "if not subjects_with_baseline == set(all_subjects): \n",
243 | " raise ValueError(\"Cannot proceed. Missing base decoder evaluation for subjects: \" + str(set(all_subjects) - subjects_with_baseline))"
244 | ]
245 | },
246 | {
247 | "cell_type": "markdown",
248 | "metadata": {},
249 | "source": [
250 | "### Synthetic columns"
251 | ]
252 | },
253 | {
254 | "cell_type": "code",
255 | "execution_count": null,
256 | "metadata": {},
257 | "outputs": [],
258 | "source": [
259 | "df[\"eval_accuracy_delta\"] = df.groupby([\"model\", \"run\"]).eval_accuracy.transform(lambda xs: xs - xs.iloc[0])\n",
260 | "df[\"eval_accuracy_norm\"] = df.groupby([\"model\", \"run\"]).eval_accuracy.transform(lambda accs: (accs - accs.min()) / (accs.max() - accs.min()))"
261 | ]
262 | },
263 | {
264 | "cell_type": "code",
265 | "execution_count": null,
266 | "metadata": {},
267 | "outputs": [],
268 | "source": [
269 | "def decoding_perf_delta(xs, metric=\"mse\"):\n",
270 | " subject = xs.index[0][3]\n",
271 | " base_metric = df.loc[baseline_model, 1, 0, subject][metric]\n",
272 | " return xs - base_metric.item()\n",
273 | "\n",
274 | "df[\"decoding_mse_delta\"] = df.groupby([\"model\", \"run\", \"subject\"]).mse.transform(partial(decoding_perf_delta, metric=\"mse\"))\n",
275 | "df[\"rank_mean_delta\"] = df.groupby([\"model\", \"run\", \"subject\"]).rank_mean.transform(partial(decoding_perf_delta, metric=\"rank_mean\"))\n",
276 | "df[\"rank_median_delta\"] = df.groupby([\"model\", \"run\", \"subject\"]).rank_median.transform(partial(decoding_perf_delta, metric=\"rank_median\"))"
277 | ]
278 | },
279 | {
280 | "cell_type": "code",
281 | "execution_count": null,
282 | "metadata": {},
283 | "outputs": [],
284 | "source": [
285 | "NUM_BINS = 50\n",
286 | "def bin(xs):\n",
287 | " if xs.isnull().values.any(): return np.nan\n",
288 | " return pd.cut(xs, np.linspace(xs.min(), xs.max() + 1e-5, NUM_BINS), labels=False)\n",
289 | "df[\"eval_accuracy_bin\"] = df.groupby([\"model\"]).eval_accuracy.transform(bin)\n",
290 | "df[\"decoding_mse_bin\"] = df.groupby([\"subject\"]).decoding_mse_delta.transform(bin)\n",
291 | "df[\"total_global_norms_bin\"] = df.groupby([\"model\"]).total_global_norms.transform(bin)"
292 | ]
293 | },
294 | {
295 | "cell_type": "code",
296 | "execution_count": null,
297 | "metadata": {},
298 | "outputs": [],
299 | "source": [
300 | "ROLLING_WINDOW_SIZE = 5\n",
301 | "grouped = df.groupby([\"model\", \"run\", \"subject\"])\n",
302 | "for col in [\"mse\", \"decoding_mse_delta\", \"eval_accuracy\", \"train_loss\", \"rank_mean\", \"rank_mean_delta\"]:\n",
303 | " df[\"%s_rolling\" % col] = grouped[col].transform(lambda rows: rows.rolling(ROLLING_WINDOW_SIZE, min_periods=1).mean())"
304 | ]
305 | },
306 | {
307 | "cell_type": "code",
308 | "execution_count": null,
309 | "metadata": {},
310 | "outputs": [],
311 | "source": [
312 | "df.tail()"
313 | ]
314 | },
315 | {
316 | "cell_type": "code",
317 | "execution_count": null,
318 | "metadata": {},
319 | "outputs": [],
320 | "source": [
321 | "df.head()"
322 | ]
323 | },
324 | {
325 | "cell_type": "code",
326 | "execution_count": null,
327 | "metadata": {},
328 | "outputs": [],
329 | "source": [
330 | "dfi = df.reset_index()"
331 | ]
332 | },
333 | {
334 | "cell_type": "markdown",
335 | "metadata": {},
336 | "source": [
337 | "## Model training analysis\n",
338 | "\n",
339 | "Let's verify that each model is not overfitting; if it is overfitting, restrict our analysis to just the region before overfitting begins."
340 | ]
341 | },
342 | {
343 | "cell_type": "code",
344 | "execution_count": null,
345 | "metadata": {},
346 | "outputs": [],
347 | "source": [
348 | "# g = sns.FacetGrid(df.reset_index().melt(id_vars=[\"model\", \"run\", \"step\"],\n",
349 | "# value_vars=[\"train_loss_rolling\", \"eval_accuracy_rolling\"]),\n",
350 | "# row=\"variable\", col=\"model\", sharex=True, sharey=False, height=4)\n",
351 | "# g.map(sns.lineplot, \"step\", \"value\", \"run\", ci=None)\n",
352 | "# g.add_legend()"
353 | ]
354 | },
355 | {
356 | "cell_type": "code",
357 | "execution_count": null,
358 | "metadata": {},
359 | "outputs": [],
360 | "source": [
361 | "%matplotlib agg\n",
362 | "\n",
363 | "if RENDER_FINAL:\n",
364 | " # models which appear on left edge of subfigs in paper\n",
365 | " LEFT_EDGE_MODELS = [\"QQP\", \"LM\"]\n",
366 | " \n",
367 | " training_fig_path = figure_path / \"training\"\n",
368 | " training_fig_path.mkdir(exist_ok=True)\n",
369 | " shared_kwargs = {\"legend\": False, \"ci\": None}\n",
370 | "\n",
371 | " for model in tqdm_notebook(report_models):\n",
372 | " f, (loss_fig, acc_fig) = plt.subplots(2, 1, figsize=(10,15), sharex=True)\n",
373 | " try:\n",
374 | " local_data = df.loc[model].reset_index()\n",
375 | " except KeyError:\n",
376 | " print(f\"Missing training data for {model}\")\n",
377 | " continue\n",
378 | " \n",
379 | " ax = sns.lineplot(data=local_data, x=\"step\", y=\"train_loss_rolling\", hue=\"run\", ax=loss_fig, **shared_kwargs)\n",
380 | " ax.set_ylabel(\"Training loss\\n(rolling window)\" if model in LEFT_EDGE_MODELS else \"\")\n",
381 | " ax.set_xlabel(\"Training step\")\n",
382 | " \n",
383 | " ax = sns.lineplot(data=local_data, x=\"step\", y=\"eval_accuracy_rolling\", hue=\"run\", ax=acc_fig, **shared_kwargs)\n",
384 | " ax.set_ylabel(\"Validation set accuracy\\n(rolling window)\" if model in LEFT_EDGE_MODELS else \"\")\n",
385 | " ax.set_xlabel(\"Training step\")\n",
386 | " \n",
387 | " sns.despine()\n",
388 | " \n",
389 | " plt.tight_layout()\n",
390 | " plt.savefig(training_fig_path / (\"%s.pdf\" % model))\n",
391 | " plt.close()\n",
392 | "%matplotlib inline"
393 | ]
394 | },
395 | {
396 | "cell_type": "markdown",
397 | "metadata": {},
398 | "source": [
399 | "## Decoding analyses"
400 | ]
401 | },
402 | {
403 | "cell_type": "code",
404 | "execution_count": null,
405 | "metadata": {},
406 | "outputs": [],
407 | "source": [
408 | "MSE_DELTA_LABEL = \"$\\Delta$(MSE)\"\n",
409 | "MAR_DELTA_LABEL = \"$\\Delta$(MAR)\""
410 | ]
411 | },
412 | {
413 | "cell_type": "markdown",
414 | "metadata": {},
415 | "source": [
416 | "### Final state analysis"
417 | ]
418 | },
419 | {
420 | "cell_type": "code",
421 | "execution_count": null,
422 | "metadata": {},
423 | "outputs": [],
424 | "source": [
425 | "%matplotlib agg\n",
426 | "\n",
427 | "if RENDER_FINAL:\n",
428 | " final_state_fig_path = figure_path / \"final_state\"\n",
429 | " final_state_fig_path.mkdir(exist_ok=True)\n",
430 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL, None, None),\n",
431 | " (\"rank_mean_delta\", MAR_DELTA_LABEL, None, None),\n",
432 | " (\"mse\", \"Mean squared error\", 0.00335, 0.00385),\n",
433 | " (\"rank_mean\", \"Mean average rank\", 20, 95)]\n",
434 | " \n",
435 | " for model_set_name, model_set in report_model_sets:\n",
436 | " final_df = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(model_set))]\n",
437 | " if final_df.empty:\n",
438 | " continue\n",
439 | "\n",
440 | " for metric, label, ymin, ymax in tqdm_notebook(metrics, desc=model_set_name):\n",
441 | " fig, ax = plt.subplots(figsize=(15, 10))\n",
442 | "\n",
443 | " # Plot BERT baseline performance.\n",
444 | " if \"delta\" not in metric:\n",
445 | " # TODO error region instead -- plt.fill_between\n",
446 | " ax.axhline(dfi[dfi.model == baseline_model][metric].mean(),\n",
447 | " linestyle=\"--\", color=\"gray\")\n",
448 | "\n",
449 | " sns.barplot(data=final_df, x=\"model\", y=metric,\n",
450 | " order=final_df.groupby(\"model\")[metric].mean().sort_values().index,\n",
451 | " palette=report_hues, ax=ax)\n",
452 | "\n",
453 | " padding = final_df[metric].var() * 0.005\n",
454 | " plt.ylim((ymin or (final_df[metric].min() - padding), ymax or (final_df[metric].max() + padding)))\n",
455 | " plt.xlabel(\"Model\")\n",
456 | " plt.ylabel(label)\n",
457 | " plt.xticks(rotation=45, ha=\"right\")\n",
458 | "\n",
459 | " plt.tight_layout()\n",
460 | " plt.savefig(final_state_fig_path / (f\"{metric}.{model_set_name}.pdf\"))\n",
461 | " #plt.close(fig)\n",
462 | " \n",
463 | "%matplotlib inline"
464 | ]
465 | },
466 | {
467 | "cell_type": "code",
468 | "execution_count": null,
469 | "metadata": {},
470 | "outputs": [],
471 | "source": [
472 | "%matplotlib agg\n",
473 | "\n",
474 | "if RENDER_FINAL:\n",
475 | " final_state_fig_path = figure_path / \"final_state_within_subject\"\n",
476 | " final_state_fig_path.mkdir(exist_ok=True)\n",
477 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL),\n",
478 | " (\"rank_mean_delta\", MAR_DELTA_LABEL),\n",
479 | " (\"mse\", \"Mean squared error\"),\n",
480 | " (\"rank_mean\", \"Mean average rank\")]\n",
481 | " \n",
482 | " for model_set_name, model_set in report_model_sets:\n",
483 | " final_df = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(model_set))]\n",
484 | "\n",
485 | " for metric, label in tqdm_notebook(metrics, desc=model_set_name):\n",
486 | " fig = plt.figure(figsize=(25, 10))\n",
487 | " sns.barplot(data=final_df, x=\"model\", y=metric, hue=\"subject\",\n",
488 | " order=final_df.groupby(\"model\")[metric].mean().sort_values().index)\n",
489 | " plt.ylabel(label)\n",
490 | " plt.xticks(rotation=30, ha=\"right\")\n",
491 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1,0.5))\n",
492 | " plt.tight_layout()\n",
493 | " plt.savefig(final_state_fig_path / f\"{metric}.{model_set_name}.pdf\")\n",
494 | " plt.close(fig)\n",
495 | " \n",
496 | "%matplotlib inline"
497 | ]
498 | },
499 | {
500 | "cell_type": "code",
501 | "execution_count": null,
502 | "metadata": {},
503 | "outputs": [],
504 | "source": [
505 | "%matplotlib agg\n",
506 | "\n",
507 | "if RENDER_FINAL:\n",
508 | " final_state_fig_path = figure_path / \"final_state_within_model\"\n",
509 | " final_state_fig_path.mkdir(exist_ok=True)\n",
510 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL, None, None),\n",
511 | " (\"rank_mean_delta\", MAR_DELTA_LABEL, None, None),\n",
512 | " (\"mse\", \"Mean squared error\", None, None),\n",
513 | " (\"rank_mean\", \"Mean average rank\", None, None)]\n",
514 | " \n",
515 | " subj_order = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(report_model_sets[0][1]))] \\\n",
516 | " .groupby(\"subject\")[metrics[0][0]].mean().sort_values().index\n",
517 | " \n",
518 | " for model_set_name, model_set in report_model_sets:\n",
519 | " final_df = dfi[(dfi.step == checkpoint_steps[-1]) & (dfi.model.isin(model_set))]\n",
520 | "\n",
521 | " for metric, label, ymin, ymax in tqdm_notebook(metrics, desc=model_set_name):\n",
522 | " fig = plt.figure(figsize=(25, 10))\n",
523 | " sns.barplot(data=final_df, x=\"subject\", y=metric, hue=\"model\",\n",
524 | " order=subj_order)\n",
525 | " \n",
526 | " padding = final_df[metric].var() * 0.005\n",
527 | " plt.ylim((ymin or (final_df[metric].min() - padding), ymax or (final_df[metric].max() + padding)))\n",
528 | " plt.xlabel(\"Subject\")\n",
529 | " plt.ylabel(label)\n",
530 | " \n",
531 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1,0.5))\n",
532 | " plt.tight_layout()\n",
533 | " plt.savefig(final_state_fig_path / f\"{metric}.{model_set_name}.pdf\")\n",
534 | " plt.close(fig)\n",
535 | " \n",
536 | "%matplotlib inline"
537 | ]
538 | },
539 | {
540 | "cell_type": "markdown",
541 | "metadata": {},
542 | "source": [
543 | "### Step analysis"
544 | ]
545 | },
546 | {
547 | "cell_type": "code",
548 | "execution_count": null,
549 | "metadata": {
550 | "slideshow": {
551 | "slide_type": "-"
552 | }
553 | },
554 | "outputs": [],
555 | "source": [
556 | "# g = sns.FacetGrid(dfi, col=\"run\", size=6)\n",
557 | "# g.map(sns.lineplot, \"step\", \"decoding_mse_delta\", \"model\").add_legend()\n",
558 | "\n",
559 | "# plt.xlabel(\"Fine-tuning step\")\n",
560 | "# plt.ylabel(MSE_DELTA_LABEL)"
561 | ]
562 | },
563 | {
564 | "cell_type": "code",
565 | "execution_count": null,
566 | "metadata": {},
567 | "outputs": [],
568 | "source": [
569 | "# g = sns.FacetGrid(dfi, col=\"run\", size=6)\n",
570 | "# g.map(sns.lineplot, \"step\", \"rank_mean_delta\", \"model\").add_legend()\n",
571 | "\n",
572 | "# plt.xlabel(\"Fine-tuning step\")\n",
573 | "# plt.ylabel(MAR_DELTA_LABEL)"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": null,
579 | "metadata": {},
580 | "outputs": [],
581 | "source": [
582 | "f, ax = plt.subplots(figsize=(15, 10))\n",
583 | "sns.lineplot(data=dfi, x=\"step\", y=\"decoding_mse_delta_rolling\", hue=\"model\", ax=ax)\n",
584 | "\n",
585 | "plt.xlabel(\"Fine-tuning step\")\n",
586 | "plt.ylabel(MSE_DELTA_LABEL)\n",
587 | "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)"
588 | ]
589 | },
590 | {
591 | "cell_type": "code",
592 | "execution_count": null,
593 | "metadata": {},
594 | "outputs": [],
595 | "source": [
596 | "f, ax = plt.subplots(figsize=(15, 10))\n",
597 | "sns.lineplot(data=dfi, x=\"step\", y=\"rank_mean_delta_rolling\", hue=\"model\", ax=ax)\n",
598 | "\n",
599 | "plt.xlabel(\"Fine-tuning step\")\n",
600 | "plt.ylabel(MAR_DELTA_LABEL)\n",
601 | "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)"
602 | ]
603 | },
604 | {
605 | "cell_type": "code",
606 | "execution_count": null,
607 | "metadata": {},
608 | "outputs": [],
609 | "source": [
610 | "%matplotlib agg\n",
611 | "\n",
612 | "if RENDER_FINAL:\n",
613 | " trajectory_fig_dir = figure_path / \"trajectories\"\n",
614 | " trajectory_fig_dir.mkdir(exist_ok=True)\n",
615 | " metrics = [(\"decoding_mse_delta\", MSE_DELTA_LABEL),\n",
616 | " (\"rank_mean_delta\", MAR_DELTA_LABEL),\n",
617 | " (\"decoding_mse_delta_rolling\", MSE_DELTA_LABEL),\n",
618 | " (\"rank_mean_delta_rolling\", MAR_DELTA_LABEL)]\n",
619 | "\n",
620 | " for model_set_name, model_set in report_model_sets:\n",
621 | " for metric, label in tqdm_notebook(metrics, desc=model_set_name):\n",
622 | " fig = plt.figure(figsize=(18, 10))\n",
623 | " sns.lineplot(data=dfi[dfi.model.isin(model_set)],\n",
624 | " x=\"step\", y=metric, hue=\"model\", palette=report_hues)\n",
625 | " plt.xlim((0, checkpoint_steps[-1]))\n",
626 | " plt.xlabel(\"Fine-tuning step\")\n",
627 | " plt.ylabel(label)\n",
628 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1, 0.5))\n",
629 | " plt.tight_layout()\n",
630 | " plt.savefig(trajectory_fig_dir / f\"{metric}.{model_set_name}.pdf\")\n",
631 | " plt.close(fig)\n",
632 | " \n",
633 | "%matplotlib inline"
634 | ]
635 | },
636 | {
637 | "cell_type": "code",
638 | "execution_count": null,
639 | "metadata": {},
640 | "outputs": [],
641 | "source": [
642 | "# g = sns.FacetGrid(dfi[dfi.model != baseline_model], col=\"model\", row=\"run\", size=6)\n",
643 | "# g.map(sns.lineplot, \"step\", \"decoding_mse_delta\", \"subject\", ci=None).add_legend()"
644 | ]
645 | },
646 | {
647 | "cell_type": "code",
648 | "execution_count": null,
649 | "metadata": {},
650 | "outputs": [],
651 | "source": [
652 | "# g = sns.FacetGrid(dfi, col=\"model\", row=\"run\", size=6)\n",
653 | "# g.map(sns.lineplot, \"step\", \"rank_median_delta\", \"subject\", ci=None).add_legend()"
654 | ]
655 | },
656 | {
657 | "cell_type": "markdown",
658 | "metadata": {},
659 | "source": [
660 | "### Gradient norm analysis"
661 | ]
662 | },
663 | {
664 | "cell_type": "code",
665 | "execution_count": null,
666 | "metadata": {},
667 | "outputs": [],
668 | "source": [
669 | "# f, ax = plt.subplots(figsize=(10, 8))\n",
670 | "# sns.lineplot(data=dfi, y=\"decoding_mse_delta\", x=\"total_global_norms_bin\", hue=\"model\", ax=ax)\n",
671 | "# ax.set_title(\"Decoding performance delta vs. binned total global gradient norm\")\n",
672 | "# ax.set_xlabel(\"Cumulative global gradient norm bin\")\n",
673 | "# ax.set_ylabel(MSE_DELTA_LABEL)"
674 | ]
675 | },
676 | {
677 | "cell_type": "code",
678 | "execution_count": null,
679 | "metadata": {},
680 | "outputs": [],
681 | "source": [
682 | "#g = sns.FacetGrid(dfi, col=\"model\", row=\"run\", size=6, sharex=False, sharey=True)\n",
683 | "#g.map(sns.lineplot, \"total_global_norms\", \"decoding_mse_delta\", \"subject\", ci=None).add_legend()"
684 | ]
685 | },
686 | {
687 | "cell_type": "markdown",
688 | "metadata": {},
689 | "source": [
690 | "### Eval accuracy analysis"
691 | ]
692 | },
693 | {
694 | "cell_type": "code",
695 | "execution_count": null,
696 | "metadata": {},
697 | "outputs": [],
698 | "source": [
699 | "#g = sns.FacetGrid(dfi, col=\"model\", row=\"run\", sharex=False, sharey=True, size=7)\n",
700 | "#g.map(sns.lineplot, \"eval_accuracy\", \"decoding_mse_delta\", \"subject\", ci=None).add_legend()"
701 | ]
702 | },
703 | {
704 | "cell_type": "markdown",
705 | "metadata": {},
706 | "source": [
707 | "## Per-subject analysis"
708 | ]
709 | },
710 | {
711 | "cell_type": "code",
712 | "execution_count": null,
713 | "metadata": {},
714 | "outputs": [],
715 | "source": [
716 | "f, ax = plt.subplots(figsize=(14, 9))\n",
717 | "dff = pd.DataFrame(dfi[dfi.step == checkpoint_steps[-1]].groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").decoding_mse_delta.mean()).stack()).reset_index()\n",
718 | "sns.barplot(data=dff, x=\"model\", hue=\"subject\", y=0, ax=ax)\n",
719 | "plt.title(\"subject final decoding mse delta, averaging across runs\")"
720 | ]
721 | },
722 | {
723 | "cell_type": "code",
724 | "execution_count": null,
725 | "metadata": {},
726 | "outputs": [],
727 | "source": [
728 | "f, ax = plt.subplots(figsize=(14, 9))\n",
729 | "dff = pd.DataFrame(dfi[dfi.step == checkpoint_steps[-1]].groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").rank_mean_delta.mean()).stack()).reset_index()\n",
730 | "sns.barplot(data=dff, x=\"model\", hue=\"subject\", y=0, ax=ax)\n",
731 | "plt.title(\"subject final rank mean delta, averaging across runs\")"
732 | ]
733 | },
734 | {
735 | "cell_type": "code",
736 | "execution_count": null,
737 | "metadata": {},
738 | "outputs": [],
739 | "source": [
740 | "f, ax = plt.subplots(figsize=(14, 9))\n",
741 | "dff = pd.DataFrame(dfi.groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").decoding_mse_delta.max()).stack()).reset_index()\n",
742 | "sns.violinplot(data=dff, x=\"subject\", y=0)\n",
743 | "sns.stripplot(data=dff, x=\"subject\", y=0, edgecolor=\"white\", linewidth=1, alpha=0.7, ax=ax)\n",
744 | "plt.title(\"subject max decoding mse delta, averaging across models and runs\")"
745 | ]
746 | },
747 | {
748 | "cell_type": "code",
749 | "execution_count": null,
750 | "metadata": {},
751 | "outputs": [],
752 | "source": [
753 | "f, ax = plt.subplots(figsize=(14, 9))\n",
754 | "dff = pd.DataFrame(dfi.groupby([\"model\", \"run\"]).apply(lambda xs: xs.groupby(\"subject\").decoding_mse_delta.min()).stack()).reset_index()\n",
755 | "sns.violinplot(data=dff, x=\"subject\", y=0)\n",
756 | "sns.stripplot(data=dff, x=\"subject\", y=0, edgecolor=\"white\", linewidth=1, alpha=0.7, ax=ax)\n",
757 | "plt.title(\"subject min decoding mse delta, averaging across models and runs\")"
758 | ]
759 | },
760 | {
761 | "cell_type": "markdown",
762 | "metadata": {},
763 | "source": [
764 | "## Statistical analyses\n",
765 | "\n",
766 | "First, some data prep for comparing final vs. start states:"
767 | ]
768 | },
769 | {
770 | "cell_type": "code",
771 | "execution_count": null,
772 | "metadata": {},
773 | "outputs": [],
774 | "source": [
775 | "perf_comp = df.query(\"step == %i\" % checkpoint_steps[-1]).reset_index(level=\"step\", drop=True).sort_index()\n",
776 | "# Join data from baseline\n",
777 | "perf_comp = perf_comp.join(df.loc[baseline_model, 1, 0].rename(columns=lambda c: \"start_%s\" % c))\n",
778 | "if \"glove\" in perf_comp.index.levels[0]:\n",
779 | " perf_comp = perf_comp.join(df.loc[\"glove\", 1, 250].rename(columns=lambda c: \"glove_%s\" % c))\n",
780 | "perf_comp.head()"
781 | ]
782 | },
783 | {
784 | "cell_type": "code",
785 | "execution_count": null,
786 | "metadata": {},
787 | "outputs": [],
788 | "source": [
789 | "(perf_comp.mse - perf_comp.start_mse).plot.hist()"
790 | ]
791 | },
792 | {
793 | "cell_type": "code",
794 | "execution_count": null,
795 | "metadata": {},
796 | "outputs": [],
797 | "source": [
798 | "perf_compi = perf_comp.reset_index()"
799 | ]
800 | },
801 | {
802 | "cell_type": "markdown",
803 | "metadata": {},
804 | "source": [
805 | "Quantitative tests:\n",
806 | " \n",
807 | "1. for any GLUE task g, MSE(g after 250) > MSE(LM)\n",
808 | "2. for any LM_scrambled_para task t, MSE(t after 250) < MSE(LM)\n",
809 | "3. for any GLUE task g, MAR(g after 250) > MAR(LM)\n",
810 | "4. for any LM_scrambled_para task t, MAR(t after 250) < MAR(LM)\n",
811 | "5. MSE(LM after 250) =~ MSE(LM)\n",
812 | "6. MAR(LM after 250) =~ MSE(LM)\n",
813 | "7. for any LM_scrambled_para task t, MSE(t after 250) < MSE(glove)\n",
814 | "8. for any LM_scrambled_para task t, MAR(t after 250) < MAR(glove)\n",
815 | "9. for any LM_pos task t, MSE(t after 250) > MSE(LM)\n",
816 | "10. for any LM_pos task t, MAR(t after 250) > MAR(LM)"
817 | ]
818 | },
819 | {
820 | "cell_type": "markdown",
821 | "metadata": {},
822 | "source": [
823 | "### test 1"
824 | ]
825 | },
826 | {
827 | "cell_type": "code",
828 | "execution_count": null,
829 | "metadata": {},
830 | "outputs": [],
831 | "source": [
832 | "sample = perf_compi[~perf_compi.model.str.startswith((baseline_model, \"LM\", \"glove\"))]"
833 | ]
834 | },
835 | {
836 | "cell_type": "code",
837 | "execution_count": null,
838 | "metadata": {},
839 | "outputs": [],
840 | "source": [
841 | "sample.mse.hist()"
842 | ]
843 | },
844 | {
845 | "cell_type": "code",
846 | "execution_count": null,
847 | "metadata": {},
848 | "outputs": [],
849 | "source": [
850 | "sample.start_mse.hist()"
851 | ]
852 | },
853 | {
854 | "cell_type": "code",
855 | "execution_count": null,
856 | "metadata": {},
857 | "outputs": [],
858 | "source": [
859 | "st.ttest_rel(sample.mse, sample.start_mse)"
860 | ]
861 | },
862 | {
863 | "cell_type": "markdown",
864 | "metadata": {},
865 | "source": [
866 | "### test 1 (split across models)"
867 | ]
868 | },
869 | {
870 | "cell_type": "code",
871 | "execution_count": null,
872 | "metadata": {},
873 | "outputs": [],
874 | "source": [
875 | "results = []\n",
876 | "for model in standard_models:\n",
877 | " if model in [\"LM\", \"glove\"]: continue\n",
878 | " sample = perf_compi[perf_compi.model == model]\n",
879 | " results.append((model,) + st.ttest_rel(sample.mse, sample.start_mse))\n",
880 | " \n",
881 | "pd.DataFrame(results, columns=[\"model\", \"tval\", \"pval\"])"
882 | ]
883 | },
884 | {
885 | "cell_type": "markdown",
886 | "metadata": {},
887 | "source": [
888 | "### test 2"
889 | ]
890 | },
891 | {
892 | "cell_type": "code",
893 | "execution_count": null,
894 | "metadata": {},
895 | "outputs": [],
896 | "source": [
897 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]"
898 | ]
899 | },
900 | {
901 | "cell_type": "code",
902 | "execution_count": null,
903 | "metadata": {},
904 | "outputs": [],
905 | "source": [
906 | "sample.mse.hist()"
907 | ]
908 | },
909 | {
910 | "cell_type": "code",
911 | "execution_count": null,
912 | "metadata": {},
913 | "outputs": [],
914 | "source": [
915 | "sample.start_mse.hist()"
916 | ]
917 | },
918 | {
919 | "cell_type": "code",
920 | "execution_count": null,
921 | "metadata": {},
922 | "outputs": [],
923 | "source": [
924 | "st.ttest_rel(sample.mse, sample.start_mse)"
925 | ]
926 | },
927 | {
928 | "cell_type": "markdown",
929 | "metadata": {},
930 | "source": [
931 | "### test 3"
932 | ]
933 | },
934 | {
935 | "cell_type": "code",
936 | "execution_count": null,
937 | "metadata": {},
938 | "outputs": [],
939 | "source": [
940 | "sample = perf_compi[~perf_compi.model.str.startswith((baseline_model, \"LM\", \"glove\"))]"
941 | ]
942 | },
943 | {
944 | "cell_type": "code",
945 | "execution_count": null,
946 | "metadata": {},
947 | "outputs": [],
948 | "source": [
949 | "sample.rank_mean.hist()"
950 | ]
951 | },
952 | {
953 | "cell_type": "code",
954 | "execution_count": null,
955 | "metadata": {},
956 | "outputs": [],
957 | "source": [
958 | "sample.start_rank_mean.hist()"
959 | ]
960 | },
961 | {
962 | "cell_type": "code",
963 | "execution_count": null,
964 | "metadata": {},
965 | "outputs": [],
966 | "source": [
967 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)"
968 | ]
969 | },
970 | {
971 | "cell_type": "markdown",
972 | "metadata": {},
973 | "source": [
974 | "### test 3 (split across models)"
975 | ]
976 | },
977 | {
978 | "cell_type": "code",
979 | "execution_count": null,
980 | "metadata": {},
981 | "outputs": [],
982 | "source": [
983 | "results = []\n",
984 | "for model in standard_models:\n",
985 | " if model in [\"LM\", \"glove\"]: continue\n",
986 | " sample = perf_compi[perf_compi.model == model]\n",
987 | " results.append((model,) + st.ttest_rel(sample.rank_mean, sample.start_rank_mean))\n",
988 | " \n",
989 | "pd.DataFrame(results, columns=[\"model\", \"tval\", \"pval\"])"
990 | ]
991 | },
992 | {
993 | "cell_type": "markdown",
994 | "metadata": {},
995 | "source": [
996 | "### test 4"
997 | ]
998 | },
999 | {
1000 | "cell_type": "code",
1001 | "execution_count": null,
1002 | "metadata": {},
1003 | "outputs": [],
1004 | "source": [
1005 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]"
1006 | ]
1007 | },
1008 | {
1009 | "cell_type": "code",
1010 | "execution_count": null,
1011 | "metadata": {},
1012 | "outputs": [],
1013 | "source": [
1014 | "sample.rank_mean.hist()"
1015 | ]
1016 | },
1017 | {
1018 | "cell_type": "code",
1019 | "execution_count": null,
1020 | "metadata": {},
1021 | "outputs": [],
1022 | "source": [
1023 | "sample.start_rank_mean.hist()"
1024 | ]
1025 | },
1026 | {
1027 | "cell_type": "code",
1028 | "execution_count": null,
1029 | "metadata": {},
1030 | "outputs": [],
1031 | "source": [
1032 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)"
1033 | ]
1034 | },
1035 | {
1036 | "cell_type": "markdown",
1037 | "metadata": {},
1038 | "source": [
1039 | "### test 5"
1040 | ]
1041 | },
1042 | {
1043 | "cell_type": "code",
1044 | "execution_count": null,
1045 | "metadata": {},
1046 | "outputs": [],
1047 | "source": [
1048 | "sample = perf_compi[perf_compi.model == \"LM\"]"
1049 | ]
1050 | },
1051 | {
1052 | "cell_type": "code",
1053 | "execution_count": null,
1054 | "metadata": {},
1055 | "outputs": [],
1056 | "source": [
1057 | "sample.mse.hist()"
1058 | ]
1059 | },
1060 | {
1061 | "cell_type": "code",
1062 | "execution_count": null,
1063 | "metadata": {},
1064 | "outputs": [],
1065 | "source": [
1066 | "sample.start_mse.hist()"
1067 | ]
1068 | },
1069 | {
1070 | "cell_type": "code",
1071 | "execution_count": null,
1072 | "metadata": {},
1073 | "outputs": [],
1074 | "source": [
1075 | "st.ttest_rel(sample.mse, sample.start_mse)"
1076 | ]
1077 | },
1078 | {
1079 | "cell_type": "markdown",
1080 | "metadata": {},
1081 | "source": [
1082 | "### test 6"
1083 | ]
1084 | },
1085 | {
1086 | "cell_type": "code",
1087 | "execution_count": null,
1088 | "metadata": {},
1089 | "outputs": [],
1090 | "source": [
1091 | "sample = perf_compi[perf_compi.model == \"LM\"]"
1092 | ]
1093 | },
1094 | {
1095 | "cell_type": "code",
1096 | "execution_count": null,
1097 | "metadata": {},
1098 | "outputs": [],
1099 | "source": [
1100 | "sample.rank_mean.hist()"
1101 | ]
1102 | },
1103 | {
1104 | "cell_type": "code",
1105 | "execution_count": null,
1106 | "metadata": {},
1107 | "outputs": [],
1108 | "source": [
1109 | "sample.start_rank_mean.hist()"
1110 | ]
1111 | },
1112 | {
1113 | "cell_type": "code",
1114 | "execution_count": null,
1115 | "metadata": {},
1116 | "outputs": [],
1117 | "source": [
1118 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)"
1119 | ]
1120 | },
1121 | {
1122 | "cell_type": "markdown",
1123 | "metadata": {},
1124 | "source": [
1125 | "### test 7"
1126 | ]
1127 | },
1128 | {
1129 | "cell_type": "code",
1130 | "execution_count": null,
1131 | "metadata": {},
1132 | "outputs": [],
1133 | "source": [
1134 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]"
1135 | ]
1136 | },
1137 | {
1138 | "cell_type": "code",
1139 | "execution_count": null,
1140 | "metadata": {},
1141 | "outputs": [],
1142 | "source": [
1143 | "sample.mse.hist()"
1144 | ]
1145 | },
1146 | {
1147 | "cell_type": "code",
1148 | "execution_count": null,
1149 | "metadata": {},
1150 | "outputs": [],
1151 | "source": [
1152 | "sample.glove_mse.hist()"
1153 | ]
1154 | },
1155 | {
1156 | "cell_type": "code",
1157 | "execution_count": null,
1158 | "metadata": {},
1159 | "outputs": [],
1160 | "source": [
1161 | "st.ttest_rel(sample.mse, sample.glove_mse)"
1162 | ]
1163 | },
1164 | {
1165 | "cell_type": "markdown",
1166 | "metadata": {},
1167 | "source": [
1168 | "### test 8"
1169 | ]
1170 | },
1171 | {
1172 | "cell_type": "code",
1173 | "execution_count": null,
1174 | "metadata": {},
1175 | "outputs": [],
1176 | "source": [
1177 | "sample = perf_compi[perf_compi.model == \"LM_scrambled_para\"]"
1178 | ]
1179 | },
1180 | {
1181 | "cell_type": "code",
1182 | "execution_count": null,
1183 | "metadata": {},
1184 | "outputs": [],
1185 | "source": [
1186 | "sample.rank_mean.hist()"
1187 | ]
1188 | },
1189 | {
1190 | "cell_type": "code",
1191 | "execution_count": null,
1192 | "metadata": {},
1193 | "outputs": [],
1194 | "source": [
1195 | "sample.glove_rank_mean.hist()"
1196 | ]
1197 | },
1198 | {
1199 | "cell_type": "code",
1200 | "execution_count": null,
1201 | "metadata": {},
1202 | "outputs": [],
1203 | "source": [
1204 | "st.ttest_rel(sample.rank_mean, sample.glove_rank_mean)"
1205 | ]
1206 | },
1207 | {
1208 | "cell_type": "markdown",
1209 | "metadata": {},
1210 | "source": [
1211 | "### test 9"
1212 | ]
1213 | },
1214 | {
1215 | "cell_type": "code",
1216 | "execution_count": null,
1217 | "metadata": {},
1218 | "outputs": [],
1219 | "source": [
1220 | "sample = perf_compi[perf_compi.model == \"LM_pos\"]"
1221 | ]
1222 | },
1223 | {
1224 | "cell_type": "code",
1225 | "execution_count": null,
1226 | "metadata": {},
1227 | "outputs": [],
1228 | "source": [
1229 | "sample.mse.hist()"
1230 | ]
1231 | },
1232 | {
1233 | "cell_type": "code",
1234 | "execution_count": null,
1235 | "metadata": {},
1236 | "outputs": [],
1237 | "source": [
1238 | "sample.start_mse.hist()"
1239 | ]
1240 | },
1241 | {
1242 | "cell_type": "code",
1243 | "execution_count": null,
1244 | "metadata": {},
1245 | "outputs": [],
1246 | "source": [
1247 | "st.ttest_rel(sample.mse, sample.start_mse)"
1248 | ]
1249 | },
1250 | {
1251 | "cell_type": "code",
1252 | "execution_count": null,
1253 | "metadata": {},
1254 | "outputs": [],
1255 | "source": [
1256 | "f = plt.figure(figsize=(20,20))\n",
1257 | "sns.barplot(data=pd.melt(sample, id_vars=[\"subject\"], value_vars=[\"mse\", \"start_mse\"]),\n",
1258 | " x=\"subject\", y=\"value\", hue=\"variable\")\n",
1259 | "plt.ylim((0.0033, 0.0038))"
1260 | ]
1261 | },
1262 | {
1263 | "cell_type": "markdown",
1264 | "metadata": {},
1265 | "source": [
1266 | "### test 10"
1267 | ]
1268 | },
1269 | {
1270 | "cell_type": "code",
1271 | "execution_count": null,
1272 | "metadata": {},
1273 | "outputs": [],
1274 | "source": [
1275 | "sample = perf_compi[perf_compi.model == \"LM_pos\"]"
1276 | ]
1277 | },
1278 | {
1279 | "cell_type": "code",
1280 | "execution_count": null,
1281 | "metadata": {},
1282 | "outputs": [],
1283 | "source": [
1284 | "sample.rank_mean.hist()"
1285 | ]
1286 | },
1287 | {
1288 | "cell_type": "code",
1289 | "execution_count": null,
1290 | "metadata": {},
1291 | "outputs": [],
1292 | "source": [
1293 | "sample.start_rank_mean.hist()"
1294 | ]
1295 | },
1296 | {
1297 | "cell_type": "code",
1298 | "execution_count": null,
1299 | "metadata": {},
1300 | "outputs": [],
1301 | "source": [
1302 | "st.ttest_rel(sample.rank_mean, sample.start_rank_mean)"
1303 | ]
1304 | }
1305 | ],
1306 | "metadata": {
1307 | "kernelspec": {
1308 | "display_name": "Python 3",
1309 | "language": "python",
1310 | "name": "python3"
1311 | },
1312 | "language_info": {
1313 | "codemirror_mode": {
1314 | "name": "ipython",
1315 | "version": 3
1316 | },
1317 | "file_extension": ".py",
1318 | "mimetype": "text/x-python",
1319 | "name": "python",
1320 | "nbconvert_exporter": "python",
1321 | "pygments_lexer": "ipython3",
1322 | "version": "3.6.8"
1323 | }
1324 | },
1325 | "nbformat": 4,
1326 | "nbformat_minor": 2
1327 | }
1328 |
--------------------------------------------------------------------------------
/notebooks/rsa.py:
--------------------------------------------------------------------------------
1 | import itertools
2 |
3 | from scipy import stats as st
4 | from scipy.spatial.distance import pdist
5 | from tqdm import tqdm
6 |
7 | import pandas as pd
8 |
9 |
10 | def rsa_encodings(encodings_dict, pairs=None, collapse_fn=None):
11 | """
12 | Compute representational similarity metrics on the given encodings.
13 |
14 | Arguments:
15 | pairs: encoding pairs (keys of `encodings_dict`) to compare. If `None`, all possible pairs are evaluated.
16 | collapse_fn: if not `None`, store the results of each pairwise analysis not under the key `(model1, model2)` (where `model1`, `model2` are keys of `pairs`), but rather `(collapse_fn(model1), collapse_fn(model2))`.
17 | """
18 |
19 | if pairs is None:
20 | pairs = list(itertools.combinations(encodings_dict.keys(), 2))
21 |
22 | # Cache distance matrices.
23 | dist_matrices = {}
24 |
25 | rsa_sims = []
26 | for m1_key, m2_key in tqdm(pairs):
27 | dists1 = dist_matrices.get(m1_key)
28 | if dists1 is None:
29 | dists1 = pdist(encodings_dict[m1_key])
30 | dist_matrices[m1_key] = dists1
31 |
32 | dists2 = dist_matrices.get(m2_key)
33 | if dists2 is None:
34 | dists2 = pdist(encodings_dict[m2_key])
35 | dist_matrices[m2_key] = dists2
36 |
37 | pearson_coef, _ = st.spearmanr(dists1, dists2)
38 |
39 | if collapse_fn is not None:
40 | m1_key = collapse_fn(m1_key)
41 | m2_key = collapse_fn(m2_key)
42 |
43 | rsa_sims.append((m1_key, m2_key, pearson_coef))
44 |
45 | rsa_sims = pd.DataFrame(rsa_sims, columns=["model1", "model2", "pearsonr"])
46 | return rsa_sims
--------------------------------------------------------------------------------
/notebooks/structural-probes.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "from functools import partial\n",
10 | "import itertools\n",
11 | "import json\n",
12 | "from pathlib import Path\n",
13 | "import re\n",
14 | "import sys\n",
15 | "sys.path.append(\"../src\")\n",
16 | "\n",
17 | "import matplotlib\n",
18 | "import matplotlib.pyplot as plt\n",
19 | "import numpy as np\n",
20 | "import pandas as pd\n",
21 | "import seaborn as sns\n",
22 | "import statsmodels.formula.api as smf\n",
23 | "from tqdm import tqdm, tqdm_notebook\n",
24 | "\n",
25 | "%matplotlib inline\n",
26 | "sns.set(style=\"whitegrid\", context=\"paper\", font_scale=3.5, rc={\"lines.linewidth\": 2.5})\n",
27 | "from IPython.display import set_matplotlib_formats\n",
28 | "set_matplotlib_formats('png')\n",
29 | "#set_matplotlib_formats('svg')\n",
30 | "\n",
31 | "%load_ext autoreload\n",
32 | "%autoreload 2\n",
33 | "import util"
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "metadata": {},
39 | "source": [
40 | "## Data preparation"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": null,
46 | "metadata": {},
47 | "outputs": [],
48 | "source": [
49 | "output_path = Path(\"../output\")\n",
50 | "bert_encoding_path = output_path / \"encodings\"\n",
51 | "sprobe_results_path = output_path / \"structural-probe\""
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "checkpoints = [util.get_encoding_ckpt_id(dir_entry) for dir_entry in bert_encoding_path.iterdir()]"
61 | ]
62 | },
63 | {
64 | "cell_type": "code",
65 | "execution_count": null,
66 | "metadata": {},
67 | "outputs": [],
68 | "source": [
69 | "models = [model for model, _, _ in checkpoints]\n",
70 | "\n",
71 | "baseline_model = \"baseline\"\n",
72 | "if baseline_model not in models:\n",
73 | " raise ValueError(\"Missing baseline model. This is necessary to compute performance deltas in the analysis of fine-tuning models. Stop.\")\n",
74 | "\n",
75 | "standard_models = [model for model in models if not model.startswith(\"LM_\") and not model == baseline_model]\n",
76 | "custom_models = [model for model in models if model.startswith(\"LM_\") and not model == baseline_model]\n",
77 | "\n",
78 | "runs = sorted(set(run for _, run, _ in checkpoints))\n",
79 | "checkpoint_steps = sorted(set(step for _, _, step in checkpoints))\n",
80 | "\n",
81 | "# Models which should appear in the final report figures\n",
82 | "report_models = [\"SQuAD\", \"QQP\", \"MNLI\", \"SST\", \"LM\", \"LM_scrambled\", \"LM_scrambled_para\", \"LM_pos\", \"glove\"]\n",
83 | "\n",
84 | "# Model subsets to render in different report figures\n",
85 | "report_model_sets = [\n",
86 | " (\"all\", set(report_models)),\n",
87 | " (\"standard\", set(report_models) & set(standard_models)),\n",
88 | " (\"custom\", set(report_models) & set(custom_models)),\n",
89 | "]\n",
90 | "report_model_sets = [(name, model_set) for name, model_set in report_model_sets\n",
91 | " if len(model_set) > 0]"
92 | ]
93 | },
94 | {
95 | "cell_type": "code",
96 | "execution_count": null,
97 | "metadata": {},
98 | "outputs": [],
99 | "source": [
100 | "RENDER_FINAL = True\n",
101 | "figure_path = Path(\"../reports/figures\")\n",
102 | "figure_path.mkdir(exist_ok=True)\n",
103 | "\n",
104 | "report_hues = dict(zip(sorted(report_models), sns.color_palette()))"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "## Collect results"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": null,
117 | "metadata": {},
118 | "outputs": [],
119 | "source": [
120 | "eval_results = {}\n",
121 | "for eval_dir in tqdm_notebook(list(sprobe_results_path.iterdir())):\n",
122 | " if not eval_dir.is_dir(): continue\n",
123 | " model, run, step = util.get_encoding_ckpt_id(eval_dir)\n",
124 | " \n",
125 | " try:\n",
126 | " uuas_file = list(eval_dir.glob(\"**/dev.uuas\"))[0]\n",
127 | " with uuas_file.open(\"r\") as f:\n",
128 | " uuas = float(f.read().strip())\n",
129 | " except: continue\n",
130 | " \n",
131 | " try:\n",
132 | " spearman_file = list(eval_dir.glob(\"**/dev.spearmanr-*-mean\"))[0]\n",
133 | " with spearman_file.open(\"r\") as f:\n",
134 | " spearman = float(f.read().strip())\n",
135 | " except: continue\n",
136 | " \n",
137 | " eval_results[model, run, step] = pd.Series({\"uuas\": uuas, \"spearman\": spearman})"
138 | ]
139 | },
140 | {
141 | "cell_type": "markdown",
142 | "metadata": {},
143 | "source": [
144 | "### Add non-BERT results"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": null,
150 | "metadata": {},
151 | "outputs": [],
152 | "source": [
153 | "nonbert_models = []"
154 | ]
155 | },
156 | {
157 | "cell_type": "code",
158 | "execution_count": null,
159 | "metadata": {},
160 | "outputs": [],
161 | "source": [
162 | "# GloVe\n",
163 | "# for glove_dir in tqdm_notebook(list(sprobe_glove_path.glob(\"*\"))):\n",
164 | "# if not glove_dir.is_dir(): continue\n",
165 | "# model = glove_dir.name\n",
166 | " \n",
167 | "# try:\n",
168 | "# uuas_file = list(glove_dir.glob(\"**/dev.uuas\"))[0]\n",
169 | "# with uuas_file.open(\"r\") as f:\n",
170 | "# uuas = float(f.read().strip())\n",
171 | "# except: continue\n",
172 | " \n",
173 | "# try:\n",
174 | "# spearman_file = list(glove_dir.glob(\"**/dev.spearmanr-*-mean\"))[0]\n",
175 | "# with spearman_file.open(\"r\") as f:\n",
176 | "# spearman = float(f.read().strip())\n",
177 | "# except: continue\n",
178 | " \n",
179 | "# nonbert_models.append(model)\n",
180 | "# eval_results[model, 1, 250, 0] = pd.Series({\"uuas\": uuas, \"spearman\": spearman})"
181 | ]
182 | },
183 | {
184 | "cell_type": "markdown",
185 | "metadata": {},
186 | "source": [
187 | "### Aggregate"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": null,
193 | "metadata": {},
194 | "outputs": [],
195 | "source": [
196 | "eval_results = pd.DataFrame(pd.concat(eval_results, names=[\"model\", \"run\", \"step\", \"metric\"]))"
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "execution_count": null,
202 | "metadata": {},
203 | "outputs": [],
204 | "source": [
205 | "eval_results.tail(20)"
206 | ]
207 | },
208 | {
209 | "cell_type": "code",
210 | "execution_count": null,
211 | "metadata": {},
212 | "outputs": [],
213 | "source": [
214 | "# Only use spaCy results\n",
215 | "nonbert_models_to_graph = [(\"spaCy-en_vectors_web_lg\", \"GloVe\")]\n",
216 | "nonbert_models_to_graph = [(name, label) for name, label in nonbert_models_to_graph if name in nonbert_models]"
217 | ]
218 | },
219 | {
220 | "cell_type": "markdown",
221 | "metadata": {},
222 | "source": [
223 | "## Graph"
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "execution_count": null,
229 | "metadata": {},
230 | "outputs": [],
231 | "source": [
232 | "graph_data = eval_results.reset_index()\n",
233 | "graph_data = graph_data[~graph_data.model.isin(nonbert_models + [baseline_model])]"
234 | ]
235 | },
236 | {
237 | "cell_type": "code",
238 | "execution_count": null,
239 | "metadata": {},
240 | "outputs": [],
241 | "source": [
242 | "g = sns.FacetGrid(data=graph_data, col=\"metric\", height=7, sharex=True, sharey=True)\n",
243 | "g.map(sns.lineplot, \"step\", 0, \"model\")\n",
244 | "\n",
245 | "for uuas_ax in g.axes[:, 0]:\n",
246 | " for nonbert_model, label in nonbert_models_to_graph:\n",
247 | " uuas_ax.axhline(eval_results.loc[nonbert_model, 1, 250, 0, \"uuas\"][0], linestyle='--', label=label)\n",
248 | "for spearman_ax in g.axes[:, 1]:\n",
249 | " for nonbert_model, label in nonbert_models_to_graph:\n",
250 | " spearman_ax.axhline(eval_results.loc[nonbert_model, 1, 250, 0, \"spearman\"][0], linestyle='--', label=label)\n",
251 | " \n",
252 | "g.add_legend()\n",
253 | "g"
254 | ]
255 | },
256 | {
257 | "cell_type": "code",
258 | "execution_count": null,
259 | "metadata": {},
260 | "outputs": [],
261 | "source": [
262 | "g = sns.FacetGrid(data=graph_data, col=\"metric\", row=\"model\", height=7, sharex=True, sharey=True)\n",
263 | "g.map(sns.lineplot, \"step\", 0).add_legend()"
264 | ]
265 | },
266 | {
267 | "cell_type": "code",
268 | "execution_count": null,
269 | "metadata": {},
270 | "outputs": [],
271 | "source": [
272 | "%matplotlib agg\n",
273 | "\n",
274 | "if RENDER_FINAL:\n",
275 | " dir = figure_path / \"structural_probe\"\n",
276 | " dir.mkdir(exist_ok=True)\n",
277 | " \n",
278 | " for metric, label in [(\"uuas\", \"UUAS\"), (\"spearman\", \"Spearman correlation\")]:\n",
279 | " fig = plt.figure(figsize=(15, 9))\n",
280 | " ax = sns.lineplot(data=graph_data[(graph_data.metric == metric)], x=\"step\", y=0,\n",
281 | " hue=\"model\", palette=report_hues)\n",
282 | " for nonbert_model, nonbert_label in nonbert_models_to_graph:\n",
283 | " ax.axhline(eval_results.loc[nonbert_model, 1, 0, metric][0],\n",
284 | " linestyle='--', label=nonbert_label, linewidth=3)\n",
285 | " \n",
286 | " plt.legend(loc=\"center left\", bbox_to_anchor=(1, 0.5))\n",
287 | " plt.xlim((0, checkpoint_steps[-1]))\n",
288 | " plt.ylabel(label)\n",
289 | " plt.xlabel(\"Training step\")\n",
290 | " plt.tight_layout()\n",
291 | " plt.savefig(dir / (\"%s.pdf\" % metric))\n",
292 | " plt.close()\n",
293 | " \n",
294 | "%matplotlib inline"
295 | ]
296 | }
297 | ],
298 | "metadata": {
299 | "kernelspec": {
300 | "display_name": "Python 3",
301 | "language": "python",
302 | "name": "python3"
303 | },
304 | "language_info": {
305 | "codemirror_mode": {
306 | "name": "ipython",
307 | "version": 3
308 | },
309 | "file_extension": ".py",
310 | "mimetype": "text/x-python",
311 | "name": "python",
312 | "nbconvert_exporter": "python",
313 | "pygments_lexer": "ipython3",
314 | "version": "3.6.8"
315 | }
316 | },
317 | "nbformat": 4,
318 | "nbformat_minor": 2
319 | }
320 |
--------------------------------------------------------------------------------
/notebooks/within-subject.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hans/nn-decoding/2d2cc639f650b6911cb1de7b8ecb7a872f75b36d/notebooks/within-subject.png
--------------------------------------------------------------------------------
/src/dependency_graph.py:
--------------------------------------------------------------------------------
1 | from argparse import ArgumentParser
2 | import itertools
3 |
4 | import matplotlib
5 | matplotlib.use("Agg")
6 | import matplotlib.pyplot as plt
7 | import numpy as np
8 | import networkx as nx
9 | import pandas as pd
10 |
11 |
12 | p = ArgumentParser()
13 | p.add_argument("heatmap_path")
14 |
15 | args = p.parse_args()
16 |
17 | heatmap = pd.read_csv(args.heatmap_path, index_col=0)
18 | assert heatmap.index.equals(heatmap.columns)
19 | encodings = heatmap.columns
20 |
21 | G = nx.Graph()
22 | edges = []
23 | for enc1, enc2 in itertools.product(list(range(len(encodings))), repeat=2):
24 | if enc1 == enc2:
25 | continue
26 | enc1_name = encodings[enc1]
27 | enc2_name = encodings[enc2]
28 |
29 | score_forward = heatmap.loc[enc1_name, enc2_name]
30 | score_reverse = heatmap.loc[enc2_name, enc1_name]
31 |
32 | # Extrema along rows, ignoring diagonal element
33 | enc1_min = min(heatmap.iloc[enc1, 0:enc1].min() if enc1 > 0 else np.inf, heatmap.iloc[enc1, enc1+1:].min())
34 | enc2_max = max(heatmap.iloc[enc2, 0:enc2].max() if enc2 > 0 else -np.inf, heatmap.iloc[enc2, enc2+1:].max())
35 | if score_forward > score_reverse and enc1_min >= enc2_max:
36 | edges.append((enc1_name, enc2_name))
37 |
38 | print(edges)
39 | G.add_edges_from(edges)
40 | pos = nx.spring_layout(G)
41 | nx.draw_networkx_nodes(G, pos)
42 | nx.draw_networkx_labels(G, pos)
43 | nx.draw_networkx_edges(G, pos, arrowstyle="->", arrowsize=10, arrows=True)
44 | plt.savefig("graph.png")
45 |
--------------------------------------------------------------------------------
/src/heatmap.py:
--------------------------------------------------------------------------------
1 | """
2 | Render a heat-map describing the relationship between different encodings.
3 | """
4 |
5 | from argparse import ArgumentParser
6 | import itertools
7 | import logging
8 | import multiprocessing
9 | import os.path
10 | logging.basicConfig(level=logging.DEBUG)
11 | logger = logging.getLogger(__name__)
12 |
13 | import matplotlib
14 | matplotlib.use("Agg")
15 | import matplotlib.pyplot as plt
16 | import numpy as np
17 | import pandas as pd
18 | from scipy.linalg import sqrtm
19 | from scipy.spatial.distance import pdist, squareform
20 | from scipy.stats import spearmanr
21 | from sklearn.decomposition import PCA
22 | from sklearn.linear_model import RidgeCV
23 | from sklearn.model_selection import KFold
24 | import seaborn as sns
25 | from tqdm import tqdm, trange
26 |
27 |
28 | def eval_encodings_cca(enc1, enc2):
29 | cv = KFold(n_splits=4)
30 | corr_results = []
31 |
32 | for train_idxs, test_idxs in tqdm(cv.split(enc1), total=cv.get_n_splits(enc1),
33 | desc="CV splits"):
34 | from rcca import CCA
35 | # TODO sanity check regularization constant s.t. CCA on self yields reasonable numbers
36 | cca = CCA(kernelcca=False, reg=1e-6, numCC=128, verbose=False)
37 | cca.train([enc1[train_idxs], enc2[train_idxs]])
38 |
39 | print(np.mean(cca.validate([enc1[train_idxs], enc2[train_idxs]])[0]))
40 | enc1_pred_corrs, enc2_pred_corrs = cca.validate([enc1[test_idxs], enc2[test_idxs]])
41 | # TODO projection weighting
42 | corr_results.append(np.mean(enc2_pred_corrs))
43 |
44 | return np.mean(corr_results)
45 |
46 |
47 | def eval_encodings_rdm(encodings, enc1_key, enc2_key,
48 | n_bootstrap_samples=100, sentences=None):
49 | """
50 | Evaluate the similarity between two encodings `e1, e2` as follows:
51 |
52 | 1. Align the paired representations of `e1` and `e2` to have maximal
53 | similarity (maximal dot product) via regularized CCA. (This CCA is
54 | cross-validated to prevent overfitting.)
55 | 2. Estimate a Spearman correlation coefficient relating the pairwise
56 | similarity judgments predicted by the two encodings by a bootstrap. (In
57 | other words, bootstrap-estimate a representational similarity analysis; in
58 | other words; bootstrap-estimate the difference between the
59 | representational dissimilarity matrices (RDMs) of the two aligned
60 | encodings.) representations.
61 |
62 | Args:
63 | encodings: Dictionary mapping from encoding name to n_examples * d matrices
64 | enc1_key: string key into `encodings`
65 | enc2_key: string key into `encodings`
66 | n_bootstrap_samples: Number of samples to take in estimating bootstrap.
67 | sentences: Optional `n_examples` array of sentences, for debugging
68 |
69 | Returns:
70 | spearman_coefs: Bootstrap estimates of the Spearman coefficient relating
71 | the pairwise similarity rankings predicted by CCA-aligned forms of `enc1`
72 | and `enc2`
73 | """
74 | enc1 = encodings[enc1_key]
75 | enc2 = encodings[enc2_key]
76 | assert enc1.shape[0] == enc2.shape[0]
77 |
78 | # First align with CCA.
79 | from rcca import CCA, CCACrossValidate
80 | cca = CCACrossValidate(kernelcca=False, regs=[1e-2,1e-1,1.0,10.0,20.0], numCCs=[32,64,128,256])
81 | cca.train([enc1, enc2])
82 | print("Best reg: %f; best CC: %i" % (cca.best_reg, cca.best_numCC))
83 |
84 | enc1_aligned, enc2_aligned = cca.comps
85 | # Calculate pairwise distances.
86 | dists_X = pdist(enc1_aligned, "correlation")
87 | dists_Y = pdist(enc2_aligned, "correlation")
88 |
89 | dists_X_square = squareform(dists_X)
90 | dists_Y_square = squareform(dists_Y)
91 |
92 | if sentences is not None:
93 | # DEBUG: List some of the most similar inputs
94 | sent_combinations = list(itertools.combinations(range(len(enc1)), 2))
95 | high_sim_X = np.argsort(dists_X)
96 | high_sim_Y = np.argsort(dists_Y)
97 |
98 | out_path = "sim_%s_%s.csv" % (enc1_key, enc2_key)
99 | with open(out_path, "w") as out_f:
100 | for i, high_sim_X_idx in enumerate(high_sim_X):
101 | sent1, sent2 = sent_combinations[high_sim_X_idx]
102 | out_f.write("%s,%d,%f,\"%s\",\"%s\"\n" % (enc1_key, i, dists_X_square[sent1, sent2],
103 | sentences[sent1], sentences[sent2]))
104 | for i, high_sim_Y_idx in enumerate(high_sim_Y):
105 | sent1, sent2 = sent_combinations[high_sim_Y_idx]
106 | out_f.write("%s,%d,%f,\"%s\",\"%s\"\n" % (enc2_key, i, dists_Y_square[sent1, sent2],
107 | sentences[sent1], sentences[sent2]))
108 |
109 | # # Bootstrap estimate the Spearman coefficient.
110 | # spearman_coefs = []
111 | # for _ in trange(n_bootstrap_samples):
112 | # idxs = np.random.choice(len(enc1), size=len(enc1), replace=True)
113 | # dists_X_sample = dists_X_square[np.ix_(idxs, idxs)]
114 | # dists_Y_sample = dists_Y_square[np.ix_(idxs, idxs)]
115 |
116 | # # Compute Spearman coefficient on condensed / non-redundant form.
117 | # sample_coef, _ = spearmanr(squareform(dists_X_sample), squareform(dists_Y_sample))
118 | # spearman_coefs.append(sample_coef)
119 |
120 | spearman_coef, _ = spearmanr(dists_X, dists_Y)
121 | print("\t", enc1_key, enc2_key, spearman_coef)
122 | return [spearman_coef]
123 |
124 |
125 | def eval_pair(inputs):
126 | enc1, enc2, encodings, sentences = inputs
127 | # Multiprocessing task function.
128 | if enc1 == enc2:
129 | return enc1, enc2, (1.0, 1.0)
130 | else:
131 | coefs = eval_encodings_rdm(encodings, enc1, enc2, sentences=sentences)
132 | # Calculate 95% CI bounds
133 | lower_bound, upper_bound = np.percentile(coefs, (0.5, 0.95))
134 | return enc1, enc2, (lower_bound, upper_bound)
135 |
136 |
137 | def main(args):
138 | encodings, encoding_keys = {}, []
139 | for encoding_path in args.encodings:
140 | encodings_i = np.load(encoding_path)
141 | encoding_key = os.path.basename(encoding_path)
142 | encoding_key = encoding_key[:encoding_key.rindex(".")]
143 |
144 | if args.encoding_project is not None and args.encoding_project < encodings_i.shape[1]:
145 | logger.info("Projecting %s to dimension %i with PCA", encoding_path, args.encoding_project)
146 | pca = PCA(args.encoding_project).fit(encodings_i)
147 | logger.info("PCA explained variance: %f", sum(pca.explained_variance_ratio_) * 100)
148 | encodings_i = pca.transform(encodings_i)
149 |
150 | encodings[encoding_key] = encodings_i
151 | encoding_keys.append(encoding_key)
152 |
153 | sentences = None
154 | if args.sentences_path is not None:
155 | with open(args.sentences_path, "r") as sentences_f:
156 | sentences = [line.strip() for line in sentences_f]
157 |
158 | # Prepare output structures
159 | assert len(set(enc.shape[0] for enc in encodings.values())) == 1
160 | # Make sure to maintain ordering of the encodings given in the CLI arguments.
161 | heatmap_mat_lower_bound = np.zeros((len(encodings), len(encodings)))
162 | heatmap_mat_upper_bound = np.zeros_like(heatmap_mat_lower_bound)
163 |
164 | # Prepare multiprocessing jobs
165 | pool = multiprocessing.Pool(processes=args.num_processes)
166 | job_inputs = [(enc1, enc2, encodings, sentences) for enc1, enc2
167 | in itertools.combinations(encoding_keys, 2)]
168 | jobs = pool.imap_unordered(eval_pair, job_inputs)
169 |
170 | # Join jobs and update matrices
171 | with tqdm(total=len(job_inputs)) as pbar:
172 | for enc1, enc2, val in tqdm(jobs):
173 | pbar.update()
174 | lower_bound, upper_bound = val
175 | if lower_bound is None:
176 | continue
177 |
178 | enc1_idx = encoding_keys.index(enc1)
179 | enc2_idx = encoding_keys.index(enc2)
180 | heatmap_mat_lower_bound[enc1_idx, enc2_idx] = lower_bound
181 | heatmap_mat_upper_bound[enc1_idx, enc2_idx] = upper_bound
182 |
183 | print(heatmap_mat_lower_bound)
184 | if args.names is not None:
185 | names = args.names.strip().split(",")
186 | assert len(names) == len(args.encodings)
187 | else:
188 | names = list(map(str, range(1, len(args.encodings) + 1)))
189 |
190 | # Calculate heatmap statistics / render figures.
191 | for heatmap_mat, heatmap_name in zip([heatmap_mat_lower_bound, heatmap_mat_upper_bound],
192 | ["lower_bound", "upper_bound"]):
193 | # Copy upper triangle of matrix to lower triangle.
194 | heatmap_mat[np.tril_indices(len(heatmap_mat), -1)] = \
195 | heatmap_mat.T[np.tril_indices(len(heatmap_mat), -1)]
196 |
197 | np.fill_diagonal(heatmap_mat, 1.0)
198 |
199 | df = pd.DataFrame(heatmap_mat, index=names, columns=names)
200 | df.mean(axis=1).to_csv("averages_%s.csv" % heatmap_name)
201 | df.to_csv("heatmap_%s.csv" % heatmap_name)
202 |
203 | # Only plot lower triangle.
204 | mask = np.zeros_like(heatmap_mat, dtype=np.bool)
205 | mask[np.triu_indices_from(mask, 1)] = True
206 | fig = plt.figure(figsize=(6, 5))
207 | sns.heatmap(data=df, annot=True, square=True, mask=mask)
208 | plt.xticks(weight="bold")
209 | plt.yticks(rotation=0, weight="bold")
210 | plt.tight_layout()
211 | fig.savefig("heatmap_%s.svg" % heatmap_name)
212 |
213 |
214 | if __name__ == '__main__':
215 | p = ArgumentParser()
216 | p.add_argument("encodings", nargs="+")
217 | p.add_argument("--encoding_project", type=int)
218 | p.add_argument("--names")
219 | p.add_argument("--sentences_path")
220 | p.add_argument("-p", "--num_processes", default=1, type=int)
221 | main(p.parse_args())
222 |
--------------------------------------------------------------------------------
/src/learn_decoder.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | """
3 | Learn a decoder mapping from functional imaging data to target model
4 | representations.
5 | """
6 | from argparse import ArgumentParser
7 | from collections import defaultdict
8 | import itertools
9 | import logging
10 | from pathlib import Path
11 | import time
12 |
13 | import numpy as np
14 | import pandas as pd
15 | from sklearn.decomposition import PCA
16 | from sklearn.linear_model import Ridge
17 | from sklearn.metrics import mean_squared_error, r2_score
18 | from sklearn.model_selection import KFold, cross_val_predict, GridSearchCV
19 | import scipy.io
20 | from scipy.spatial import distance
21 | from tqdm import tqdm
22 |
23 | import util
24 |
25 | logging.basicConfig(level=logging.INFO)
26 | L = logging.getLogger(__name__)
27 |
28 | # Candidate ridge regression regularization parameters.
29 | ALPHAS = [1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e1]
30 |
31 |
32 | def main(args):
33 | print(args)
34 |
35 | sentences = util.load_sentences(args.sentences_path)
36 | encodings = util.load_encodings(args.encoding_paths, project=args.encoding_project)
37 | encodings_normed = encodings / np.linalg.norm(encodings, axis=1, keepdims=True)
38 |
39 | assert len(encodings) == len(sentences)
40 |
41 | ######### Prepare to process subject.
42 |
43 | # Load subject data.
44 | subject = args.subject_name or args.brain_path.name
45 | L.info("Loading subject %s data.", subject)
46 | subject_images = util.load_brain_data(str(args.brain_path / args.mat_name),
47 | project=args.image_project)
48 | assert len(subject_images) == len(sentences)
49 |
50 | ######### Prepare learning setup.
51 |
52 | # Track within-subject performance.
53 | metrics = pd.DataFrame(columns=["mse", "r2"])
54 |
55 | # Prepare nested CV.
56 | # Inner CV is responsible for hyperparameter optimization;
57 | # outer CV is responsible for prediction.
58 | state = int(time.time())
59 | inner_cv = KFold(n_splits=args.n_folds, shuffle=True, random_state=state)
60 | outer_cv = KFold(n_splits=args.n_folds, shuffle=True, random_state=state)
61 |
62 | # Final data prep: normalize.
63 | X = subject_images - subject_images.mean(axis=0)
64 | X = X / np.linalg.norm(X, axis=1, keepdims=True)
65 | Y = encodings - encodings.mean(axis=0)
66 | Y = Y / np.linalg.norm(Y, axis=1, keepdims=True)
67 |
68 | ######## Run learning.
69 |
70 | # Run inner CV.
71 | gs = GridSearchCV(Ridge(fit_intercept=False, normalize=False),
72 | {"alpha": ALPHAS}, cv=inner_cv, n_jobs=args.n_jobs, verbose=10)
73 | # Run outer CV.
74 | decoder_predictions = cross_val_predict(gs, X, Y, cv=outer_cv)
75 |
76 | ######### Evaluate.
77 |
78 | metrics.loc[subject, "mse"] = mean_squared_error(Y, decoder_predictions)
79 | metrics.loc[subject, "r2"] = r2_score(Y, decoder_predictions)
80 |
81 | # Rank evaluation.
82 | _, rank_of_correct = util.eval_ranks(decoder_predictions, np.arange(len(decoder_predictions)), Y)
83 | rank_stats = pd.Series(rank_of_correct).agg(["mean", "median", "min", "max"])
84 | metrics = metrics.join(pd.concat([rank_stats], keys=[subject]).unstack().rename(columns=lambda c: "rank_%s" % c))
85 |
86 | ######### Save results.
87 |
88 | csv_path = "%s.csv" % args.out_prefix
89 | metrics.to_csv(csv_path)
90 | L.info("Wrote decoding results to %s" % csv_path)
91 |
92 | # Save per-sentence outputs.
93 | npy_path = "%s.pred.npy" % args.out_prefix
94 | np.save(npy_path, decoder_predictions)
95 | L.info("Wrote decoder predictions to %s" % npy_path)
96 |
97 |
98 | if __name__ == '__main__':
99 | p = ArgumentParser()
100 |
101 | p.add_argument("sentences_path", type=Path)
102 | p.add_argument("brain_path", type=Path)
103 | p.add_argument("encoding_paths", type=Path, nargs="+")
104 | p.add_argument("--encoding_project", type=int)
105 | p.add_argument("--image_project", type=int)
106 | p.add_argument("--n_folds", type=int, default=12)
107 | p.add_argument("--mat_name", default="examples_384sentences.mat")
108 | p.add_argument("--out_prefix", default="decoder_perf")
109 | p.add_argument("--subject_name", help="By default, basename of brain_path")
110 | p.add_argument("--n_jobs", type=int, default=1)
111 |
112 | main(p.parse_args())
113 |
--------------------------------------------------------------------------------
/src/nearest_neighbors.py:
--------------------------------------------------------------------------------
1 | from pathlib import Path
2 |
3 | import numpy as np
4 | from scipy.spatial import distance
5 |
6 | import util
7 |
8 |
9 | def eval_quant(encoding, metric="cosine"):
10 | # Compute pairwise cosine similarities.
11 | similarities = 1 - distance.pdist(encoding, metric=metric)
12 |
13 | return similarities
14 |
15 |
16 | def main(args):
17 | sentences = util.load_sentences(args.sentences_path)
18 | encoding = np.load(encoding_path)
19 |
20 | if args.mode == "quant":
21 | eval_quant(encoding)
22 | elif args.mode == "qual":
23 | pass
24 |
25 |
26 | if __name__ == '__main__':
27 | p = ArgumentParser()
28 | p.add_argument("sentences_path", type=Path)
29 | p.add_argument("encoding_path")
30 | p.add_argument("--mode", choices=["quant", "qual"], default="quant")
31 |
--------------------------------------------------------------------------------
/src/util.py:
--------------------------------------------------------------------------------
1 | """
2 | Data analysis tools shared across scripts and notebooks.
3 | """
4 |
5 | from collections import defaultdict
6 | import itertools
7 | import logging
8 | from pathlib import Path
9 | import re
10 |
11 | import matplotlib
12 | matplotlib.use("Agg", warn=False)
13 | import numpy as np
14 | import pandas as pd
15 | import seaborn as sns
16 | import scipy.io as io
17 | import scipy.stats as st
18 | from sklearn.decomposition import PCA
19 | from tqdm import tqdm
20 |
21 | L = logging.getLogger(__name__)
22 |
23 |
24 | def load_sentences(sentence_path="data/sentences/stimuli_384sentences.txt"):
25 | with open(sentence_path, "r") as f:
26 | sentences = [line.strip() for line in f]
27 | return sentences
28 |
29 |
30 | def load_encodings(paths, project=None):
31 | encodings = []
32 | for encoding_path in paths:
33 | encodings_i = np.load(encoding_path)
34 | L.info("%s: Loaded encodings of size %s.", encoding_path, encodings_i.shape)
35 |
36 | if project is not None:
37 | L.info("Projecting encodings to dimension %i with PCA", project)
38 |
39 | if encodings_i.shape[1] < project:
40 | L.warn("Encodings are already below requested dimensionality: %i < %i"
41 | % (encodings_i.shape[1], project))
42 | else:
43 | pca = PCA(project).fit(encodings_i)
44 | L.info("PCA explained variance: %f", sum(pca.explained_variance_ratio_) * 100)
45 | encodings_i = pca.transform(encodings_i)
46 |
47 | encodings.append(encodings_i)
48 |
49 | encodings = np.concatenate(encodings, axis=1)
50 | return encodings
51 |
52 |
53 | def load_brain_data(path, project=None):
54 | subject_data = io.loadmat(path)
55 | subject_images = subject_data["examples"]
56 | if project is not None:
57 | L.info("Projecting brain images to dimension %i with PCA", project)
58 | if subject_images.shape[1] < project:
59 | L.warn("Images are already below requested dimensionality: %i < %i"
60 | % (subject_images.shape[1], project))
61 | else:
62 | pca = PCA(project).fit(subject_images)
63 | L.info("PCA explained variance: %f", sum(pca.explained_variance_ratio_) * 100)
64 | subject_images = pca.transform(subject_images)
65 |
66 | return subject_images
67 |
68 |
69 | def load_decoding_perfs(results_dir):
70 | """
71 | Load and render a DataFrame describing decoding performance across models,
72 | model runs, and subjects.
73 |
74 | Args:
75 | results_dir: path to pipeline decoder output directory
76 | """
77 |
78 | results = {}
79 | result_keys = ["model", "run", "step", "subject"]
80 | for csv in tqdm(list(Path(results_dir).glob("**/*.csv")),
81 | desc="Loading perf files"):
82 | key = get_decoder_id(csv.parent.name)
83 | try:
84 | df = pd.read_csv(csv, usecols=["mse", "r2",
85 | "rank_median", "rank_mean",
86 | "rank_min", "rank_max"])
87 | except ValueError:
88 | continue
89 |
90 | results[key] = df
91 |
92 | if len(results) == 0:
93 | raise ValueError("No valid csv outputs found.")
94 |
95 | ret = pd.concat(results, names=result_keys)
96 | # drop irrelevant CSV row ID level
97 | ret.index = ret.index.droplevel(-1)
98 | return ret
99 |
100 |
101 | def load_decoding_preds(results_dir, glob_prefix=None):
102 | """
103 | Load decoder predictions into a dictionary organized by decoder properties:
104 | decoder target model, target model run, target model run training step,
105 | and source subject image.
106 | """
107 | decoder_re = re.compile(r"\.(\w+)-run(\d+)-(\d+)-([\w\d]+)\.pred\.npy$")
108 |
109 | results = {}
110 | for npy in tqdm(list(Path(results_dir).glob("%s*.pred.npy" % (glob_prefix or ""))),
111 | desc="Loading prediction files"):
112 | model, run, step, subject = decoder_re.findall(npy.name)[0]
113 | results[model, int(run), int(step), subject] = np.load(npy)
114 |
115 | if len(results) == 0:
116 | raise ValueError("No valid npy pred files found.")
117 |
118 | return results
119 |
120 |
121 | def get_encoding_ckpt_id(encoding_dir):
122 | """
123 | Get information about a model encoding from its output directory name.
124 | """
125 | encoding_dir = encoding_dir.name if isinstance(encoding_dir, Path) else encoding_dir
126 | try:
127 | model, run, step = re.findall(r"^([\w_]+)-(\d+)-(\d+)$", encoding_dir)[0]
128 | except IndexError:
129 | raise ValueError("Failed to extract checkpoint information from encoding directory %s" % encoding_dir)
130 |
131 | return model, int(run), int(step)
132 |
133 |
134 | def get_decoder_id(decoder_dir):
135 | """
136 | Get information about a learned decoder from its output directory name.
137 | """
138 | decoder_dir = decoder_dir.name if isinstance(decoder_dir, Path) else decoder_dir
139 | model, run, step, subject = re.findall("^([\w_]+)-(\d+)-(\d+)-([\w\d]+)$", decoder_dir)[0]
140 | return model, int(run), int(step), subject
141 |
142 |
143 | def eval_ranks(Y_pred, idxs, encodings, encodings_normed=True):
144 | """
145 | Run a rank evaluation on predicted encodings `Y_pred` with dataset indices
146 | `idxs`.
147 |
148 | Args:
149 | Y_pred: `N_test * n_dim`-matrix of predicted encodings for some
150 | `N_test`-subset of sentences
151 | idxs: `N_test`-length array of dataset indices generating each of `Y_pred`
152 | encodings: `M * n_dim`-matrix of dataset encodings. The perfect decoder
153 | would predict `Y_pred[idxs] == encoding[idxs]`.
154 |
155 | Returns:
156 | ranks: `N_test * M` integer matrix. Each row specifies a
157 | ranking over sentences computed using the decoding model, given the
158 | brain image corresponding to each row of Y_test_idxs.
159 | rank_of_correct: `N_test` array indicating the rank of the target
160 | concept for each test input.
161 | """
162 | N_test = len(Y_pred)
163 | assert N_test == len(idxs)
164 |
165 | # TODO implicitly coupled to decoder normalization -- best to factor this
166 | # out!
167 | if encodings_normed:
168 | Y_pred -= Y_pred.mean(axis=0)
169 | Y_pred /= np.linalg.norm(Y_pred, axis=1, keepdims=True)
170 |
171 | # For each Y_pred, evaluate rank of corresponding Y_test example among the
172 | # entire collection of Ys (not just Y_test), where rank is established by
173 | # cosine distance.
174 | # n_Y_test * n_sentences
175 | similarities = np.dot(Y_pred, encodings.T)
176 |
177 | # Calculate distance ranks across rows.
178 | orders = (-similarities).argsort(axis=1)
179 | ranks = orders.argsort(axis=1)
180 | # Find the rank of the desired vectors.
181 | ranks_test = ranks[np.arange(len(idxs)), idxs]
182 |
183 | return ranks, ranks_test
184 |
185 |
186 | def wilcoxon_rank_preds(models, correct_bonferroni=True, pairs=None):
187 | """
188 | Run Wilcoxon rank tests comparing the ranks of correct sentence representations in predictions
189 | from two or more models.
190 | """
191 | if pairs is None:
192 | pairs = list(itertools.combinations(models.keys(), 2))
193 |
194 | model_preds = {model: pd.read_csv("perf.384sentences.%s.pred.csv" % path).sort_index()
195 | for model, path in models.items()}
196 |
197 | subjects = next(iter(model_preds.values())).subject.unique()
198 |
199 | results = {}
200 | for model1, model2 in pairs:
201 | m1_preds, m2_preds = model_preds[model1], model_preds[model2]
202 | m_preds = m1_preds.join(m2_preds["rank"], rsuffix="_m2")
203 | pair_results = m_preds.groupby("subject").apply(lambda xs: st.wilcoxon(xs["rank"], xs["rank_m2"])) \
204 | .apply(lambda ys: pd.Series(ys, index=("w_stat", "p_val")))
205 |
206 | results[model1, model2] = pair_results
207 |
208 | results = pd.concat(results, names=["model1", "model2"]).sort_index()
209 |
210 | if correct_bonferroni:
211 | correction = len(results)
212 | print(0.01 / correction, len(results))
213 | results["p_val_corrected"] = results.p_val * correction
214 |
215 | return results
216 |
217 |
218 | def load_bert_finetune_metadata(savedir, checkpoint_step=None):
219 | """
220 | Load metadata for an instance of a finetuned BERT model.
221 | """
222 | savedir = Path(savedir)
223 |
224 | import tensorflow as tf
225 | from tensorflow.python.pywrap_tensorflow import NewCheckpointReader
226 | try:
227 | ckpt = NewCheckpointReader(str(savedir / "model.ckpt"))
228 | except tf.errors.NotFoundError:
229 | if checkpoint_step is None:
230 | raise
231 | ckpt = NewCheckpointReader(str(savedir / ("model.ckpt-step%i" % checkpoint_step)))
232 |
233 | ret = {}
234 | try:
235 | ret["global_steps"] = ckpt.get_tensor("global_step")
236 | ret["output_dims"] = ckpt.get_tensor("output_bias").shape[0]
237 | except tf.errors.NotFoundError:
238 | ret.setdefault("global_steps", np.nan)
239 | ret.setdefault("output_dims", np.nan)
240 |
241 | ret["steps"] = defaultdict(dict)
242 |
243 | # Load training events data.
244 | try:
245 | events_file = next(savedir.glob("events.*"))
246 | except StopIteration:
247 | # no events data -- skip
248 | print("Missing training events file in savedir:", savedir)
249 | pass
250 | else:
251 | total_global_norm = 0.
252 | first_loss, cur_loss = None, None
253 | tags = set()
254 | for e in tf.train.summary_iterator(str(events_file)):
255 | for v in e.summary.value:
256 | tags.add(v.tag)
257 | if v.tag == "grads/global_norm":
258 | total_global_norm += v.simple_value
259 | elif v.tag in ["loss_1", "loss"]:
260 | # SQuAD output stores loss in `loss` key;
261 | # classifier stores in `loss_1` key.
262 |
263 | if e.step == 1:
264 | first_loss = v.simple_value
265 | cur_loss = v.simple_value
266 |
267 | if checkpoint_step is None or e.step == checkpoint_step:
268 | ret["steps"][e.step].update({
269 | "total_global_norms": total_global_norm,
270 | "train_loss": cur_loss,
271 | "train_loss_norm": cur_loss / ret["output_dims"]
272 | })
273 |
274 | ret["first_train_loss"] = first_loss
275 | ret["first_train_loss_norm"] = first_loss / ret["output_dims"]
276 |
277 | # Load eval events data.
278 | try:
279 | eval_events_file = next(savedir.glob("eval/events.*"))
280 | except StopIteration:
281 | # no eval events data -- skip
282 | print("Missing eval events data in savedir:", savedir)
283 | pass
284 | else:
285 | tags = set()
286 | eval_loss, eval_accuracy = None, None
287 | for e in tf.train.summary_iterator(str(eval_events_file)):
288 | for v in e.summary.value:
289 | tags.add(v.tag)
290 | if v.tag == "eval_loss":
291 | eval_loss = v.simple_value
292 | elif v.tag == "eval_accuracy":
293 | eval_accuracy = v.simple_value
294 | elif v.tag == "masked_lm_accuracy":
295 | eval_accuracy = v.simple_value
296 |
297 | if checkpoint_step is None or e.step == checkpoint_step:
298 | ret["steps"][e.step].update({
299 | "eval_accuracy": eval_accuracy,
300 | "eval_loss": eval_loss,
301 | })
302 |
303 | return ret
304 |
--------------------------------------------------------------------------------
/structural-probes/en_ewt-ud/en_ewt-ud-test.txt:
--------------------------------------------------------------------------------
1 | AMAZING
2 | By the way we now have a " forum " in the post link
3 | Also , I have an extra ticket for the Comets game on Sat. you said you wanted to go ?
4 | Obviously , he should have been arrested and jailed - imagine making a statue or painting on the same subject as another artist - clearly insulting and disrespecting - wasted time that could have been used making a statue with clothes on .
5 | Possibly a freshwater tank with a ton of different species in there .
6 | Any particular shop that you know of AND their number .
7 | KK
8 | image_jpg_part
9 | " NASA then plans to develop a new 100 - metric - ton - class launch vehicle derived from existing capabilities with the space shuttle external tanks and solid rocket boosters for future missions to the moon , " the letter said .
10 | Thank you though .
11 | There is no lower rating for Noonan 's Liquor , owners and employees .
12 | I shoot a t1i and have n't had such an issue .
13 | What should I do ?
14 | This place is great !
15 | Sanctuary is amazing !
16 | Please note that neither the e-mail address nor name of the sender have been verified .
17 | email them at address below
18 | They also had a special connection to some extremists in Jordan and Germany .
19 | It is a place in Argentina lol
20 | good outside , bad inside
21 | Ideally , we would like a fast turnaround .
22 | Worst experience ever like a sardine can and the bartender downstairs is the rudest person I have ever met .
23 | " ... to gaze at Wei 's art is like entering a floating world of dreams . "
24 | Al Arfsten 713 965 2158
25 | - Ram Tackett ( E-mail ) .vcf 4222
26 | So does anyone know what the difference is ?
27 | We 're at war with Islamic fascists .
28 | Umar Islam , 28 , ( born Brian Young ) High Wycombe
29 | yeah
30 | Rich was here before the scheduled time .
31 | I have no complaints about the service I received .
32 | Vince :
33 | ( Space.com )
34 | Hi David :
35 | WHAT A GREAT DEAL THANK YOU
36 | i am sure i could have persuaded you to give me some action .
37 | You company and services will be recommended by us to everyone . "
38 | Airfare alone will be incredibly expensive so make sure you have the money and of course free time to take your time and have a great time .
39 | =)
40 | Looking for something on the casual side and we want it to be fun .
41 | I think this location is no longer in business .
42 | The sushi is great , and they have a great selection .
43 | The correction to the working gas includes TWO corrections .
44 | i know you remember the bet .
45 | Has that gone anywhere or are the other possibilities you had better ?
46 | By using the word " Islamic " as an adjective Bush was purposely not associating Muslims with fascism , hence the qualifier .
47 | Email : n3td3v < xploita...@gmail.com >
48 | - Joe Namath
49 | they are great dogs .
50 | They were playing Captain Ahab to Saddam 's great white whale .
51 | We have this report ?
52 | I 'm not driving tonite , but I bet that we could hitch a ride back with Anil .
53 | exelent Job
54 | After this weekend , we will no longer have access to the estate files , these people will be able to help you with any of your questions .
55 | Susan :
56 | Deep tissue massage helps with pain in neck and shoulders
57 | Fast and friendly service , they know my order when I walk in the door !
58 | You will always find fascinating links at : Extreme Web Surfs http://extremewebsurfs.blogspot.com/ ( nice urban wildlife post today ) & Me and the Web http://maartenvt.blogspot.com/ Arts , History , Animals , Music , Games , Politics , Technology , Fun and more !
59 | A big rally then took the Dow ( unconfirmed ) to a record high of 1051 in January of '73 , turning everyone bullish .
60 | weather in december in Tremblant ?
61 | AEP $ 19,250,000 $ 38,750,000
62 | Well , he launched today .
63 | that deal is making like it wants to close and the traders scheduled a 430 call to wrap it up
64 | Thank you (:
65 | Food is often expired so check the dates every time !
66 | Sent by : Janette Elbertson
67 | i had a blast that night .
68 | Thanks ,
69 | The employees do n't really seem to enjoy what they are doing and it shows .
70 | - Winston Churchill
71 | ( Most Salafis are not militant or violent , though they tend to be rather narrow - minded in my experience , on the order of Protestant Pietists ) .
72 | Mine does .
73 | This place had the worst tasting pizza I have ever tasted it was possible the worst food I 've ever eaten .
74 | a staple !
75 | Yes .
76 | I had a rose named after me and I was very flattered .
77 | - Socrates
78 | Got to love this place .
79 | I have never hated a man enough to give his diamonds back .
80 | you guys have any job opening for ex natural gas traders that made their now almost defunctc ompany over 40 million in the last two years ?
81 | The owner was very friendly and helpful .
82 | Geoff ,
83 | ( ZD Net )
84 | Not enough seating .
85 | It 's impossible to understand how this place has survived .
86 | They specialize in financial institutions , medical , and retail projects .
87 | Do You Yahoo! ?
88 | Thank you .
89 | Expect either undercooked or mushy food and lackluster service .
90 | Michael helped shoot the majority of my firm 's website and we could not have been happier .
91 | Of THESE three , it 's a toss - up between Royal and Carnival .
92 | We got upgraded to a corner suite !
93 | I will not be there at 7:30 , but will see you arond 9:30 on Tuesday .
94 | My wife and I would love for you to come and visit our page
95 | DO Nt Go here
96 | Job Title : Attorney
97 | Poor Taste
98 | Martin ,
99 | Any information would help .
100 | Did you have a chance to take a look at the resume I sent you ?
101 | Money ca n't buy you happiness ... but it does bring you a more pleasant form of misery .
102 | " Inhibitory systems are essential for controlling the pattern of activity in the cortex , which has important implications for the mechanisms of cortical operation , according to a Yale School of Medicine study in Neuron ....
103 | extend , and if you want to end it , you just say bye ,
104 | Twinkle Twinkle lazy star Kitna soyega uthja yaar , up above the world so high , sun has risen in the sky , uthke jaldi pee le chai , then call me up and say " HI "
105 | SCALIA filed a dissenting opinion , in which THOMAS and ALITO joined .
106 | An Hour Of Prego Bliss !
107 | In any event , my presentation should give you a starting point .
108 | One can suspect the Iranian Government .
109 | Her flexibility and accessibility made for an easy closing .
110 | He is now lecturing in USA .
111 | Totals $ 22,750,000 $ 40,000,000
112 | Good Morning * *
113 | Our client is a small law firm that is looking for an individual to join their team handling toxic tort with some minor PI defense .
114 | Very professional and great results .
115 | I wish I had the capital to open my own shop ?
116 | Zakaria Amara , 20 , Mississauga , Ont. ;
117 | Sent by : Janette Elbertson
118 | If you enjoy amazing things , you must go to World 's Finest Donair .
119 | it s cheap and it s good !
120 | you must be thinking of someone else .
121 | At that time , the gurus and geniuses of Wall Street were predicting a 250 Dow and many were talking openly about the end of capitalism .
122 | If you do not wish to receive such e-mails in the future or want to know more about the BBC 's Email a Friend service , please read our frequently asked questions .
123 | By the time a man is wise enough to watch his step , he 's too old to go anywhere .
124 | Give me a few days , and I 'll be in touch .
125 | Thank you for your help in tracking these invoices .
126 | Stephanie
127 | Look on the debenhams website
128 | In the world of " Wei 's magic cubes , " all seem to be ingeniously planned and tricky .
129 | Linda
130 | I 'll search him out before class or after that break and see if I can set it up .
131 | I had no problem with my delivery .
132 | And a portion of each package or memorial purchased goes to a charity on their database .
133 | I am not sure how reliable that is , though .
134 | A real pleasure training with Natasha .
135 | prime ribs have very good food but it s super expensive
136 | I heard that more may be going up for sale in the next month or do .
137 | Great job on my roof and the pricing was fair .
138 | in n out of the chicago area ?
139 | I own a property management firm and need a contractor with the credentials that Farrell Electric has .
140 | Friendly service .
141 | EY4096.1 PERFORMANCE 01-Feb-02 P 6,363,217 - $ 250,393
142 | Thanks .
143 | < http://www.bbc.co.uk/dailyemail/ >
144 | Did a great job of removing my tree in Conyers .
145 | why are there two statues of David ?
146 | Horrible !
147 | It was all sorted with no hassle at all and I 'm really grateful - they were fab .
148 | Thanks
149 | Women 's rain coat ... where can I find one ?
150 | I want to go to the cafeteria for vegetables .
151 | The President has also said he would like to see Israel wiped off the map which he could n't even begin to try without nuclear weapons .
152 | 732-657-3416
153 | Why not put together a bottle of champagne , a picnic and have a date on Treasure Island .
154 | Thanks !
155 | Vince :
156 | There are deals in the Aruba book so I 'm not sure why you are n't picking those up .
157 | yep
158 | So delightful .
159 | Love Hop City
160 | Vince
161 | Not me sorry .
162 | The answer is , " Yes ! "
163 | These have been sold .
164 | I ca nt find any information about it
165 | But will diplomacy work ?
166 | really , i have no idea what you 're talking about .
167 | Google is probably making this move to counter Microsoft Search using Encarta ( it's online dictionary ) .
168 | I am bringing two of my girlfriends from LJ .
169 | Cast .
170 | It sounds like a firmware issue and the camera requires a re-boot just like what happens in a computer - needs a re-start from time to time but it should n't be happening in a camera .
171 | the bast cab in minneapolis
172 | Thanks
173 | Overpriced and the doctor acted arrogant and rushed at a time when there was very few clients in the facility .
174 | We would like to thank our emergency plumbers who visted our shop in Morningside Road today .
175 | hahaha
176 | And international donors have given only half of the relief aid that Darfur needs , according to the local UN officials .
177 | Nacho Libre is suppose to be inspired in Mexicans , not in Argentineans .
178 | And there 's nothing distinctly Irish about them .
179 | why is enron blowing up ?
180 | do n't forget to use a calcium supplement twice a week ; captive reptiles are prone to calcium deficiency .
181 | Highly recommended landscaper !!!
182 | These guys really know their stuff .. they have almost anything you could want in terms of spy and surviellance equipment .
183 | CLH
184 | It taste better than In and Out ....
185 | Name of specific Hibachi restaurant in Chicago ?
186 | Syria has agreed to withdraw under the conditions set forth in UNSC Resolution 1559 , which has already begun .
187 | Good selection .
188 | Or how about visiting the Chicago Botanical Gardens and see the change of colors and enjoy the air , They also have many inside exhibits you might enjoy , food is pretty good to .
189 | I know saltwater is a possibility , can you give me a possible stocking option for that too ?
190 | Best Limo Limousine service in all of Dallas
191 | They were from all accounts marginalized and not listened to .
192 | I highly recommend Garage Pros to my friends .
193 | ------
194 | M
195 | the attitude of some staff is terrible , did not solve anything only say i can do nothing .
196 | Or more if you have drinks .
197 | Rodgers
198 | never response the phone call
199 | Sara ,
200 | I would n't want any other company in my time of need .
201 | that is how i want you to refer to me as " the king "
202 | Shamin Mohammed Uddin , 36 , Stoke Newington
203 | Compact 's Corona dryers remove at least twice as much water as the previous dryers , allowing a production increase of over 10 % and a significant energy saving .
204 | " Thank you so much for the superior job well done .
205 | spanish
206 | Fast Service Called them one hour ago and they just left my house five minutes ago .
207 | hopefully she does n't hose you .
208 | ------------------------------
209 | I have a friend out in Chicago this week , and I am trying to remember the name of an awesome hibachi style restaurant i visited while out there a couple years ago .
210 | Please let us know if you need any additional information .
211 | The waiting staff is really friendly , it s like every one knows each other , the manager is really sweet and the food .. well no complaints from me .
212 | I met you at the Risk conference last week in Houston .
213 | did anyone have this issue ?
214 | I 'm not sure how the market will react .
215 | That 's because of the buffer that holds the data until it 's ready to be recorded to the memory card .
216 | someplace that is like $ 30 an entree .
217 | He is at his best when he is doing his Nerd impression ...
218 | Absoul is the greatest donair man on the planet .
219 | how fare of kolkatta ?
220 | I have a Kodak Camera ( 10.2 Megapixels ) ... Kodak AF 5x OPTICAL LENS ... how do I pause it while recording ?
221 | Let aggressive ( American ) leaders and soldiers know that we are capable of protecting the city 's security and safety , and ask them to lift their hands from the city . "
222 | EY4108.I PERFORMANCE 01-Feb-02 P - 10,274,494 - $ 166,960
223 | sorry again
224 | No problem .
225 | I had to go to the BBC for this report .
226 | My canon t2i stops working at times as in the power bottom is switched to " on " but the camera does not respond to any function .
227 | I enjoyed working with you and wish you the best of everything .
228 | I hope that this would mean that you would remain involved at some level .
229 | Cafeteria is fine .
230 | Friendly service .
231 | it is a cute little nice and quiet library
232 | Fresh and unic !
233 | Chris Abel Manager , Risk Controls Global Risk Operations chris.abel@enron.com < mailto:chris.abel@enron.com > 713.853.3102
234 | Bike ride in the park , followed by coffee .
235 | We still have the traders and books that you provided last week , but need to know if there are any changes to this .
236 | ------
237 | UN Secretary - General Kofi Annan has indicated it is time to " recognize Hezbollah " after easily being duped by " the message on the placards they are using " .
238 | - Bob Hope
239 | Michael Olsen@ENRON
240 | Greg Couch will be taking over the responsibility for the estate risk group and will be able to assist you with your requests going forward .
241 | Tracy , Do we have concerns here .
242 | Can you use the ' find my phone ' feature to track someone else 's phone ?
243 | What do you think of Air France ?
244 | ALL OF THE TEACHERS THERE ARE SO MEAN THEY GET MAD AT YOU FOR NOTHING !!!!!!!!!!!!!!!!!!!
245 | When the French returned to Indochina at the end of WW II the Viet Minh were in control of the Red River Delta .
246 | They sell these kits in most hobby and craft stores .
247 | - Jimmy Durante
248 | The gods were n't with us on that one .
249 | I 'm more than happy to help people with the site or answer any questions about Action Network - just drop me a message .
250 | While Tanya is reviewing credit , can you please send a " blank form Paragraph 13 " for this master .
251 | Will use again in the future .
252 | Where can I go on a first date ( adults ) ?
253 | While our established schedules of Tuesday and Friday DPR's would have us reporting tomorrow 's activity on Monday , we will change that for month end .
254 | ???
255 | We honestly can not think of even 1 thing we did n't like !
256 | Say after I finished those 2 years and I found a job .
257 | http://www.google.com/search?hl=en&rlz=1G1GGLQ_ENUS359&q=gunther+uecker+biography&gs_sm=c&gs_upl=484l484l0l5093l1l1l0l0l0l0l328l328l3-1l1l0&um=1&ie=UTF-8&tbm=isch&source=og&sa=N&tab=wi&biw=1221&bih=756&sei=XVG5TqrrGoXK2AXG0ry-Bw#um=1&hl=en&rlz=1G1GGLQ_ENUS359&tbm=isch&sa=1&q=gunther+uecker+artist&pbx=1&oq=gunther+uecker+&aq=1S&aqi=g1g-S3&aql=&gs_sm=c&gs_upl=10219l10219l0l13797l1l1l0l0l0l0l125l125l0.1l1l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=13ddcdc64cbf5fd&biw=1221&bih=756
258 | I just wanted to check with you regarding the consulting arrangement we discussed a couple of weeks ago .
259 | Should I be concerned ?
260 | ( Space.com )
261 | Onion Rings are great and the fries are endless .
262 | I think it was in the Lincoln Square area but do n't quote me on that .
263 | I just wanted to follow up on whether you will have a chance to send a draft Credit Support Annex ( similar in form to the one previously executed with ENA ) .
264 | like what ?
265 | I need something reliable and good looking .
266 | Do you prefer ham , bacon or sausages with your breakfast ?
267 | EY4106.6 PERFORMANCE 01-Feb-02 P - 1,993,045 - $ 32,387
268 | Pam the Pom
269 | The credit guys are currently assuming that there is no correlation and may consequently be double dipping the credit reserve on this basis too .
270 | I 'm considering taking a job with Steiner and noticed I have to pay for all my travel .
271 | Elizabeth 36349
272 | But there is no proof .
273 | Amazing customer service
274 | This is the very best in the Gables .
275 | Bay of Plenty - Are you even old enough to vote ?
276 | Michael L. Beatty & Associates , PC ; # 10461 , # 10469 & # 10468 dated 5/28/00 .
277 | Martin
278 | - Groucho Marx
279 | - Herbert Henry Asquith
280 | Yasim Abdi Mohamed , 24 , Kingston ;
281 | No minimum order amount .
282 | U.S. officials have said the plot , thwarted by Britain , to blow up several aircraft over the Atlantic bore many of the hallmarks of al Qaeda .
283 | Which is why he did n't say we 're at war with Islamic people .
284 | Hi Kevin ,
285 | Many thanks from myself and all of our wedding guests !
286 | - Phyllis Diller
287 | We just did a deal for the rest of the month for 10,000 / d at meter # 1552 QE - 1 @ $ 4.355 .... can you let me and Robert Lloyd know what the sitara # is ?
288 | Clewlow / Strickland book is out .
289 | She makes every item fit you perfectly .
290 | Vince
291 | Has another store in the st. charles mall .
292 | As we discussed , here is a copy of the draft memo .
293 | Thanks a lot .
294 | Vincent ,
295 | Further to your call attached is a presentation I gave at the Canadian Risk Managers Conference in Edmonton in the fall of 2000 .
296 | houston wo n't be too affected b/c most of the layoffs affect satelite offices .
297 | The people at Gulf Coast Siding were very easy and clear to work with .
298 | ** and i can upload my pictures and videos on the computer ( facebook )
299 | No , but I do believe some Koreans reside in the country of HA - ha !
300 | i always thought there s no custom charges for gifts .
301 | Swetha
302 | A Look at Slogans - http://www.small-business-software.net/look-at-slogans.htm
303 | Excellent energy efficiency
304 | wow wow wow .
305 | Pretty spendy for really not great quality
306 | The Lunar Transporatation Systems ( LTS ) is actually being funded by two space businessmen , Walter Kistler and Bob Citron .
307 | Nearby what ?
308 | i am in portland .
309 | Click here To view it .
310 | I 'll ask around ?
311 | Jeff
312 | if i preorder a game at gamestop can someone else pick it up for me ?
313 | mazzoni 's deli best italian food in phila pa ?
314 | Regards
315 | regards ,
316 | Google is a nice search engine .
317 | I 've been a regular customer at this store since it opened , and love the fact that all of the employees are friendly locals .
318 | Frank
319 | Bush is in Santiago for the annual Asia - Pacific Economic Cooperation ( APEC ) leaders meeting .
320 | The IIP had also been the main force urging Sunni Arabs to participate in the elections scheduled for January , and had been opposed in this stance by the Association of Muslim Scholars .
321 | Let me join the chorus of annoyance over Google 's new toolbar , which , as noted in the linked article , commits just about every sin an online marketer could commit , and makes up a few new ones besides .
322 | Here is a revised draft of the CDWR risk memo .
323 | He listens and is excellent in diagnosing , addressing and explaining the specific issues and suggesting exercises to use .
324 | Compensation : $ 60000 - 70000
325 | Does 5 make a chain ?
326 | Expensive for the level of food and the quality of service .
327 | Farrell Electric is a very good electrical contractor .
328 | What 's going on with the UBS weather position ?
329 | You also need Pakistani air space .
330 | Not so good
331 | The clerics demanded talks with local US commanders .
332 | Well there s Mc. Donald s , Taco Bell , Burger King .....
333 | i.e .
334 | very reasonable prices .
335 | That is Flat Top Grill
336 | they recovered the pics geeksquad deleted .
337 | Listened to my problem and took care of it .
338 | Has anyone ever worked for steiner leisure cruises ?
339 | It could notionally be expanded to encompass the 5,000 - strong " 55th Brigade " of the Taliban regime , though this is not the technical definition .
340 | 01/24/2001 03:51 PM
341 | Thank you
342 | I 'll be back on Monday .
343 | ------
344 | FYI ,
345 | The finest German bedding and linens store .
346 | Please start using the ENA DPR 0102 file rather than the EWS DPR 2002 file to send to Chris .
347 | Best fried shrimp in the state !
348 | When I tried to return it they refused , so I had to leave without a refund and still hungry .
349 | 732-657-3416
350 | It looks like The Lunar Transportation Systems , Inc. is visualizing a " space highway " going from the moon to Earth ( and back again ) .
351 | They basically buy daily deals from Groupon , Living Social , and all sorts of other places .
352 | There must be a better mexican place in Rockland .
353 | Susan
354 | Average food and deathly slow service
355 | Hope you will be sorted .
356 | **
357 | sounds exciting .
358 | Al ,
359 | Ifunny.com
360 | I was wondering if you could give me some references regarding the calculation of correlation coefficients from a GARCH model .
361 | Get great service , fantastic menu , and relax .
362 | My nails looked great for the better part of 2 weeks !
363 | I 've looked and looked , but can not find one anywhere !
364 | As in the old days , varnish is often used as a protective film against years of dirt , grease , smoke , etc .
365 | Delhi police chief K K Paul named the man as Tariq Ahmed Dar , and said police were hunting for four accomplices .
366 | Thanks -
367 | Walgreens on University
368 | looking for a surprise spot to take my bf .
369 | * ... *
370 | However , the request below is to " replenish " the CASH that was drawn down ... please advise .
371 | Prosperity POS makes the best pos systems .
372 | have a look at sony wx10
373 | Microsoft is 4 - 0 ( they took down Netscape , Suns Systems , MAC and IBM ) and Google may be their next target .
374 | Magali Van Belle Consultant PHB Hagler Bailly MANAGEMENT AND ECONOMIC CONSULTANTS PHB Hagler Bailly , Inc. ( 202 ) 828-3933 direct dial 1776 Eye Street , N.W. ( 202 ) 296-3858 facsimile Washington , D.C. 20006-3700 mvanbell@haglerbailly.com e-mail
375 | The deals are listed below .
376 | http://reflectioncafe.blogspot.com/2005/09/unnatural-disasterthe-less...
377 | Kristen ,
378 | Was wondering if anyone knew a rough estimate of how much it costs with travel and training
379 | I 'll post highlights from the opinion and dissents when I 'm finished .
380 | Tanvir Hussain , 24 , London E10
381 | i got her number though .
382 | - REDLINE GPSA Guaranty.doc
383 | PS - we also have more cats coming in for re-homing see our ' Homes Wanted ' page
384 | You have to see these slides .... they are amazing .
385 | Very professional , talented , unic and fresh work .
386 | hard to forgive such an awful margarita and steep prices but the food can be good
387 | Richard Harper and Mary Nell Browning may have some ideas here but I am sure you 've already gone through it with them .
388 | I 've tried bland white rice but he wo nt eat anything .
389 | Kam
390 | Your suggestion to introduce the concept discussed with one of the Lays is welcomed .
391 | The decision to sidestep the obvious to satisfy the need to avoid confrontation does not bring peace , but only delays the eventual conflict as the predators of Hamas and Hezbollah exploit the inherent weakness of the internationals and the media .
392 | The best pilates on the Gold Coast !
393 | I 've never kept cichlids though .
394 | I do see myself as a conservative
395 | I 'm working hard for you !
396 | THE TEACHING THERE SUCKS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
397 | But my resturant is way better than all of them ... and it 's quite close .
398 | Great job !
399 | I need creative art ideas ?
400 | Wei 's Magic Cubes
401 | I have used Bright Futures for the last 7 years .
402 | I do n't know how much it will help however .
403 | Yes , you must pay customs duties .
404 | Zarqawi is a Jordanian , and his Monotheism and Holy War group in Afghanistan probably had a distinctive coloration as mainly Jordanian , Palestinian and Syrian .
405 | I love Air France !
406 | Did you get in as much trouble as I did this weekend - Lori seems to think I need to get help - I told her it 's normal to drink all day at a bar , then go to dinner where you do n't know half the people there and proceed to get extremely fucked up
407 | I love them !!!
408 | Great work !
409 | Unfortunately , I will be in Plano that weekend .
410 | All - you can - eat style deal .
411 | Any of the tip - top places have great ice - cream , get them to mix it up .
412 | I felt as if I was in an over priced Olive Garden .
413 | The company gets busy but you never have to wait long because they ARE orginizied , so you are in , out , and paid well for your scrap
414 | Besides eating good foods , what else do people do in Miramar ?
415 | Staten Island Computers
416 | Musharraf told Clinton he could n't use Pakistani soil or air space to send the team in against Bin Laden .
417 | I wo n't return .
418 | Daren
419 | Well , they have a variety of sports that they play like basketball , soccer , etc .
420 | I should have asked for a jury .
421 | " We believe this is an ill - advised term and we believe that it is counterproductive to associate Islam or Muslims with fascism , " said Nihad Awad , executive director of the Council on American - Islamic Relations advocacy group .
422 | I have been using Steele Electric for years .
423 | Melissa ,
424 | Along with an area on the page for those to join the mailing list .
425 | -----== Over 100,000 Newsgroups - 19 Different Servers ! =-----
426 | Thanks
427 | Bush came out today and said that if he had known what was coming , he would have expended every effort to stop it , and that so would have Clinton .
428 | Highly recomended .
429 | I doubt you will get a sensible answer in the " TRAVEL " section .
430 | thumbs down ???
431 | Also very friendly and the stylists are not in the " been there / done that " mood !
432 | Call me if you have time .
433 | I prefer Royal Caribbean out of all these .
434 | However , I did not find her very helpful and her receptionist was rude .
435 | yeah , i was thinking somewhere like mcdonald 's .
436 | 01/24/2001 11:21 AM
437 | It was pretty epic as I remember and would love to send my friend there .
438 | 42299
439 | Wolfowitz contradicted counter-terrorism czar Richard Clarke when the latter spoke of the al - Qaeda threat , insisting that the preeminent threat of terrorism against the US came from Iraq , and indicating he accepted Laurie Mylroie 's crackpot conspiracy theory that Saddam was behind the 1993 World Trade Towers bombing .
440 | I would highly recommend Landscape by Hiro .
441 | WASHINGTON ( Reuters ) -
442 | I 'm free any day but Tuesday .
443 | Green
444 | Please return an executed copy of confirm to me .
445 | Hooray for Craggy .
446 | Hospitality .!
447 | Green Tea .
448 | This group does sound pretty interesting though .
449 | kolkatta is an Indian State .
450 | many PCs have sleep & charge now , that allow the PC to go into sleep mode , and still allow the USB ports to charge things like phones .
451 | Will definitely go back when I need medical care .
452 | I have use them four times for fixing items from pushing out a dent in a bumper to fixing the fender on my beloved Miata .
453 | We 've got a Steven , the one word that did n't crash my spell - check , despite it being followed by a Vikash Chand Abdul Shakur .
454 | I have a Nacho Libre question .?
455 | Natasha
456 | crab
457 | Everyone is relaxed and having fun !!!
458 | i was very pleased with the service .
459 | yeah i got yelled at also .
460 | house of pies here i come .
461 | Installed Biometrics and Got Excellent Service .
462 | The guaranty for the PPA is blacklined against the Enron EPC Contract guaranty in favor of the banks which was granted in connection with our Cabazon Wind Power Project ( the project most recently financed by EWC ) and the GPSA guaranty is blacklined against the PPA guaranty .
463 | Great Service , Thanks Don .
464 | Email : " Ian " < ian.gilb...@btinternet.com >
465 | It 's well cool . :)
466 | This is a beautiful site and a wonderful idea .
467 | Does anyone know any good restaurants in cordoba ?
468 | What is this Miramar ?
469 | We 've got a page dedicated to issues around hospitals ( http://www.bbc.co.uk/dna/actionnetwork/C55153 ) and a group called NHS SOS have already put up a campaign about ward closures in Cumbria .
470 | This consolidation is obviously a result of Bush 's aggressive invasion of Iraq and of the botching of the aftermath .
471 | * Ireland
472 | I 've been fuming over this fact for a few weeks now , ever since some organizations and governments suggested we need to accept the fact that Hezbollah will get involved in running Lebanon .
473 | I like music very loud and with a lot of bass .
474 | ?
475 | well , I do n't ask questions here because I have no clue what " Iguazu " is ...
476 | I do n't feel anything until noon .
477 | Best count on $ 50 per person no matter what .
478 | it was a little to high dollar for me
479 | this is the worst Sam s club I 've ever been to
480 | My pharmacy order is always correct and promptly delivered but the pharmacy staff are always very short with me and do n't seem to like answering questions .
481 | Even the least discriminating diner would know not to eat at Sprecher 's .
482 | Dick could never thereafter get any real cooperation from the cabinet officers , who outranked him , and he could not convince them to go to battle stations in the summer of 2001 when George Tenet 's hair was " on fire " about the excited chatter the CIA was picking up from radical Islamist terrorists .
483 | I usually use ZebraKlub .
484 | ---- cgy
485 | i would like to have one of those super random surprisingly nice nights out ... suggestions ?
486 | Arriving in Auckland on a direct flight from Canada .?
487 | --
488 | well since i do nt know your budget , i recommend Hakka Restaurant for chinese food
489 | We discussed a few days ago a consulting arrangement with Prof. Sheridan Titman from UT .
490 | Vince
491 | Now at 83.5 .
492 | All you can do is take each section ( individual video ) and edit them together on software .
493 | Drove all the way over from the highway ... closed at 7 .
494 | Highly recommended
495 | All of those ;D
496 | If you get a good wife , you 'll become happy ; if you get a bad one , you 'll become a philosopher .
497 | Grateful for any help or suggestions you could provide .
498 | Because he liked making statues of David ! :D
499 | He mentions his wife 's death having an effect on him .
500 | obviously take her to the vet
501 | Recommend you call in for a look .
502 | and hopefully you do not know the same people because he tells others about you payment status .
503 | The African Union is clearly not up to the task of keeping the peace , pledging 300 troops to an area that will need 15,000 , according to analysts .
504 | ..
505 | What if Google Morphed Into GoogleOS ?
506 | it is to late for me to add changes .
507 | We do n't have to believe him .
508 | I do nt go there anymore
509 | These guys took Customer Service 101 from a Neanderthal .
510 | complete with original Magnavox tubes - all tubes have been tested they are all good - stereo amp
511 | kudos to Allentown Post Office staff
512 | And from a place that specializes in high quality meat , too .
513 | Okay , FIRST , you have posted a question about an American movie , set in Mexico , in the Dining Out in Argentina category .
514 | Very Mediocre donuts !
515 | David ,
516 | 02/28/2001 03:16 PM
517 | Enron could be an ideal environment from which to the concept enhancement through to commercialization could be successfully accomplished .
518 | Ruth Penycate www.bbc.co.uk/actionnetwork
519 | Animal News Center Webmaster
520 | What 's the difference between Indian and African ringnecks and alexandrine parrots ? ?
521 | Who does that ?!
522 | I think it will help me very much in my role .
523 | Please be brief !
524 | Key Delhi blast suspect arrested
525 | there might be bigger and more well known bagel places in the area but Family Bagels are nice people , small shop and incredibly friendly .
526 | Darrell Duffie mail GSB Stanford CA 94305-5015 USA phone 650 723 1976 fax 650 725 7979 email duffie@stanford.edu web http://www.stanford.edu/~duffie/
527 | Hi Sara ,
528 | Otherwise , hope to see you there !
529 | Thanks
530 | i did n't want you to go .
531 | If you believe crackpot theories instead of focusing on the reality -- that was an al - Qaeda operation mainly carried out by al - Gamaa al - Islamiyyah , an Egyptian terrorist component allied with Bin Laden -- then you will concentrate on the wrong threat .
532 | Thank you .
533 | Neither did Cheney , Rumsfeld , or Wolfowitz .
534 | not sure , but i assume that the bluegrass songbook is mine .
535 | Great place
536 | Best Electrician in Florence
537 | Sheridan
538 | 02/28/2001 04:41 PM
539 | Michael
540 | It looks like the war between Microsoft and Google is quickly brewing on the horizon .
541 | Abu Musab al - Zarqawi and his group are said to have been bitter rivals of al - Qaeda during the Afghan resistance days .
542 | The findings demonstrate the inhibitory network is central to controlling not only the amplitude , extent and duration of activation of recurrent excitatory cortical networks , but also the precise timing of action potentials , and , thus , network synchronization ... "
543 | I really want to go to andiamo s for my birthday and i was just wondering how much it would cost for the four of us to eat there
544 | This BuzzMachine post argues that Google 's rush toward ubiquity might backfire -- which we 've all heard before , but it 's particularly well - put in this post .
545 | Rumsfeld initially rejected an attack on al - Qaeda bases in Afghanistan , saying there were " no good targets " in Afghanistan .
546 | Extremely greasy .
547 | i love The Script and know the re from iraland .
548 | 03/26/2001 08:58 PM
549 | http://www.theadvocate.com/sports/story.asp?StoryID=16475
550 | Really great service and kind staff .
551 | Deb Price
552 | Chicago 's a big area .
553 | Amazing service !
554 | Thanks
555 | Are you free for lunch today .
556 | See what DD & D showed at the original place on Harms Rd in Glenview :
557 | I live in the neighborhood and this place is one of my favorites for a tasty , quick and inexpensive meal .
558 | No ... that 's all .
559 | Depends of what .
560 | best burger chain in the Chicago area ?
561 | Iran says it is creating nuclear energy without wanting nuclear weapons .
562 | I 'm looking for websites like flickr.com tumblr.com and autocorrects.com but ones that are n't very common and have a variety of funny pictures !
563 | Teco Tap 85.000 / HPL IFERC ; 20.000 / Enron
564 | He 's not giving 85 % away , he 's giving a number of shares each year that decrease in number at the rate of 5 % a year ( until gone ? ) .
565 | Becky A. Stephens Litigation Unit , Enron Corp. 713/853-5025 EB 4809
566 | image001.jpg
567 | Please verify receipt at your earliest convenience .
568 | could not find any info online
569 | will i have to pay customs in NZ .
570 | Although these new rockets are probably more expensive , they will be able to go at a much greater range than it's shuttle cousins , as they can not only break free from the atmosphere but reach the moon as well .
571 | The Sunni AMS told Iraqis , " You sinned when you participated with occupation forces in the assault on Najaf , and beware lest you repeat this same sin in Fallujah .
572 | Would love for you to join us .
573 | Best ceviche that I 'd had so far ! :)
574 | Jeff
575 | out of carnival , royal caribbean , and norweigan ( cruises ) which is the best and why ?
576 | 09/08/2000 09:36 AM
577 | Description :
578 | --
579 | Just wanted you to know that Eric came by as scheduled today and sprayed our house for scorpions .
580 | see you there on court 10
581 | thanks
582 | thick cut bacon or really good sausages
583 | They are like family .
584 | Tired of spam ?
585 | ALITO filed a dissenting opinion , in which SCALIA and THOMAS joined as to Parts I through III .
586 | I appreciate the quick , good service and the reasonable prices and will definitely use American Pride Irrigation & Landscaping again .
587 | I started collecting animations & jokes just to help with my boredom and depression .
588 | no they wo nt be able to some gamestops do nt check ID for the pre-order so some ppl can get away with doing this but a lot of store want to see your id and / or your reciept of the pre-order
589 | you live in NZ and you eat McDonald s ice cream ?
590 | [ http://www.space.com/missionlaunches/ft_050829_ksc_spacefuture.html ]
591 | Ladies room , Open Sundays
592 | I 'm free any day but Tuesday .
593 | So hear we are , two weeks later , after that dazzling PR display two weeks ago by Powell and Annan , and the situation on the ground in Darfur appears basically unchanged .
594 | Vince ,
595 | or has acquired some type of disease and that too needs to be attended to ...
596 | The Supreme Court announced its ruling today in Hamdan v. Rumsfeld divided along idelogical lines with John Roberts abstaining due to his involvement at the D.C. Circuit level and Anthony Kennedy joining the liberals in a 5 - 3 decision that is 185 pages long .
597 | 11/15/2000 11:58 AM
598 | A reminder .
599 | colorado beat texas a&m .
600 | This was a risk that we had but we did have assurances from Phillips regarding performance .
601 | has life like animal wholesale figurines made from rabbit and goat fur , feathers and sheep s wool .
602 | Angry crowds chanted anti-American slogans in the western city of Falluja ( pop. 256,000 ) as the security police killed in a friendly fire incident by US troops were buried on Saturday .
603 | so i m a little confused , why is there two statues of David ?
604 | --- The Art of Calligraphy in Modern China ( British Museum Press , 2002 )
605 | Out of business ?
606 | EY4096.7 PERFORMANCE 01-Feb-02 P - 6,363,217 - $ 55,678
607 | It will be interesting to see whether or not Google will finally slay the Microsoft Goliath , who has known no major defeat and seeks to vanquish all competition .
608 | The latest spot for a real Hackney 's is Printers ' Row :
609 | Email : franz371...@gmail.com
610 | What if Google expanded on its search - engine ( and now e-mail ) wares into a full - fledged operating system ?
611 | I shall send you a copy today .
612 | now i will have really straight teeth .
613 | D
614 | Wolfowitz lied to him and said that there was a 10 to 50 % chance that Iraq was behind them .
615 | The actual vote is a little confusing .
616 | Meat Kabob
617 | People love to buy these cute cuddly little animals for gifts and collectables .
618 | These guys do great work at VERY reasonable prices .
619 | Ahmad Mustafa Ghany , 21 , Mississauga ;
620 | Love this place !!
621 | Are you free for lunch some day this week ?
622 | Well , would n't you know it .
623 | 3 thumbs up .
624 | are you lying ?
625 | A very nice park .
626 | YUM
627 | Does anyone know of any good food in iguazu ?
628 | ca n't believe you left last night .
629 | Groups : alt.animals , alt.animals.cat , alt.animals.ethics.vegetarian , talk.politics.animals
630 | I 'm looking for a camera that has really good zoom during a video and pictures ; and good quality pictures / videos
631 | Absolutely rude .
632 | The convergence of views among the more militant Sunni Muslim clerics of AMS and the radical Shiites of the Sadr movement has been seen before , last spring during the initial US assault on Fallujah and during the US attack on Mahdi Army militiamen in Najaf .
633 | EY4106.7 PERFORMANCE 01-Feb-02 P 1,993,045 - $ 43,548
634 | They are very well made and realistic .
635 | If you own a Retail Store or are a Professional Vendor who exhibits at Sport , Hunting , or Craft Shows and are interested in selling our products , please give us a call !
636 | I just got your email and I certainly concur with Jeff making the call .
637 | To summarize : Enron 's pad gas will now be 3,993,310 MMBTU , instead of 4,223,000 MMBTU .
638 | Winning Attorney !
639 | Please give me lots of links and places to look !
640 | Umir Hussain , 24 , London E14
641 | But not so .
642 | Company :
643 | Is that a money maker ?
644 | Media , Software , Fun and Games , Website design , Web Promotion , B2B , Business Promotion , Search Engine Optimization .
645 | Does anybody use it for anything else ?
646 | Rajendra
647 | Channel Guide
648 | I was married by a judge .
649 | Choose the news and sport headlines you want - when you want them , all in one daily e-mail
650 | KENNEDY filed an opinion concurring in part , in which SOUTER , GINSBURG , and BREYER joined as to Parts I and II .
651 | 09/20/2000 03:22 PM
652 | Do n't give these guys a penny .
653 | Seth provides deep tissue massage which has significantly reduced the pain in my neck and shoulders and added flexibility and movement back to the area .
654 | My fries were n't fully cooked last time I went there .
655 | I 've thought about you a few times in the last few months , did n't want to intrude upon an already bad situation with my bullshit questions .
656 | Even after the attacks on September 11 , Bush was obsessing about Iraq .
657 | great , we look forward to seeing you .
658 | We at R&L Plumbing Services are pleased with your professionalism and the extra mile you went to get out computers working correctly , you will be our first call if anything happens again and we will refer you to other people with computer issues .
659 | Strip mall asian it is not !
660 | NASA is planning on using these new shuttles to replace the current models , with industry forecasters predicting a launch as early as 2014 .
661 | Hackney 's has a great burger formula that started about 80 years ago .
662 | Like I 'm legitimately concerned at this point ... lol
663 | Noticed a few of these Cookie cutter places opening in Summit and New Providence .
664 | The thing about The Script is they do not sound that Irish , I was surprised to hear they were from Dublin .
665 | i was thinking somewhere that requires a jacket , like tony 's .
666 | Lest you be lame !!!
667 | Thanks for the great care !!!!
668 | Why certain slogans work and why some do n't .
669 | image_gif_part
670 | Louise ,
671 | Of course , you could just go in by main force .
672 | Are you going to be able to make the power VAR meeting on Thursday ?
673 | 20 fluid ounces in a Pint in Ireland
674 | Sara
675 | Also more often than not you end up with a healthy dose of nasty rude attitude from the employees !
676 | A Top Quality Sandwich made to artistic standards .
677 | I would not hesitate to use him again or refer him to my family or friends .
678 | Steven Vikash Chand alias Abdul Shakur , 25 , Toronto ;
679 | Finally a convenient place close to home .
680 | Clean store , friendly check - out staff up front .
681 | I have Chronic Lyme disease , so I 'm stuck at home .
682 | Abdul Muneem Patel , 17 , London E5
683 | The food is amazing , and the prices can not be beat .
684 | I have two upcoming events one is for 200 and another is for 21 .
685 | i need to now get a job at house of pies b/c that is the only way to pay the bills .
686 | ' Everything is for the best in the best of all possible worlds if only no artificial hindrances are put in the way of free exchange , for demand and supply will regulate everything better than any Government would be able to . '
687 | yuk .
688 | they are the best orthodontics in the world .
689 | Cheapest airline ticket from Raleigh to Philippines ?
690 | Universities will take you whatever age you are .
691 | Definetely going back
692 | Do n't worry about avoiding temptation ... as you grow older , it will avoid you .
693 | http://news.bbc.co.uk/1/hi/help/4162471.stm
694 | Elizabeth 36349
695 | Feel good
696 | Starting in February , you will be able to export the data , as opposed to using the spreadsheets .
697 | Privileged / Confidential Information may be contained in this message .
698 | Green Tea Or White Tea ?
699 | They own blogger , of course .
700 | This place is awesome
701 | Any good suggestions would really be appreciated .
702 | ** Disclaimer **
703 | Any help ?
704 | As we discussed , here is a first effort at a revised TVA offer letter .
705 | Just our standard .
706 | " ... there is no companion quite so devoted , so communicative , so loving and so mesmerizing as a rat . "
707 | __________________________________________________
708 | The actual word " MAD " has to be on the cover and incorporated into the image .
709 | buy them in any good photography supplies shop .
710 | I have had several dentists in my life , but Dr. Deters is by far my favorite .
711 | These people were so helpful this week and did everything to sort out my windscreen and insurance .
712 | tttthhhhh Madonna !
713 | home team - thanks 4 playin !!!
714 | On the other hand , it looks pretty cool .
715 | can ever & never forget the training undergone here which made my life step onto the successful job without any hurdles .
716 | I would prefer a simple , fitted black one .
717 | We look forward to your active participation to make this forum an exciting meeting place for like minded individuals .
718 | Today is good 12:30 ?
719 | Please advise immediately if you or your employer do not consent to Internet email for messages of this kind .
720 | Today is good 12:30 ?
721 | I thought that since Chonawee has an optimization background , he would be good to have him go to dinner with Dr. Lasdon on Thrusday as well .
722 | By April of '71 the Dow had climbed back to 950 , only to fall to 869 in February of '72 .
723 | In such case , you should destroy this message and kindly notify the sender by reply email .
724 | I decided to get a 150 gal aquarium , what can I fill it with ?
725 | --
726 | I need a new lawnmower , so I 'll try to bump it up a little more .
727 | REUTERS / Jason Reed
728 | Know this well because I remember an ' irish ' pub in the town in canada i grew up in used to advertise the cheapest pints of guinness in town , but they served them in american sized pints .
729 | Looks like the kids had a great time !
730 | In addition , there is a reduction of 22,101 MMBTU which is the difference between the SCADA values ( Best Available ) that Anita showed on the February 29th Storage Sheet and the " official " February 29th values that Gary Wilson received from MIPS .
731 | any format url ?
732 | enron is blowing up .
733 | U.S. President George W. Bush shakes hands with Chinese President Hu Jintao in a bilateral meeting in Santiago .
734 | I 'm hearing some pretty depressing stuff from the people I know at ENE .
735 | Thanks and Regard
736 | FYI .
737 | Kind regards
738 | We accecpt : Visa , MasterCard , Amex , Dinner s Club / Carte Blanche , & Personal Checks / Money Orders .
739 | The food was incredibly bland .
740 | Edward Terry
741 | After tomorrow , I will no longer have access to the estate files .
742 | We are still trying to work the PSE swap transaction , now that the forex desk has been able to find a fix for CPI in the market .
743 | Debra Perlingiere
744 | Monkey Brain .
745 | I need suggestions for San Francisco restaurants with good food and good catering service .?
746 | Someone had to be first .
747 | On the other hand , this is essentially a statement that the company is overpriced from the guy who knows it best -- and happens to be the best investor of the last century .
748 | Kyle with Bullwark
749 | I just had the best experience at this Kal Tire location .
750 | Bland and over cooked .
751 | Highly recommended .
752 | I 'll probably start looking next weekend .
753 | Dinner and dancing in Chicago ?
754 | http://www.newsfeeds.com - The # 1 Newsgroup Service in the World !
755 | EY4106.9 PERFORMANCE 01-Feb-02 P 27,886 $ 27,361
756 | so i live in Invercargill New Zealand and i want to know if there are any good places to buy an ice - cream sundae from other than mc donald s lol
757 | Police in the Indian capital Delhi say they have arrested the suspected co-ordinator and financier of last month 's deadly bomb blasts in the city .
758 | I have ordered Bose Headfones worth 300 USD .
759 | Bryan ,
760 | Osman Adam Khatib , 20 , London E17
761 | Any suggestions would be really helpful , thanks !
762 | it s a gift from my brother .
763 | Ian - Webmaster www.southbhamcats.org.uk
764 | Would do business with them again .
765 | i flew here last night .
766 | Thanks
767 | In a timid voice , he says : " If an airplane carrying Winston Peters was blown up by a bomb , THAT would be a tragedy " .
768 | Hope you 're doing good .
769 | Ram Tackett , ( mailto:rtackett@abacustech.net ) Owner , Abacus Technologies 17611 Loring Lane , Spring , TX 77388-5746 ( 281 ) 651-7106 ; Fax ( 281 ) 528-8636 Web : http://www.abacustech.net
770 | The door is easy to use and it keeps the cold out during the winter .
771 | different generations , the donatello is of a boy david as a young sheep Herder , the Michelangelo is the grown up man david as slayer and king
772 | ( 713 ) 853-7408
773 | Yes , they all have secret locator chips , just like gps
774 | His work
775 | I have never been disappointed .
776 | Great meats that are already cooked , easy to take home for dinner .
777 | ------
778 | I would highly recommend her services .
779 | ------
780 | " Our new lunar transportation system utilizes a unique architecture that will establish the equivalent of a two - way highway between the Earth and the Moon , " Kistler told SPACE.com .
781 | Following up on your and Ken Lay 's conversation with Gary Cohn , I would like to forward the following proposal , acting for each of Goldman Sachs Capital Markets and J. Aron .
782 | Friendliest place I have ever stayed !
783 | If the PX comes back again , I will call their in - house attys .
784 | Very Informative website with a lot of good work
785 | Stayed here for 2 nights .
786 | Portia
787 | Mohammed Dirie , 22 , Kingston , Ont. ;
788 | Daren ,
789 | Rubbish
790 | You 'd need their Apple ID and password , if you had that then yes you can track any iPhone .
791 | " Well , " says the boy , " because it would n't be an accident , and it certainly would n't be a great loss ! "
792 | I 've been looking at the bose sound dock 10 i ve currently got a jvc mini hifi system , i was wondering what would be a good set of speakers .
793 | Al - Qaeda in Afghanistan was a group of only a few hundred " Afghan Arabs " who pledged personal loyalty to Usamah Bin Laden .
794 | I have 3 children there and they are the Best .
795 | I am going to be the Senior Regulatory Counsel at ISO New England starting on April 9 , 2001 .
796 | i can think of a few things
797 | A girl raises her hand .
798 | SS
799 | If I went into the " pre-university " direction with business administration in mind .
800 | here s the link :
801 | Or background stands
802 | Studying in Quebec , Canada ?
803 | fyi
804 | my bad .
805 | My results were just AWFUL .
806 | They know that the American advent implies for them a demotion , and an elevation of the Shiites and Kurds , and they refuse to go quietly .
807 | the camera only begins to work again when i take out the battery and put it back in .
808 | A thoroughly comprehensive service ; excellent communication and best of they are transparent with their fee ( ie nothing is simply implied or assumed ) .
809 | More below .
810 | Mike
811 | The best climbing club around .
812 | Opinions , conclusions and other information in this message that do not relate to the official business of my firm shall be understood as neither given nor endorsed by it .
813 | Marvel Consultants , Inc. 28601 Chagrin Blvd. Cleveland , Ohio 44122 USA Email : recruiters@marvelconsultants.com < mailto:recruiters@marvelconsultants.com > Phone : 216-292-2855 Fax : 216-292-7207
814 | No , technically they do not need a UVB light ; they are nocturnal .
815 | I enjoyed your presentations very much .
816 | Launching this way will hopefully avoid future disasters , giving more support towards NASA revisiting the stars .
817 | http://www.google.co.uk/search?q=backdrop+frame&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&safe=active&sout=1
818 | his clinic is very very dirty he is a real disaster to go totally not organized for every step he take .
819 | you know , whatever .
820 | He was very clean , very nice to work with and gave a very reasonable price .
821 | Currently we have a blank " sample " for our Paragraph 13s which are attached to our sample ISDAs for ( a ) US Corporate , ( b ) Hedge Funds , ( c ) Municipal .
822 | OK Food , Slow service
823 | Old time grocery , best steaks I have ever had !
824 | He 's worth every penny .
825 | The workers sped up and down the street with no mind to the small children playing .
826 | Rest was too oily .
827 | I refer to VNHH often and love you guys .
828 | The results of the February 26th test reduces the working gas by 398,487 MMBTU .
829 | Shareef Abdelhaleen , 30 , Mississauga ;
830 | The September 11 Panel will issue its findings on Thursday .
831 | The overwhelming human and financial impacts of Hurricane Katrina are powerful evidence that political and economic decisions made in the United States and other countries have failed to account for our dependence on a healthy resource base , according to an assessment released today by the Worldwatch Institute .
832 | [ via Microsoft Watch from Mary Jo Foley ]
833 | http://www.nola.com/lsu/t-p/football/index.ssf?/lsustory/lsunotes08.html
834 | None of the above .
835 | You should really ask this in the art section .
836 | this kebab shop is one of the best around the meat is good and fresh and the chilly sauce is the best , keep them lovely kebabs coming and a happy new year to all the staff
837 | Wonderful Wonderful People !
838 | Holland & Hart , LLP ; # 432785 dated 4/14/00
839 | 06/02/2001 10:53 AM
840 | http://www.theadvocate.com/sports/story.asp?StoryID=16473
841 | But we ca n't prove it .
842 | I give this place 11 / 10 .
843 | Travelled 40 mins after calling to see if a product was in stock .
844 | Which wonderful contact of mine is thumbs upping all my best answers ^^ ?
845 | Hidden Treasure .
846 | Hi I ´m from Brazil and I want to know of book 06 .
847 | I never wait in the waiting room more than two minutes and the cleanings are quick and painless .
848 | Thank you though .
849 | We have updated our site to include a LOST and FOUND page and you can now join our branch and make a secure on - line donation to the charity .
850 | roflmao
851 | My assistant Joanne Rozycki has cell , car numbers to reach me .
852 | Yeah you got ta burn it .. that s the only way
853 | They picked my car up in Yarmouth and towed to Bath for a great price .
854 | I have just checked with RAC ( David Gorte ) and we have a green light to go ahead with the project .
855 | Please update daily
856 | I plan on going again .
857 | The haircut was inexpensive and so were the salon services ( eyebrows were cheap ! ) .
858 | Wonderful Atmosphere
859 | " What ? " asks Winston , " is n't there any one here who can give me an example of a tragedy ? "
860 | There may or may not be snow , depending on local weather conditions .
861 | economy should be good here .
862 | I have spoken with Mark Lay and he is interested .
863 | Further to my voicemail , our colleagues in credit are calculating the reserve on the PSE swap .
864 | By Deb Price / The Detroit News
865 | Is Hank Green Awesome ?
866 | The rooms were very clean and the breakfast was excellent .
867 | Bush did not have his eye on the ball .
868 | Posted by Anthony Beavers to Cognitive Science News at 8/28/2005 07:18:20 AM
869 | i used to have one .
870 | here is to getting back on track after thanksgiving
871 | Name something you find at a carnival that comes on a stick ?
872 | - Alex Levine
873 | i must have had you messed up with some other girl i made the bet with .
874 | I use their limo services for all of my airport car services and airport transportation needs
875 | em ... no ... the Gates foundation mainly invests in medical research and education , that means donating now adds a tremendous value compared to donating in ten years .
876 | green curry and red curry is awesome !
877 | Mike Curry
878 | The Donuts were very over proofed , making them stale and bready .
879 | Very fast and efficient service .
880 | Inford Media
881 | A fast service , saved a bad situation getting a lot worse .
882 | Do nt go to the one by pepco , I got confused !!!
883 | Someone told me that Chase is planning a shitload of layoffs .
884 | Most troubling , however , is the fact that the political will to end the crisis expressed a few short weeks ago seems to have ebbed .
885 | In one class , he asks the students if anyone can give him an example of a " tragedy " .
886 | i do n't even like a&m , i would n't bet on them .
887 | thank you
888 | I better pass on the Comets game .
889 | Atal Pharmacy , Karol Bag .
890 | you can buy me dinner when we get back .
891 | Thanks ,
892 | Sure Google , although this would put it on coarse for global domination of the internet by 2014 .
893 | Debra Perlingiere
894 | Red Robin .
895 | My weekends seem to be taken up with condo matters , house hunting .
896 | It does n't change the company 's intrinsic worth , and as the article notes , the company might be added to a major index once the shares get more liquid .
897 | Do n't waste your money on the jukebox
898 | which is the best burger chain in the chicago metro area like for example burger king portillo s white castle which one do like the best ?
899 | Do n't judge a book by its cover
900 | - Victor Borge
901 | ca n't go to any more lsu games unless i get a free ticket .
902 | to make convo , it 's short , so if things go great , you can
903 | then i told her i felt i should be able to screw missy just once .
904 | there will be talent and opportunity a plenty on the market soon .
905 | But I was not pleased to read the description in the catalog : " No good in a bed , but fine against a wall . "
906 | Daren
907 | Events change everyday .
908 | " If a school bus carrying fifty children drove off a cliff , killing everyone involved ... that would be a tragedy " .
909 | Groups : alt.animals.cat
910 | Thanks
911 | I will go there again .
912 | This is by far the best run dealership in Miami .
913 | I could go on and on !
914 | On my last 6 trips here from the states I used " Fly genesis " based in Seattle , Wa .
915 | not sure how much longer ene is going to be around and i 'm checking out my options !
916 | **
917 | They have always done a great job at a reasonable price .
918 | August 12 , 2000
919 | I was very upset when I went to Mother Plucker , they had NO FEATHERS and the quality is TERRIBLE .
920 | slow service
921 | Vijay K. Suchdev Vice President Equity Derivatives First Union Securities , Inc. Telephone : ( 212 ) 909-0951 Facsimile : ( 212 ) 891-5042 email : vijay.suchdev@funb.com < mailto:vijay.suchdev@funb.com >
922 | B & w .
923 | I hope you do n't mind , but I 've taken liberty to turn them into a web photo album at http://24.27.98.30/pictures/08-05_Garrett_Gayle_Bday .
924 | The Pentagon did not even have a plan for dealing with Afghanistan or al - Qaeda that it could pull off the shelf , according to Bob Woodward .
925 | Sheridan Titman < titman@mail.utexas.edu > on 01/24/2001 02:45:50 PM
926 | Over three years after 9-11 , the United Nations , despite their attempts to project strength in fighting terrorism , still can not properly define the word " terrorist " , waffling over the issue of whether the murder of innocent civilians are terrorist acts .
927 | If you took out the clown loach it would make a nice 150 gallon tank .
928 | Stick to Hop Hing , 20 year + resident .
929 | Send the revised report by e-mail .
930 | --
931 | After friday , I will no longer have access to the estate , so if you could shoot this off over night so I could have something in the morning to work with I would appreciate it .
932 | Cheap , great view , time together .
933 | See http://www.gulf-news.com/Articles/news.asp?ArticleID=97508
934 | Posted by Hidden Nook to Hidden Nook at 2/14/2005 07:03:00 PM
935 | i have stronger will than you think .
936 | Thanks again , Directv .
937 | not sure yet
938 | Well , I 'm about to graduate in less then a year , and I 'm planning to study medical school .
939 | Please confidentially share matters as you think best and advise me of the interest generated .
940 | Restaurant on top was renovated , food was decent , price was way to high for Duluth for quality , new decor seems tacky
941 | I 'm planning on buying a compact system camera at best buy ; so please list the one ( s ) I should purchase .
942 | canon t2i stops working ?
943 | Compare compare compare - that 's the key to getting the best deal .
944 | Use Travelocity or Expedia and see what you come up with .
945 | By September of that year the Dow had tumbled to 744 .
946 | " No , " Winston says , " That would be an ACCIDENT . "
947 | I like my Monkey Brain on a stick for sure .
948 | http://www.hackneys.net/
949 | I gave Dr. Rohatgi 2 stars because her assistant was very pleasant .
950 | Hilary E. Ackermann Goldman Sachs Credit Risk Management & Advisory Phone : 212-902-3724 Fax : 212-428-1181 E-Mail : hilary.ackermann@gs.com
951 | Of course , that was the bottom
952 | Relish
953 | I started this page to help with my boredom .
954 | Just ask American Express
955 | We 'll see you on ' Border Patrol '
956 | On the same day Palestinians protest in support of Hezbollah and Syria , the terrorist group Hamas has indicated it will participate in the scheduled upcoming Parliamentary elections .
957 | The best customer service I 've come across for long time .
958 | EY4108.H PERFORMANCE 01-Feb-02 P - 2,239,879 - $ 36,398
959 | this dentist want to pull the tooth out always .. always wants to do the cheapest for his benefit .. not unless he knows you .
960 | It 's a little hard to parse , but at this point his ostensible view is that the Gateses are very good money - redistributors , and he wants them to have the money as soon as possible .
961 | And can you tell me WHY that would be a tragedy ? "
962 | Wendi has worked for them have a look at her blog .
963 | Original Margin Call Margin Due Today
964 | I am amazed how the details get fuzzy on an old project .
965 | If you want a CD copy of this web site , give me a yell .
966 | Amy :-)
967 | The positions needed to be divided to reflect BCF 's .
968 | Hi , i 'm looking to take myself and my best friend and his girl friend and this girl i really like out to dinner for my birthday .
969 | and the people are sweet :)
970 | They chased the Communists out of the capital ( Hanoi ) and retook control .
971 | Maybe Labour .
972 | The employees are really friendly .
973 | We plan to use the same basic form for the Enron guaranty that will be made in favor of EPMI with respect to the seller 's obligations under the PPA .
974 | Great Place !
975 | Since work has gone to hell , I am hoping to find some excitement in the possibility that LSU may play in the Cotton Bowl ( if Rohan " Alabama " Davey shows up for the next 3 games . )
976 | we can go somewhere nice .
977 | i put $ 5 bucks down for it too .
978 | American Food , Soul Food , Mexican , Italian , and Chinese are the options .
979 | depends on the computer
980 | Photo from Technology News
981 | Michelangelo made the marble one but why did he do another if Donatello had already made one ?
982 | You should use the same spreadsheet format used for the 1/29/02 DPR .
983 | Sheridan ,
984 | We will have to correct them after the churn .
985 | ~ CGoehring
986 | EY4096.3 PERFORMANCE 01-Feb-02 P 202,989 $ 195,610
987 | It is now threatening to pull out of the Allawi caretaker government .
988 | Very knowledgeable and friendly design build firm .
989 | My last day in the Portland area will be March 31 , 2001 .
990 | Seth K .
991 | Rude and Untrustworthy
992 | Dawn
993 | Close to my house , this is the only reason I would go to this particular QT .
994 | i wish the other utilities i had to set up had people to work with like this ..
995 | He is regarded as one of the leading Avant - Garde artist of modern calligraphy .
996 | Quality has fallen over the years , but still the best go - to burger place on the East Bay .
997 | Look at a map and you try to figure out how , in fall of 1999 , you could possibly pull off such an operation without Pakistani facilities .
998 | Best ,
999 | We now have over 5000 addresses .
1000 | WE AT HOME LOVE IT AT $ 80 +++
1001 |
--------------------------------------------------------------------------------
/structural-probes/spec.yaml:
--------------------------------------------------------------------------------
1 | dataset:
2 | observation_fieldnames:
3 | - index
4 | - sentence
5 | - lemma_sentence
6 | - upos_sentence
7 | - xpos_sentence
8 | - morph
9 | - head_indices
10 | - governance_relations
11 | - secondary_relations
12 | - extra_info
13 | - embeddings
14 | corpus:
15 | root: en_ewt-ud/
16 | train_path: en_ewt-ud-train.conllu
17 | dev_path: en_ewt-ud-dev.conllu
18 | test_path: en_ewt-ud-test.conllu
19 | embeddings:
20 | type: token #{token,subword}
21 | root: .
22 | train_path: encodings-train.hdf5
23 | dev_path: encodings-dev.hdf5
24 | test_path: encodings-test.hdf5
25 | batch_size: 40
26 | model:
27 | hidden_dim: 768
28 | model_type: BERT-disk # BERT-disk, ELMo-disk,
29 | use_disk: True
30 | model_layer:
31 | probe:
32 | task_signature: word_pair # word, word_pair
33 | task_name: parse-distance
34 | maximum_rank: 32
35 | psd_parameters: True
36 | diagonal: False
37 | params_path: predictor.params
38 | probe_training:
39 | epochs: 30
40 | loss: L1
41 | reporting:
42 | root: .
43 | observation_paths:
44 | train_path: train.observations
45 | dev_path: dev.observations
46 | test_path: test.observations
47 | prediction_paths:
48 | train_path: train.predictions
49 | dev_path: dev.predictions
50 | test_path: test.predictions
51 | reporting_methods:
52 | - spearmanr
53 | - uuas
54 | - root_acc
55 |
--------------------------------------------------------------------------------