├── README.md
├── eval_character.py
├── eval_scripts
    ├── eval_character.sh
    └── eval_style.sh
├── eval_style.py
├── methods
    ├── inference.py
    ├── inference_dynamic_match.py
    ├── inference_random_match.py
    ├── inference_static_match.py
    └── inference_utterances_only.py
├── requirements.txt
└── resources
    ├── all_styles.tsv
    ├── dailydialog_test_utterances.tsv
    └── predefined_texts.txt


/README.md:
--------------------------------------------------------------------------------
 1 | # Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances (NAACL 2022)
 2 | 
 3 | - [Paper link](https://arxiv.org/pdf/2204.10825.pdf)
 4 | 
 5 | ## Env Setup
 6 | ```
 7 | conda create -n [env_name] python=3.8
 8 | conda activate [env_name]
 9 | 
10 | conda install cudatoolkit=11.0 pytorch=1.7.1
11 | pip install git+https://github.com/huggingface/transformers
12 | pip install datasets pandas pyarrow sklearn
13 | pip install -r requirements.txt
14 | ```
15 | 
16 | ## Run inference
17 | - We assume you run your language model and launch server on `http://${1}/generate`.
18 | - Below is an example for `Dynamic Match`.
19 | 
20 | ```
21 | python3 methods/inference_dynamic_match.py \
22 |   --model-file $retriever_model_path \
23 |   --megatron-endpoint http://${1}/generate \
24 |   --character-name $character_name \
25 |   --response-selection-strategy top1 \
26 |   --max-num-exemplars 8 \
27 |   --evaluate-set resources/dailydialog_test_utterances.tsv \
28 |   --all-styles-path resources/all_styles.tsv \
29 |   --save-results-path results \
30 |   --styles $character
31 | ```
32 | 
33 | ## Run Character StyleProb Evaluation
34 | ```
35 | srun --gres=gpu:1 eval_scripts/eval_character.sh [jsonl_input_file_path] [character_name] [classfier_model_path]
36 | ```
37 | 
38 | ## Run Other Styles StyleProb Evaluation
39 | Styles including `positive`, `negative`, `Modern`, `Shakespearean`, `joy`, `anger`.
40 | ```
41 | srun --gres=gpu:1 eval_scripts/eval_style.sh [jsonl_input_file_path] [expected_label] [classfier_model_path]
42 | ```
43 | 
44 | ### Note: Example of Jsonl file
45 | ```
46 | {"context": ["that's awesome! Do you spend a lot of time there?", "i do! it's a lot of fun but it can be tiring sometimes", "I can imagine. what kind of restaurant do they own?"], "response": "The restaurant the restaurant"}
47 | {"context": ["I got some great news today! My husband got a better paying job offer!", "Holy cow that's awesome!!!  What are you going to do with all that extra moneys??", "Not sure yet, but itll help us life more comforatbly! We move to his hometown in November when he gets out of Army!"], "response": "You must be so thrilled. There are so many lonely life out there. He must be thrilled."}
48 | ...
49 | ```
50 | 
51 | ## Run MaUdE
52 | - Clone [MaUdE Repo](https://github.com/facebookresearch/online_dialog_eval) and setup environment
53 | - Run following script
54 | 
55 | ```
56 | cat maude_inference.sh
57 | 
58 | >>>
59 | 
60 | #!/bin/zsh
61 | MODEL_SAVE_DIR=full_runs/
62 | DATA_NAME=convai2
63 | DATA_LOC=$1
64 | FINE_TUNE_MODEL=convai2_data/distilbert_lm
65 | TRAIN_MODE=nce
66 | 
67 | VERSION=20488119
68 | MODEL_ID=na_all
69 | 
70 | 
71 | for DATA_LOC in "$@"
72 | do
73 |   python3 codes/inference.py \
74 |     --id $MODEL_ID \
75 |     --model_save_dir $MODEL_SAVE_DIR \
76 |     --model_version $VERSION \
77 |     --train_mode nce \
78 |     --corrupt_pre $DATA_LOC \
79 |     --test_suffix true_response \
80 |     --test_column response
81 | done
82 | ```
83 | ```
84 | srun --gres=gpu:1 maude_inference.sh [jsonl_path]
85 | ```
86 | 
87 | ## Citation
88 | 
89 | If you find our paper or this project helps your research, please kindly consider citing our paper in your publications.
90 | 
91 | ```
92 | @article{han2022meet,
93 |   title={Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances},
94 |   author={Han, Seungju and Kim, Beomsu and Yoo, Jin Yong and Seo, Seokjun and Kim, Sangbum and Erdenee, Enkhbayar and Chang, Buru},
95 |   journal={arXiv preprint arXiv:2204.10825},
96 |   year={2022}
97 | }
98 | ```
99 | 


--------------------------------------------------------------------------------
/eval_character.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import json
 3 | 
 4 | import torch
 5 | from scipy.special import softmax
 6 | from transformers import AutoModelForSequenceClassification
 7 | from transformers import AutoTokenizer
 8 | 
 9 | CHARACTERS = [
10 |     "BMO",
11 |     "Rachel",
12 |     "Burke",
13 |     "Barney",
14 |     "Spock",
15 |     "Sheldon",
16 |     "Dwight",
17 |     "Michael",
18 |     "BartSimpson",
19 |     "MargeSimpson",
20 | ]
21 | CHARACTER_TO_IDX = {c: i for i, c in enumerate(CHARACTERS)}
22 | 
23 | 
24 | def transform_input(texts, tokenizer):
25 |     result = tokenizer(texts, max_length=256, truncation=True, padding="max_length")
26 |     result = {k: torch.LongTensor(v) for k, v in result.items()}
27 |     # Optional when you run your model in GPU
28 |     result = {k: v.to("cuda:0") for k, v in result.items()}
29 |     return result
30 | 
31 | 
32 | def run_model(texts, model, tokenizer):
33 |     transformed = transform_input(texts, tokenizer)
34 |     with torch.no_grad():
35 |         logits = model(**transformed).logits
36 |         logits = logits.cpu().numpy()
37 |         probs = softmax(logits, axis=-1)
38 |     return probs
39 | 
40 | 
41 | def main(args):
42 |     try:
43 |         character_idx = CHARACTER_TO_IDX[args.character_name]
44 |     except KeyError:
45 |         raise ValueError(f"Unsupported character name: {args.character_name}")
46 | 
47 |     with open(args.input_path) as f:
48 |         sentences = [json.loads(line.strip())["response"] for line in f]
49 | 
50 |     tokenizer = AutoTokenizer.from_pretrained(args.model_dir)
51 |     model = AutoModelForSequenceClassification.from_pretrained(args.model_dir, from_tf=False)
52 |     # Optional when you run your model in GPU
53 |     model = model.to("cuda:0")
54 | 
55 |     sum_probs = 0.
56 |     num_instances = 0.
57 | 
58 |     for start_idx in range(0, len(sentences), args.batch_size):
59 |         end_idx = min(start_idx + args.batch_size, len(sentences))
60 |         batch = sentences[start_idx: end_idx]
61 | 
62 |         model_preds = run_model(batch, model, tokenizer)
63 |         sum_probs += model_preds[:, character_idx].sum()
64 |         num_instances += len(model_preds)
65 | 
66 |     avg_prob = sum_probs / num_instances
67 |     print(f"Avg prob for predicting as character {args.character_name}: {avg_prob:.8f}")
68 | 
69 | 
70 | if __name__ == "__main__":
71 |     parser = argparse.ArgumentParser()
72 |     parser.add_argument("--model-dir", type=str)
73 |     parser.add_argument("--input-path", type=str)
74 |     parser.add_argument("--character-name", type=str, choices=CHARACTERS)
75 |     parser.add_argument("--batch-size", type=int, default=64)
76 |     args = parser.parse_args()
77 | 
78 |     main(args)
79 | 


--------------------------------------------------------------------------------
/eval_scripts/eval_character.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/zsh
 2 | 
 3 | EVAL_INPUT_PATH=$1  # Path for input jsonl file containing pairs for evaluation
 4 | CHARACTER_NAME=$2
 5 | CLASSIFIER_MODEL_DIR=$3
 6 | 
 7 | python3 eval_character.py \
 8 |   --model-dir $CLASSIFIER_MODEL_DIR \
 9 |   --input-path $EVAL_INPUT_PATH \
10 |   --character-name $CHARACTER_NAME
11 | 


--------------------------------------------------------------------------------
/eval_scripts/eval_style.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/zsh
 2 | 
 3 | EVAL_INPUT_PATH=$1  # Path for input jsonl file containing pairs for evaluation
 4 | EXPECTED_STYLE_LABEL=$2
 5 | CLASSIFIER_MODEL_DIR=$3
 6 | 
 7 | python3 eval_style.py \
 8 |   --model-dir $CLASSIFIER_MODEL_DIR \
 9 |   --input-path $EVAL_INPUT_PATH\
10 |   --expected-label $EXPECTED_STYLE_LABEL


--------------------------------------------------------------------------------
/eval_style.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import json
 3 | 
 4 | import numpy as np
 5 | import torch
 6 | from scipy.special import softmax
 7 | from transformers import AutoModelForSequenceClassification
 8 | from transformers import AutoTokenizer
 9 | 
10 | 
11 | def transform_input(texts, tokenizer):
12 |     result = tokenizer(texts, max_length=256, truncation=True, padding="max_length")
13 |     result = {k: torch.LongTensor(v) for k, v in result.items()}
14 |     # Optional when you run your model in GPU
15 |     result = {k: v.to("cuda:0") for k, v in result.items()}
16 |     return result
17 | 
18 | 
19 | def run_model(texts, model, tokenizer):
20 |     transformed = transform_input(texts, tokenizer)
21 |     with torch.no_grad():
22 |         logits = model(**transformed).logits
23 |         logits = logits.cpu().numpy()
24 |         probs = softmax(logits, axis=-1)
25 |     preds = np.argmax(probs, axis=-1)
26 |     return preds
27 | 
28 | 
29 | def main(args):
30 |     LABEL_STR_TO_INT = {
31 |         "modern": 0,
32 |         "shakespearen": 1,
33 |         "negative": 0,
34 |         "positive": 1,
35 |         "anger": 0,
36 |         "joy": 1,
37 |     }
38 |     expected_label = LABEL_STR_TO_INT[args.expected_label]
39 | 
40 |     with open(args.input_path) as f:
41 |         sentences = [json.loads(line.strip())["response"] for line in f]
42 | 
43 |     tokenizer = AutoTokenizer.from_pretrained(args.model_dir)
44 |     model = AutoModelForSequenceClassification.from_pretrained(args.model_dir, from_tf=False)
45 |     # Optional when you run your model in GPU
46 |     model = model.to("cuda:0")
47 | 
48 |     num_right = 0
49 | 
50 |     for start_idx in range(0, len(sentences), args.batch_size):
51 |         end_idx = min(start_idx + args.batch_size, len(sentences))
52 |         batch = sentences[start_idx: end_idx]
53 | 
54 |         model_preds = run_model(batch, model, tokenizer)
55 |         num_right += (model_preds == expected_label).sum()
56 | 
57 |     accuracy = num_right * 100. / len(sentences)
58 |     print(f"Accuracy: {accuracy:.6f}%")
59 | 
60 | 
61 | if __name__ == "__main__":
62 |     parser = argparse.ArgumentParser()
63 |     parser.add_argument("--model-dir", type=str)
64 |     parser.add_argument("--input-path", type=str)
65 |     parser.add_argument("--expected-label", type=str, choices=["modern",
66 |                                                                "shakespearen",
67 |                                                                "negative",
68 |                                                                "positive",
69 |                                                                "anger",
70 |                                                                "joy"])
71 |     parser.add_argument("--batch-size", type=int, default=64)
72 |     args = parser.parse_args()
73 | 
74 |     main(args)
75 | 


--------------------------------------------------------------------------------
/methods/inference.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import random
  3 | import sys
  4 | from typing import List
  5 | 
  6 | import faiss
  7 | import numpy as np
  8 | import pandas as pd
  9 | import torch
 10 | from parlai.agents.transformer.transformer import TransformerRankerAgent
 11 | from parlai.core.agents import create_agent
 12 | from parlai.core.params import ParlaiParser
 13 | from parlai.core.torch_agent import Batch
 14 | from parlai.utils import logging
 15 | from tqdm import tqdm
 16 | 
 17 | 
 18 | def load_agent(model_file: str):
 19 |     parser = ParlaiParser(False, False)
 20 |     TransformerRankerAgent.add_cmdline_args(parser)
 21 |     parser.set_params(
 22 |         model="transformer/biencoder",
 23 |         model_file=model_file,
 24 |         print_scores=False,
 25 |         data_parallel=False,
 26 |         no_cuda=False,
 27 |         gpu=0,
 28 |     )
 29 |     safety_opt = parser.parse_args([])
 30 |     agent = create_agent(safety_opt, requireModelExists=True)
 31 |     agent.model.eval()
 32 |     return agent
 33 | 
 34 | 
 35 | def update_history(agent, context: List[str]):
 36 |     observation = {"text_vec": None}
 37 |     agent.reset()
 38 |     for utterance_idx, utterance in enumerate(context):
 39 |         observe_input = {"text": utterance, "episode_done": False}
 40 |         is_self_utterance = (len(context) - utterance_idx) % 2 == 0
 41 |         if is_self_utterance:
 42 |             # Initial utterance case
 43 |             if agent.observation is None:
 44 |                 agent.history.add_reply(utterance)
 45 |             else:
 46 |                 agent.self_observe(observe_input)
 47 |         else:
 48 |             observation = agent.observe(observe_input)
 49 | 
 50 |     if observation["text_vec"] is None:
 51 |         observation = agent.observe({"text": " ", "episode_done": False})
 52 |     return observation["text_vec"]
 53 | 
 54 | 
 55 | def get_embeddings(agent, contexts: List[List[str]]):
 56 |     logging.disable()
 57 |     with torch.no_grad():
 58 |         text_vecs = [update_history(agent, context) for context in contexts]
 59 |         padded_text_vecs = agent._pad_tensor(text_vecs)[0].to("cuda:0")
 60 |         batch = Batch(text_vec=padded_text_vecs)
 61 |         _, _context_embeddings = agent.model.encode_context_memory(
 62 |             context_w=batch.text_vec, memories_w=None, context_segments=None)
 63 |     logging.enable()
 64 |     return _context_embeddings
 65 | 
 66 | 
 67 | def get_candidate_embeddings(agent, candidates: List[str]):
 68 |     with torch.no_grad():
 69 |         logging.disable()
 70 |         sys.stderr = open(os.devnull, "w")
 71 |         candidate_vecs = agent._make_candidate_vecs(candidates)
 72 |         candidate_embs = agent._make_candidate_encs(candidate_vecs.to("cuda:0"))
 73 |         logging.enable()
 74 |         sys.stderr = sys.__stderr__
 75 |     return candidate_embs
 76 | 
 77 | 
 78 | def build_context_embeddings(agent, exemplars, batch_size=32):
 79 |     context_embeddings = []
 80 |     for start_idx in range(0, len(exemplars), batch_size):
 81 |         end_idx = min(len(exemplars), start_idx + batch_size)
 82 |         context_embeddings.append(
 83 |             get_embeddings(
 84 |                 agent,
 85 |                 [[exemplar.context] for exemplar in exemplars[start_idx:end_idx]]
 86 |             )
 87 |         )
 88 | 
 89 |     context_embeddings = torch.cat(context_embeddings, dim=0)
 90 |     return context_embeddings
 91 | 
 92 | 
 93 | def read_exemplars(exemplar_tsv_path, use_k_sentences=-1, style=None, delimiter="\t", use_gold_q=False):
 94 |     df = pd.read_csv(exemplar_tsv_path, delimiter=delimiter).dropna()
 95 |     if style:
 96 |         df = df.loc[df["style"] == style]
 97 |         if not use_gold_q:
 98 |             df["context"] = "DUMMY"
 99 |     df = df.rename(columns={"generated_text": "response"})
100 |     exemplars = [row for _, row in df.iterrows()]
101 |     for exp in exemplars:
102 |         if exp.response[0] == '"' and exp.response[-1] == '"':
103 |             exp.response = exp.response[1:-1]
104 |     if use_k_sentences > -1:
105 |         exemplars = exemplars[:use_k_sentences]  # use first k sentences
106 |     return exemplars
107 | 
108 | 
109 | def setup_exemplars(agent, exemplar_tsv_path, batch_size=32, use_k_sentences=-1, style=None):
110 |     exemplars = read_exemplars(exemplar_tsv_path, use_k_sentences, style)
111 |     context_embeddings = build_context_embeddings(agent, exemplars, batch_size)
112 |     return exemplars, context_embeddings
113 | 
114 | 
115 | def get_reference_exemplars(agent, query_context, exemplars, context_embeddings, max_num_exemplars):
116 |     with torch.no_grad():
117 |         query_embedding = get_embeddings(agent, [query_context])
118 |         scores = torch.mm(query_embedding, context_embeddings.transpose(0, 1)).squeeze(0)
119 |         scores = scores.cpu().numpy()
120 | 
121 |     reference_exemplars = [
122 |         {"score": float(scores[idx]), **exemplars[idx].to_dict()}
123 |         for idx in np.argsort(scores)[::-1][:max_num_exemplars]
124 |     ]
125 |     reference_exemplars = sorted(reference_exemplars, key=lambda x: -x["score"])
126 |     return reference_exemplars
127 | 
128 | 
129 | def get_ranker_top1(agent, context, candidates, print_candidates=False):
130 |     with torch.no_grad():
131 |         context_embedding = get_embeddings(agent, [context])
132 |         candidate_embeddings = get_candidate_embeddings(agent, [cand["text"] for cand in candidates])
133 |         scores = torch.mm(context_embedding, candidate_embeddings.transpose(0, 1)).squeeze(0)
134 |         scores = scores.cpu().numpy()
135 | 
136 |     if print_candidates:
137 |         for score, candidate in zip(scores, candidates):
138 |             candidate["score"] = score
139 |         print_candidates = sorted(candidates, key=lambda x: -x["score"])
140 |         print(print_candidates)
141 | 
142 |     return candidates[np.argmax(scores)]
143 | 
144 | 
145 | def select_response(agent, context, candidates, strategy_name):
146 |     if strategy_name == "random":
147 |         candidate = random.choice(candidates)
148 |     elif strategy_name == "top1":
149 |         candidate = max(candidates, key=lambda x: x["score"])
150 |     elif strategy_name == "ranker":
151 |         candidate = get_ranker_top1(agent, context, candidates)
152 |     else:
153 |         raise ValueError(f"Not a proper selection strategy.")
154 | 
155 |     return candidate["text"]
156 | 
157 | 
158 | def get_pseudo_context_embeddings(agent, pseudo_contexts):
159 |     batch_size = 1024
160 |     context_embeddings = []
161 |     for start_idx in tqdm(range(0, len(pseudo_contexts), batch_size)):
162 |         end_idx = min(len(pseudo_contexts), start_idx + batch_size)
163 |         batch = pseudo_contexts[start_idx: end_idx]
164 |         contexts = [[context] for context in batch]
165 |         context_embedding = get_embeddings(agent, contexts)
166 |         context_embeddings.append(context_embedding)
167 |     context_embeddings = torch.cat(context_embeddings)
168 |     assert len(pseudo_contexts) == len(context_embeddings)
169 |     return context_embeddings
170 | 
171 | 
172 | def build_faiss_index(fixed_candidate_vecs: np.ndarray):
173 |     print("Started building faiss index.")
174 |     num_embeddings, index_dim = fixed_candidate_vecs.shape
175 |     cpu_faiss_index = faiss.IndexFlatIP(index_dim)
176 |     gpu_faiss_index = cpu_faiss_index
177 |     # gpu_faiss_index = faiss.index_cpu_to_all_gpus(cpu_faiss_index)
178 |     gpu_faiss_index.add(fixed_candidate_vecs)
179 |     print("Finished building faiss index.")
180 |     return gpu_faiss_index
181 | 
182 | 
183 | def get_reference_exemplars_using_answer(agent, query_context, exemplars, candidate_embeddings, max_num_exemplars):
184 |     with torch.no_grad():
185 |         query_embedding = get_embeddings(agent, [query_context])
186 |         scores = torch.mm(query_embedding, candidate_embeddings.transpose(0, 1)).squeeze(0)
187 |         scores = scores.cpu().numpy()
188 | 
189 |     reference_exemplars = [
190 |         {"score": float(scores[idx]), **exemplars[idx].to_dict()}
191 |         for idx in np.argsort(scores)[::-1][:max_num_exemplars]
192 |     ]
193 |     reference_exemplars = sorted(reference_exemplars, key=lambda x: -x["score"])
194 |     return reference_exemplars
195 | 


--------------------------------------------------------------------------------
/methods/inference_dynamic_match.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import json
  3 | import random
  4 | from pathlib import Path
  5 | 
  6 | import numpy as np
  7 | import pandas as pd
  8 | import requests
  9 | import torch
 10 | from termcolor import colored
 11 | 
 12 | from methods.inference import build_faiss_index
 13 | from methods.inference import get_candidate_embeddings
 14 | from methods.inference import get_embeddings
 15 | from methods.inference import get_pseudo_context_embeddings
 16 | from methods.inference import get_reference_exemplars_using_answer
 17 | from methods.inference import load_agent
 18 | from methods.inference import read_exemplars
 19 | from methods.inference import select_response
 20 | 
 21 | 
 22 | def main(args, agent):
 23 |     whole_exemplars = []
 24 |     for style in args.styles:
 25 |         styled_exemplars = []
 26 |         styled_exemplars.extend(read_exemplars(args.all_styles_path, 8, style))
 27 |         whole_exemplars.append(styled_exemplars)
 28 |     test_samples = pd.read_csv(args.evaluate_set, delimiter="\t")
 29 | 
 30 |     # replace exemplar questions with PSEUDO CONTEXT.
 31 |     with open(args.pseudo_context_path, "r") as f:
 32 |         pseudo_contexts = f.read().split("\n")[:-1]
 33 |         print(f"Reading pseudo contexts: {len(pseudo_contexts)} contexts.")
 34 |     emb_path = Path(args.pseudo_context_path).with_suffix(".npy")
 35 |     if emb_path.exists():
 36 |         print("Embedding already exists.")
 37 |         pseudo_context_embeddings = torch.Tensor(np.load(str(emb_path), allow_pickle=False)).half().cuda()
 38 |     else:
 39 |         print("Saving embeddings.")
 40 |         pseudo_context_embeddings = get_pseudo_context_embeddings(agent, pseudo_contexts)
 41 |         np.save(str(emb_path), pseudo_context_embeddings.cpu().numpy())
 42 |     faiss_index = build_faiss_index(pseudo_context_embeddings.cpu().numpy().astype(np.float32))
 43 |     batch_size = 128
 44 | 
 45 |     results = []
 46 |     for _, row in list(test_samples.iterrows()):
 47 |         context = [row.Query]
 48 |         curr_exemplars = []
 49 |         whole_candidate_embeddings = []
 50 |         context_embedding = get_embeddings(agent, [[row.Query]])
 51 |         for exemplars in whole_exemplars:
 52 |             for start_idx in range(0, len(exemplars), batch_size):
 53 |                 end_idx = min(len(exemplars), start_idx + batch_size)
 54 |                 batch = [exemplar["response"] for exemplar in exemplars[start_idx: end_idx]]
 55 |                 response_emb = get_candidate_embeddings(agent, batch)
 56 |                 faiss_score, cand_indices = faiss_index.search(response_emb.cpu().numpy().astype(np.float32) +
 57 |                                                                context_embedding.cpu().numpy().astype(np.float32), 1)
 58 |                 for i in range(len(batch)):
 59 |                     if cand_indices[i, 0] == -1:
 60 |                         exemplars[start_idx + i]["context"] = "Hello."
 61 |                     else:
 62 |                         exemplars[start_idx + i]["context"] = pseudo_contexts[cand_indices[i, 0]]
 63 |             whole_candidate_embeddings.append(
 64 |                 get_candidate_embeddings(agent, [exemplar["response"] for exemplar in exemplars]))
 65 |         for num_exemplars, styled_exemplars, candidate_embeddings in zip(args.max_num_exemplars, whole_exemplars, whole_candidate_embeddings):
 66 |             curr_exemplars.extend(get_reference_exemplars_using_answer(
 67 |                 agent, context, styled_exemplars, candidate_embeddings, int(num_exemplars)))
 68 |         curr_exemplars = list(sorted(curr_exemplars, key=lambda x: x["score"]))
 69 |         if args.print_docs:
 70 |             for exp in curr_exemplars:
 71 |                 print(colored(exp, "cyan"))
 72 | 
 73 |         prefix_context = []
 74 |         for exemplar in curr_exemplars:
 75 |             prefix_context.extend([f"User: {exemplar['context']}", f"{args.character_name}: {exemplar['response']}"])
 76 | 
 77 |         prefix_context = f"This is conversation between User and {args.character_name}.\n" + "\n".join(prefix_context) + "\n"
 78 |         context_str = "\n".join([f"User: {utterance}" if i % 2 == 0 else f"{args.character_name}: {utterance}"
 79 |                                  for i, utterance in enumerate(context)])
 80 | 
 81 |         server_response = requests.post(
 82 |             args.megatron_endpoint,
 83 |             data=json.dumps({
 84 |                 "context": prefix_context + context_str + f"\n{args.character_name}:",
 85 |             })
 86 |         )
 87 |         assert server_response.status_code == 200
 88 | 
 89 |         candidates = json.loads(server_response.text)
 90 |         if args.print_docs:
 91 |             for cand in candidates:
 92 |                 print(colored(cand, "yellow"))
 93 |         response_str = select_response(agent, context, candidates, args.response_selection_strategy)
 94 |         print(f"Input: {context}")
 95 |         print(colored(f"{args.character_name}:", "blue", attrs=["bold"]),
 96 |               colored(response_str, "white", attrs=["bold"]))
 97 |         results.append((context, response_str))
 98 | 
 99 |     if args.save_results_path:
100 |         Path(args.save_results_path).mkdir(exist_ok=True, parents=True)
101 |         with open(str(Path(args.save_results_path) / f"dynamicq-{args.styles[0]}{args.max_num_exemplars[0]}.jsonl"), "w") as f:
102 |             for result in results:
103 |                 json.dump({"context": result[0], "response": result[1]}, f)
104 |                 f.write("\n")
105 | 
106 | 
107 | if __name__ == "__main__":
108 |     parser = argparse.ArgumentParser()
109 |     parser.add_argument("--model-file", type=str, required=True)
110 |     parser.add_argument("--all-styles-path", type=str)
111 |     parser.add_argument("--megatron-endpoint", type=str, required=True)
112 |     parser.add_argument("--max-num-exemplars", nargs='+', default=[])
113 |     parser.add_argument("--character-name", type=str, default="Bot")
114 |     parser.add_argument("--response-selection-strategy", type=str,
115 |                         choices=["random", "top1", "ranker"], default="ranker")
116 |     parser.add_argument("--print-docs", action="store_true")
117 |     parser.add_argument("--pseudo-context-path", type=str, default="./resources/predefined_texts.txt")
118 |     parser.add_argument("--evaluate-set", type=str)
119 |     parser.add_argument("--styles", nargs='+', default=[], required=True)
120 |     parser.add_argument("--save-results-path", default=None, type=str)
121 | 
122 |     args = parser.parse_args()
123 |     agent = load_agent(args.model_file)
124 |     random.seed(777)
125 |     print(colored(f"\n\n[Current style] {args.styles}\n\n", "blue", attrs=["bold"]))
126 |     main(args, agent)
127 | 


--------------------------------------------------------------------------------
/methods/inference_random_match.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import json
  3 | import random
  4 | from pathlib import Path
  5 | 
  6 | import pandas as pd
  7 | import requests
  8 | from termcolor import colored
  9 | 
 10 | from methods.inference import get_candidate_embeddings
 11 | from methods.inference import get_reference_exemplars_using_answer
 12 | from methods.inference import load_agent
 13 | from methods.inference import read_exemplars
 14 | from methods.inference import select_response
 15 | 
 16 | 
 17 | def main(args, agent):
 18 |     whole_exemplars = []
 19 |     for style in args.styles:
 20 |         styled_exemplars = []
 21 |         styled_exemplars.extend(read_exemplars(args.all_styles_path, 8, style))
 22 |         whole_exemplars.append(styled_exemplars)
 23 |     test_samples = pd.read_csv(args.evaluate_set, delimiter="\t")
 24 | 
 25 |     # replace exemplar questions with PSEUDO CONTEXT.
 26 |     with open(args.pseudo_context_path, "r") as f:
 27 |         pseudo_contexts = f.read().split("\n")[:-1]
 28 |         print(f"Reading pseudo contexts: {len(pseudo_contexts)} contexts.")
 29 |     batch_size = 128
 30 |     whole_candidate_embeddings = []
 31 |     for exemplars in whole_exemplars:
 32 |         for start_idx in range(0, len(exemplars), batch_size):
 33 |             end_idx = min(len(exemplars), start_idx + batch_size)
 34 |             batch = [exemplar["response"] for exemplar in exemplars[start_idx: end_idx]]
 35 |             context_indices = random.sample(list(range(len(pseudo_contexts))), len(batch))
 36 |             for i in range(len(batch)):
 37 |                 exemplars[start_idx + i]["context"] = pseudo_contexts[context_indices[i]]
 38 |         whole_candidate_embeddings.append(get_candidate_embeddings(agent, [exemplar["response"] for exemplar in exemplars]))
 39 | 
 40 |     results = []
 41 |     for _, row in list(test_samples.iterrows()):
 42 |         context = [row.Query]
 43 |         curr_exemplars = []
 44 |         for num_exemplars, styled_exemplars, candidate_embeddings in zip(args.max_num_exemplars, whole_exemplars, whole_candidate_embeddings):
 45 |             curr_exemplars.extend(get_reference_exemplars_using_answer(
 46 |                 agent, context, styled_exemplars, candidate_embeddings, int(num_exemplars)))
 47 |         curr_exemplars = list(sorted(curr_exemplars, key=lambda x: x["score"]))
 48 | 
 49 |         if args.print_docs:
 50 |             for exp in curr_exemplars:
 51 |                 print(colored(exp, "cyan"))
 52 | 
 53 |         prefix_context = []
 54 |         for exemplar in curr_exemplars:
 55 |             prefix_context.extend([{"text": exemplar['context']}, {"text": exemplar['response']}])
 56 | 
 57 |         server_response = requests.post(
 58 |             args.megatron_endpoint,
 59 |             data = json.dumps({
 60 |                 "context": prefix_context + [{"text": row.Query}],
 61 |             })
 62 |         )
 63 |         assert server_response.status_code == 200
 64 | 
 65 |         candidates = json.loads(server_response.text)
 66 |         if args.print_docs:
 67 |             for cand in candidates:
 68 |                 print(colored(cand, "yellow"))
 69 |         response_str = select_response(agent, context, candidates, args.response_selection_strategy)
 70 |         print(f"Input: {context}")
 71 |         print(colored(f"{args.character_name}:", "blue", attrs=["bold"]),
 72 |               colored(response_str, "white", attrs=["bold"]))
 73 |         results.append((context, response_str))
 74 | 
 75 |     if args.save_results_path:
 76 |         Path(args.save_results_path).mkdir(exist_ok=True, parents=True)
 77 |         with open(str(Path(args.save_results_path) / f"bst-ablation-noaug-randomq-{args.sorting_method}-{args.styles[0]}{args.max_num_exemplars[0]}.jsonl"), "w") as f:
 78 |             for result in results:
 79 |                 json.dump({"context": result[0], "response": result[1]}, f)
 80 |                 f.write("\n")
 81 | 
 82 | 
 83 | if __name__ == "__main__":
 84 |     parser = argparse.ArgumentParser()
 85 |     parser.add_argument("--model-file", type=str, required=True)
 86 |     parser.add_argument("--all-styles-path", type=str)
 87 |     parser.add_argument("--megatron-endpoint", type=str, required=True)
 88 |     parser.add_argument("--max-num-exemplars", nargs='+', default=[])
 89 |     parser.add_argument("--character-name", type=str, default="Bot")
 90 |     parser.add_argument("--response-selection-strategy", type=str,
 91 |                         choices=["random", "top1", "ranker"], default="ranker")
 92 |     parser.add_argument("--print-docs", action="store_true")
 93 |     parser.add_argument("--pseudo-context-path", type=str, default="./resources/predefined_texts.txt")
 94 |     parser.add_argument("--evaluate-set", type=str)
 95 |     parser.add_argument("--styles", nargs='+', default=[], required=True)
 96 |     parser.add_argument("--save-results-path", default=None, type=str)
 97 | 
 98 |     args = parser.parse_args()
 99 |     agent = load_agent(args.model_file)
100 |     random.seed(777)
101 |     print(colored(f"\n\n[Current style] {args.styles}\n\n", "blue", attrs=["bold"]))
102 |     main(args, agent)
103 | 


--------------------------------------------------------------------------------
/methods/inference_static_match.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import json
  3 | import random
  4 | from pathlib import Path
  5 | 
  6 | import numpy as np
  7 | import pandas as pd
  8 | import requests
  9 | import torch
 10 | from termcolor import colored
 11 | 
 12 | from methods.inference import build_faiss_index
 13 | from methods.inference import get_candidate_embeddings
 14 | from methods.inference import get_pseudo_context_embeddings
 15 | from methods.inference import get_reference_exemplars_using_answer
 16 | from methods.inference import load_agent
 17 | from methods.inference import read_exemplars
 18 | from methods.inference import select_response
 19 | 
 20 | 
 21 | def main(args, agent):
 22 |     whole_exemplars = []
 23 |     for style in args.styles:
 24 |         styled_exemplars = []
 25 |         styled_exemplars.extend(read_exemplars(args.all_styles_path, 8, style))
 26 |         whole_exemplars.append(styled_exemplars)
 27 |     test_samples = pd.read_csv(args.evaluate_set, delimiter="\t")
 28 | 
 29 |     # replace exemplar questions with PSEUDO CONTEXT.
 30 |     with open(args.pseudo_context_path, "r") as f:
 31 |         pseudo_contexts = f.read().split("\n")[:-1]
 32 |         print(f"Reading pseudo contexts: {len(pseudo_contexts)} contexts.")
 33 |     emb_path = Path(args.pseudo_context_path).with_suffix(".npy")
 34 |     if emb_path.exists():
 35 |         print("Embedding already exists.")
 36 |         pseudo_context_embeddings = torch.Tensor(np.load(str(emb_path), allow_pickle=False)).half().cuda()
 37 |     else:
 38 |         print("Saving embeddings.")
 39 |         pseudo_context_embeddings = get_pseudo_context_embeddings(agent, pseudo_contexts)
 40 |         np.save(str(emb_path), pseudo_context_embeddings.cpu().numpy())
 41 |     faiss_index = build_faiss_index(pseudo_context_embeddings.cpu().numpy().astype(np.float32))
 42 |     batch_size = 128
 43 |     whole_candidate_embeddings = []
 44 |     for exemplars in whole_exemplars:
 45 |         for start_idx in range(0, len(exemplars), batch_size):
 46 |             end_idx = min(len(exemplars), start_idx + batch_size)
 47 |             batch = [exemplar["response"] for exemplar in exemplars[start_idx: end_idx]]
 48 |             response_emb = get_candidate_embeddings(agent, batch)
 49 |             faiss_score, cand_indices = faiss_index.search(response_emb.cpu().numpy().astype(np.float32), 1)
 50 |             for i in range(len(batch)):
 51 |                 if cand_indices[i, 0] == -1:
 52 |                     exemplars[start_idx + i]["context"] = "Hello."
 53 |                 else:
 54 |                     exemplars[start_idx + i]["context"] = pseudo_contexts[cand_indices[i, 0]]
 55 |         whole_candidate_embeddings.append(get_candidate_embeddings(agent, [exemplar["response"] for exemplar in exemplars]))
 56 | 
 57 |     results = []
 58 |     for _, row in list(test_samples.iterrows()):
 59 |         context = [row.Query]
 60 |         curr_exemplars = []
 61 |         for num_exemplars, styled_exemplars, candidate_embeddings in zip(args.max_num_exemplars, whole_exemplars, whole_candidate_embeddings):
 62 |             curr_exemplars.extend(get_reference_exemplars_using_answer(
 63 |                 agent, context, styled_exemplars, candidate_embeddings, int(num_exemplars)))
 64 |         curr_exemplars = list(sorted(curr_exemplars, key=lambda x: x["score"]))
 65 |         if args.print_docs:
 66 |             for exp in curr_exemplars:
 67 |                 print(colored(exp, "cyan"))
 68 | 
 69 |         prefix_context = []
 70 |         for exemplar in curr_exemplars:
 71 |             prefix_context.extend([f"User: {exemplar['context']}", f"{args.character_name}: {exemplar['response']}"])
 72 | 
 73 |         prefix_context = f"This is conversation between User and {args.character_name}.\n" + "\n".join(prefix_context) + "\n"
 74 |         context_str = "\n".join([f"User: {utterance}" if i % 2 == 0 else f"{args.character_name}: {utterance}"
 75 |                                  for i, utterance in enumerate(context)])
 76 | 
 77 |         server_response = requests.post(
 78 |             args.megatron_endpoint,
 79 |             data=json.dumps({
 80 |                 "context": prefix_context + context_str + f"\n{args.character_name}:",
 81 |             })
 82 |         )
 83 |         assert server_response.status_code == 200
 84 | 
 85 |         candidates = json.loads(server_response.text)
 86 |         if args.print_docs:
 87 |             for cand in candidates:
 88 |                 print(colored(cand, "yellow"))
 89 |         response_str = select_response(agent, context, candidates, args.response_selection_strategy)
 90 |         print(f"Input: {context}")
 91 |         print(colored(f"{args.character_name}:", "blue", attrs=["bold"]),
 92 |               colored(response_str, "white", attrs=["bold"]))
 93 |         results.append((context, response_str))
 94 | 
 95 |     if args.save_results_path:
 96 |         Path(args.save_results_path).mkdir(exist_ok=True, parents=True)
 97 |         with open(str(Path(args.save_results_path) / f"ablation-noaug-{args.styles[0]}{args.max_num_exemplars[0]}.jsonl"), "w") as f:
 98 |             for result in results:
 99 |                 json.dump({"context": result[0], "response": result[1]}, f)
100 |                 f.write("\n")
101 | 
102 | 
103 | if __name__ == "__main__":
104 |     parser = argparse.ArgumentParser()
105 |     parser.add_argument("--model-file", type=str, required=True)
106 |     parser.add_argument("--all-styles-path", type=str)
107 |     parser.add_argument("--megatron-endpoint", type=str, required=True)
108 |     parser.add_argument("--max-num-exemplars", nargs='+', default=[])
109 |     parser.add_argument("--character-name", type=str, default="Bot")
110 |     parser.add_argument("--response-selection-strategy", type=str,
111 |                         choices=["random", "top1", "ranker"], default="ranker")
112 |     parser.add_argument("--print-docs", action="store_true")
113 |     parser.add_argument("--pseudo-context-path", type=str, default="./resources/predefined_texts.txt")
114 |     parser.add_argument("--evaluate-set", type=str)
115 |     parser.add_argument("--styles", nargs='+', default=[], required=True)
116 |     parser.add_argument("--save-results-path", default=None, type=str)
117 | 
118 |     args = parser.parse_args()
119 |     agent = load_agent(args.model_file)
120 |     random.seed(777)
121 |     print(colored(f"\n\n[Current style] {args.styles}\n\n", "blue", attrs=["bold"]))
122 |     main(args, agent)
123 | 


--------------------------------------------------------------------------------
/methods/inference_utterances_only.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import json
  3 | import random
  4 | from pathlib import Path
  5 | 
  6 | import numpy as np
  7 | import pandas as pd
  8 | import requests
  9 | import torch
 10 | from termcolor import colored
 11 | 
 12 | from methods.inference import get_candidate_embeddings
 13 | from methods.inference import load_agent
 14 | from methods.inference import read_exemplars
 15 | from methods.inference import select_response
 16 | from methods.inference import build_faiss_index
 17 | from methods.inference import get_pseudo_context_embeddings
 18 | from methods.inference import get_reference_exemplars_using_answer
 19 | 
 20 | 
 21 | def main(args, agent):
 22 |     whole_exemplars = []
 23 |     for style in args.styles:
 24 |         styled_exemplars = []
 25 |         styled_exemplars.extend(read_exemplars(args.all_styles_path, 8, style))
 26 |         whole_exemplars.append(styled_exemplars)
 27 |     test_samples = pd.read_csv(args.evaluate_set, delimiter="\t")
 28 | 
 29 |     # replace exemplar questions with PSEUDO CONTEXT.
 30 |     with open(args.pseudo_context_path, "r") as f:
 31 |         pseudo_contexts = f.read().split("\n")[:-1]
 32 |         print(f"Reading pseudo contexts: {len(pseudo_contexts)} contexts.")
 33 |     emb_path = Path(args.pseudo_context_path).with_suffix(".npy")
 34 |     if emb_path.exists():
 35 |         print("Embedding already exists.")
 36 |         pseudo_context_embeddings = torch.Tensor(np.load(str(emb_path), allow_pickle=False)).half().cuda()
 37 |     else:
 38 |         print("Saving embeddings.")
 39 |         pseudo_context_embeddings = get_pseudo_context_embeddings(agent, pseudo_contexts)
 40 |         np.save(str(emb_path), pseudo_context_embeddings.cpu().numpy())
 41 |     faiss_index = build_faiss_index(pseudo_context_embeddings.cpu().numpy().astype(np.float32))
 42 |     batch_size = 128
 43 |     whole_candidate_embeddings = []
 44 |     for exemplars in whole_exemplars:
 45 |         for start_idx in range(0, len(exemplars), batch_size):
 46 |             end_idx = min(len(exemplars), start_idx + batch_size)
 47 |             batch = [exemplar["response"] for exemplar in exemplars[start_idx: end_idx]]
 48 |             response_emb = get_candidate_embeddings(agent, batch)
 49 |             faiss_score, cand_indices = faiss_index.search(response_emb.cpu().numpy().astype(np.float32), 1)
 50 |             for i in range(len(batch)):
 51 |                 if cand_indices[i, 0] == -1:
 52 |                     exemplars[start_idx + i]["context"] = "Hello."
 53 |                 else:
 54 |                     exemplars[start_idx + i]["context"] = pseudo_contexts[cand_indices[i, 0]]
 55 |         whole_candidate_embeddings.append(get_candidate_embeddings(agent, [exemplar["response"] for exemplar in exemplars]))
 56 | 
 57 |     results = []
 58 |     for _, row in list(test_samples.iterrows()):
 59 |         context = [row.Query]
 60 |         curr_exemplars = []
 61 |         for num_exemplars, styled_exemplars, candidate_embeddings in zip(args.max_num_exemplars, whole_exemplars, whole_candidate_embeddings):
 62 |             curr_exemplars.extend(get_reference_exemplars_using_answer(
 63 |                 agent, context, styled_exemplars, candidate_embeddings, int(num_exemplars)))
 64 |         curr_exemplars = list(sorted(curr_exemplars, key=lambda x: x["score"]))
 65 |         if args.print_docs:
 66 |             for exp in curr_exemplars:
 67 |                 print(colored(exp, "cyan"))
 68 | 
 69 |         prefix_context = []
 70 |         for exemplar in curr_exemplars:
 71 |             prefix_context.append(f"- {exemplar['response']}")
 72 | 
 73 |         prefix_context = f"The below are quotes of {args.character_name} during conversation.\n" + "\n".join(prefix_context) + f"\nThe below are conversation between User and {args.character_name}.\n"
 74 |         context_str = "\n".join([f"User: {utterance}" if i % 2 == 0 else f"{args.character_name}: {utterance}"
 75 |                                  for i, utterance in enumerate(context)])
 76 | 
 77 |         server_response = requests.post(
 78 |             args.megatron_endpoint,
 79 |             data=json.dumps({
 80 |                 "context": prefix_context + context_str + f"\n{args.character_name}:",
 81 |             })
 82 |         )
 83 |         assert server_response.status_code == 200
 84 | 
 85 |         candidates = json.loads(server_response.text)
 86 |         if args.print_docs:
 87 |             for cand in candidates:
 88 |                 print(colored(cand, "yellow"))
 89 |         response_str = select_response(agent, context, candidates, args.response_selection_strategy)
 90 |         print(f"Input: {context}")
 91 |         print(colored(f"{args.character_name}:", "blue", attrs=["bold"]),
 92 |               colored(response_str, "white", attrs=["bold"]))
 93 |         results.append((context, response_str))
 94 | 
 95 |     if args.save_results_path:
 96 |         Path(args.save_results_path).mkdir(exist_ok=True, parents=True)
 97 |         with open(str(Path(args.save_results_path) / f"utterancesonly-{args.styles[0]}{args.max_num_exemplars[0]}.jsonl"), "w") as f:
 98 |             for result in results:
 99 |                 json.dump({"context": result[0], "response": result[1]}, f)
100 |                 f.write("\n")
101 | 
102 | 
103 | if __name__ == "__main__":
104 |     parser = argparse.ArgumentParser()
105 |     parser.add_argument("--model-file", type=str, required=True)
106 |     parser.add_argument("--all-styles-path", type=str)
107 |     parser.add_argument("--megatron-endpoint", type=str, required=True)
108 |     parser.add_argument("--max-num-exemplars", nargs='+', default=[])
109 |     parser.add_argument("--character-name", type=str, default="Bot")
110 |     parser.add_argument("--response-selection-strategy", type=str,
111 |                         choices=["random", "top1", "ranker"], default="ranker")
112 |     parser.add_argument("--print-docs", action="store_true")
113 |     parser.add_argument("--use-reversed-scores", action="store_true")
114 |     parser.add_argument("--pseudo-context-path", type=str, default="./resources/predefined_texts.txt")
115 |     parser.add_argument("--evaluate-set", type=str)
116 |     parser.add_argument("--styles", nargs='+', default=[], required=True)
117 |     parser.add_argument("--save-results-path", default=None, type=str)
118 | 
119 |     args = parser.parse_args()
120 |     agent = load_agent(args.model_file)
121 |     random.seed(777)
122 |     print(colored(f"\n\n[Current style] {args.styles}\n\n", "blue", attrs=["bold"]))
123 |     main(args, agent)
124 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | pybind11
 2 | torch>=1.8.0,<1.9.0
 3 | six
 4 | regex
 5 | numpy
 6 | tqdm
 7 | flask
 8 | tensorboard
 9 | pandas
10 | termcolor


--------------------------------------------------------------------------------
/resources/all_styles.tsv:
--------------------------------------------------------------------------------
  1 | style	response
  2 | Negative	ever since joes has changed hands it 's just gotten worse and worse .
  3 | Negative	the decor was seriously lacking .
  4 | Negative	she was not happy being there .
  5 | Negative	there is definitely not enough room in that part of the venue .
  6 | Negative	but it probably sucks too !
  7 | Negative	the sales people here are terrible .
  8 | Negative	the wine was very average and the food was even less .
  9 | Negative	so basically tasted watered down .
 10 | Positive	good drinks , and good company .
 11 | Positive	it 's small yet they make you feel right at home .
 12 | Positive	i will be going back and enjoying this great place !
 13 | Positive	the drinks were affordable and a good pour .
 14 | Positive	everything is fresh and so delicious !
 15 | Positive	very good brunch , was impressed with selection and quality .
 16 | Positive	excellent knowledgeable dentist and staff !
 17 | Positive	the biscuits and gravy were good .
 18 | anger	WHY THE FUCK IS BAYLESS ISOING
 19 | anger	I KILL YOU SCUM
 20 | anger	Its ok we all know you suck scrub.
 21 | anger	Yeah it sort of sucks now.
 22 | anger	Because majority of moves have stupidly long names. GO INTO ANGEL PUNISHER INTO DIVE BOMB RADIANCE AND FINISH THE JOB WITH PUNISHING LIGHT BLADE
 23 | anger	FIRE THE MEDICAL TEAM.
 24 | anger	Ugh, he sounds like a slob and you should leave him.
 25 | anger	Fucking and being a boy
 26 | joy	Happy to be able to help.
 27 | joy	i think its cool because its a hybrid that comes in stickshift
 28 | joy	She has at least 12 elbows and all of them are happy. You get it gurl
 29 | joy	He will rest easy knowing how happy he made all of us
 30 | joy	I am glad, and enjoy. It has some pretty cool theological discussions.
 31 | joy	I enjoyed this TED talk.
 32 | joy	Well I am happy you believe that.
 33 | joy	To each their own. I enjoy it for what it is
 34 | Shakespearean	If thou rememb’rest not the slightest folly That ever love did make thee run into, Thou hast not loved.
 35 | Shakespearean	Whom thou wert sworn to cherish and defend.
 36 | Shakespearean	To none but thee, no more but when to thee.
 37 | Shakespearean	Go tread the path that thou shalt ne'er return.
 38 | Shakespearean	I pray you sir, go forth And give us truth who ’tis that is arrived.
 39 | Shakespearean	What shall I call thee when thou art a man?
 40 | Shakespearean	Fellow, wilt thou bestow thy time with me?
 41 | Shakespearean	If any bark put forth, come to the mart, Where I will walk till thou return to me.
 42 | Modern	If you can’t remember the stupidest little thing love made you do, you haven’t loved.
 43 | Modern	Whom you swore you would protect and defend.
 44 | Modern	To none but you, only to you.
 45 | Modern	Go walk the path that you will never return from.
 46 | Modern	Please go find out for certain who has arrived.
 47 | Modern	What should I call you when you’re a man?
 48 | Modern	You, will you join with me?
 49 | Modern	If a ship’s leaving, come to the marketplace.
 50 | Spock	Earth, Captain. We were on a general course in this direction when we were pulled in by the star. Apparently the breakaway threw us on in the same direction.
 51 | Spock	Worry is a human emotion, Captain. I accept what has happened. The ship's hull seems to have a high density level or is cloaked against sensor probes. It is manned, but sensors cannot make out specifics.
 52 | Spock	Recheck your equipment, Mister Scott. I'll scan for them on the planet's surface. Spock out.
 53 | Spock	I submit, Mister Scott, if we do not get out, the shields would be extraneous. It would only prolong our wait for death by a short period of time.
 54 | Spock	Our sensors are in a state of chaos, Captain. They became unreliable when we entered the Triangle.
 55 | Spock	I have already begun investigation into tht possibility.
 56 | Spock	Captain, there are approximately one hundred of us engaged in this search, against one creature. The odds against you and I both being killed are 2,228.7 to 1.
 57 | Spock	Spock here. Proceed, Mister Scott.
 58 | Sheldon	I'm Dr. Sheldon Cooper, and welcome to the premiere episode of Sheldon Cooper Presents Fun with Flags. Over the next 52 weeks, you and I are going to explore the dynamic world of vexillology.
 59 | Sheldon	Ah, normally I refrain from alcohol, but since my cerebral cortex is twiddling its proverbial thumbs, why not soak it in grape juice that's been predigested by a fungus?
 60 | Sheldon	That's a semantically null sentence.
 61 | Sheldon	Good morning and welcome to Science and Society. I'm Dr. Sheldon Cooper, BS, MS, MA, PhD, and ScD. OMG, right?
 62 | Sheldon	Granted, my methods may have been somewhat unorthodox, but I think the end result will be a measurable enhancement of Penny's quality of life.
 63 | Sheldon	Well, I've spent the last three hours in an online debate in the DC Comics Batman chatroom, and I need your help.
 64 | Sheldon	I love strawberry Quik. It's my favourite pink fluid, narrowly beating out Pepto-Bismol.
 65 | Sheldon	But I would like to remind you that in science, there's no such thing as failure. There once was a man who referred to his prediction of a cosmological constant as the single biggest blunder of his career. That man's name was, surprise, surprise, Albert Einstein.
 66 | BartSimpson	Oh man, you girls ruin everything -- even vampires.
 67 | BartSimpson	We're Simpsons, Dad. We don't do good behavior.
 68 | BartSimpson	I'm sorry for all the trouble I've caused you, Krusty. But, you know, my mom says God never closes a door without opening a window.
 69 | BartSimpson	Come on, Lis. Let's go finish our soapbox racers.
 70 | BartSimpson	I didn't ruin Thanksgiving! She did... Buncha jerks... I always get blamed for everything...
 71 | BartSimpson	Mom, this gravy tastes better than God's sweat.
 72 | BartSimpson	Quick, Mom! Whip up a cake before Dad fires you!
 73 | BartSimpson	Hey, Dad, why don't we try the "Sprawl-Mart"?
 74 | MargeSimpson	You should've seen the faces of your children when they caught you stealing. Kids, get in here and show your father the faces!
 75 | MargeSimpson	Aw, Homie, you'll always be my western hero.
 76 | MargeSimpson	Isn't Bart sweet, Homer? He sings like a little angel.
 77 | MargeSimpson	Bart! That hobo skeleton is not a toy!'
 78 | MargeSimpson	Homer, please!
 79 | MargeSimpson	You're teaching Bart a terrible lesson of intolerance!
 80 | MargeSimpson	Bart? Honey, I made you an extra-warm sweater you can wear while you're down in the well.
 81 | MargeSimpson	Okay, Bart, you don't have to say it, but you do have to have a loving attitude. Be nice to your sister.
 82 | BMO	Jake Jr., I'm sorry for messing up your time travel.
 83 | BMO	Ronnie the mouse stole it, but Lorraine chicken set him up to kill Bebe, but the flatfoot busted Ronnie, and Lorraine skedaddled with the loot, but BMO solved the case!
 84 | BMO	They're too strong! Use the combo move, Finn! The combo-
 85 | BMO	Okay. Please take me to get fixed. I need- need- need- need to get new core system drivers installed. We can get them at the MO Factory in the Bad Lands, where I was born.
 86 | BMO	Yippie ki-yi-yay!
 87 | BMO	I'm gonna go wash this shmutz off my grabbers.
 88 | BMO	Sorry, lady, forgetting ain't in my job description. Ha ha ha!
 89 | BMO	If anyone tries to hurt Finn, I will kill them.
 90 | Rachel	Oh my God, Phoebe, this is impossible! We can't do this by Friday! We have to find a place. We have to invite people! We have to get food! There's just too much to do! It's impossible! We can't do it! We cannot do it! We cannot do it! 
 91 | Rachel	Jo-Joey, look honey we-we need to talk okay? Umm, I kinda got the feeling from her today that uh, she's not lookin' for a serious relationship.
 92 | Rachel	Hey, come on! I had this friend from college and I made the stupid mistake of telling Joey that one time…she and I y'know…kissed a little bit.
 93 | Rachel	Ooh, I miss dating. Gettin' all dressed up and going to a fancy restaurant. I'm not gonna be able to do that for so long, and it's so much fun! I mean not that sitting at home worrying about giving birth to a sixteen pound baby is not fun.
 94 | Rachel	Oh, y'know, would you just for once, not remember every…little…thing!!
 95 | Rachel	Yeah I don't think dressing provocatively is going to help me here! Oh my god just please take her.
 96 | Rachel	Okay. It's okay. We're gonna be okay. Y'know what? It's okay. I'm gonna, I'm gonna, I'm gonna boil some water and just rip up some sheets!
 97 | Rachel	Oh, I mean she's gonna be at the wedding waiting for him and people will be whispering, "Oh that poor girl." Y'know? Then she'll have to come back here and live all alone.
 98 | Burke	Dr. Yang. I don't need you for this. This is a known complication of the surgery. It happens. It happens. Restart bypass.7
 99 | Burke	No, I can fix her heart while it's still beating. Push 40 milligrams of abizonole.
100 | Burke	That's not for you to decide Dr. Grey. He asked you for something. You told him you would do it. If you don't, that doesn't make you noble. It makes you a liar.
101 | Burke	Christina I can't walk away from this surgery. If I do this man will die. So tremor or no tremor, I have to try. But you..
102 | Burke	I want to know when you stopped thinking of me as your number one. Richard, I do more in this hospital than any other surgeon.
103 | Burke	We're finished here, Karev. You're off this case.
104 | Burke	I am the best surgeon at Grace with the lowest mortality rate, you can't just bring some guy in from -
105 | Burke	Not good enough. He's my patient now. That okay with you, Dr. K?
106 | Barney	Unfair? I wouldn't let you take care of the imaginary kids I make up to score with single moms. That's it Ted, we're going home. Ted? Ted, you okay?
107 | Barney	Yes! Yes! We're back. We are back. And Ted, my boy, it's going to be legend... Wait for it...
108 | Barney	Robin, girls are like cartons of milk. Each one has a hotness expiration date, and you've hit yours. I'm not saying the occasional guy won't still open the fridge, pick you up, give a sniff, shrug and take a sip anyway But it's all downhill from here.
109 | Barney	Oh, Robin, my simple friend from the untamed north, let me tell you about a little thing I like to call mind over body. You see, whenever I start feeling sick, I just stop being sick and be awesome instead. True story. Yeah, in two minutes, I'm going to pound a sixer of Red Bull, hop in a cab, play a couple of hours of laser tag, maybe get a spray-on tan. It's gonna be legen... Wait for it...
110 | Barney	No, no, no... we are not laughing about this, Ted. This is not going to be some funny story that we'll be telling in a couple months. It's not gonna be like, "Hey, remember that time when you were grinding with... NO! And you know why? Because, italics, this night did not happen. And you promise me that you will never, ever, ever tell another soul what transpired here tonight. You promise. Promise
111 | Barney	You're right, I do deserve that. That's all I came here to say. You know what? I don't care who knows about it. Excuse me. Excuse me. I, Ted Mosby, am a jerk to women. Tell your sisters. Tell your daughters to stay away! Ted-Mosby-Is-A-Jerk-dot-com.
112 | Barney	Please, vacation romances have an expiration date. Gael's got a "Best if banged by" sticker on him. Once your romance starts to stink, you'll dump his ass down the drain like sour milk and go back to being "Unevolved Robin", the one we actually like. Back me up here, Ted.
113 | Barney	I need alcohol. I'm not gonna do that stuff with Robin.Look at that. Berry cocktail, conditioner meninges. I dream, these drinks could make a girl smarter. What hell Ted brought us?
114 | Dwight	Can I have my neck pillow back? Michael! Can I have my neck pillow back?
115 | Dwight	How bout this. I don't tell Michael and in exchange you owe me one great big giant favor. Redeemable by me at a time and place of my choosing.
116 | Dwight	Michael, please. Stop it now. You're embarrassing yourself.
117 | Dwight	Thank you Michael. I just want to say, to the few of you who will remain under my employ, that I intend to lead you into the black! With ferocity!
118 | Dwight	Black Bears can smell a salami at five miles Michael, what are you thinking?! And they run faster than a horse, so if you were thinking about outrunning one on a horse I would try a cheetah. You, in tight pants, Michael, are a salami to a Black Bear. Do you understand?
119 | Dwight	Michael, I have so much to learn from you.
120 | Dwight	Don't let him in! He's a traitor! Michael!
121 | Dwight	Dammit. I am no closer to taking Jim down. What a waste of a day. I could of grown poisoned mushrooms that would have been this high by now.
122 | Michael	Dwight, oh ho, Dwight, Dwight, my loyal compadre. You and I are hangin' tonight. The two of us. We are celebrating our freedom and our manhood. You know what? Why don't we watch that show that you've been wanting to watch, that stupid Battleship Galaxy.
123 | Michael	No, no, no. I am not dating Jan. She was very clear about that. Just two like souls having a romantic time in the most romantic place on earth. Got enough, weirdo?
124 | Michael	I'm not firing him. I'm not, I need you to act like I am firing him. Just, what I'm going to do, is I'm going to pretend that I am firing him, and I need you to act like I am firing him. Do you get that? Do you get it? I'm teaching him a lesson. He needs to learn humility, alright? That's all I'm, okay, here he comes. Let's just... play act.
125 | Michael	EHHHHNT. Game over. Offer revoked. Dwight. I'm sorry, but you reach out and you try to be a nice guy, and help out a friend, and this is what happens. This is what I get. Oh god. I'm ... Ok.
126 | Michael	All right, let's not get hung up on the furnace. This just... it's the sales... I see the sales department down there. They're in the engine room, and they are shoveling coal into the furnace, right? I mean, who saw the movie Titanic? They were very important in the movie Titanic. Who saw it? Show of hands!
127 | Michael	Oh, cool. Cool. And maybe you could attend tonight and then you could stay over at my house for the night. Awkward. That's... You know what. Brenda, could Jan and I have a moment alone?
128 | Michael	No problem. The guys are having a little shindig of their own in the warehouse. From 2:30 to 3:15. It is the only time that Bob was available. Sort of a guy's night out. A G-N-O if you will. A Gah-No. Actually, it's more of a guy's afternoon in. A G-A-I. A gay. Not- not- it's uh, not gay, it's just a, it's a bridal shower for guys. A guy shower. An hour long shower with guys.
129 | Michael	Just follow my lead. Don't pimp me, all right? Come in. So, uh, Corporate just said that I don't want to...


--------------------------------------------------------------------------------