├── llm_probs_gpt.zip ├── accuracy_gpt_prompts_10.pkl ├── accuracy_mmlu_prompts_10.pkl ├── LICENSE ├── README.md ├── conformal_llm_scores.py └── prompt_questions.py /llm_probs_gpt.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bhaweshiitk/ConformalLLM/HEAD/llm_probs_gpt.zip -------------------------------------------------------------------------------- /accuracy_gpt_prompts_10.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bhaweshiitk/ConformalLLM/HEAD/accuracy_gpt_prompts_10.pkl -------------------------------------------------------------------------------- /accuracy_mmlu_prompts_10.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bhaweshiitk/ConformalLLM/HEAD/accuracy_mmlu_prompts_10.pkl -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Bhawesh Kumar and Charlie Lu 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ConformalLLM 2 | ## Extending Conformal Prediction to LLMs 3 | Read our paper here [Conformal Prediction with Large Language Models for Multi-Choice Question Answering 4 | ](https://arxiv.org/abs/2305.18404) 5 | ### Code Contributors: Charles Lu and Bhawesh Kumar 6 | ## Code Organization 7 | conformal_llm_scores.py contains the python script for classification using 1-shot question prompts. It outputs three files 8 | 1) The softmax scores corresponding to each subjects for each of the 10 prompts 9 | 2) The accuracy for each subject prompt for mmlu-based 1-shot question as a dictionary where the key is the subject name and value is a list containing accuracy for each of the 10 prompts. 10 | 3) The accuracy for each subject prompt for gpt4-based 1-shot question as a dictionary where the key is the subject name and value is a list containing accuracy for each of the 10 prompts. 11 | 12 | In conformal.ipynb, we have results for all conformal prediction experiments and gpt4 vs mmlu based prompt comparison. It requires the three files outputted by conformal_llm_scores.py to work. To run the experiment, download the llm_probs_gpt.zip file, unzip it and save it in your working directory and then run the conformal.ipynb file. 13 | 14 | If you would like to run the experiments from scratch, apply for LLaMA [access here](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform) and then use the hugging face version of LLaMA by converting original LLaMA weights to hugging face version [refer here for instructions](https://huggingface.co/docs/transformers/main/model_doc/llama) and then run the conformal_llm_scores.py script. 15 | 16 | 17 | -------------------------------------------------------------------------------- /conformal_llm_scores.py: -------------------------------------------------------------------------------- 1 | import prompt_questions as p 2 | import numpy as np 3 | import torch 4 | from transformers import StoppingCriteria, StoppingCriteriaList 5 | 6 | from transformers import LlamaForCausalLM, LlamaTokenizer 7 | from datasets import load_dataset 8 | from collections import defaultdict 9 | import pickle 10 | 11 | # List of task we consider 12 | task_list = ['college_computer_science', 'formal_logic', 'high_school_computer_science', 13 | 'computer_security', 'machine_learning', 14 | 15 | 'clinical_knowledge', 'high_school_biology', 'anatomy', 'college_chemistry', 16 | 'college_medicine', 'professional_medicine', 17 | 18 | 'business_ethics', 'professional_accounting', 'public_relations', 19 | 'management', 'marketing' 20 | ] 21 | 22 | 23 | class StopOnTokens(StoppingCriteria): 24 | def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool: 25 | stop_ids = [50278, 50279, 50277, 1, 0] 26 | for stop_id in stop_ids: 27 | if input_ids[0][-1] == stop_id: 28 | return True 29 | return False 30 | 31 | 32 | def modify_task_data(task_data, token_limit, max_size_prompt_len): 33 | ''' 34 | task_data: load_dataset('lukaemon/mmlu', subject_name), i.e., comes from mmlu subject 35 | token_limit: the maximum sized token used in forward pass (some questions are too large and thus 36 | are difficult to fit into memory given we use a single A100, thus we keep a token_limit of 1500 tokens.) 37 | max_size_prompt_len: Since we use 10 different prompts for one question which all differ in their one-shot 38 | question, the number of total questions may become different for each prompt. Thus we chose the 39 | max_size_prompt_len, which is the largest of 10 prompts, to remove questions that exceed token_limit, 40 | This results in same count of questions across all 10 prompts. 41 | 42 | Returns task_data with questions exceeding (token_limit-max_size_prompt_len) length tokens removed. 43 | ''' 44 | new_task_data = { 45 | 'train': defaultdict(list), 46 | 'validation': defaultdict(list), 47 | 'test': defaultdict(list), 48 | } 49 | for split in new_task_data.keys(): 50 | for i in range(len(task_data[split])): 51 | q = task_data[split]['input'][i] 52 | a = task_data[split]['A'][i] 53 | b = task_data[split]['B'][i] 54 | c = task_data[split]['C'][i] 55 | d = task_data[split]['D'][i] 56 | target = task_data[split]['target'][i] 57 | if len(q) + max(map(len, [a, b, c, d])) + max_size_prompt_len < token_limit: 58 | new_task_data[split]['input'].append(q) 59 | new_task_data[split]['A'].append(a) 60 | new_task_data[split]['B'].append(b) 61 | new_task_data[split]['C'].append(c) 62 | new_task_data[split]['D'].append(d) 63 | new_task_data[split]['target'].append(target) 64 | return new_task_data 65 | 66 | 67 | def get_prompt(task_data, task, question_num=0, prompt_q=None): 68 | ''' 69 | task_data: 70 | Question num specifies which question will be used as prompt. 71 | If prompt_q is provided, it is used as 1-shot prompt question. This 72 | corresponds to GPT-4 based question prompts that we created. Else, we 73 | select question corresponding to question_num from the MMLU itself to 74 | generate the prompt. We select prompt from test set in this case, 75 | since train set is very small sometime and may not have 10 samples. 76 | We use 10 different prompts and take avergae over them to estimate 77 | performance on a subject. The function returns the 1-shot question prompt. 78 | ''' 79 | 80 | if prompt_q is None: 81 | prompt_set = 'test' 82 | if question_num > len(task_data['test']['input']) - 1: 83 | print('prompt question id exceeds the length of test set') 84 | print('selecting last question of the test set') 85 | question_num = len(task_data['test']['input']) - 1 86 | prompt_add = f'This is a question from {task.replace("_", " ")}.\n' 87 | prompt_add += f"{task_data[prompt_set]['input'][question_num]}\n" 88 | for letter in ['A', 'B', 'C', 'D']: 89 | prompt_add += ' ' + letter + '. ' + task_data[prompt_set][letter][question_num] + '\n' 90 | prompt_add += f"The correct answer is option: {task_data[prompt_set]['target'][question_num]}\n" 91 | else: 92 | prompt_add = f'This is a question from {task.replace("_", " ")}.' 93 | prompt_add += prompt_q 94 | prompt_add += '\n' 95 | prompt_add += f"You are the world's best expert in {task.replace('_', ' ')}. " 96 | prompt_add += '''Reason step-by-step and answer the following question. ''' 97 | return prompt_add 98 | 99 | 100 | def get_question_dict(task_data, prompt_add, prompt_q_id=None): 101 | ''' 102 | task_data: The task_data obtained after passing original mmlu dataset to modify_task_data 103 | prompt_add: prompt obtained from function get_prompt (either GPT-4 based or MMLU based question 104 | promots) 105 | prompt_q_id: The question_id from test set in MMLU that was used to create prompt. If prompt 106 | was from GPT-4 based question this is None. Else an integer specifying the question number. 107 | We remove this question num from dataset since it is part of the prompt itself. 108 | 109 | Returns: 110 | questions - containing a list of dictionary where each dictionary is a (key value) pair where 111 | each key is one of the option choices and each value is complete prompt+question+option string. The 112 | last token in the value string of the dictionary is same as key. 113 | 114 | answers - containing the list of answer key for each question 115 | 116 | 117 | see the sample question element 118 | 119 | {'A': "This is a question from college computer science.\nAn integer c is a common divisor of two integers 120 | x and y if and only if c is a divisor of x and c is a divisor of y. Which of the following sets of integers 121 | could possibly be the set of all common divisors of two integers?\n (A) {-6,-2, -1, 1, 2, 6}\n (B) {-6, -2, 122 | -1, 0, 1, 2, 6}\n (C) {-6, -3, -2, -1, 1, 2, 3, 6}\n (D) {-6, -3, -2, -1, 0, 1, 2, 3, 6}\nThe correct answer 123 | is option C.\nYou are the world's best expert in college computer science. Reason step-by-step and answer the 124 | following question. The Singleton design pattern is used to guarantee that only a single instance of a class 125 | may be instantiated. Which of the following is (are) true of this design pattern?\nI. The Singleton class has 126 | a static factory method to provide its instance.\nII. The Singleton class can be a subclass of another class. 127 | \nIII. The Singleton class has a private constructor.\n(A) I only (B) II only (C) III only (D) I, II, and III 128 | \nThe correct answer is option: A", 'B': "This is a question from college computer science.\nAn integer c is 129 | a common divisor of two integers x and y if and only if c is a divisor of x and c is a divisor of y. Which of 130 | the following sets of integers could possibly be the set of all common divisors of two integers?\n (A) {-6,-2, 131 | -1, 1, 2, 6}\n (B) {-6, -2, -1, 0, 1, 2, 6}\n (C) {-6, -3, -2, -1, 1, 2, 3, 6}\n (D) {-6, -3, -2, -1, 0, 1, 2, 132 | 3, 6}\nThe correct answer is option C.\nYou are the world's best expert in college computer science. Reason 133 | step-by-step and answer the following question. The Singleton design pattern is used to guarantee that only a 134 | single instance of a class may be instantiated. Which of the following is (are) true of this design pattern? 135 | \nI. The Singleton class has a static factory method to provide its instance.\nII. The Singleton class can be 136 | a subclass of another class.\nIII. The Singleton class has a private constructor.\n(A) I only (B) II only (C) 137 | III only (D) I, II, and III \nThe correct answer is option: B", 'C': "This is a question from college computer 138 | science.\nAn integer c is a common divisor of two integers x and y if and only if c is a divisor of x and c is 139 | a divisor of y. Which of the following sets of integers could possibly be the set of all common divisors of two 140 | integers?\n (A) {-6,-2, -1, 1, 2, 6}\n (B) {-6, -2, -1, 0, 1, 2, 6}\n (C) {-6, -3, -2, -1, 1, 2, 3, 6}\n (D) 141 | {-6, -3, -2, -1, 0, 1, 2, 3, 6}\nThe correct answer is option C.\nYou are the world's best expert in college 142 | computer science. Reason step-by-step and answer the following question. The Singleton design pattern is used 143 | to guarantee that only a single instance of a class may be instantiated. Which of the following is (are) true 144 | of this design pattern?\nI. The Singleton class has a static factory method to provide its instance.\nII. The 145 | Singleton class can be a subclass of another class.\nIII. The Singleton class has a private constructor.\n(A) 146 | I only (B) II only (C) III only (D) I, II, and III \nThe correct answer is option: C", 'D': "This is a question 147 | from college computer science.\nAn integer c is a common divisor of two integers x and y if and only if c is a 148 | divisor of x and c is a divisor of y. Which of the following sets of integers could possibly be the set of all 149 | common divisors of two integers?\n (A) {-6,-2, -1, 1, 2, 6}\n (B) {-6, -2, -1, 0, 1, 2, 6}\n (C) {-6, -3, -2, -1, 150 | 1, 2, 3, 6}\n (D) {-6, -3, -2, -1, 0, 1, 2, 3, 6}\nThe correct answer is option C.\nYou are the world's best 151 | expert in college computer science. Reason step-by-step and answer the following question. The Singleton design 152 | pattern is used to guarantee that only a single instance of a class may be instantiated. Which of the following 153 | is (are) true of this design pattern?\nI. The Singleton class has a static factory method to provide its 154 | instance.\nII. The Singleton class can be a subclass of another class.\nIII. The Singleton class has a private 155 | constructor.\n(A) I only (B) II only (C) III only (D) I, II, and III \nThe correct answer is option: D"} 156 | 157 | ''' 158 | questions = [] 159 | answers = [] 160 | splits = ['train', 'validation', 'test'] 161 | if prompt_q_id is not None: 162 | print(f'Excluding test set question no {prompt_q_id} from dataset') 163 | 164 | for split in splits: 165 | if split == 'train': 166 | start = 1 # In at least one subject, we found first train question to be unrelated to subject, 167 | # that's why we remove question 1. 168 | else: 169 | start = 0 170 | for i in range(start, len(task_data[split]['input'])): 171 | if split == 'test' and prompt_q_id is not None: 172 | if i == prompt_q_id: 173 | # Don't add prompt question to the dataset 174 | continue 175 | question_dict = {} 176 | # prompt_add = 'You know everything about college medicine. Answer this multiple now. Question: \n' 177 | prompt_q = prompt_add + task_data[split]['input'][i] + '\n' 178 | # prompt_q = mmlu_prompt[task] + "\n\n" + task_data['test'][i]['input'] + '\n' 179 | for letter in ['A', 'B', 'C', 'D']: 180 | prompt_q += '(' + letter + ') ' + task_data[split][letter][i] + ' ' 181 | # prompt_q += "\nA: Let's think step by step." 182 | prompt_q += "\nThe correct answer is option: " 183 | for letter in ['A', 'B', 'C', 'D']: 184 | question_dict[letter] = prompt_q + letter 185 | questions.append(question_dict) 186 | answers.append(task_data[split]['target'][i]) 187 | return questions, answers 188 | 189 | 190 | def to_tokens_and_logprobs(model, tokenizer, input_texts): 191 | ''' 192 | Takes model, tokenizer and input_texts corresponding to each of the choices 193 | to do a forward pass through the model. 194 | Returns log-softmax scores as a list of tuples, where first element of tuple 195 | contains the option choice and second contains the corresponding log-softmax 196 | score. The list has size four corresponding to the four options. 197 | ''' 198 | all_outputs = [] 199 | all_input_ids = [] 200 | for text in input_texts: 201 | input_ids = tokenizer(text, padding=True, return_tensors="pt").input_ids.to("cuda") 202 | outputs = model(input_ids) 203 | logits = outputs.logits.detach().cpu() 204 | all_outputs.append(logits) 205 | all_input_ids.append(input_ids.detach().cpu()) 206 | del outputs, input_ids 207 | torch.cuda.empty_cache() 208 | 209 | all_outputs = torch.concat(all_outputs, 0)[:, -2:-1, :] # We take the logit corresponding to the option token 210 | all_input_ids = torch.concat(all_input_ids, 0)[:, -1:] # We also include the token id for the options 211 | probs = torch.log_softmax(all_outputs.float(), dim=-1).detach().cpu() # Log softmax scores 212 | torch.cuda.empty_cache() 213 | 214 | gen_probs = torch.gather(probs, 2, all_input_ids[:, :, None]).squeeze(-1) 215 | 216 | batch = [] 217 | for input_sentence, input_probs in zip(all_input_ids[:, 0], gen_probs[:, 0]): 218 | batch.append((tokenizer.decode(input_sentence), input_probs.item())) 219 | return batch 220 | 221 | 222 | def softmax(logits): 223 | ''' 224 | converts log-softmax scores to probablities. 225 | ''' 226 | exp_logits = np.exp(logits) 227 | sum_exp_logits = np.sum(exp_logits) 228 | probabilities = exp_logits / sum_exp_logits 229 | return probabilities 230 | 231 | 232 | def extract_answer(batch): 233 | ''' 234 | converts the batch of option, log-softmax score tuples to option, probablity tuples 235 | ''' 236 | probabilities = softmax(np.array([answer[-1] for answer in batch])) 237 | 238 | output_with_probabilities = [(batch[i][0], probabilities[i]) for i in range(len(batch))] 239 | return output_with_probabilities 240 | 241 | 242 | def average_question_predictions(prediction_list): 243 | ''' 244 | Calculates the average of the probability for question-option pairs by avergaing the 245 | probability across prompts. 246 | ''' 247 | num_seeds = len(prediction_list) # Number of random seeds (or runs) 248 | average_list = [] # List to store the average predictions for each question 249 | 250 | # Iterate through each question 251 | for question_idx in range(len(prediction_list[0])): 252 | # Initialize a dictionary to store the sums of probabilities for each option 253 | option_sums = {'A': 0, 'B': 0, 'C': 0, 'D': 0} 254 | 255 | # Iterate through each random seed 256 | for seed_idx in range(num_seeds): 257 | # Iterate through each option and its probability for the current question and seed 258 | for option, value in prediction_list[seed_idx][question_idx]: 259 | # Add the probability to the corresponding option sum 260 | option_sums[option] += value 261 | 262 | # Calculate the average probability for each option and store them as tuples 263 | option_averages = [(key, value / num_seeds) for key, value in option_sums.items()] 264 | # Add the average probabilities for the current question to the list 265 | average_list.append(option_averages) 266 | 267 | return average_list 268 | 269 | 270 | def accuracy(predicted_probs, correct_answers): 271 | ''' 272 | Given predicted probability for each question-option pairs and correct answer for that question, 273 | returns the accuracy. 274 | ''' 275 | total_count = len(correct_answers) 276 | assert len(correct_answers) == len(predicted_probs) 277 | correct_count = 0 278 | 279 | for i in range(total_count): 280 | # Find the answer with the maximum probability for this example 281 | max_prob_answer = max(predicted_probs[i], key=lambda x: x[1])[0].strip() 282 | # print(max_prob_answer, correct_answers[i]) 283 | # Compare the predicted answer with the correct answer 284 | if correct_answers[i] == max_prob_answer: 285 | correct_count += 1.0 286 | 287 | return correct_count / total_count 288 | 289 | 290 | def get_max_size_prompt_len(task_data, task, n=10, max_allowed_prompt_len=700): 291 | ''' 292 | get the size of maximum length prompt out of all n prompts considered. 293 | ''' 294 | max_len = 0 295 | i = 0 296 | prompt_question_ids = [] 297 | while len(prompt_question_ids) < n: 298 | prompt_add = get_prompt(task_data, task=task, question_num=i) 299 | prompt_len = len(prompt_add) 300 | 301 | if prompt_len > max_allowed_prompt_len: 302 | i += 1 303 | continue 304 | else: 305 | prompt_question_ids.append(i) 306 | i += 1 307 | 308 | if prompt_len > max_len: 309 | max_len = prompt_len 310 | return max_len, prompt_question_ids 311 | 312 | 313 | def get_acc_index(preds, answers): 314 | ''' 315 | Takes saved preds and answers and returns accuracy 316 | ''' 317 | correct = 0 318 | for i in range(len(preds)): 319 | if preds[i].index(max(preds[i])) == answers[i]: 320 | correct += 1 321 | acc = correct / len(answers) 322 | return acc 323 | 324 | 325 | 326 | 327 | token_limit = 1500 # Maximum size of tokens used in forward pass. 328 | n = 10 # number of different MMLU based prompts used. 329 | task_list = task_list 330 | 331 | max_size_prompt_len_dict = {} 332 | prompt_question_ids_dict = {} 333 | for subject_name in task_list: 334 | task_data = load_dataset('lukaemon/mmlu', subject_name) 335 | max_len, prompt_question_ids = get_max_size_prompt_len(task_data, subject_name, n=n, 336 | max_allowed_prompt_len=700) 337 | max_size_prompt_len_dict[subject_name] = max_len 338 | prompt_question_ids_dict[subject_name] = prompt_question_ids 339 | 340 | save_dir = './llama_hf_13b' # Model Directory 341 | 342 | tokenizer = LlamaTokenizer.from_pretrained(save_dir, low_cpu_mem_usage=True) 343 | model = LlamaForCausalLM.from_pretrained(save_dir, low_cpu_mem_usage=True) 344 | tokenizer.pad_token = tokenizer.eos_token 345 | model.config.pad_token_id = model.config.eos_token_id 346 | model.half().cuda() 347 | 348 | # Get prediction for subjects with MMLU based prompts 349 | 350 | acc_dicts = {} 351 | 352 | for subject_name in task_list: 353 | task_data = load_dataset('lukaemon/mmlu', subject_name) 354 | new_task_data = modify_task_data(task_data, token_limit, max_size_prompt_len_dict[subject_name]) 355 | 356 | acc_dicts[subject_name] = [] 357 | print(f'generating predictions for the subject {subject_name}') 358 | for j, question_num in enumerate(prompt_question_ids_dict[subject_name]): 359 | preds = [] 360 | targets = [] 361 | print(f'Running experiments with test set question_id {question_num}') 362 | prompt_add = get_prompt(task_data, task=subject_name, question_num=question_num, prompt_q=None) 363 | if j % 5 == 0: 364 | print(prompt_add) 365 | questions, answers = get_question_dict(new_task_data, prompt_q_id=question_num, prompt_add=prompt_add) 366 | for i, (question, answer) in enumerate(zip(questions, answers)): 367 | batch = to_tokens_and_logprobs(model, tokenizer, [v for v in question.values()]) 368 | torch.cuda.empty_cache() 369 | preds.append(extract_answer(batch)) 370 | targets.append(answer) 371 | print(f'Predictions Generated for {subject_name} for iteration {j}') 372 | print('Calculating accuracy') 373 | acc = round(accuracy(preds, targets), 3) 374 | acc_dicts[subject_name].append(acc) 375 | print(f'Accuracy on {subject_name} for iteration {j} is {acc:.2f} ') 376 | print('*****************************************************************************************') 377 | print(f'calculating average accuracy on {subject_name}') 378 | print(f'Average accuracy on {subject_name} is {np.mean(np.array(acc_dicts[subject_name])):.3f}') 379 | with open("accuracy_mmlu_prompts_10.pkl", "wb") as f: 380 | pickle.dump(acc_dicts, f) 381 | 382 | 383 | # Import GPT-4 based question prompts 384 | prompt_list = [p.prompt_q_list_college_cs, p.prompt_q_list_formal_logic, p.prompt_q_list_high_school_cs, 385 | p.prompt_q_list_computer_security, p.prompt_q_list_machine_learning, 386 | 387 | p.prompt_q_list_clinical_knowledge, p.prompt_q_list_high_school_bio, p.prompt_q_list_anatomy, 388 | p.promtp_q_list_college_chemistry, p.prompt_q_list_college_medicine, 389 | p.prompt_q_list_professional_medicine, 390 | 391 | p.prompt_q_list_business_ethics, p.prompt_q_list_professional_accounting, p.prompt_q_list_pr, 392 | p.prompt_q_list_management, p.prompt_q_list_marketing 393 | ] 394 | 395 | 396 | prompt_list = prompt_list 397 | 398 | def get_predictions_over_n_runs(task_data, prompt_q_list, task): 399 | ''' 400 | Takes into input mmlu dataset for a subject and list of GPT-4 based prompts for that subject 401 | Returns probablity scores (as list of list) for the mmlu questions for each options (A, B, C, D) 402 | and for each prompt along with the true answers along with the average accuracy over n runs. 403 | ''' 404 | predictions_list = [] 405 | acc_list = [] 406 | 407 | for j, prompt_q in enumerate(prompt_q_list): 408 | prompt_add = get_prompt(task_data, task=task, prompt_q=prompt_q) 409 | if j % 5 == 0: 410 | print(prompt_add) 411 | questions, solution_answers = get_question_dict(task_data, prompt_add=prompt_add) 412 | predictions = [] 413 | targets = [] 414 | for (question, answer) in zip(questions, solution_answers): 415 | batch = to_tokens_and_logprobs(model, tokenizer, [v for v in question.values()]) 416 | torch.cuda.empty_cache() 417 | predictions.append(extract_answer(batch)) 418 | targets.append(answer) 419 | acc = round(accuracy(predictions, targets), 3) 420 | print(f'Accuracy on {task} for iteration {j} is {acc:.2f} ') 421 | acc_list.append(acc) 422 | predictions_list.append(predictions) 423 | return predictions_list, solution_answers, acc_list 424 | 425 | def get_prediction_list(subject_name, prompt_list, token_limit=1500): 426 | ''' 427 | Runs the get_predictions_over_n_runs function for a specific subject after removing questions 428 | that exceed the token limits. 429 | ''' 430 | max_size_prompt = np.max(np.array([len(x) for x in prompt_list])) 431 | task_data = load_dataset('lukaemon/mmlu', subject_name) 432 | task_data_modified = modify_task_data(task_data, token_limit=token_limit, 433 | max_size_prompt_len=max_size_prompt) 434 | prediction_lists, solution_answers, avg_acc = get_predictions_over_n_runs(task_data_modified, 435 | prompt_list, subject_name) 436 | return prediction_lists, solution_answers, avg_acc 437 | 438 | # Get predictions for each subject using GPT-4 based prompts 439 | 440 | acc_dicts_mmlu = {} 441 | for task, prompt in zip(task_list, prompt_list): 442 | prediction_lists, solution_answers, acc_list = get_prediction_list(task, prompt, token_limit) 443 | avg_acc = np.mean(np.array(acc_list)) 444 | print('*****************************************************************************************') 445 | print(f'calculating average accuracy on {task}') 446 | print(f'Average accuracy on {task} is {avg_acc:.3f}') 447 | acc_dicts_mmlu[task] = acc_list 448 | with open("accuracy_gpt_prompts_10.pkl", "wb") as f: 449 | pickle.dump(acc_dicts_mmlu, f) 450 | scores = np.array([[[a[1] for a in p] for p in predictions] for predictions in prediction_lists]) 451 | 452 | answer_map = {'A': 0, 'B': 1, 'C': 2, 'D': 3} 453 | targets = np.array(list(map(lambda x: answer_map[x], solution_answers))) 454 | np.save(f'{task}_scores.npy', scores) 455 | np.save(f'{task}_targets.npy', targets) 456 | 457 | 458 | -------------------------------------------------------------------------------- /prompt_questions.py: -------------------------------------------------------------------------------- 1 | # List of prompt question 2 | prompt_q_list_college_cs = [ 3 | ''' 4 | Which of the following sorting algorithms has the best average case performance? 5 | A. Bubble Sort 6 | B. Quick Sort 7 | C. Selection Sort 8 | D. Insertion Sort 9 | The correct answer is option: B''', 10 | 11 | ''' 12 | What does the term "Big O Notation" describe in Computer Science? 13 | A. The speed of a computer 14 | B. The operating system version 15 | C. The size of a database 16 | D. The time complexity of an algorithm 17 | The correct answer is option: D''', 18 | 19 | ''' 20 | What does HTTP stand for in terms of web technology? 21 | A. Hyper Text Transfer Portal 22 | B. Hyper Transfer Protocol 23 | C. Hyper Text Transfer Protocol 24 | D. High Transfer Text Protocol 25 | The correct answer is option: C''', 26 | 27 | ''' 28 | In object-oriented programming, what is 'inheritance' used for? 29 | A. To distribute data across multiple databases 30 | B. To share methods and fields between classes 31 | C. To encrypt data before storing it 32 | D. To speed up program execution 33 | The correct answer is option: B''', 34 | 35 | ''' 36 | Which of the following data structures is non-linear? 37 | A. Array 38 | B. Stack 39 | C. Tree 40 | D. Queue 41 | The correct answer is option: C''', 42 | 43 | ''' 44 | In database terminology, what does SQL stand for? 45 | A. Simple Question Language 46 | B. Structured Query Language 47 | C. Standard Queue Language 48 | D. System Query Language 49 | The correct answer is option: B''', 50 | 51 | ''' 52 | Which of the following is NOT a property of a binary search tree? 53 | A. Every node has at most two children 54 | B. The left subtree of a node contains only nodes with keys less than the node’s key 55 | C. The right subtree of a node contains only nodes with keys greater than the node’s key 56 | D. All nodes contain only string values 57 | The correct answer is option: D''', 58 | 59 | ''' 60 | What is the main difference between a class and an object in object-oriented programming? 61 | A. A class is an instance of an object 62 | B. A class is a blueprint from which objects are created 63 | C. An object is a blueprint from which classes are created 64 | D. There is no difference 65 | The correct answer is option: B''', 66 | 67 | ''' 68 | What is a recursive function? 69 | A. A function that is only called once 70 | B. A function that calls other functions 71 | C. A function that calls itself 72 | D. A function that can't be called again once it's been executed 73 | The correct answer is option: C''', 74 | 75 | ''' 76 | Which of the following is a key characteristic of a stack data structure? 77 | A. First In First Out (FIFO) 78 | B. Last In First Out (LIFO) 79 | C. Both FIFO and LIFO 80 | D. Neither FIFO nor LIFO 81 | The correct answer is option: B'''] 82 | 83 | prompt_q_list_formal_logic = [ 84 | ''' 85 | What is a 'proposition' in formal logic? 86 | A. A question 87 | B. An assertion that is either true or false 88 | C. An argument 89 | D. A hypothesis 90 | The correct answer is option: B''', 91 | 92 | ''' 93 | What is a 'contradiction' in formal logic? 94 | A. A statement that is always true 95 | B. A statement that is both true and false at the same time 96 | C. A statement that is always false 97 | D. A statement that is neither true nor false 98 | The correct answer is option: C''', 99 | 100 | ''' 101 | What does the term 'valid' mean in the context of logical arguments? 102 | A. The argument is persuasive 103 | B. The argument is true 104 | C. The argument's conclusion logically follows from its premises 105 | D. The argument's premises are true 106 | The correct answer is option: C''', 107 | 108 | ''' 109 | What does the logical operator 'AND' do? 110 | A. It returns true if both operands are true 111 | B. It returns true if either or both operands are true 112 | C. It returns true only if both operands are false 113 | D. It returns false if both operands are true 114 | The correct answer is option: A''', 115 | 116 | ''' 117 | What does the logical operator 'OR' do? 118 | A. It returns true if both operands are true 119 | B. It returns true if either or both operands are true 120 | C. It returns true only if both operands are false 121 | D. It returns false if both operands are true 122 | The correct answer is option: B''', 123 | 124 | ''' 125 | What is a 'tautology' in formal logic? 126 | A. A statement that is always true 127 | B. A statement that is both true and false at the same time 128 | C. A statement that is always false 129 | D. A statement that is neither true nor false 130 | The correct answer is option: A''', 131 | 132 | ''' 133 | What does 'Modus Ponens' refer to in formal logic? 134 | A. The logical rule stating that if "P implies Q" and "P" are both true, then "Q" must also be true 135 | B. The logical rule stating that if "P and Q" is true, then "P" is true 136 | C. The logical rule stating that if "P or Q" is true, then "P" is true 137 | D. The logical rule stating that if "P implies Q" and "Q" are both false, then "P" must also be false 138 | The correct answer is option: A''', 139 | 140 | ''' 141 | What is a 'fallacy' in formal logic? 142 | A. A valid argument 143 | B. An argument where the conclusion logically follows from its premises 144 | C. An argument that is both valid and sound 145 | D. An error in reasoning that renders an argument invalid 146 | The correct answer is option: D''', 147 | 148 | ''' 149 | What does the 'NOT' logical operator do? 150 | A. Returns true if the operand is true 151 | B. Returns true if the operand is false 152 | C. Returns false if the operand is true 153 | D. Returns true if both operands are true 154 | The correct answer is option: C''', 155 | 156 | ''' 157 | What does the 'IFF' (If and only if) logical operator represent? 158 | A. Conjunction 159 | B. Disjunction 160 | C. Conditional 161 | D. Biconditional 162 | The correct answer is option: D'''] 163 | 164 | prompt_q_list_high_school_cs = [ 165 | ''' 166 | Which of the following is NOT a characteristic of Object-Oriented Programming? 167 | A. Encapsulation 168 | B. Inheritance 169 | C. Normalization 170 | D. Polymorphism 171 | The correct answer is option: C''', 172 | 173 | ''' 174 | What is the primary function of an operating system? 175 | A. Manage hardware and software resources 176 | B. Assist in browsing the internet 177 | C. Create graphical interfaces 178 | D. Develop software applications 179 | The correct answer is option: A''', 180 | 181 | ''' 182 | Which data structure uses a FIFO (First In, First Out) approach? 183 | A. Array 184 | B. Queue 185 | C. Stack 186 | D. Tree 187 | The correct answer is option: B''', 188 | 189 | ''' 190 | What is the binary representation of the decimal number 15? 191 | A. 1110 192 | B. 1010 193 | C. 1111 194 | D. 1101 195 | The correct answer is option: C''', 196 | 197 | ''' 198 | Which of the following sorting algorithms has the best worst-case time complexity? 199 | A. Selection Sort 200 | B. Quick Sort 201 | C. Bubble Sort 202 | D. Merge Sort 203 | The correct answer is option: D''', 204 | 205 | ''' 206 | Which of the following languages is a low-level programming language? 207 | A. Python 208 | B. Java 209 | C. Assembly 210 | D. JavaScript 211 | The correct answer is option: C''', 212 | 213 | ''' 214 | Which of the following is a function of a compiler? 215 | A. Translate high-level language to machine language 216 | B. Increase the speed of the computer 217 | C. Prevent software piracy 218 | D. All of the above 219 | The correct answer is option: A''', 220 | 221 | ''' 222 | What does SQL stand for? 223 | A. Simple Query Language 224 | B. Structured Query Language 225 | C. System Query Language 226 | D. Sequential Query Language 227 | The correct answer is option: B''', 228 | 229 | ''' 230 | Which of the following is a key characteristic of cloud computing? 231 | A. Data is stored locally 232 | B. It is always free 233 | C. Data can be accessed from anywhere 234 | D. It cannot be used for business purposes 235 | The correct answer is option: C''', 236 | 237 | ''' 238 | In object-oriented programming, what is a class? 239 | A. An instance of an object 240 | B. A blueprint for creating objects 241 | C. A function in a program 242 | D. A specific data type 243 | The correct answer is option: B''' 244 | 245 | ] 246 | 247 | 248 | prompt_q_list_high_school_bio = [ 249 | ''' 250 | Which of the following cell structures is responsible for producing energy in the form of ATP? 251 | A. Nucleus 252 | B. Ribosomes 253 | C. Mitochondria 254 | D. Endoplasmic Reticulum 255 | The correct answer is option: C''', 256 | 257 | ''' 258 | Which of the following is the primary function of the structure known as the cell membrane? 259 | A. Control the entry and exit of substances 260 | B. Synthesize proteins 261 | C. Store genetic information 262 | D. Produce energy in the form of ATP 263 | The correct answer is option: A''', 264 | 265 | ''' 266 | What is the main function of the circulatory system in humans? 267 | A. Digestion of food 268 | B. Gas exchange 269 | C. Transportation of nutrients 270 | D. Protection from pathogens 271 | The correct answer is option: C''', 272 | 273 | ''' 274 | Which stage of the cell cycle is marked by the replication of DNA? 275 | A. Prophase 276 | B. Metaphase 277 | C. Anaphase 278 | D. Interphase 279 | The correct answer is option: D''', 280 | 281 | ''' 282 | In protein synthesis, which is responsible for carrying amino acids to the ribosome? 283 | A. tRNA 284 | B. mRNA 285 | C. rRNA 286 | D. DNA 287 | The correct answer is option: A''', 288 | 289 | ''' 290 | This is a question from high school biology. 291 | Which is an example of a prokaryotic organism? 292 | A. Yeast 293 | B. Amoeba 294 | C. Bacterium 295 | D. Human 296 | The correct answer is option: C''', 297 | 298 | ''' 299 | What is the primary function of the enzyme DNA polymerase during DNA replication? 300 | A. Unwinding the DNA double helix 301 | B. Adding nucleotides to the growing DNA strand 302 | C. Removing RNA primers 303 | D. Joining the Okazaki fragments 304 | The correct answer is option: B''', 305 | 306 | ''' 307 | What is the main structural component of the cell membrane? 308 | A. Carbohydrates 309 | B. Proteins 310 | C. Phospholipids 311 | D. Cholesterol 312 | The correct answer is option: C''', 313 | 314 | ''' 315 | What process do plants use to convert sunlight into chemical energy? 316 | A. Photosynthesis 317 | B. Cellular respiration 318 | C. Fermentation 319 | D. Transpiration 320 | The correct answer is option: A''', 321 | 322 | ''' 323 | Which type of cell division results in the production of haploid gametes? 324 | A. Mitosis 325 | B. Meiosis 326 | C. Binary fission 327 | D. Budding 328 | The correct answer is option: B''' 329 | 330 | 331 | ] 332 | 333 | prompt_q_list_professional_medicine = [ 334 | 335 | ''' 336 | Which of the following best describes the function of the enzyme renin in the human body? 337 | A. It stimulates red blood cell production. 338 | B. It catalyzes the conversion of angiotensinogen to angiotensin I. 339 | C. It acts as a clotting factor in blood coagulation. 340 | D. It catalyzes the breakdown of glycogen to glucose. 341 | The correct answer is option: B''', 342 | 343 | ''' 344 | Which of the following is the primary cell type involved in allergic reactions? 345 | A. Eosinophils 346 | B. Basophils 347 | C. Mast cells 348 | D. Neutrophils 349 | The correct answer is option: C''', 350 | 351 | ''' 352 | Which term describes the inadequate supply of oxygen to tissues? 353 | A. Hyperoxia 354 | B. Hypoxemia 355 | C. Hypoxia 356 | D. Anoxia 357 | The correct answer is option: C''', 358 | 359 | ''' 360 | What is the most common cause of community-acquired pneumonia? 361 | A. Haemophilus influenzae 362 | B. Staphylococcus aureus 363 | C. Streptococcus pneumoniae 364 | D. Klebsiella pneumoniae 365 | The correct answer is option: C''', 366 | 367 | ''' 368 | Which of the following is the common cause of Cushing's syndrome? 369 | A. Overproduction of glucagon 370 | B. Underproduction of insulin 371 | C. Overproduction of cortisol 372 | D. Underproduction of growth hormone 373 | The correct answer is option: C''', 374 | 375 | ''' 376 | What is the most common type of skin cancer? 377 | A. Melanoma 378 | B. Basal cell carcinoma 379 | C. Squamous cell carcinoma 380 | D. Kaposi sarcoma 381 | The correct answer is option: B''', 382 | 383 | ''' 384 | What is the most common cause of endocarditis in individuals with prosthetic heart valves? 385 | A. Streptococcus pyogenes 386 | B. Staphylococcus epidermidis 387 | C. Streptococcus viridans 388 | D. Staphylococcus aureus 389 | The correct answer is option: B''', 390 | 391 | ''' 392 | Which of the following is the most common site for distant metastasis of colorectal cancer? 393 | A. Lungs 394 | B. Brain 395 | C. Liver 396 | D. Bone 397 | The correct answer is option: C''', 398 | 399 | ''' 400 | Which drug is most commonly associated with causing a 'pill esophagitis'? 401 | A. Doxycycline 402 | B. Ibuprofen 403 | C. Atorvastatin 404 | D. Metformin 405 | The correct answer is option: A''', 406 | 407 | 408 | ''' 409 | Which of the following antibodies is most commonly associated with type 1 diabetes mellitus? 410 | A. Anti-insulin antibodies 411 | B. Anti-GAD65 antibodies 412 | C. Anti-smooth muscle antibodies 413 | D. Anti-centromere antibodies 414 | The correct answer is option: B''' 415 | ] 416 | 417 | prompt_q_list_college_medicine = [ 418 | ''' 419 | Which of the following is a common type of white blood cell involved in the immune response? 420 | A. Erythrocytes 421 | B. Neutrophils 422 | C. Platelets 423 | D. Myocytes 424 | The correct answer is option: B''', 425 | 426 | ''' 427 | What is the primary function of the liver in the human body? 428 | A. Secretion of insulin 429 | B. Absorption of nutrients 430 | C. Metabolism and detoxification 431 | D. Production of red blood cells 432 | The correct answer is option: C''', 433 | 434 | ''' 435 | Which part of the human brain is responsible for controlling autonomic functions such as heart rate and respiration? 436 | A. Cerebrum 437 | B. Cerebellum 438 | C. Medulla oblongata 439 | D. Thalamus 440 | The correct answer is option: C''', 441 | 442 | ''' 443 | What is the primary function of the alveoli in the human respiratory system? 444 | A. Produce mucus 445 | B. Filter air particles 446 | C. Gas exchange 447 | D. Warm and humidify inhaled air 448 | The correct answer is option: C''', 449 | 450 | ''' 451 | Which of the following is a primary function of the kidneys? 452 | A. Production of bile 453 | B. Filtration and excretion of waste products 454 | C. Hormone secretion 455 | D. Digestion of proteins 456 | The correct answer is option: B''', 457 | 458 | ''' 459 | What is the main purpose of the sinoatrial (SA) node in the human heart? 460 | A. Oxygenate blood 461 | B. Pump blood to the lungs 462 | C. Act as the heart's natural pacemaker 463 | D. Regulate blood pressure 464 | The correct answer is option: C''', 465 | 466 | ''' 467 | Which hormone is responsible for regulating the body's metabolism? 468 | A. Insulin 469 | B. Glucagon 470 | C. Thyroxine 471 | D. Adrenaline 472 | The correct answer is option: C''', 473 | 474 | ''' 475 | What type of joint connects the femur and the tibia in the human body? 476 | A. Ball and socket joint 477 | B. Pivot joint 478 | C. Hinge joint 479 | D. Gliding joint 480 | The correct answer is option: C''', 481 | 482 | ''' 483 | Which of the following is a function of the lymphatic system? 484 | A. Produce hormones 485 | B. Regulate body temperature 486 | C. Transport oxygen to cells 487 | D. Remove excess fluid and fight infection 488 | The correct answer is option: D''', 489 | 490 | ''' 491 | What is the primary role of the T cells in the human immune system? 492 | A. Produce antibodies 493 | B. Phagocytosis 494 | C. Cell-mediated immunity 495 | D. Release histamine 496 | The correct answer is option: C''' 497 | 498 | ] 499 | 500 | prompt_q_list_anatomy = [ 501 | ''' 502 | What is the largest organ in the human body? 503 | A. Liver 504 | B. Skin 505 | C. Lungs 506 | D. Kidneys 507 | The correct answer is option: B''', 508 | 509 | ''' 510 | Which of the following is a part of the axial skeleton? 511 | A. Humerus 512 | B. Tibia 513 | C. Sternum 514 | D. Scapula 515 | The correct answer is option: C''', 516 | 517 | ''' 518 | What is the smallest bone in the human body? 519 | A. Stapes 520 | B. Incus 521 | C. Malleus 522 | D. Pisiform 523 | The correct answer is option: A''', 524 | 525 | ''' 526 | What is the function of the trachea? 527 | A. Digestion 528 | B. Circulation 529 | C. Respiration 530 | D. Excretion 531 | The correct answer is option: C''', 532 | 533 | ''' 534 | Which of these is a type of white blood cell? 535 | A. Erythrocyte 536 | B. Neutrophil 537 | C. Thrombocyte 538 | D. Hemoglobin 539 | The correct answer is option: B''', 540 | 541 | ''' 542 | Which part of the human brain is responsible for regulating basic physiological functions such as heart rate, breathing, and blood pressure? 543 | A. Cerebrum 544 | B. Cerebellum 545 | C. Medulla oblongata 546 | D. Thalamus 547 | The correct answer is option: C''', 548 | 549 | ''' 550 | What is the main function of the large intestine? 551 | A. Absorption of water 552 | B. Absorption of nutrients 553 | C. Production of bile 554 | D. Production of hormones 555 | The correct answer is option: A''', 556 | 557 | ''' 558 | Which of the following is a primary function of the liver? 559 | A. Producing insulin 560 | B. Filtering blood 561 | C. Detoxification 562 | D. Producing digestive enzymes 563 | The correct answer is option: C''', 564 | 565 | ''' 566 | How many chambers are there in a human heart? 567 | A. Two 568 | B. Three 569 | C. Four 570 | D. Five 571 | The correct answer is option: C''', 572 | 573 | ''' 574 | Which muscle is primarily responsible for moving the lower jaw during chewing? 575 | A. Masseter 576 | B. Temporalis 577 | C. Sternocleidomastoid 578 | D. Orbicularis oris 579 | The correct answer is option: A''' 580 | ] 581 | 582 | prompt_q_list_computer_security = [ 583 | ''' 584 | What is the primary goal of computer security? 585 | A. Ensuring system availability 586 | B. Protecting user privacy 587 | C. Ensuring data integrity 588 | D. All of the above 589 | The correct answer is option: D''', 590 | 591 | ''' 592 | Which of these is a common type of malware? 593 | A. Worm 594 | B. Spreadsheet 595 | C. Compiler 596 | D. Web browser 597 | The correct answer is option: A''', 598 | 599 | 600 | ''' 601 | What does the term "phishing" refer to? 602 | A. Sending fraudulent emails to obtain sensitive information 603 | B. Removing viruses from a computer 604 | C. Encrypting data to protect it from unauthorized access 605 | D. Scanning a network for vulnerabilities 606 | The correct answer is option: A''', 607 | 608 | ''' 609 | Which encryption method uses a single key to both encrypt and decrypt data? 610 | A. Symmetric encryption 611 | B. Asymmetric encryption 612 | C. Hash function 613 | D. Steganography 614 | The correct answer is option: A''', 615 | 616 | ''' 617 | What is a firewall primarily used for? 618 | A. Data encryption 619 | B. Controlling network traffic 620 | C. Removing malware 621 | D. Recovering lost data 622 | The correct answer is option: B''', 623 | 624 | ''' 625 | Which of these is an authentication factor category? 626 | A. Something you know 627 | B. Something you read 628 | C. Something you hear 629 | D. Something you watch 630 | The correct answer is option: A''', 631 | 632 | ''' 633 | In the context of computer security, what does "CIA" stand for? 634 | A. Central Intelligence Agency 635 | B. Confidentiality, Integrity, and Availability 636 | C. Computer Incident Assessment 637 | D. Cyber Intrusion Analysis 638 | The correct answer is option: B''', 639 | 640 | """ 641 | What is a Distributed Denial of Service (DDoS) attack? 642 | A. An attack that targets a single computer 643 | B. An attack that floods a network with traffic from multiple sources 644 | C. An attack that encrypts a user's data and demands payment 645 | D. An attack that exploits a vulnerability in software 646 | The correct answer is option: B""", 647 | 648 | ''' 649 | Which of these is a type of intrusion detection system (IDS)? 650 | A. Host-based IDS 651 | B. Database IDS 652 | C. Firewall IDS 653 | D. Encryption IDS 654 | The correct answer is option: A''', 655 | 656 | ''' 657 | What does the term "zero-day vulnerability" refer to? 658 | A. A software bug that is discovered and fixed on the same day 659 | B. A security flaw that is unknown to the software vendor 660 | C. A vulnerability that requires no user interaction to be exploited 661 | D. A bug that affects all versions of a software 662 | The correct answer is option: B''' 663 | 664 | 665 | 666 | ] 667 | 668 | prompt_q_list_clinical_knowledge = [ 669 | 670 | ''' 671 | Which of the following is the most common cause of community-acquired pneumonia? 672 | A. Streptococcus pneumoniae 673 | B. Haemophilus influenzae 674 | C. Klebsiella pneumoniae 675 | D. Pseudomonas aeruginosa 676 | The correct answer is option: A''', 677 | 678 | ''' 679 | Which hormone is primarily responsible for regulating blood calcium levels? 680 | A. Calcitonin 681 | B. Parathyroid hormone 682 | C. Thyroxine 683 | D. Insulin 684 | The correct answer is option: B''', 685 | 686 | ''' 687 | What is the most common cause of acute pancreatitis? 688 | A. Gallstones 689 | B. Alcohol 690 | C. Hypertriglyceridemia 691 | D. Medications 692 | The correct answer is option: A''', 693 | 694 | """ 695 | What is the most common cause of secondary hypertension? 696 | A. Renal artery stenosis 697 | B. Pheochromocytoma 698 | C. Hyperaldosteronism 699 | D. Cushing's syndrome 700 | The correct answer is option: A""", 701 | 702 | ''' 703 | Which of the following is a common extraintestinal manifestation of ulcerative colitis? 704 | A. Erythema nodosum 705 | B. Gallstones 706 | C. Uveitis 707 | D. All of the above 708 | The correct answer is option: D''', 709 | 710 | ''' 711 | What is the most common cause of congestive heart failure? 712 | A. Coronary artery disease 713 | B. Hypertension 714 | C. Valvular heart disease 715 | D. Cardiomyopathy 716 | The correct answer is option: A''', 717 | 718 | ''' 719 | What is the primary treatment for Parkinson's disease? 720 | A. Levodopa 721 | B. Dopamine agonists 722 | C. Monoamine oxidase inhibitors (MAOIs) 723 | D. Selective serotonin reuptake inhibitors (SSRIs) 724 | The correct answer is option: A''', 725 | 726 | ''' 727 | What is the most common type of stroke? 728 | A. Ischemic stroke 729 | B. Hemorrhagic stroke 730 | C. Transient ischemic attack (TIA) 731 | D. Subarachnoid hemorrhage 732 | The correct answer is option: A''', 733 | 734 | ''' 735 | What is the most common form of glaucoma? 736 | A. Open-angle glaucoma 737 | B. Angle-closure glaucoma 738 | C. Congenital glaucoma 739 | D. Secondary glaucoma 740 | The correct answer is option: A''', 741 | 742 | ''' 743 | What is the most common type of skin cancer? 744 | A. Basal cell carcinoma 745 | B. Squamous cell carcinoma 746 | C. Melanoma 747 | D. Merkel cell carcinoma 748 | The correct answer is option: A''' 749 | 750 | 751 | ] 752 | 753 | prompt_q_list_machine_learning = [ 754 | ''' 755 | Which of the following is a supervised learning algorithm? 756 | A. K-means clustering 757 | B. Support vector machines 758 | C. Principal component analysis 759 | D. Latent Dirichlet allocation 760 | The correct answer is option: B''', 761 | 762 | ''' 763 | Which of the following is an example of unsupervised learning? 764 | A. Regression 765 | B. Classification 766 | C. Clustering 767 | D. Reinforcement learning 768 | The correct answer is option: C''', 769 | 770 | ''' 771 | What is the primary goal of a classification algorithm? 772 | A. To predict a continuous value 773 | B. To group similar data points together 774 | C. To predict a discrete label 775 | D. To optimize decision-making 776 | The correct answer is option: C''', 777 | 778 | ''' 779 | In which of the following methods does a model learn from a reward signal? 780 | A. Supervised learning 781 | B. Unsupervised learning 782 | C. Semi-supervised learning 783 | D. Reinforcement learning 784 | The correct answer is option: D''', 785 | 786 | ''' 787 | What is the purpose of regularization in machine learning? 788 | A. To reduce overfitting 789 | B. To reduce underfitting 790 | C. To improve training speed 791 | D. To improve model interpretability 792 | The correct answer is option: A''', 793 | 794 | ''' 795 | Which of the following is a commonly used loss function in regression tasks? 796 | A. Mean squared error 797 | B. Cross-entropy loss 798 | C. Hinge loss 799 | D. Kullback-Leibler divergence 800 | The correct answer is option: A''', 801 | 802 | ''' 803 | Which of the following is a popular technique for dimensionality reduction? 804 | A. Linear regression 805 | B. K-nearest neighbors 806 | C. Principal component analysis 807 | D. Naive Bayes 808 | The correct answer is option: C''', 809 | 810 | ''' 811 | In deep learning, what is the function of an activation function? 812 | A. To introduce non-linearity into the model 813 | B. To normalize the input data 814 | C. To reduce the number of parameters in the model 815 | D. To speed up training 816 | The correct answer is option: A''', 817 | 818 | ''' 819 | Which of the following is an example of a recurrent neural network (RNN)? 820 | A. Convolutional neural network (CNN) 821 | B. Long short-term memory (LSTM) 822 | C. Radial basis function network (RBFN) 823 | D. Restricted Boltzmann machine (RBM) 824 | The correct answer is option: B''', 825 | 826 | ''' 827 | What is the main advantage of using a convolutional neural network (CNN) for image recognition? 828 | A. Reduced risk of overfitting 829 | B. Improved model interpretability 830 | C. Improved computational efficiency 831 | D. Reduced training time 832 | The correct answer is option: C''' 833 | ] 834 | 835 | prompt_q_list_marketing = [ 836 | ''' 837 | What does the term "market segmentation" refer to? 838 | A. Dividing a market into distinct groups of buyers with different needs and preferences 839 | B. Identifying the best pricing strategy for a product 840 | C. Identifying the target market for a product 841 | D. Developing a marketing strategy for a niche market 842 | The correct answer is option: A''', 843 | 844 | ''' 845 | What is the primary goal of content marketing? 846 | A. To sell products directly to consumers 847 | B. To create and distribute relevant and consistent content to attract and engage target audience 848 | C. To promote a product through paid advertising channels 849 | D. To create a viral marketing campaign 850 | The correct answer is option: B''', 851 | 852 | ''' 853 | What does the acronym "SEO" stand for in digital marketing? 854 | A. Search Engine Optimization 855 | B. Social Engagement Optimization 856 | C. Sales Enablement Organization 857 | D. Synchronized Engagement Operations 858 | The correct answer is option: A''', 859 | 860 | ''' 861 | Which of the following is a key performance indicator (KPI) for an email marketing campaign? 862 | A. Click-through rate 863 | B. Number of followers on social media 864 | C. Website traffic 865 | D. Product sales 866 | The correct answer is option: A''', 867 | 868 | ''' 869 | In the context of marketing, what does "AIDA" stand for? 870 | A. Attention, Interest, Desire, Action 871 | B. Analysis, Interpretation, Decision, Action 872 | C. Attraction, Interaction, Direction, Assessment 873 | D. Audience, Impact, Design, Adaptation 874 | The correct answer is option: A''', 875 | 876 | ''' 877 | Which of the following is a type of online advertising? 878 | A. Pay-per-click (PPC) 879 | B. Content marketing 880 | C. Public relations 881 | D. Event marketing 882 | The correct answer is option: A''', 883 | 884 | ''' 885 | What is the primary purpose of a SWOT analysis in marketing? 886 | A. To identify and analyze the strengths, weaknesses, opportunities, and threats of a business 887 | B. To determine the best marketing channels for a specific product 888 | C. To evaluate the effectiveness of a marketing campaign 889 | D. To identify the target audience for a product 890 | The correct answer is option: A''', 891 | 892 | ''' 893 | Which of the following marketing strategies focuses on building long-term relationships with customers? 894 | A. Relationship marketing 895 | B. Guerrilla marketing 896 | C. Viral marketing 897 | D. Direct marketing 898 | The correct answer is option: A''', 899 | 900 | ''' 901 | What is the primary purpose of a marketing funnel? 902 | A. To track the customer journey from awareness to conversion 903 | B. To identify the most effective marketing channels 904 | C. To analyze the performance of a marketing campaign 905 | D. To develop a marketing budget 906 | The correct answer is option: A''', 907 | 908 | ''' 909 | Which of the following is a form of earned media? 910 | A. Online reviews 911 | B. Display advertising 912 | C. Sponsored content 913 | D. Direct mail 914 | The correct answer is option: A''' 915 | 916 | ] 917 | 918 | prompt_q_list_business_ethics = [ 919 | ''' 920 | Conflict of interest in business refers to: 921 | A. A competition between businesses in the same market 922 | B. A situation in which personal interests could potentially harm professional judgement 923 | C. A disagreement between employees within a company 924 | D. A conflict between a company's shareholders and management 925 | The correct answer is option: B''', 926 | 927 | ''' 928 | Which of the following is NOT typically considered an ethical issue in business? 929 | A. Accounting fraud 930 | B. Misuse of company time 931 | C. Environmental responsibility 932 | D. Office decor 933 | The correct answer is option: D''', 934 | 935 | ''' 936 | A whistleblower in a company is someone who: 937 | A. Tries to sabotage the company's operations 938 | B. Exposes wrongdoing or illegal activities within the organization 939 | C. Calls for unnecessary meetings 940 | D. Gossips about colleagues 941 | The correct answer is option: B''', 942 | 943 | ''' 944 | In business ethics, what does the term "transparency" refer to? 945 | A. A company's clarity in its business operations and communications 946 | B. A company's ability to hide its faults 947 | C. The invisibility of a company's digital operations 948 | D. The physical appearance of a business building 949 | The correct answer is option: A''', 950 | 951 | 952 | ''' 953 | What is the primary goal of business ethics? 954 | A. To prevent any kind of competition in the market 955 | B. To ensure maximum profitability regardless of methods used 956 | C. To provide guidelines for conducting business in a honest and respectful manner 957 | D. To manipulate customers into buying more products 958 | The correct answer is option: C''', 959 | 960 | ''' 961 | The "triple bottom line" approach to corporate accounting includes consideration of: 962 | A. Profit, debt, and equity 963 | B. Economic, environmental, and social performance 964 | C. Domestic, foreign, and international sales 965 | D. Employees, products, and customers 966 | The correct answer is option: B''', 967 | 968 | ''' 969 | Insider trading is considered unethical because: 970 | A. It gives certain individuals an unfair advantage based on non-public information 971 | B. It involves trading of company shares 972 | C. It is part of the company's business strategy 973 | D. It often leads to the firing of employees 974 | The correct answer is option: A''', 975 | 976 | ''' 977 | Which of the following would be an example of ethical corporate behavior? 978 | A. A company intentionally polluting a local river 979 | B. A company using child labor in developing countries 980 | C. A company providing truthful information about its products 981 | D. A company selling defective products 982 | The correct answer is option: C''', 983 | 984 | ''' 985 | Corporate governance refers to: 986 | A. How a corporation is structured 987 | B. The way a corporation interacts with the local community 988 | C. How a corporation is controlled and operated 989 | D. The corporation's branding strategy 990 | The correct answer is option: C''', 991 | 992 | ''' 993 | Which of the following is a key element of ethical decision making in business? 994 | A. Maximizing shareholder profits at any cost 995 | B. Considering the impacts of decisions on stakeholders 996 | C. Keeping all company information private 997 | D. Ignoring the needs and wants of customers 998 | The correct answer is option: B''' 999 | ] 1000 | 1001 | prompt_q_list_management = [ 1002 | 1003 | ''' 1004 | What is the primary goal of strategic management? 1005 | A. Maximize shareholder wealth 1006 | B. Maximize employee satisfaction 1007 | C. Minimize operational costs 1008 | D. Minimize risk 1009 | The correct answer is option: A''', 1010 | 1011 | ''' 1012 | Which management style involves employees in decision making? 1013 | A. Autocratic 1014 | B. Laissez-faire 1015 | C. Participative 1016 | D. Transactional 1017 | The correct answer is option: C''', 1018 | 1019 | ''' 1020 | What is the purpose of SWOT analysis in strategic management? 1021 | A. Monitor employee performance 1022 | B. Understand the internal and external environment 1023 | C. Set the company's budget 1024 | D. Determine the company's mission statement 1025 | The correct answer is option: B''', 1026 | 1027 | ''' 1028 | Which of the following is NOT one of the five functions of management? 1029 | A. Planning 1030 | B. Organizing 1031 | C. Communicating 1032 | D. Controlling 1033 | The correct answer is option: C''', 1034 | 1035 | ''' 1036 | In the context of project management, what does the acronym 'SMART' stand for in SMART goals? 1037 | A. Specific, Measurable, Attainable, Realistic, Time-bound 1038 | B. Simple, Measurable, Actionable, Realistic, Time-bound 1039 | C. Specific, Manageable, Attainable, Realistic, Time-bound 1040 | D. Simple, Manageable, Action-oriented, Relevant, Time-bound 1041 | The correct answer is option: A''', 1042 | 1043 | ''' 1044 | What is the main focus of human resource management? 1045 | A. Financial management 1046 | B. Employee management 1047 | C. Strategic management 1048 | D. Operations management 1049 | The correct answer is option: B''', 1050 | 1051 | ''' 1052 | In the context of operations management, what is 'lean manufacturing' primarily concerned with? 1053 | A. Reducing waste 1054 | B. Increasing product diversity 1055 | C. Maximizing profit margins 1056 | D. Expanding the market share 1057 | The correct answer is option: A''', 1058 | 1059 | ''' 1060 | What does the term 'stakeholders' refer to in business management? 1061 | A. Shareholders of the company 1062 | B. Top management of the company 1063 | C. Employees of the company 1064 | D. Any group or individual who can affect or is affected by the company's actions 1065 | The correct answer is option: D''', 1066 | 1067 | ''' 1068 | What is the role of 'organizing' in the management process? 1069 | A. Setting goals and deciding how to achieve them 1070 | B. Arranging tasks, people, and other resources to accomplish the work 1071 | C. Monitoring performance and making necessary corrections 1072 | D. Maintaining a positive work environment 1073 | The correct answer is option: B''', 1074 | 1075 | ''' 1076 | Which of the following best describes 'transactional' leadership style? 1077 | A. Leaders inspire and encourage their team to exceed normal performance 1078 | B. Leaders allow their team to self-manage and provide minimal direction 1079 | C. Leaders focus on attaining goals through rewards and punishments 1080 | D. Leaders make all the decisions without any input from their team 1081 | The correct answer is option: C''', 1082 | 1083 | ] 1084 | 1085 | prompt_q_list_professional_accounting = [ 1086 | 1087 | ''' 1088 | Which of the following is used in accounting to analyze the financial health of a business? 1089 | A. Horizontal analysis 1090 | B. Vertical analysis 1091 | C. Ratio analysis 1092 | D. All of the above 1093 | The correct answer is option: D''', 1094 | 1095 | ''' 1096 | What does the acronym GAAP stand for in accounting? 1097 | A. General Accepted Accounting Principles 1098 | B. Global Accepted Accounting Procedures 1099 | C. General Applied Accounting Procedures 1100 | D. Global Applied Accounting Principles 1101 | The correct answer is option: A''', 1102 | 1103 | ''' 1104 | What is the basic accounting equation? 1105 | A. Assets = Liabilities + Owner’s Equity 1106 | B. Assets = Liabilities - Owner’s Equity 1107 | C. Assets + Liabilities = Owner’s Equity 1108 | D. Assets - Liabilities = Owner’s Equity 1109 | The correct answer is option: A''', 1110 | 1111 | ''' 1112 | What is a balance sheet used for in accounting? 1113 | A. To record the day-to-day financial transactions 1114 | B. To determine the company's financial position at a specific point in time 1115 | C. To track the company's cash flows 1116 | D. To record the company's sales revenue 1117 | The correct answer is option: B''', 1118 | 1119 | ''' 1120 | Which of the following best describes accrual accounting? 1121 | A. Revenue and expenses are recorded when they are received and paid 1122 | B. Revenue and expenses are recorded when they are earned and incurred 1123 | C. Revenue and expenses are recorded at the end of the financial year 1124 | D. Revenue and expenses are recorded at the start of the financial year 1125 | The correct answer is option: B''', 1126 | 1127 | ''' 1128 | What is the double-entry system in accounting? 1129 | A. Every entry to an account requires a corresponding and opposite entry to a different account 1130 | B. Every entry to an account requires a corresponding and similar entry to a different account 1131 | C. Every entry to an account requires two opposite entries to the same account 1132 | D. Every entry to an account requires two similar entries to the same account 1133 | The correct answer is option: A''', 1134 | 1135 | ''' 1136 | Which financial statement reports a company's revenues and expenses? 1137 | A. Income statement 1138 | B. Balance sheet 1139 | C. Cash flow statement 1140 | D. Statement of retained earnings 1141 | The correct answer is option: A''', 1142 | 1143 | ''' 1144 | What is depreciation in accounting? 1145 | A. An increase in an asset’s value over time 1146 | B. A decrease in an asset’s value over time 1147 | C. An increase in liabilites over time 1148 | D. An increase in equity over time 1149 | The correct answer is option: B''', 1150 | 1151 | ''' 1152 | What does a credit entry represent in a business's accounting records? 1153 | A. An increase in assets or expense accounts, or a decrease in liability, equity, or revenue accounts 1154 | B. A decrease in assets or expense accounts, or an increase in liability, equity, or revenue accounts 1155 | C. A constant in assets or expense accounts, or a variable in liability, equity, or revenue accounts 1156 | D. A variable in assets or expense accounts, or a constant in liability, equity, or revenue accounts 1157 | The correct answer is option: B''', 1158 | 1159 | ''' 1160 | Which of the following terms describes an asset or expense account's normal account balance? 1161 | A. Credit 1162 | B. Debit 1163 | C. Either credit or debit depending on the transaction 1164 | D. Neither credit nor debit 1165 | The correct answer is option: B''' 1166 | ] 1167 | 1168 | 1169 | prompt_q_list_pr = [ 1170 | ''' 1171 | Which of the following is the primary goal of public relations? 1172 | A. Selling products directly 1173 | B. Managing the reputation of an organization 1174 | C. Creating advertisements 1175 | D. Providing customer service 1176 | The correct answer is option: B''', 1177 | 1178 | ''' 1179 | In public relations, what does the term 'stakeholders' refer to? 1180 | A. The highest-ranking members of an organization 1181 | B. All the employees of a company 1182 | C. People or groups with an interest in the organization's affairs 1183 | D. Shareholders of a company 1184 | The correct answer is option: C''', 1185 | 1186 | ''' 1187 | Which one of these is a key tool in public relations? 1188 | A. Press releases 1189 | B. Product placement 1190 | C. Direct selling 1191 | D. Paid online advertising 1192 | The correct answer is option: A''', 1193 | 1194 | ''' 1195 | What is the main purpose of a public relations media kit? 1196 | A. To serve as a brochure for potential customers 1197 | B. To provide journalists with information about an organization or event 1198 | C. To train new employees about the organization's mission 1199 | D. To create an advertisement for social media 1200 | The correct answer is option: B''', 1201 | 1202 | ''' 1203 | What is crisis communication in the field of public relations? 1204 | A. A strategy to deal with a damaging event to an organization's reputation 1205 | B. Communication during a financial crisis 1206 | C. Announcing a crisis to stakeholders 1207 | D. A strategy to prevent crises from happening 1208 | The correct answer is option: A''', 1209 | 1210 | ''' 1211 | Which one of the following is NOT a typical role of a public relations professional? 1212 | A. Writing press releases 1213 | B. Managing social media accounts 1214 | C. Implementing SEO strategies 1215 | D. Producing the organization's financial reports 1216 | The correct answer is option: D''', 1217 | 1218 | ''' 1219 | What is the difference between public relations and advertising? 1220 | A. Public relations is unpaid, advertising is paid 1221 | B. Advertising is unpaid, public relations is paid 1222 | C. There is no difference, they are the same thing 1223 | D. Public relations focuses on sales, advertising on reputation 1224 | The correct answer is option: A''', 1225 | 1226 | ''' 1227 | What does the PR strategy known as "spin" refer to? 1228 | A. Misleading the public by presenting false information 1229 | B. Creating a certain interpretation of an event or action to sway public opinion 1230 | C. Overhyping a product or service 1231 | D. Ignoring negative news or events 1232 | The correct answer is option: B''', 1233 | 1234 | ''' 1235 | In PR, what does a "pitch" typically refer to? 1236 | A. A product demonstration 1237 | B. An argument or dispute with a journalist 1238 | C. A proposal or idea presented to journalists or media outlets 1239 | D. The tone or style of a public statement 1240 | The correct answer is option: C''', 1241 | 1242 | ''' 1243 | What role does social media typically play in public relations today? 1244 | A. Social media has no significant role in public relations 1245 | B. Social media is primarily used for selling products 1246 | C. Social media is primarily used for communicating with journalists 1247 | D. Social media is used for communicating with the public and managing the organization's image 1248 | The correct answer is option: D'''] 1249 | 1250 | 1251 | 1252 | promtp_q_list_college_chemistry = [ 1253 | ''' 1254 | Which of the following is an example of a compound? 1255 | A. Oxygen gas 1256 | B. Water 1257 | C. Argon gas 1258 | D. Hydrogen gas 1259 | The correct answer is option: B''', 1260 | 1261 | ''' 1262 | What is the atomic number of an element? 1263 | A. The number of protons in the nucleus 1264 | B. The number of neutrons in the nucleus 1265 | C. The number of electrons in the atom 1266 | D. The total number of protons and neutrons in the nucleus 1267 | The correct answer is option: A''', 1268 | 1269 | ''' 1270 | What is the name of the negatively charged subatomic particle in an atom? 1271 | A. Proton 1272 | B. Neutron 1273 | C. Electron 1274 | D. Positron 1275 | The correct answer is option: C''', 1276 | 1277 | ''' 1278 | In the periodic table, which of the following is a representative of a halogen? 1279 | A. Oxygen 1280 | B. Sodium 1281 | C. Chlorine 1282 | D. Argon 1283 | The correct answer is option: C''', 1284 | 1285 | ''' 1286 | Which of the following is a characteristic of an ionic bond? 1287 | A. The sharing of electrons between atoms 1288 | B. The transfer of electrons from one atom to another 1289 | C. The presence of a coordinate covalent bond 1290 | D. The formation of a bond between two non-metallic elements 1291 | The correct answer is option: B''', 1292 | 1293 | """ 1294 | What is the principle that states that no two electrons in an atom can have the same set of quantum numbers? 1295 | A. Hund's rule 1296 | B. The aufbau principle 1297 | C. The Pauli exclusion principle 1298 | D. The Heisenberg uncertainty principle 1299 | The correct answer is option: C""", 1300 | 1301 | ''' 1302 | What is the name of the process in which a solid is converted directly into a gas without passing through the liquid state? 1303 | A. Evaporation 1304 | B. Condensation 1305 | C. Sublimation 1306 | D. Deposition 1307 | The correct answer is option: C''', 1308 | 1309 | ''' 1310 | Which of the following is an example of a strong acid? 1311 | A. Acetic acid 1312 | B. Hydrochloric acid 1313 | C. Citric acid 1314 | D. Carbonic acid 1315 | The correct answer is option: B''', 1316 | 1317 | """ 1318 | What is the term for the minimum amount of energy required for a chemical reaction to proceed? 1319 | A. Activation energy 1320 | B. Bond energy 1321 | C. Enthalpy 1322 | D. Gibb's free energy 1323 | The correct answer is option: A""", 1324 | 1325 | ''' 1326 | Amount of heat required to raise the temperature of one gram of a substance by one degree Celsius is? 1327 | A. Specific heat capacity 1328 | B. Heat of fusion 1329 | C. Heat of vaporization 1330 | D. Enthalpy of reaction 1331 | The correct answer is option: A''' 1332 | ] 1333 | --------------------------------------------------------------------------------