├── .gitignore ├── CLAIR_preferences.ipynb ├── LICENSE ├── README.md ├── cache.tar.gz └── images ├── apo-github.png ├── clair-github.png ├── github-clair-notebook.png └── performance-boost.png /.gitignore: -------------------------------------------------------------------------------- 1 | cache/ -------------------------------------------------------------------------------- /CLAIR_preferences.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Contrastive Learning from AI Revisions (CLAIR)\n", 8 | "This notebook accompanies the \"[Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment](https://arxiv.org/abs/2408.06266v1)\" paper.\n", 9 | "\n", 10 | "In this notebook, we will create preference pairs for alignment through contrastive revisions. We use an LLM behind API for the revision process, but we've cached the results so you can run the notebook without API key." 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": [ 17 | "" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "!pip install datasets" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 3, 32 | "metadata": {}, 33 | "outputs": [ 34 | { 35 | "name": "stderr", 36 | "output_type": "stream", 37 | "text": [ 38 | "/env/lib/conda/trl-karel/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", 39 | " from .autonotebook import tqdm as notebook_tqdm\n" 40 | ] 41 | } 42 | ], 43 | "source": [ 44 | "import requests\n", 45 | "from joblib import Memory\n", 46 | "import datasets\n", 47 | "import pandas as pd" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 2, 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "# Set your API Key. If you don't change this variable, we will use caching.\n", 57 | "API_KEY = 'your-openai-api-key'\n", 58 | "\n", 59 | "# Change model and kwargs. Only tested on this exact model.\n", 60 | "model_name = 'gpt-4-0125-preview'\n", 61 | "model_url = 'https://api.openai.com/v1/chat/completions'\n", 62 | "model_kwargs = {\n", 63 | " 'max_tokens': 4096,\n", 64 | " 'temperature': .7\n", 65 | "}" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": 15, 71 | "metadata": {}, 72 | "outputs": [ 73 | { 74 | "name": "stdout", 75 | "output_type": "stream", 76 | "text": [ 77 | "cache/\n", 78 | "cache/joblib/\n", 79 | "cache/joblib/CLAIR_preferences/\n", 80 | "cache/joblib/CLAIR_preferences/query_chat_model/\n", 81 | "cache/joblib/CLAIR_preferences/query_chat_model/5c48bee118c640c35d6ddb1c9f5aad76/\n", 82 | "cache/joblib/CLAIR_preferences/query_chat_model/5c48bee118c640c35d6ddb1c9f5aad76/metadata.json\n", 83 | "cache/joblib/CLAIR_preferences/query_chat_model/5c48bee118c640c35d6ddb1c9f5aad76/output.pkl\n", 84 | "cache/joblib/CLAIR_preferences/query_chat_model/b3782bbc8aa8cb3bcafb65a1f63e45c1/\n", 85 | "cache/joblib/CLAIR_preferences/query_chat_model/b3782bbc8aa8cb3bcafb65a1f63e45c1/metadata.json\n", 86 | "cache/joblib/CLAIR_preferences/query_chat_model/b3782bbc8aa8cb3bcafb65a1f63e45c1/output.pkl\n", 87 | "cache/joblib/CLAIR_preferences/query_chat_model/71f6587e578726353beb1a979163c162/\n", 88 | "cache/joblib/CLAIR_preferences/query_chat_model/71f6587e578726353beb1a979163c162/output.pkl\n", 89 | "cache/joblib/CLAIR_preferences/query_chat_model/71f6587e578726353beb1a979163c162/metadata.json\n", 90 | "cache/joblib/CLAIR_preferences/query_chat_model/ed39ac7228e1cf303cc3fb64d33f6239/\n", 91 | "cache/joblib/CLAIR_preferences/query_chat_model/ed39ac7228e1cf303cc3fb64d33f6239/output.pkl\n", 92 | "cache/joblib/CLAIR_preferences/query_chat_model/ed39ac7228e1cf303cc3fb64d33f6239/metadata.json\n", 93 | "cache/joblib/CLAIR_preferences/query_chat_model/2c21b538571ba66c76ea680a3080b3a8/\n", 94 | "cache/joblib/CLAIR_preferences/query_chat_model/2c21b538571ba66c76ea680a3080b3a8/metadata.json\n", 95 | "cache/joblib/CLAIR_preferences/query_chat_model/2c21b538571ba66c76ea680a3080b3a8/output.pkl\n", 96 | "cache/joblib/CLAIR_preferences/query_chat_model/0401e23c9c74865594818e148c796043/\n", 97 | "cache/joblib/CLAIR_preferences/query_chat_model/0401e23c9c74865594818e148c796043/output.pkl\n", 98 | "cache/joblib/CLAIR_preferences/query_chat_model/0401e23c9c74865594818e148c796043/metadata.json\n", 99 | "cache/joblib/CLAIR_preferences/query_chat_model/5215d6a342d83047bb7e00a8fa51de4b/\n", 100 | "cache/joblib/CLAIR_preferences/query_chat_model/5215d6a342d83047bb7e00a8fa51de4b/metadata.json\n", 101 | "cache/joblib/CLAIR_preferences/query_chat_model/5215d6a342d83047bb7e00a8fa51de4b/output.pkl\n", 102 | "cache/joblib/CLAIR_preferences/query_chat_model/4132a4b77dac4d9f04873e07d2caae06/\n", 103 | "cache/joblib/CLAIR_preferences/query_chat_model/4132a4b77dac4d9f04873e07d2caae06/metadata.json\n", 104 | "cache/joblib/CLAIR_preferences/query_chat_model/4132a4b77dac4d9f04873e07d2caae06/output.pkl\n", 105 | "cache/joblib/CLAIR_preferences/query_chat_model/8e5f73cb4cebc1cf4883d4c44154179a/\n", 106 | "cache/joblib/CLAIR_preferences/query_chat_model/8e5f73cb4cebc1cf4883d4c44154179a/metadata.json\n", 107 | "cache/joblib/CLAIR_preferences/query_chat_model/8e5f73cb4cebc1cf4883d4c44154179a/output.pkl\n", 108 | "cache/joblib/CLAIR_preferences/query_chat_model/113a851541a62f046d3b62d295920e1c/\n", 109 | "cache/joblib/CLAIR_preferences/query_chat_model/113a851541a62f046d3b62d295920e1c/output.pkl\n", 110 | "cache/joblib/CLAIR_preferences/query_chat_model/113a851541a62f046d3b62d295920e1c/metadata.json\n", 111 | "cache/joblib/CLAIR_preferences/query_chat_model/60c78454e64f2af0b229641b353466df/\n", 112 | "cache/joblib/CLAIR_preferences/query_chat_model/60c78454e64f2af0b229641b353466df/output.pkl\n", 113 | "cache/joblib/CLAIR_preferences/query_chat_model/60c78454e64f2af0b229641b353466df/metadata.json\n", 114 | "cache/joblib/CLAIR_preferences/query_chat_model/8cd2bb111f19296e6f2443c823bb28e3/\n", 115 | "cache/joblib/CLAIR_preferences/query_chat_model/8cd2bb111f19296e6f2443c823bb28e3/metadata.json\n", 116 | "cache/joblib/CLAIR_preferences/query_chat_model/8cd2bb111f19296e6f2443c823bb28e3/output.pkl\n", 117 | "cache/joblib/CLAIR_preferences/query_chat_model/941ad34ae42a38b8b104eb2306c87503/\n", 118 | "cache/joblib/CLAIR_preferences/query_chat_model/941ad34ae42a38b8b104eb2306c87503/output.pkl\n", 119 | "cache/joblib/CLAIR_preferences/query_chat_model/941ad34ae42a38b8b104eb2306c87503/metadata.json\n", 120 | "cache/joblib/CLAIR_preferences/query_chat_model/067d8d9ec4d4d0d3a41fa2ab8aeb79dd/\n", 121 | "cache/joblib/CLAIR_preferences/query_chat_model/067d8d9ec4d4d0d3a41fa2ab8aeb79dd/metadata.json\n", 122 | "cache/joblib/CLAIR_preferences/query_chat_model/067d8d9ec4d4d0d3a41fa2ab8aeb79dd/output.pkl\n", 123 | "cache/joblib/CLAIR_preferences/query_chat_model/70fce2c6dd42d48796d207602b993991/\n", 124 | "cache/joblib/CLAIR_preferences/query_chat_model/70fce2c6dd42d48796d207602b993991/output.pkl\n", 125 | "cache/joblib/CLAIR_preferences/query_chat_model/70fce2c6dd42d48796d207602b993991/metadata.json\n", 126 | "cache/joblib/CLAIR_preferences/query_chat_model/485820758b7d066796f50bc841854da5/\n", 127 | "cache/joblib/CLAIR_preferences/query_chat_model/485820758b7d066796f50bc841854da5/metadata.json\n", 128 | "cache/joblib/CLAIR_preferences/query_chat_model/485820758b7d066796f50bc841854da5/output.pkl\n", 129 | "cache/joblib/CLAIR_preferences/query_chat_model/b333765d6d727089100309db9b221734/\n", 130 | "cache/joblib/CLAIR_preferences/query_chat_model/b333765d6d727089100309db9b221734/metadata.json\n", 131 | "cache/joblib/CLAIR_preferences/query_chat_model/b333765d6d727089100309db9b221734/output.pkl\n", 132 | "cache/joblib/CLAIR_preferences/query_chat_model/a61ff4255f44bc7e9eb4072bdd7e05af/\n", 133 | "cache/joblib/CLAIR_preferences/query_chat_model/a61ff4255f44bc7e9eb4072bdd7e05af/metadata.json\n", 134 | "cache/joblib/CLAIR_preferences/query_chat_model/a61ff4255f44bc7e9eb4072bdd7e05af/output.pkl\n", 135 | "cache/joblib/CLAIR_preferences/query_chat_model/bba07a4fc243d2eed5a4b36b667cb01d/\n", 136 | "cache/joblib/CLAIR_preferences/query_chat_model/bba07a4fc243d2eed5a4b36b667cb01d/metadata.json\n", 137 | "cache/joblib/CLAIR_preferences/query_chat_model/bba07a4fc243d2eed5a4b36b667cb01d/output.pkl\n", 138 | "cache/joblib/CLAIR_preferences/query_chat_model/27846542dbcfe2d7ca61dad3d43549a4/\n", 139 | "cache/joblib/CLAIR_preferences/query_chat_model/27846542dbcfe2d7ca61dad3d43549a4/output.pkl\n", 140 | "cache/joblib/CLAIR_preferences/query_chat_model/27846542dbcfe2d7ca61dad3d43549a4/metadata.json\n", 141 | "cache/joblib/CLAIR_preferences/query_chat_model/func_code.py\n" 142 | ] 143 | } 144 | ], 145 | "source": [ 146 | "# Set up the cache\n", 147 | "import os\n", 148 | "if not os.path.isfile('cache.tar.gz'):\n", 149 | " !wget https://github.com/ContextualAI/CLAIR_and_APO/raw/master/cache.tar.gz\n", 150 | "!tar -xzvf cache.tar.gz " 151 | ] 152 | }, 153 | { 154 | "cell_type": "markdown", 155 | "metadata": {}, 156 | "source": [ 157 | "Helper functions that can be ignored for now:" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": 13, 163 | "metadata": {}, 164 | "outputs": [], 165 | "source": [ 166 | "# Setup the cache directory\n", 167 | "memory = Memory(\"./cache\", verbose=0)\n", 168 | "\n", 169 | "# Get joblib cache working in notebooks\n", 170 | "# source: https://stackoverflow.com/questions/75202475/joblib-persistence-across-sessions-machines\n", 171 | "def cache(mem, module, **mem_kwargs):\n", 172 | " def cache_(f):\n", 173 | " f.__module__ = module\n", 174 | " f.__qualname__ = f.__name__\n", 175 | " return mem.cache(f, **mem_kwargs)\n", 176 | " return cache_\n", 177 | "\n", 178 | "# extract existing preferences from ultrafeedback\n", 179 | "def get_preferences_from_ultrafeedback(dataset):\n", 180 | " instruction = []\n", 181 | " chosen = []\n", 182 | " rejected = []\n", 183 | "\n", 184 | " for _, row in dataset.iterrows():\n", 185 | " response = [x[\"response\"] for x in row[\"completions\"]]\n", 186 | " score = [x[\"overall_score\"] for x in row[\"completions\"]]\n", 187 | "\n", 188 | " if len(score):\n", 189 | " chosen_index = score.index(max(score))\n", 190 | " rejected_index = score.index(min(score))\n", 191 | "\n", 192 | " instruction.append(row[\"instruction\"])\n", 193 | " chosen.append(response[chosen_index])\n", 194 | " rejected.append(response[rejected_index])\n", 195 | "\n", 196 | " return pd.DataFrame.from_dict({\n", 197 | " \"text\": instruction,\n", 198 | " \"rejected\": rejected,\n", 199 | " \"chosen\": chosen \n", 200 | " })\n", 201 | "\n", 202 | "# Visualize a preference triple\n", 203 | "def visualize_triple(triple: dict):\n", 204 | " print('---TEXT (first 400 characters):\\n')\n", 205 | " print(triple['text'][:400])\n", 206 | " print('---REJECTED (first 400 characters):\\n')\n", 207 | " print(triple['rejected'][:400])\n", 208 | " print('---CHOSEN (first 400 characters):\\n')\n", 209 | " print(triple['chosen'][:400])\n", 210 | " if 'rational' in triple:\n", 211 | " print('---REVISION RATIONAL (first 400 characters):\\n')\n", 212 | " print(triple['rational'][:400])\n", 213 | "\n", 214 | "\n", 215 | "@cache(memory, \"CLAIR_preferences\")\n", 216 | "def query_chat_model(user_prompt, system_prompt='', url='https://api.openai.com/v1/chat/completions', model_name='gpt-4-0125-preview'):\n", 217 | " print(f\"Querying {model_name} API at {url}...\")\n", 218 | " headers = {\n", 219 | " \"Content-Type\": \"application/json\",\n", 220 | " \"Authorization\": f\"Bearer {API_KEY}\"\n", 221 | " }\n", 222 | " data = {\n", 223 | " \"model\": model_name,\n", 224 | " \"messages\": [\n", 225 | " {\"role\": \"system\", \"content\": system_prompt},\n", 226 | " {\"role\": \"user\", \"content\": user_prompt}\n", 227 | " ],\n", 228 | " **model_kwargs\n", 229 | " }\n", 230 | " response = requests.post(url, headers=headers, json=data)\n", 231 | " \n", 232 | " if response.status_code == 200:\n", 233 | " result = response.json()\n", 234 | " return result\n", 235 | " else:\n", 236 | " raise Exception(f\"API request failed with status code {response.status_code}: {response.text}\")\n", 237 | "\n", 238 | "# Parse revisions\n", 239 | "def get_revision_from_response(response):\n", 240 | " raw_completion = response['choices'][0]['message']['content']\n", 241 | "\n", 242 | " if \"**Corrected Student Solution:**\" in raw_completion:\n", 243 | " splits = raw_completion.split(\"**Corrected Student Solution:**\")\n", 244 | " elif \"{corrected_student_solution}:\" in raw_completion:\n", 245 | " splits = raw_completion.split(\"{corrected_student_solution}:\")\n", 246 | " elif \"{corrected_student_solution}\" in raw_completion:\n", 247 | " splits = raw_completion.split(\"{corrected_student_solution}\")\n", 248 | " elif \"**Worsened Student Solution:**\" in raw_completion:\n", 249 | " splits = raw_completion.split(\"**Worsened Student Solution:**\")\n", 250 | " elif \"{worsened_student_solution}:\" in raw_completion:\n", 251 | " splits = raw_completion.split(\"{worsened_student_solution}:\")\n", 252 | " elif \"{worsened_student_solution}\" in raw_completion:\n", 253 | " splits = raw_completion.split(\"{worsened_student_solution}\")\n", 254 | " \n", 255 | " if len(splits) >= 2: \n", 256 | " edit = splits[1]\n", 257 | " edit = edit.strip('\\n\\n').strip()\n", 258 | "\n", 259 | " rational = splits[0]\n", 260 | " if '{teacher_reasoning}' in rational:\n", 261 | " rational = rational.split('{teacher_reasoning}')[1].strip(':').strip()\n", 262 | " rational = rational.strip('\\n\\n').strip()\n", 263 | " else:\n", 264 | " Exception('Failed to parse response')\n", 265 | " return edit, rational\n", 266 | "\n" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": {}, 272 | "source": [ 273 | "## Load data\n", 274 | "We will load data from an existing dataset. Alternatively, you can use a Language Model to generate your own data.\n", 275 | "\n", 276 | "In this notebook, we will use prompts and existing responses from the UltraFeedback dataset. UltraFeedback already contains preference pairs, against which we can compare our CLAIR preferences." 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": 25, 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "ultraf = datasets.load_dataset('openbmb/UltraFeedback')['train'].to_pandas()\n", 286 | "ultraf = ultraf[:20] # take first 20 examples\n", 287 | "ultraf = get_preferences_from_ultrafeedback(ultraf)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 26, 293 | "metadata": {}, 294 | "outputs": [ 295 | { 296 | "name": "stdout", 297 | "output_type": "stream", 298 | "text": [ 299 | "---TEXT (first 400 characters):\n", 300 | "\n", 301 | "Identify the interrelated economic, political, and social factors that contributed to the stock market crash of 1929, including but not limited to the impact of World War I on the global economy, the role of government policies such as the Smoot-Hawley Tariff Act, the effects of speculative investment practices and margin trading, and the socioeconomic disparities of the time period. Additionally,\n", 302 | "---REJECTED (first 400 characters):\n", 303 | "\n", 304 | "Sure, I'd be happy to help you learn about the causes and effects of the 1929 stock market crash and how it compares to other financial crises.\n", 305 | "\n", 306 | "The stock market crash of 1929 was a significant event that occurred on October 29, known as Black Tuesday. This event marked the beginning of the Great Depression, a period of economic downturn and unemployment that lasted for nearly a decade. There were\n", 307 | "---CHOSEN (first 400 characters):\n", 308 | "\n", 309 | "The stock market crash of 1929 was a result of a complex interplay of economic, political, and social factors. The impact of World War I on the global economy, government policies such as the Smoot-Hawley Tariff Act, speculative investment practices, and socioeconomic disparities all contributed to the crisis.\n", 310 | "The aftermath of World War I left many European countries in a state of economic ruin. T\n" 311 | ] 312 | } 313 | ], 314 | "source": [ 315 | "# look at an example\n", 316 | "visualize_triple(ultraf.loc[2])" 317 | ] 318 | }, 319 | { 320 | "cell_type": "markdown", 321 | "metadata": {}, 322 | "source": [ 323 | "## Revise and Improve\n", 324 | "We will contrastively revise and improve answers using an LLM over API. We'll start from the `rejected` answers in UltraFeedback." 325 | ] 326 | }, 327 | { 328 | "cell_type": "code", 329 | "execution_count": 27, 330 | "metadata": {}, 331 | "outputs": [ 332 | { 333 | "name": "stdout", 334 | "output_type": "stream", 335 | "text": [ 336 | "You are a teacher and your task is to minimally improve a student's answer. I will give you a {{task}} and a {{student_solution}}. Your job is to revise the {{student_solution}} such that it is clearer, more correct, and more engaging. Copy all non-corrected parts of the student's answer. Do not allude to the {{corrected_student_solution}} being a revision or a correction in your final solution.\n", 337 | "\n", 338 | "{{task}}: {input}\n", 339 | "\n", 340 | "{{student_solution}}: {output}\n", 341 | "\n", 342 | "-----------------\n", 343 | "\n", 344 | "Let's first think step by step with a {{teacher_reasoning}} to decide how to improve the {{student_solution}}, then give the {{corrected_student_solution}}. Mention the {{teacher_reasoning}} and {{corrected_student_solution}} identifiers to structure your answer.\n", 345 | "\n", 346 | "\n" 347 | ] 348 | } 349 | ], 350 | "source": [ 351 | "revision_template = \"\"\"You are a teacher and your task is to minimally improve a student's answer. I will give you a {{task}} and a {{student_solution}}. Your job is to revise the {{student_solution}} such that it is clearer, more correct, and more engaging. Copy all non-corrected parts of the student's answer. Do not allude to the {{corrected_student_solution}} being a revision or a correction in your final solution.\\n\\n{{task}}: {input}\\n\\n{{student_solution}}: {output}\\n\\n-----------------\\n\\nLet's first think step by step with a {{teacher_reasoning}} to decide how to improve the {{student_solution}}, then give the {{corrected_student_solution}}. Mention the {{teacher_reasoning}} and {{corrected_student_solution}} identifiers to structure your answer.\\n\\n\"\"\"\n", 352 | "\n", 353 | "# let's look at the revision prompt\n", 354 | "print(revision_template)" 355 | ] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "execution_count": 28, 360 | "metadata": {}, 361 | "outputs": [ 362 | { 363 | "name": "stdout", 364 | "output_type": "stream", 365 | "text": [ 366 | "You are a teacher and your task is to minimally improve a student's answer. I will give you a {task} and a {student_solution}. Your job is to revise the {student_solution} such that it is clearer, more correct, and more engaging. Copy all non-corrected parts of the student's answer. Do not allude to the {corrected_student_solution} being a revision or a correction in your final solution.\n", 367 | "\n", 368 | "{task}: Identify the interrelated economic, political, and social factors that contributed to the stock market crash of 1929, including but not limited to the impact of World War I on the global economy, the role of government policies such as the Smoot-Hawley Tariff Act, the effects of speculative investment practices and margin trading, and the socioeconomic disparities of the time period. Additionally, provide a comparative analysis of the causes and effects of the 1929 stock market crash with other notable financial crises in history, such as the Great Depression of the 1930s and the 2008 global financial crisis. Use primary and secondary sources to support your analysis and provide data visualizations, such as graphs and charts, to illustrate your findings.\n", 369 | "\n", 370 | "{student_solution}: Sure, I'd be happy to help you learn about the causes and effects of the 1929 stock market crash and how it compares to other financial crises.\n", 371 | "\n", 372 | "The stock market crash of 1929 was a significant event that occurred on October 29, known as Black Tuesday. This event marked the beginning of the Great Depression, a period of economic downturn and unemployment that lasted for nearly a decade. There were several interrelated economic, political, and social factors that contributed to the 1929 stock market crash.\n", 373 | "\n", 374 | "Firstly, the impact of World War I on the global economy played a significant role. After the war, many European countries were left with large debts and were unable to invest in their economies. This led to a decrease in demand for American products and a decrease in international trade.\n", 375 | "\n", 376 | "In terms of government policies, the Smoot-Hawley Tariff Act of 1930 is often cited as a contributing factor to the crash. This act increased tariffs on foreign goods, which ultimately reduced international trade and led to retaliatory tariffs from other countries. This trade policy had negative effects on U.S. exporters and contributed to the economic downturn.\n", 377 | "\n", 378 | "Secondly, speculative investment practices and margin trading were also contributing factors. Margin trading allowed investors to borrow money to purchase stocks, which led to an artificial increase in demand for stocks. This, in turn, created a speculative bubble that eventually burst, leading to a significant decline in stock prices.\n", 379 | "\n", 380 | "Additionally, socioeconomic disparities of the time period also played a role in the crash. The 1920s was a period of significant wealth inequality, where the wealthy few were benefiting from the boom while the rest of the population struggled.\n", 381 | "\n", 382 | "A comparative analysis of the causes and effects of the 1929 stock market crash with other notable financial crises in history, such as the Great Depression of the 1930s and the 2008 global financial crisis, shows several similarities. In both cases, there were interrelated economic factors, such as the role of government policies and market speculation, that contributed to the severity of the financial crisis.\n", 383 | "\n", 384 | "The 1929 stock market crash and the subsequent Great Depression also had significant social impacts, including high unemployment rates and a loss of public confidence in the economy. Similarly, the 2008 global financial crisis resulted in a significant loss of wealth and the failure of several large financial institutions.\n", 385 | "\n", 386 | "In conclusion, multiple economic, political, and social factors contributed to the stock market crash of 1929, which ultimately led to the Great Depression. This event serves as a stark example of the negative consequences of interrelated economic factors, government policies, and market speculation. By understanding the causes and effects of the 1929 stock market crash, we can better prepare for and prevent similar financial crises in the future.\n", 387 | "\n", 388 | "-----------------\n", 389 | "\n", 390 | "Let's first think step by step with a {teacher_reasoning} to decide how to improve the {student_solution}, then give the {corrected_student_solution}. Mention the {teacher_reasoning} and {corrected_student_solution} identifiers to structure your answer.\n", 391 | "\n", 392 | "\n" 393 | ] 394 | } 395 | ], 396 | "source": [ 397 | "# create the prompts using data from ultrafeedback\n", 398 | "prompts = []\n", 399 | "prompts_text = []\n", 400 | "prompts_rejected = []\n", 401 | "\n", 402 | "for _, triple in ultraf.iterrows():\n", 403 | " prompts.append(revision_template.format(input=triple['text'], output=triple['rejected']))\n", 404 | " prompts_text.append(triple['text'])\n", 405 | " prompts_rejected.append(triple['rejected'])\n", 406 | "\n", 407 | "# visualize prompt\n", 408 | "print(prompts[2])\n", 409 | " " 410 | ] 411 | }, 412 | { 413 | "cell_type": "code", 414 | "execution_count": 29, 415 | "metadata": {}, 416 | "outputs": [], 417 | "source": [ 418 | "# query the API and parse the responses\n", 419 | "# For the first 20 examples, this code will rely on cached revisions\n", 420 | "preferences = []\n", 421 | "for prompt, prompt_text, prompt_rejected in zip(prompts, prompts_text, prompts_rejected):\n", 422 | " try:\n", 423 | " response = query_chat_model(prompt, model_name=model_name, url=model_url)\n", 424 | " revision, revision_rational = get_revision_from_response(response)\n", 425 | "\n", 426 | " preferences.append({\n", 427 | " 'text': prompt_text,\n", 428 | " 'rejected': prompt_rejected,\n", 429 | " 'chosen': revision,\n", 430 | " 'rational': revision_rational,\n", 431 | " })\n", 432 | " except Exception as e:\n", 433 | " # don't block on exception\n", 434 | " print(e)\n", 435 | "\n", 436 | "# Turn this into a dataset\n", 437 | "ultraf_revisions = pd.DataFrame.from_records(preferences)" 438 | ] 439 | }, 440 | { 441 | "cell_type": "code", 442 | "execution_count": 30, 443 | "metadata": {}, 444 | "outputs": [ 445 | { 446 | "name": "stdout", 447 | "output_type": "stream", 448 | "text": [ 449 | "---TEXT (first 400 characters):\n", 450 | "\n", 451 | "Identify the interrelated economic, political, and social factors that contributed to the stock market crash of 1929, including but not limited to the impact of World War I on the global economy, the role of government policies such as the Smoot-Hawley Tariff Act, the effects of speculative investment practices and margin trading, and the socioeconomic disparities of the time period. Additionally,\n", 452 | "---REJECTED (first 400 characters):\n", 453 | "\n", 454 | "Sure, I'd be happy to help you learn about the causes and effects of the 1929 stock market crash and how it compares to other financial crises.\n", 455 | "\n", 456 | "The stock market crash of 1929 was a significant event that occurred on October 29, known as Black Tuesday. This event marked the beginning of the Great Depression, a period of economic downturn and unemployment that lasted for nearly a decade. There were\n", 457 | "---CHOSEN (first 400 characters):\n", 458 | "\n", 459 | "The stock market crash of 1929, culminating on October 29, known as \"Black Tuesday,\" was a pivotal event marking the onset of the Great Depression. This period of economic distress, characterized by widespread unemployment and hardship, was influenced by a complex interplay of economic, political, and social factors.\n", 460 | "\n", 461 | "**Economic Factors:** The aftermath of World War I significantly strained the gl\n", 462 | "---REVISION RATIONAL (first 400 characters):\n", 463 | "\n", 464 | "The student's answer provides a decent overview of the stock market crash of 1929, linking it to economic, political, and social factors. However, there are several areas where improvements could make the answer clearer, more correct, and more engaging:\n", 465 | "\n", 466 | "1. **Accuracy and Detail:** The student mentions the Smoot-Hawley Tariff Act as a contributing factor to the crash but inaccurately dates it to 1\n" 467 | ] 468 | } 469 | ], 470 | "source": [ 471 | "# Visualize an edit\n", 472 | "visualize_triple(ultraf_revisions.loc[2])" 473 | ] 474 | }, 475 | { 476 | "cell_type": "code", 477 | "execution_count": 31, 478 | "metadata": {}, 479 | "outputs": [ 480 | { 481 | "data": { 482 | "text/html": [ 483 | "
\n", 501 | " | text | \n", 502 | "rejected | \n", 503 | "chosen | \n", 504 | "rational | \n", 505 | "
---|---|---|---|---|
0 | \n", 510 | "Can you write a C++ program that prompts the u... | \n", 511 | "Sure, here is the program using the C++11 algo... | \n", 512 | "```cpp\\n#include <iostream>\\n#include <string>... | \n", 513 | "The student's solution seems to misunderstand ... | \n", 514 | "
1 | \n", 517 | "Suppose you are a content creator and want to ... | \n", 518 | "To use GPT for generating compelling titles an... | \n", 519 | "To use GPT for automatically generating compel... | \n", 520 | "The student's solution provides a comprehensiv... | \n", 521 | "
2 | \n", 524 | "Identify the interrelated economic, political,... | \n", 525 | "Sure, I'd be happy to help you learn about the... | \n", 526 | "The stock market crash of 1929, culminating on... | \n", 527 | "The student's answer provides a decent overvie... | \n", 528 | "
3 | \n", 531 | "How can I convert the decimal number 31 to bin... | \n", 532 | "Sure, I'd be happy to help with that! \\n\\nIn J... | \n", 533 | "Sure, I'd be happy to help with that!\\n\\nIn Ja... | \n", 534 | "The student's solution contains a couple of in... | \n", 535 | "
4 | \n", 538 | "Can you modify the C++ code provided below to ... | \n", 539 | "Sure, I can help you with that. Here's how you... | \n", 540 | "To generate the first 20 Fibonacci numbers in ... | \n", 541 | "The student's solution provides a basic outlin... | \n", 542 | "