├── 01_Keyword_Search.ipynb ├── 02_Embeddings.ipynb ├── 03_Dense_Retrieval.ipynb ├── 04_Rerank.ipynb ├── 05_Generative_Search.ipynb ├── Images ├── Dense_Retrival.png ├── Dense_Retrival_Weakness.png ├── Embeddings.png ├── Evaluating_Search_Systems.png ├── Hybrid_Search.png ├── Keyword_Search.png ├── LLM_for_keyword_search.png ├── Limitations_of_Keyword_Matching.png ├── Re-Rank.png └── Vector_Databases.png ├── components.png ├── cover.jpg └── readme.md /01_Keyword_Search.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "45b98f11", 6 | "metadata": {}, 7 | "source": [ 8 | "# Keyword Search" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "06335224", 14 | "metadata": {}, 15 | "source": [ 16 | "## Setup\n", 17 | "\n", 18 | "Load needed API keys and relevant Python libaries." 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": null, 24 | "id": "c54004a6", 25 | "metadata": { 26 | "height": 47 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "# !pip install cohere\n", 31 | "# !pip install weaviate-client" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "id": "45ff94ae-8603-431a-acee-0ec3a39a1056", 38 | "metadata": { 39 | "height": 81 40 | }, 41 | "outputs": [], 42 | "source": [ 43 | "import os\n", 44 | "from dotenv import load_dotenv, find_dotenv\n", 45 | "_ = load_dotenv(find_dotenv()) # read local .env file" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "id": "c23677b6", 51 | "metadata": {}, 52 | "source": [ 53 | "Let's start by imporing Weaviate to access the Wikipedia database." 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": null, 59 | "id": "f44886bf-f8cf-4f90-83ae-2299ec448e16", 60 | "metadata": { 61 | "height": 98 62 | }, 63 | "outputs": [], 64 | "source": [ 65 | "import weaviate\n", 66 | "auth_config = weaviate.auth.AuthApiKey(\n", 67 | " api_key=os.environ['WEAVIATE_API_KEY'])\n", 68 | "\n" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": null, 74 | "id": "750583ec-7263-4927-b484-9818fba3318b", 75 | "metadata": { 76 | "height": 149 77 | }, 78 | "outputs": [], 79 | "source": [ 80 | "client = weaviate.Client(\n", 81 | " url=os.environ['WEAVIATE_API_URL'],\n", 82 | " auth_client_secret=auth_config,\n", 83 | " additional_headers={\n", 84 | " \"X-Cohere-Api-Key\": os.environ['COHERE_API_KEY'],\n", 85 | " }\n", 86 | ")" 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": null, 92 | "id": "2efcdfdc-1566-4046-95d8-f9f2f3a36088", 93 | "metadata": { 94 | "height": 30 95 | }, 96 | "outputs": [], 97 | "source": [ 98 | "client.is_ready() " 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "id": "78acc1ed", 104 | "metadata": {}, 105 | "source": [ 106 | "# Keyword Search" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "id": "017cd922-c729-4aa7-8696-d11d28eb7288", 113 | "metadata": { 114 | "height": 421 115 | }, 116 | "outputs": [], 117 | "source": [ 118 | "def keyword_search(query,\n", 119 | " results_lang='en',\n", 120 | " properties = [\"title\",\"url\",\"text\"],\n", 121 | " num_results=3):\n", 122 | "\n", 123 | " where_filter = {\n", 124 | " \"path\": [\"lang\"],\n", 125 | " \"operator\": \"Equal\",\n", 126 | " \"valueString\": results_lang\n", 127 | " }\n", 128 | " \n", 129 | " response = (\n", 130 | " client.query.get(\"Articles\", properties)\n", 131 | " .with_bm25(\n", 132 | " query=query\n", 133 | " )\n", 134 | " .with_where(where_filter)\n", 135 | " .with_limit(num_results)\n", 136 | " .do()\n", 137 | " )\n", 138 | "\n", 139 | " result = response['data']['Get']['Articles']\n", 140 | " return result" 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "id": "5a9e70d9-326f-443c-915a-5a97c0dfe54b", 147 | "metadata": { 148 | "collapsed": true, 149 | "height": 81, 150 | "jupyter": { 151 | "outputs_hidden": true 152 | } 153 | }, 154 | "outputs": [], 155 | "source": [ 156 | "query = \"What is the most viewed televised event?\"\n", 157 | "keyword_search_results = keyword_search(query)\n", 158 | "print(keyword_search_results)" 159 | ] 160 | }, 161 | { 162 | "cell_type": "markdown", 163 | "id": "b15686a7", 164 | "metadata": {}, 165 | "source": [ 166 | "### Try modifying the search options\n", 167 | "- Other languages to try: `en, de, fr, es, it, ja, ar, zh, ko, hi`" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": null, 173 | "id": "260397fc-2799-426d-8b82-afbe9c3a126e", 174 | "metadata": { 175 | "height": 47 176 | }, 177 | "outputs": [], 178 | "source": [ 179 | "properties = [\"text\", \"title\", \"url\", \n", 180 | " \"views\", \"lang\"]" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "id": "2a2daa0b-dd23-4ffd-8b3d-d08b73e052b9", 187 | "metadata": { 188 | "height": 166 189 | }, 190 | "outputs": [], 191 | "source": [ 192 | "def print_result(result):\n", 193 | " \"\"\" Print results with colorful formatting \"\"\"\n", 194 | " for i,item in enumerate(result):\n", 195 | " print(f'item {i}')\n", 196 | " for key in item.keys():\n", 197 | " print(f\"{key}:{item.get(key)}\")\n", 198 | " print()\n", 199 | " print()" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "id": "6bdb1635-adad-411b-8059-d6b8b251fb2c", 206 | "metadata": { 207 | "collapsed": true, 208 | "height": 30, 209 | "jupyter": { 210 | "outputs_hidden": true 211 | }, 212 | "scrolled": true 213 | }, 214 | "outputs": [], 215 | "source": [ 216 | "print_result(keyword_search_results)" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": null, 222 | "id": "2350fa0e-f896-42af-bcb1-a7577f9c9be1", 223 | "metadata": { 224 | "height": 81 225 | }, 226 | "outputs": [], 227 | "source": [ 228 | "query = \"What is the most viewed televised event?\"\n", 229 | "keyword_search_results = keyword_search(query, results_lang='de')\n", 230 | "print_result(keyword_search_results)" 231 | ] 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": null, 236 | "id": "99269036-ec4e-4852-85fc-19593df6e638", 237 | "metadata": { 238 | "height": 30 239 | }, 240 | "outputs": [], 241 | "source": [] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": null, 246 | "id": "64bb115e-bfe4-4d0c-99a6-a8ef980365f7", 247 | "metadata": { 248 | "height": 30 249 | }, 250 | "outputs": [], 251 | "source": [] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "id": "f40ca604-5ab7-4f35-86b1-0420842b6b6b", 256 | "metadata": {}, 257 | "source": [ 258 | "## How to get your own API key\n", 259 | "\n", 260 | "For this course, an API key is provided for you. If you would like to develop projects with Cohere's API outside of this classroom, you can register for an API key [here](https://dashboard.cohere.ai/welcome/register?utm_source=partner&utm_medium=website&utm_campaign=DeeplearningAI)." 261 | ] 262 | } 263 | ], 264 | "metadata": { 265 | "kernelspec": { 266 | "display_name": "Python 3 (ipykernel)", 267 | "language": "python", 268 | "name": "python3" 269 | }, 270 | "language_info": { 271 | "codemirror_mode": { 272 | "name": "ipython", 273 | "version": 3 274 | }, 275 | "file_extension": ".py", 276 | "mimetype": "text/x-python", 277 | "name": "python", 278 | "nbconvert_exporter": "python", 279 | "pygments_lexer": "ipython3", 280 | "version": "3.9.17" 281 | } 282 | }, 283 | "nbformat": 4, 284 | "nbformat_minor": 5 285 | } 286 | -------------------------------------------------------------------------------- /02_Embeddings.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "11cc2042", 6 | "metadata": {}, 7 | "source": [ 8 | "# Lesson 2: Embeddings" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "14c47bc7", 14 | "metadata": {}, 15 | "source": [ 16 | "### Setup\n", 17 | "Load needed API keys and relevant Python libaries." 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "id": "8831c1e6", 24 | "metadata": { 25 | "height": 30 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "# !pip install cohere umap-learn altair datasets" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": null, 35 | "id": "23d4483b", 36 | "metadata": { 37 | "height": 81 38 | }, 39 | "outputs": [], 40 | "source": [ 41 | "import os\n", 42 | "from dotenv import load_dotenv, find_dotenv\n", 43 | "_ = load_dotenv(find_dotenv()) # read local .env file" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": null, 49 | "id": "14cee683", 50 | "metadata": { 51 | "height": 47 52 | }, 53 | "outputs": [], 54 | "source": [ 55 | "import cohere\n", 56 | "co = cohere.Client(os.environ['COHERE_API_KEY'])" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "id": "7cd186b9", 63 | "metadata": { 64 | "height": 30 65 | }, 66 | "outputs": [], 67 | "source": [ 68 | "import pandas as pd" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "id": "a66e9572", 74 | "metadata": {}, 75 | "source": [ 76 | "## Word Embeddings\n", 77 | "\n", 78 | "Consider a very small dataset of three words." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "id": "3ab6a806", 85 | "metadata": { 86 | "height": 149 87 | }, 88 | "outputs": [], 89 | "source": [ 90 | "three_words = pd.DataFrame({'text':\n", 91 | " [\n", 92 | " 'joy',\n", 93 | " 'happiness',\n", 94 | " 'potato'\n", 95 | " ]})\n", 96 | "\n", 97 | "three_words" 98 | ] 99 | }, 100 | { 101 | "cell_type": "markdown", 102 | "id": "e27c4adb", 103 | "metadata": {}, 104 | "source": [ 105 | "Let's create the embeddings for the three words:" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": null, 111 | "id": "22d30ec7", 112 | "metadata": { 113 | "height": 64 114 | }, 115 | "outputs": [], 116 | "source": [ 117 | "three_words_emb = co.embed(texts=list(three_words['text']),\n", 118 | " model='embed-english-v2.0').embeddings" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": null, 124 | "id": "5f179777", 125 | "metadata": { 126 | "height": 64 127 | }, 128 | "outputs": [], 129 | "source": [ 130 | "word_1 = three_words_emb[0]\n", 131 | "word_2 = three_words_emb[1]\n", 132 | "word_3 = three_words_emb[2]" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": null, 138 | "id": "69da1290", 139 | "metadata": { 140 | "height": 30 141 | }, 142 | "outputs": [], 143 | "source": [ 144 | "word_1[:10]" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "id": "81adb0d9", 150 | "metadata": {}, 151 | "source": [ 152 | "## Sentence Embeddings" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "id": "575da3a8", 158 | "metadata": {}, 159 | "source": [ 160 | "Consider a very small dataset of three sentences." 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "id": "ca6539bc", 167 | "metadata": { 168 | "height": 234 169 | }, 170 | "outputs": [], 171 | "source": [ 172 | "sentences = pd.DataFrame({'text':\n", 173 | " [\n", 174 | " 'Where is the world cup?',\n", 175 | " 'The world cup is in Qatar',\n", 176 | " 'What color is the sky?',\n", 177 | " 'The sky is blue',\n", 178 | " 'Where does the bear live?',\n", 179 | " 'The bear lives in the the woods',\n", 180 | " 'What is an apple?',\n", 181 | " 'An apple is a fruit',\n", 182 | " ]})\n", 183 | "\n", 184 | "sentences" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "id": "05733ed4", 190 | "metadata": {}, 191 | "source": [ 192 | "Let's create the embeddings for the three sentences:" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "id": "ef89a105", 199 | "metadata": { 200 | "height": 132 201 | }, 202 | "outputs": [], 203 | "source": [ 204 | "emb = co.embed(texts=list(sentences['text']),\n", 205 | " model='embed-english-v2.0').embeddings\n", 206 | "\n", 207 | "# Explore the 10 first entries of the embeddings of the 3 sentences:\n", 208 | "for e in emb:\n", 209 | " print(e[:3])" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "id": "2c33c078", 216 | "metadata": { 217 | "height": 30 218 | }, 219 | "outputs": [], 220 | "source": [ 221 | "len(emb[0])" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": null, 227 | "id": "a2b96e44", 228 | "metadata": { 229 | "height": 47 230 | }, 231 | "outputs": [], 232 | "source": [ 233 | "#import umap\n", 234 | "#import altair as alt" 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "id": "eeb8c945", 241 | "metadata": { 242 | "height": 30 243 | }, 244 | "outputs": [], 245 | "source": [ 246 | "from utils import umap_plot" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "id": "de8a8509", 253 | "metadata": { 254 | "height": 30 255 | }, 256 | "outputs": [], 257 | "source": [ 258 | "chart = umap_plot(sentences, emb)" 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "execution_count": null, 264 | "id": "93a581c4", 265 | "metadata": { 266 | "height": 30 267 | }, 268 | "outputs": [], 269 | "source": [ 270 | "chart.interactive()" 271 | ] 272 | }, 273 | { 274 | "cell_type": "markdown", 275 | "id": "3cfb0192", 276 | "metadata": {}, 277 | "source": [ 278 | "## Articles Embeddings" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "id": "dfff2ced", 285 | "metadata": { 286 | "height": 64 287 | }, 288 | "outputs": [], 289 | "source": [ 290 | "import pandas as pd\n", 291 | "wiki_articles = pd.read_pickle('wikipedia.pkl')\n", 292 | "wiki_articles" 293 | ] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": null, 298 | "id": "e9bde94a", 299 | "metadata": { 300 | "height": 47 301 | }, 302 | "outputs": [], 303 | "source": [ 304 | "import numpy as np\n", 305 | "from utils import umap_plot_big" 306 | ] 307 | }, 308 | { 309 | "cell_type": "code", 310 | "execution_count": null, 311 | "id": "874cf116", 312 | "metadata": { 313 | "height": 115 314 | }, 315 | "outputs": [], 316 | "source": [ 317 | "articles = wiki_articles[['title', 'text']]\n", 318 | "embeds = np.array([d for d in wiki_articles['emb']])\n", 319 | "\n", 320 | "chart = umap_plot_big(articles, embeds)\n", 321 | "chart.interactive()" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": null, 327 | "id": "387c8901", 328 | "metadata": { 329 | "height": 30 330 | }, 331 | "outputs": [], 332 | "source": [] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "id": "cc3c9b2d", 338 | "metadata": { 339 | "height": 30 340 | }, 341 | "outputs": [], 342 | "source": [] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": null, 347 | "id": "d8b338c7", 348 | "metadata": { 349 | "height": 30 350 | }, 351 | "outputs": [], 352 | "source": [] 353 | }, 354 | { 355 | "cell_type": "code", 356 | "execution_count": null, 357 | "id": "b9e30d7b", 358 | "metadata": { 359 | "height": 30 360 | }, 361 | "outputs": [], 362 | "source": [] 363 | }, 364 | { 365 | "cell_type": "code", 366 | "execution_count": null, 367 | "id": "ec708a6b", 368 | "metadata": { 369 | "height": 30 370 | }, 371 | "outputs": [], 372 | "source": [] 373 | } 374 | ], 375 | "metadata": { 376 | "kernelspec": { 377 | "display_name": "Python 3 (ipykernel)", 378 | "language": "python", 379 | "name": "python3" 380 | }, 381 | "language_info": { 382 | "codemirror_mode": { 383 | "name": "ipython", 384 | "version": 3 385 | }, 386 | "file_extension": ".py", 387 | "mimetype": "text/x-python", 388 | "name": "python", 389 | "nbconvert_exporter": "python", 390 | "pygments_lexer": "ipython3", 391 | "version": "3.9.17" 392 | } 393 | }, 394 | "nbformat": 4, 395 | "nbformat_minor": 5 396 | } 397 | -------------------------------------------------------------------------------- /03_Dense_Retrieval.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "272e659b", 6 | "metadata": {}, 7 | "source": [ 8 | "# Dense Retrieval" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "a00e3a75", 14 | "metadata": {}, 15 | "source": [ 16 | "## Setup\n", 17 | "\n", 18 | "Load needed API keys and relevant Python libaries." 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": null, 24 | "id": "84214662", 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "# !pip install cohere \n", 29 | "# !pip install weaviate-client Annoy" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": null, 35 | "id": "e876e494-818f-439e-a1be-b50535cf09f5", 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "import os\n", 40 | "from dotenv import load_dotenv, find_dotenv\n", 41 | "_ = load_dotenv(find_dotenv()) # read local .env file" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "id": "dc7f9f83-c295-4114-aca4-8b0ddbc64ddb", 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "import cohere\n", 52 | "co = cohere.Client(os.environ['COHERE_API_KEY'])" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": null, 58 | "id": "cfa7aa9f-e3e8-44d9-a4ac-461d71d8c7f4", 59 | "metadata": {}, 60 | "outputs": [], 61 | "source": [ 62 | "import weaviate\n", 63 | "auth_config = weaviate.auth.AuthApiKey(\n", 64 | " api_key=os.environ['WEAVIATE_API_KEY'])" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": null, 70 | "id": "d8474f81-011c-485a-9f24-827f81165fee", 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "client = weaviate.Client(\n", 75 | " url=os.environ['WEAVIATE_API_URL'],\n", 76 | " auth_client_secret=auth_config,\n", 77 | " additional_headers={\n", 78 | " \"X-Cohere-Api-Key\": os.environ['COHERE_API_KEY'],\n", 79 | " }\n", 80 | ")\n", 81 | "client.is_ready() #check if True" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "id": "75df20b5", 87 | "metadata": {}, 88 | "source": [ 89 | "## Part 1: Vector Database for semantic Search" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "id": "1a44f246-2fdb-4986-9a23-401f22825647", 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "def dense_retrieval(query, \n", 100 | " results_lang='en', \n", 101 | " properties = [\"text\", \"title\", \"url\", \"views\", \"lang\", \"_additional {distance}\"],\n", 102 | " num_results=5):\n", 103 | "\n", 104 | " nearText = {\"concepts\": [query]}\n", 105 | " \n", 106 | " # To filter by language\n", 107 | " where_filter = {\n", 108 | " \"path\": [\"lang\"],\n", 109 | " \"operator\": \"Equal\",\n", 110 | " \"valueString\": results_lang\n", 111 | " }\n", 112 | " response = (\n", 113 | " client.query\n", 114 | " .get(\"Articles\", properties)\n", 115 | " .with_near_text(nearText)\n", 116 | " .with_where(where_filter)\n", 117 | " .with_limit(num_results)\n", 118 | " .do()\n", 119 | " )\n", 120 | "\n", 121 | " result = response['data']['Get']['Articles']\n", 122 | "\n", 123 | " return result" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "id": "3c7d2791", 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [ 133 | "from utils import print_result" 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "id": "344956dc", 139 | "metadata": {}, 140 | "source": [ 141 | "### Bacic Query" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "id": "ca9b7b93-dfcc-464a-ac9e-41bfa8117a99", 148 | "metadata": {}, 149 | "outputs": [], 150 | "source": [ 151 | "query = \"Who wrote Hamlet?\"\n", 152 | "dense_retrieval_results = dense_retrieval(query)\n", 153 | "print_result(dense_retrieval_results)" 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "id": "319111ec", 159 | "metadata": {}, 160 | "source": [ 161 | "### Medium Query" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "id": "1c1145db-f3c0-4178-8f76-69bd78a8b779", 168 | "metadata": { 169 | "jupyter": { 170 | "outputs_hidden": true 171 | } 172 | }, 173 | "outputs": [], 174 | "source": [ 175 | "query = \"What is the capital of Canada?\"\n", 176 | "dense_retrieval_results = dense_retrieval(query)\n", 177 | "print_result(dense_retrieval_results)" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": null, 183 | "id": "dda4f775-d95a-4408-872b-a0d9cf0c7a68", 184 | "metadata": { 185 | "jupyter": { 186 | "outputs_hidden": true 187 | } 188 | }, 189 | "outputs": [], 190 | "source": [ 191 | "from utils import keyword_search\n", 192 | "\n", 193 | "query = \"What is the capital of Canada?\"\n", 194 | "keyword_search_results = keyword_search(query, client)\n", 195 | "print_result(keyword_search_results)" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "id": "529e8a97", 201 | "metadata": {}, 202 | "source": [ 203 | "### Complicated Query" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "id": "9deb9d52-6da8-48f3-92e2-baf922afdbb9", 210 | "metadata": {}, 211 | "outputs": [], 212 | "source": [ 213 | "from utils import keyword_search\n", 214 | "\n", 215 | "query = \"Tallest person in history?\"\n", 216 | "keyword_search_results = keyword_search(query, client)\n", 217 | "print_result(keyword_search_results)" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": null, 223 | "id": "334fd577-5832-4ccc-9ac1-1ed32c6913ab", 224 | "metadata": { 225 | "jupyter": { 226 | "outputs_hidden": true 227 | } 228 | }, 229 | "outputs": [], 230 | "source": [ 231 | "query = \"Tallest person in history\"\n", 232 | "dense_retrieval_results = dense_retrieval(query)\n", 233 | "print_result(dense_retrieval_results)" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": null, 239 | "id": "a52ea636-1343-4587-85ce-0c291e5fecbc", 240 | "metadata": { 241 | "jupyter": { 242 | "outputs_hidden": true 243 | } 244 | }, 245 | "outputs": [], 246 | "source": [ 247 | "query = \"أطول رجل في التاريخ\"\n", 248 | "dense_retrieval_results = dense_retrieval(query)\n", 249 | "print_result(dense_retrieval_results)" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "id": "134a5df0-3b43-4f03-8078-cfad28dcbaf5", 256 | "metadata": { 257 | "jupyter": { 258 | "outputs_hidden": true 259 | } 260 | }, 261 | "outputs": [], 262 | "source": [ 263 | "query = \"film about a time travel paradox\"\n", 264 | "dense_retrieval_results = dense_retrieval(query)\n", 265 | "print_result(dense_retrieval_results)" 266 | ] 267 | }, 268 | { 269 | "cell_type": "markdown", 270 | "id": "d3ed0397", 271 | "metadata": {}, 272 | "source": [ 273 | "## Part 2: Building Semantic Search from Scratch\n", 274 | "\n", 275 | "### Get the text archive:" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "id": "b1782247-3b70-4698-8863-7b2e73b10bb5", 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "from annoy import AnnoyIndex\n", 286 | "import numpy as np\n", 287 | "import pandas as pd\n", 288 | "import re" 289 | ] 290 | }, 291 | { 292 | "cell_type": "code", 293 | "execution_count": null, 294 | "id": "664b5661-8e1c-4858-bbd9-5370737318e0", 295 | "metadata": {}, 296 | "outputs": [], 297 | "source": [ 298 | "text = \"\"\"\n", 299 | "Interstellar is a 2014 epic science fiction film co-written, directed, and produced by Christopher Nolan.\n", 300 | "It stars Matthew McConaughey, Anne Hathaway, Jessica Chastain, Bill Irwin, Ellen Burstyn, Matt Damon, and Michael Caine.\n", 301 | "Set in a dystopian future where humanity is struggling to survive, the film follows a group of astronauts who travel through a wormhole near Saturn in search of a new home for mankind.\n", 302 | "\n", 303 | "Brothers Christopher and Jonathan Nolan wrote the screenplay, which had its origins in a script Jonathan developed in 2007.\n", 304 | "Caltech theoretical physicist and 2017 Nobel laureate in Physics[4] Kip Thorne was an executive producer, acted as a scientific consultant, and wrote a tie-in book, The Science of Interstellar.\n", 305 | "Cinematographer Hoyte van Hoytema shot it on 35 mm movie film in the Panavision anamorphic format and IMAX 70 mm.\n", 306 | "Principal photography began in late 2013 and took place in Alberta, Iceland, and Los Angeles.\n", 307 | "Interstellar uses extensive practical and miniature effects and the company Double Negative created additional digital effects.\n", 308 | "\n", 309 | "Interstellar premiered on October 26, 2014, in Los Angeles.\n", 310 | "In the United States, it was first released on film stock, expanding to venues using digital projectors.\n", 311 | "The film had a worldwide gross over $677 million (and $773 million with subsequent re-releases), making it the tenth-highest grossing film of 2014.\n", 312 | "It received acclaim for its performances, direction, screenplay, musical score, visual effects, ambition, themes, and emotional weight.\n", 313 | "It has also received praise from many astronomers for its scientific accuracy and portrayal of theoretical astrophysics. Since its premiere, Interstellar gained a cult following,[5] and now is regarded by many sci-fi experts as one of the best science-fiction films of all time.\n", 314 | "Interstellar was nominated for five awards at the 87th Academy Awards, winning Best Visual Effects, and received numerous other accolades\"\"\"" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "id": "55f38f89", 320 | "metadata": {}, 321 | "source": [ 322 | "### Chunking: " 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": null, 328 | "id": "44d495f8-8db7-4aee-bfe6-5f3cd46b2217", 329 | "metadata": {}, 330 | "outputs": [], 331 | "source": [ 332 | "# Split into a list of sentences\n", 333 | "texts = text.split('.')\n", 334 | "\n", 335 | "# Clean up to remove empty spaces and new lines\n", 336 | "texts = np.array([t.strip(' \\n') for t in texts])" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": null, 342 | "id": "c378fcac-9b19-451c-ab68-8c6b69b7ac2d", 343 | "metadata": {}, 344 | "outputs": [], 345 | "source": [ 346 | "texts" 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": null, 352 | "id": "3f850709", 353 | "metadata": {}, 354 | "outputs": [], 355 | "source": [ 356 | "# Split into a list of paragraphs\n", 357 | "texts = text.split('\\n\\n')\n", 358 | "\n", 359 | "# Clean up to remove empty spaces and new lines\n", 360 | "texts = np.array([t.strip(' \\n') for t in texts])" 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": null, 366 | "id": "7b41bc71", 367 | "metadata": {}, 368 | "outputs": [], 369 | "source": [ 370 | "texts" 371 | ] 372 | }, 373 | { 374 | "cell_type": "code", 375 | "execution_count": null, 376 | "id": "4f4e167d", 377 | "metadata": {}, 378 | "outputs": [], 379 | "source": [ 380 | "# Split into a list of sentences\n", 381 | "texts = text.split('.')\n", 382 | "\n", 383 | "# Clean up to remove empty spaces and new lines\n", 384 | "texts = np.array([t.strip(' \\n') for t in texts])" 385 | ] 386 | }, 387 | { 388 | "cell_type": "code", 389 | "execution_count": null, 390 | "id": "907ad923-a086-4746-b3c8-bdd7e5ff30ad", 391 | "metadata": {}, 392 | "outputs": [], 393 | "source": [ 394 | "title = 'Interstellar (film)'\n", 395 | "\n", 396 | "texts = np.array([f\"{title} {t}\" for t in texts])" 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": null, 402 | "id": "630eb9b9-eced-4716-8684-ad028a437057", 403 | "metadata": {}, 404 | "outputs": [], 405 | "source": [ 406 | "texts" 407 | ] 408 | }, 409 | { 410 | "cell_type": "markdown", 411 | "id": "3a26b43a", 412 | "metadata": {}, 413 | "source": [ 414 | "### Get the embeddings:" 415 | ] 416 | }, 417 | { 418 | "cell_type": "code", 419 | "execution_count": null, 420 | "id": "fbbc0303-e929-4eb4-aa95-0c41042e19c9", 421 | "metadata": {}, 422 | "outputs": [], 423 | "source": [ 424 | "response = co.embed(\n", 425 | " texts=texts.tolist()\n", 426 | ").embeddings" 427 | ] 428 | }, 429 | { 430 | "cell_type": "code", 431 | "execution_count": null, 432 | "id": "84d7961f-5075-4549-84ac-a013d875b709", 433 | "metadata": {}, 434 | "outputs": [], 435 | "source": [ 436 | "embeds = np.array(response)\n", 437 | "embeds.shape" 438 | ] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "id": "39e19dee", 443 | "metadata": {}, 444 | "source": [ 445 | "### Create the search index:" 446 | ] 447 | }, 448 | { 449 | "cell_type": "code", 450 | "execution_count": null, 451 | "id": "7cadadeb-f74f-4145-8a62-de184ffd9ead", 452 | "metadata": {}, 453 | "outputs": [], 454 | "source": [ 455 | "search_index = AnnoyIndex(embeds.shape[1], 'angular')\n", 456 | "# Add all the vectors to the search index\n", 457 | "for i in range(len(embeds)):\n", 458 | " search_index.add_item(i, embeds[i])\n", 459 | "\n", 460 | "search_index.build(10) # 10 trees\n", 461 | "search_index.save('test.ann')" 462 | ] 463 | }, 464 | { 465 | "cell_type": "code", 466 | "execution_count": null, 467 | "id": "86e7267a-d130-4ee2-8498-4ba2ad2eebd0", 468 | "metadata": {}, 469 | "outputs": [], 470 | "source": [ 471 | "pd.set_option('display.max_colwidth', None)\n", 472 | "\n", 473 | "def search(query):\n", 474 | "\n", 475 | " # Get the query's embedding\n", 476 | " query_embed = co.embed(texts=[query]).embeddings\n", 477 | "\n", 478 | " # Retrieve the nearest neighbors\n", 479 | " similar_item_ids = search_index.get_nns_by_vector(query_embed[0],\n", 480 | " 3,\n", 481 | " include_distances=True)\n", 482 | " # Format the results\n", 483 | " results = pd.DataFrame(data={'texts': texts[similar_item_ids[0]],\n", 484 | " 'distance': similar_item_ids[1]})\n", 485 | "\n", 486 | " print(texts[similar_item_ids[0]])\n", 487 | " \n", 488 | " return results" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": null, 494 | "id": "6b932b43-5fa4-4f23-a627-29deb80f0bd0", 495 | "metadata": {}, 496 | "outputs": [], 497 | "source": [ 498 | "query = \"How much did the film make?\"\n", 499 | "search(query)" 500 | ] 501 | }, 502 | { 503 | "cell_type": "code", 504 | "execution_count": null, 505 | "id": "a1e4c088-aecd-4e7d-9475-79fdf4f1e3ff", 506 | "metadata": {}, 507 | "outputs": [], 508 | "source": [] 509 | }, 510 | { 511 | "cell_type": "code", 512 | "execution_count": null, 513 | "id": "e3502566-f976-4f29-9ac3-ceb8251506f3", 514 | "metadata": {}, 515 | "outputs": [], 516 | "source": [] 517 | }, 518 | { 519 | "cell_type": "code", 520 | "execution_count": null, 521 | "id": "b8301f90-f6e8-4f2e-a3f0-c528824e797a", 522 | "metadata": {}, 523 | "outputs": [], 524 | "source": [] 525 | }, 526 | { 527 | "cell_type": "code", 528 | "execution_count": null, 529 | "id": "72b88367-b571-4e50-aa4f-c754fc31ab7f", 530 | "metadata": {}, 531 | "outputs": [], 532 | "source": [] 533 | }, 534 | { 535 | "cell_type": "code", 536 | "execution_count": null, 537 | "id": "d0e71d39-8f59-4d0e-a2fd-aa0eb0cd1b84", 538 | "metadata": {}, 539 | "outputs": [], 540 | "source": [] 541 | }, 542 | { 543 | "cell_type": "code", 544 | "execution_count": null, 545 | "id": "ded6ecfa-91f4-48c2-9237-6904f7f630d4", 546 | "metadata": {}, 547 | "outputs": [], 548 | "source": [] 549 | }, 550 | { 551 | "cell_type": "code", 552 | "execution_count": null, 553 | "id": "87ef1f57-be42-47aa-a132-0731aeaef4fc", 554 | "metadata": {}, 555 | "outputs": [], 556 | "source": [] 557 | }, 558 | { 559 | "cell_type": "code", 560 | "execution_count": null, 561 | "id": "aed0699a-ce3c-4529-b596-b17ff9da51e1", 562 | "metadata": {}, 563 | "outputs": [], 564 | "source": [] 565 | }, 566 | { 567 | "cell_type": "code", 568 | "execution_count": null, 569 | "id": "b0e5ddb7-41fa-4eae-b077-2264be76c258", 570 | "metadata": {}, 571 | "outputs": [], 572 | "source": [] 573 | }, 574 | { 575 | "cell_type": "code", 576 | "execution_count": null, 577 | "id": "5821387b-fd64-4e12-8541-d7bca94ba9d9", 578 | "metadata": {}, 579 | "outputs": [], 580 | "source": [] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": null, 585 | "id": "f96720ad-4b11-4038-b7a4-0d2ea5a34379", 586 | "metadata": {}, 587 | "outputs": [], 588 | "source": [] 589 | } 590 | ], 591 | "metadata": { 592 | "kernelspec": { 593 | "display_name": "Python 3 (ipykernel)", 594 | "language": "python", 595 | "name": "python3" 596 | }, 597 | "language_info": { 598 | "codemirror_mode": { 599 | "name": "ipython", 600 | "version": 3 601 | }, 602 | "file_extension": ".py", 603 | "mimetype": "text/x-python", 604 | "name": "python", 605 | "nbconvert_exporter": "python", 606 | "pygments_lexer": "ipython3", 607 | "version": "3.10.9" 608 | } 609 | }, 610 | "nbformat": 4, 611 | "nbformat_minor": 5 612 | } 613 | -------------------------------------------------------------------------------- /04_Rerank.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "0ead654a", 6 | "metadata": {}, 7 | "source": [ 8 | "# ReRank" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "99f6a6f7", 14 | "metadata": {}, 15 | "source": [ 16 | "## Setup\n", 17 | "\n", 18 | "Load needed API keys and relevant Python libaries." 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 2, 24 | "id": "f350cd1b", 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "# !pip install cohere \n", 29 | "# !pip install weaviate-client" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 71, 35 | "id": "b2febbb9-27dd-4209-838a-99b4f9cdf51b", 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "import os\n", 40 | "from dotenv import load_dotenv, find_dotenv\n", 41 | "_ = load_dotenv(find_dotenv()) # read local .env file" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": 72, 47 | "id": "dab2ecba-3403-4317-86ef-bd6d92a6cb46", 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "import cohere\n", 52 | "co = cohere.Client(os.environ['COHERE_API_KEY'])" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 73, 58 | "id": "30737b1b-e4c8-4bd0-a04b-c2ce70d28821", 59 | "metadata": {}, 60 | "outputs": [], 61 | "source": [ 62 | "import weaviate\n", 63 | "auth_config = weaviate.auth.AuthApiKey(\n", 64 | " api_key=os.environ['WEAVIATE_API_KEY'])" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 74, 70 | "id": "8781f638-17c7-4ab7-86b5-3763d4d5abad", 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "client = weaviate.Client(\n", 75 | " url=os.environ['WEAVIATE_API_URL'],\n", 76 | " auth_client_secret=auth_config,\n", 77 | " additional_headers={\n", 78 | " \"X-Cohere-Api-Key\": os.environ['COHERE_API_KEY'],\n", 79 | " }\n", 80 | ")" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "ffcc8e5e", 86 | "metadata": {}, 87 | "source": [ 88 | "## Dense Retrieval" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 77, 94 | "id": "b8561fbf-035e-4856-a97f-8eda21d32a81", 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "from utils import dense_retrieval" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": 78, 104 | "id": "15694a5c-3525-49cc-b5e9-d1c34ae0fbe9", 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [ 108 | "query = \"What is the capital of Canada?\"" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": 126, 114 | "id": "6dfede25-8a43-41c9-9328-d331695c4fcb", 115 | "metadata": {}, 116 | "outputs": [], 117 | "source": [ 118 | "dense_retrieval_results = dense_retrieval(query, client)" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": 127, 124 | "id": "1822cc6c-ddc2-4938-b746-7cda2506d51e", 125 | "metadata": {}, 126 | "outputs": [], 127 | "source": [ 128 | "from utils import print_result" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 128, 134 | "id": "2990c5c4-1b63-453e-8dd8-8568cb7872f5", 135 | "metadata": {}, 136 | "outputs": [ 137 | { 138 | "name": "stdout", 139 | "output_type": "stream", 140 | "text": [ 141 | "item 0\n", 142 | "_additional:{'distance': -150.8129}\n", 143 | "\n", 144 | "lang:en\n", 145 | "\n", 146 | "text:The governor general of the province had designated Kingston as the capital in 1841. However, the major population centres of Toronto and Montreal, as well as the former capital of Lower Canada, Quebec City, all had legislators dissatisfied with Kingston. Anglophone merchants in Quebec were the main group supportive of the Kingston arrangement. In 1842, a vote rejected Kingston as the capital, and study of potential candidates included the then-named Bytown, but that option proved less popular than Toronto or Montreal. In 1843, a report of the Executive Council recommended Montreal as the capital as a more fortifiable location and commercial centre, however, the Governor General refused to execute a move without a parliamentary vote. In 1844, the Queen's acceptance of a parliamentary vote moved the capital to Montreal.\n", 147 | "\n", 148 | "title:Ottawa\n", 149 | "\n", 150 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 151 | "\n", 152 | "views:2000\n", 153 | "\n", 154 | "\n", 155 | "item 1\n", 156 | "_additional:{'distance': -150.29314}\n", 157 | "\n", 158 | "lang:en\n", 159 | "\n", 160 | "text:For brief periods, Toronto was twice the capital of the united Province of Canada: first from 1849 to 1852, following unrest in Montreal, and later 1856–1858. After this date, Quebec was designated as the capital until 1866 (one year before Canadian Confederation). Since then, the capital of Canada has remained Ottawa, Ontario.\n", 161 | "\n", 162 | "title:Toronto\n", 163 | "\n", 164 | "url:https://en.wikipedia.org/wiki?curid=64646\n", 165 | "\n", 166 | "views:3000\n", 167 | "\n", 168 | "\n", 169 | "item 2\n", 170 | "_additional:{'distance': -150.03601}\n", 171 | "\n", 172 | "lang:en\n", 173 | "\n", 174 | "text:Selection of Ottawa as the capital of Canada predates the Confederation of Canada. The selection was contentious and not straightforward, with the parliament of the United Province of Canada holding more than 200 votes over several decades to attempt to settle on a legislative solution to the location of the capital.\n", 175 | "\n", 176 | "title:Ottawa\n", 177 | "\n", 178 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 179 | "\n", 180 | "views:2000\n", 181 | "\n", 182 | "\n", 183 | "item 3\n", 184 | "_additional:{'distance': -149.92947}\n", 185 | "\n", 186 | "lang:en\n", 187 | "\n", 188 | "text:Until the late 18th century Québec was the most populous city in present-day Canada. As of the census of 1790, Montreal surpassed it with 18,000 inhabitants, but Quebec (pop. 14,000) remained the administrative capital of New France. It was then made the capital of Lower Canada by the Constitutional Act of 1791. From 1841 to 1867, the capital of the Province of Canada rotated between Kingston, Montreal, Toronto, Ottawa and Quebec City (from 1852 to 1856 and from 1859 to 1866).\n", 189 | "\n", 190 | "title:Quebec City\n", 191 | "\n", 192 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 193 | "\n", 194 | "views:2000\n", 195 | "\n", 196 | "\n", 197 | "item 4\n", 198 | "_additional:{'distance': -149.7189}\n", 199 | "\n", 200 | "lang:en\n", 201 | "\n", 202 | "text:The Quebec Conference on Canadian Confederation was held in the city in 1864. In 1867, Queen Victoria chose Ottawa as the definite capital of the Dominion of Canada, while Quebec City was confirmed as the capital of the newly created province of Quebec.\n", 203 | "\n", 204 | "title:Quebec City\n", 205 | "\n", 206 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 207 | "\n", 208 | "views:2000\n", 209 | "\n", 210 | "\n", 211 | "item 5\n", 212 | "_additional:{'distance': -149.35098}\n", 213 | "\n", 214 | "lang:en\n", 215 | "\n", 216 | "text:Montreal was the capital of the Province of Canada from 1844 to 1849, but lost its status when a Tory mob burnt down the Parliament building to protest the passage of the Rebellion Losses Bill. Thereafter, the capital rotated between Quebec City and Toronto until in 1857, Queen Victoria herself established Ottawa as the capital due to strategic reasons. The reasons were twofold. First, because it was located more in the interior of the Province of Canada, it was less susceptible to attack from the United States. Second, and perhaps more importantly, because it lay on the border between French and English Canada, Ottawa was seen as a compromise between Montreal, Toronto, Kingston and Quebec City, which were all vying to become the young nation's official capital. Ottawa retained the status as capital of Canada when the Province of Canada joined with Nova Scotia and New Brunswick to form the Dominion of Canada in 1867.\n", 217 | "\n", 218 | "title:Montreal\n", 219 | "\n", 220 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 221 | "\n", 222 | "views:3000\n", 223 | "\n", 224 | "\n", 225 | "item 6\n", 226 | "_additional:{'distance': -148.69257}\n", 227 | "\n", 228 | "lang:en\n", 229 | "\n", 230 | "text:Ottawa was chosen as the capital for two primary reasons. First, Ottawa's isolated location, surrounded by dense forest far from the Canada–US border and situated on a cliff face, would make it more defensible from attack. Second, Ottawa was approximately midway between Toronto and Kingston (in Canada West) and Montreal and Quebec City (in Canada East) making the selection an important political compromise.\n", 231 | "\n", 232 | "title:Ottawa\n", 233 | "\n", 234 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 235 | "\n", 236 | "views:2000\n", 237 | "\n", 238 | "\n", 239 | "item 7\n", 240 | "_additional:{'distance': -148.68246}\n", 241 | "\n", 242 | "lang:en\n", 243 | "\n", 244 | "text:A number of buildings across Canada are reserved by the Crown for the use of the monarch and his viceroys. Each is called \"Government House\", but may be customarily known by some specific name. The sovereign's and governor general's official residences are Rideau Hall in Ottawa and the Citadelle in Quebec City. Each of these royal seats holds pieces from the Crown Collection. Further, though neither was ever used for their intended purpose, Hatley Castle in British Columbia was purchased in 1940 by King George VI in Right of Canada to use as his home during the course of the Second World War and the Emergency Government Headquarters, built in 1959 at CFS Carp and decommissioned in 1994, included a residential apartment for the sovereign or governor general in the case of a nuclear attack on Ottawa.\n", 245 | "\n", 246 | "title:Monarchy of Canada\n", 247 | "\n", 248 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 249 | "\n", 250 | "views:2000\n", 251 | "\n", 252 | "\n", 253 | "item 8\n", 254 | "_additional:{'distance': -148.63625}\n", 255 | "\n", 256 | "lang:en\n", 257 | "\n", 258 | "text:Ottawa (, ; ) is the capital city of Canada. It is located at the confluence of the Ottawa River and the Rideau River in the southern portion of the province of Ontario. Ottawa borders Gatineau, Quebec, and forms the core of the Ottawa–Gatineau census metropolitan area (CMA) and the National Capital Region (NCR). Ottawa had a city population of 1,017,449 and a metropolitan population of 1,488,307, making it the fourth-largest city and fourth-largest metropolitan area in Canada.\n", 259 | "\n", 260 | "title:Ottawa\n", 261 | "\n", 262 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 263 | "\n", 264 | "views:2000\n", 265 | "\n", 266 | "\n", 267 | "item 9\n", 268 | "_additional:{'distance': -148.31482}\n", 269 | "\n", 270 | "lang:en\n", 271 | "\n", 272 | "text:Toronto, the capital of Ontario, is the centre of Canada's financial services and banking industry. Neighbouring cities are home to product distribution, IT centres, and manufacturing industries. Canada's Federal Government is the largest single employer in the National Capital Region, which centres on the border cities of Ontario's Ottawa and Quebec's Gatineau.\n", 273 | "\n", 274 | "title:Ontario\n", 275 | "\n", 276 | "url:https://en.wikipedia.org/wiki?curid=22218\n", 277 | "\n", 278 | "views:3000\n", 279 | "\n", 280 | "\n", 281 | "item 10\n", 282 | "_additional:{'distance': -147.95712}\n", 283 | "\n", 284 | "lang:en\n", 285 | "\n", 286 | "text:Ottawa is the political centre of Canada and headquarters to the federal government. The city houses numerous foreign embassies, key buildings, organizations, and institutions of Canada's government, including the Parliament of Canada, the Supreme Court, the residence of Canada's viceroy, and Office of the Prime Minister.\n", 287 | "\n", 288 | "title:Ottawa\n", 289 | "\n", 290 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 291 | "\n", 292 | "views:2000\n", 293 | "\n", 294 | "\n", 295 | "item 11\n", 296 | "_additional:{'distance': -147.67686}\n", 297 | "\n", 298 | "lang:en\n", 299 | "\n", 300 | "text:Canada is a country in North America. Its ten provinces and three territories extend from the Atlantic Ocean to the Pacific Ocean and northward into the Arctic Ocean, covering over , making it the world's second-largest country by total area. Its southern and western border with the United States, stretching , is the world's longest binational land border. Canada's capital is Ottawa, and its three largest metropolitan areas are Toronto, Montreal, and Vancouver.\n", 301 | "\n", 302 | "title:Canada\n", 303 | "\n", 304 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 305 | "\n", 306 | "views:4000\n", 307 | "\n", 308 | "\n", 309 | "item 12\n", 310 | "_additional:{'distance': -147.3634}\n", 311 | "\n", 312 | "lang:en\n", 313 | "\n", 314 | "text:In 1849, after violence in Montreal a series of votes was held, with Kingston and Bytown both again considered as capitals. However, the successful proposal was for two cities to share capital status, and the legislature to alternate sitting in each: Quebec City and Toronto, in a policy known as perambulation. Logistical difficulties made this an unpopular arrangement, and although an 1856 vote passed for the lower house of parliament to relocate permanently to Quebec City, the upper house refused to approve funding.\n", 315 | "\n", 316 | "title:Ottawa\n", 317 | "\n", 318 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 319 | "\n", 320 | "views:2000\n", 321 | "\n", 322 | "\n", 323 | "item 13\n", 324 | "_additional:{'distance': -146.92775}\n", 325 | "\n", 326 | "lang:en\n", 327 | "\n", 328 | "text:Montreal has the second-largest economy of Canadian cities based on GDP and the largest in Quebec. In 2014, Metropolitan Montreal was responsible for of Quebec's GDP. The city is today an important centre of commerce, finance, industry, technology, culture, world affairs and is the headquarters of the Montreal Exchange. In recent decades, the city was widely seen as weaker than that of Toronto and other major Canadian cities, but it has recently experienced a revival.\n", 329 | "\n", 330 | "title:Montreal\n", 331 | "\n", 332 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 333 | "\n", 334 | "views:3000\n", 335 | "\n", 336 | "\n", 337 | "item 14\n", 338 | "_additional:{'distance': -146.91176}\n", 339 | "\n", 340 | "lang:en\n", 341 | "\n", 342 | "text:Quebec City ( or ; ), officially Québec (), is the capital city of the Canadian province of Quebec. As of July 2021, the city had a population of 549,459, and the metropolitan area had a population of 839,311. It is the eleventh-largest city and the seventh-largest metropolitan area in Canada. It is also the second-largest city in the province after Montreal. It has a humid continental climate with warm summers coupled with cold and snowy winters.\n", 343 | "\n", 344 | "title:Quebec City\n", 345 | "\n", 346 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 347 | "\n", 348 | "views:2000\n", 349 | "\n", 350 | "\n", 351 | "item 15\n", 352 | "_additional:{'distance': -146.8559}\n", 353 | "\n", 354 | "lang:en\n", 355 | "\n", 356 | "text:The prime minister of Canada () is the head of government of Canada. Under the Westminster system, the prime minister governs with the confidence of a majority the elected House of Commons; as such, the prime minister typically sits as a member of Parliament (MP) and leads the largest party or a coalition of parties. As first minister, the prime minister selects ministers to form the Cabinet, and serves as its chair. Constitutionally, the Crown exercises executive power on the advice of the Cabinet, which is collectively responsible to the House of Commons.\n", 357 | "\n", 358 | "title:Prime Minister of Canada\n", 359 | "\n", 360 | "url:https://en.wikipedia.org/wiki?curid=24135\n", 361 | "\n", 362 | "views:2000\n", 363 | "\n", 364 | "\n", 365 | "item 16\n", 366 | "_additional:{'distance': -146.75198}\n", 367 | "\n", 368 | "lang:en\n", 369 | "\n", 370 | "text:St. John's, the capital and largest city of Newfoundland and Labrador, is Canada's 22nd-largest census metropolitan area and it is home to about 40% of the province's population. St. John's is the seat of the House of Assembly of Newfoundland and Labrador as well as the jurisdiction's highest court, the Newfoundland and Labrador Court of Appeal.\n", 371 | "\n", 372 | "title:Newfoundland and Labrador\n", 373 | "\n", 374 | "url:https://en.wikipedia.org/wiki?curid=21980\n", 375 | "\n", 376 | "views:2000\n", 377 | "\n", 378 | "\n", 379 | "item 17\n", 380 | "_additional:{'distance': -146.7267}\n", 381 | "\n", 382 | "lang:en\n", 383 | "\n", 384 | "text:London was named for the British capital of London by John Graves Simcoe, who also named the local river the Thames, in 1793. Simcoe had intended London to be the capital of Upper Canada. Guy Carleton (Governor Dorchester) rejected that plan after the War of 1812, but accepted Simcoe's second choice, the present site of Toronto, to become the capital city of what would become the Province of Ontario, at Confederation, on 1 July 1867.\n", 385 | "\n", 386 | "title:London, Ontario\n", 387 | "\n", 388 | "url:https://en.wikipedia.org/wiki?curid=40353\n", 389 | "\n", 390 | "views:2000\n", 391 | "\n", 392 | "\n", 393 | "item 18\n", 394 | "_additional:{'distance': -146.70326}\n", 395 | "\n", 396 | "lang:en\n", 397 | "\n", 398 | "text:The funding impasse led to the ending of the legislature's role in determining the seat of government. The legislature requested the Queen make the determination of the seat of government. The Queen then acted on the advice of her governor general Edmund Head, who, after reviewing proposals from various cities, selected the recently renamed Ottawa. The Queen sent a letter to colonial authorities selecting Ottawa as the capital, effective December 31, 1857. George Brown, briefly a co-premier of the Province of Canada, attempted to reverse this decision, but was unsuccessful. The Queen's choice was ratified by the Parliament in 1859, with Quebec serving as interim capital from 1859 to 1865. The relocation process began in 1865, with the first session of Parliament held in the new buildings in 1866, and the buildings were generally well received by legislators.\n", 399 | "\n", 400 | "title:Ottawa\n", 401 | "\n", 402 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 403 | "\n", 404 | "views:2000\n", 405 | "\n", 406 | "\n", 407 | "item 19\n", 408 | "_additional:{'distance': -146.62202}\n", 409 | "\n", 410 | "lang:en\n", 411 | "\n", 412 | "text:By federal law (Air Canada Public Participation Act), Air Canada has been obligated to keep its head office in Montreal. Its corporate headquarters is Air Canada Centre (French: \"Centre Air Canada\"), also known as La Rondelle (\"The Puck\" in French), a 7-storey building located on the grounds of Montréal–Trudeau International Airport in Saint-Laurent.\n", 413 | "\n", 414 | "title:Air Canada\n", 415 | "\n", 416 | "url:https://en.wikipedia.org/wiki?curid=145623\n", 417 | "\n", 418 | "views:2000\n", 419 | "\n", 420 | "\n", 421 | "item 20\n", 422 | "_additional:{'distance': -146.50558}\n", 423 | "\n", 424 | "lang:en\n", 425 | "\n", 426 | "text:Montreal ( ; officially Montréal, ) is the second-most populous city in Canada and most populous city in the Canadian province of Quebec. Founded in 1642 as \"Ville-Marie\", or \"City of Mary\", it is named after Mount Royal, the triple-peaked hill around which the early city of Ville-Marie is built. The city is centred on the Island of Montreal, which obtained its name from the same origin as the city, and a few much smaller peripheral islands, the largest of which is Île Bizard. The city is east of the national capital Ottawa, and southwest of the provincial capital, Quebec City.\n", 427 | "\n", 428 | "title:Montreal\n", 429 | "\n", 430 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 431 | "\n", 432 | "views:3000\n", 433 | "\n", 434 | "\n", 435 | "item 21\n", 436 | "_additional:{'distance': -146.47559}\n", 437 | "\n", 438 | "lang:en\n", 439 | "\n", 440 | "text:The administrative region in which it is situated is officially referred to as Capitale-Nationale, and the term \"national capital\" is used to refer to Quebec City itself at the provincial level.\n", 441 | "\n", 442 | "title:Quebec City\n", 443 | "\n", 444 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 445 | "\n", 446 | "views:2000\n", 447 | "\n", 448 | "\n", 449 | "item 22\n", 450 | "_additional:{'distance': -146.45479}\n", 451 | "\n", 452 | "lang:en\n", 453 | "\n", 454 | "text:Air Canada is the flag carrier and the largest airline of Canada by size and passengers carried. Air Canada maintains its headquarters in the borough of Saint-Laurent, Montreal, Quebec. The airline, founded in 1937, provides scheduled and charter air transport for passengers and cargo to 222 destinations worldwide. It is a founding member of the Star Alliance. Air Canada's major hubs are at Montréal–Trudeau International Airport (YUL), Toronto Pearson International Airport (YYZ), Calgary International Airport (YYC), and Vancouver International Airport (YVR). The airline's regional service is Air Canada Express.\n", 455 | "\n", 456 | "title:Air Canada\n", 457 | "\n", 458 | "url:https://en.wikipedia.org/wiki?curid=145623\n", 459 | "\n", 460 | "views:2000\n", 461 | "\n", 462 | "\n", 463 | "item 23\n", 464 | "_additional:{'distance': -146.34396}\n", 465 | "\n", 466 | "lang:en\n", 467 | "\n", 468 | "text:Across the Ottawa River, which forms the border between Ontario and Quebec, lies the city of Gatineau, itself the result of amalgamation of the former Quebec cities of Hull and Aylmer. Although formally and administratively separate cities in two separate provinces, Ottawa and Gatineau (along with a number of nearby municipalities) collectively constitute the National Capital Region, which is considered a single metropolitan area. One federal Crown corporation, the National Capital Commission, or NCC, has significant land holdings in both cities, including sites of historical and touristic importance. The NCC, through its responsibility for planning and development of these lands, has a key role in shaping the development of the city. Around the main urban area is an extensive greenbelt, administered by the NCC for conservation and leisure, and comprising mostly forest, farmland and marshland.\n", 469 | "\n", 470 | "title:Ottawa\n", 471 | "\n", 472 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 473 | "\n", 474 | "views:2000\n", 475 | "\n", 476 | "\n", 477 | "item 24\n", 478 | "_additional:{'distance': -146.22684}\n", 479 | "\n", 480 | "lang:en\n", 481 | "\n", 482 | "text:As the national capital of Canada, tourism is an important part of Ottawa's economy, particularly after the 150th anniversary of Canada which was centred in Ottawa. The lead-up to the festivities saw much investment in civic infrastructure, upgrades to tourist infrastructure and increases in national cultural attractions. The National Capital Region annually attracts an estimated 22 million tourists, who spend about 2.2 billion dollars and support 30,600 jobs directly.\n", 483 | "\n", 484 | "title:Ottawa\n", 485 | "\n", 486 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 487 | "\n", 488 | "views:2000\n", 489 | "\n", 490 | "\n", 491 | "item 25\n", 492 | "_additional:{'distance': -146.18494}\n", 493 | "\n", 494 | "lang:en\n", 495 | "\n", 496 | "text:In Canada, Calgary has the second-highest concentration of head offices in Canada (behind Toronto), the most head offices per capita, and the highest head office revenue per capita. Some large employers with Calgary head offices include Canada Safeway Limited, Westfair Foods Ltd., Suncor Energy, Agrium, Flint Energy Services Ltd., Shaw Communications, and Canadian Pacific Railway. CPR moved its head office from Montreal in 1996 and Imperial Oil moved from Toronto in 2005. Encana's new 58-floor corporate headquarters, the Bow, became the tallest building in Canada outside of Toronto. In 2001, the city became the corporate headquarters of the TSX Venture Exchange.\n", 497 | "\n", 498 | "title:Calgary\n", 499 | "\n", 500 | "url:https://en.wikipedia.org/wiki?curid=15895358\n", 501 | "\n", 502 | "views:2000\n", 503 | "\n", 504 | "\n", 505 | "item 26\n", 506 | "_additional:{'distance': -146.10544}\n", 507 | "\n", 508 | "lang:en\n", 509 | "\n", 510 | "text:The Crown is the pinnacle of the Canadian Forces, with the constitution placing the monarch in the position of commander-in-chief of the entire force, though the governor general carries out the duties attached to the position and also bears the title of \"Commander-in-Chief in and over Canada\". Further, included in Canada's constitution are the various treaties between the Crown and Canadian First Nations, Inuit, and Métis peoples, who view these documents as agreements directly and only between themselves and the reigning monarch, illustrating the relationship between sovereign and aboriginals.\n", 511 | "\n", 512 | "title:Monarchy of Canada\n", 513 | "\n", 514 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 515 | "\n", 516 | "views:2000\n", 517 | "\n", 518 | "\n", 519 | "item 27\n", 520 | "_additional:{'distance': -146.09323}\n", 521 | "\n", 522 | "lang:en\n", 523 | "\n", 524 | "text:The city currently has one professional team, the baseball team Capitales de Québec, which plays in the Frontier League in downtown's Stade Canac. The team was established in 1999 and originally played in the Northern League. It has seven league titles, won in 2006, 2009, 2010, 2011, 2012, 2013 and 2017. A professional basketball team, the Quebec Kebs, played in National Basketball League of Canada in 2011 but folded before the 2012 season, and a semi-professional soccer team, the Dynamo de Québec, played in the Première ligue de soccer du Québec, until 2019.\n", 525 | "\n", 526 | "title:Quebec City\n", 527 | "\n", 528 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 529 | "\n", 530 | "views:2000\n", 531 | "\n", 532 | "\n", 533 | "item 28\n", 534 | "_additional:{'distance': -146.02489}\n", 535 | "\n", 536 | "lang:en\n", 537 | "\n", 538 | "text:Quebec ( ; ) is one of the thirteen provinces and territories of Canada. It is the largest province by area and the second-largest by population. Much of the population lives in urban areas along the St. Lawrence River, between the most populous city, Montreal, and the provincial capital, Quebec City. Quebec is the home of the Québécois nation. Located in Central Canada, the province shares land borders with Ontario to the west, Newfoundland and Labrador to the northeast, New Brunswick to the southeast, and a coastal border with Nunavut; in the south it borders Maine, New Hampshire, Vermont, and New York in the United States.\n", 539 | "\n", 540 | "title:Quebec\n", 541 | "\n", 542 | "url:https://en.wikipedia.org/wiki?curid=7954867\n", 543 | "\n", 544 | "views:3000\n", 545 | "\n", 546 | "\n", 547 | "item 29\n", 548 | "_additional:{'distance': -146.00629}\n", 549 | "\n", 550 | "lang:en\n", 551 | "\n", 552 | "text:The monarchy of Canada is Canada's form of government embodied by the Canadian sovereign and head of state. It is at the core of Canada's constitutional federal structure and Westminster-style parliamentary democracy. The monarchy is the foundation of the executive (King-in-Council), legislative (King-in-Parliament), and judicial (King-on-the-Bench) branches of both federal and provincial jurisdictions. The king of Canada since 8 September 2022 has been Charles III.\n", 553 | "\n", 554 | "title:Monarchy of Canada\n", 555 | "\n", 556 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 557 | "\n", 558 | "views:2000\n", 559 | "\n", 560 | "\n", 561 | "item 30\n", 562 | "_additional:{'distance': -145.96207}\n", 563 | "\n", 564 | "lang:en\n", 565 | "\n", 566 | "text:Because the prime minister is in practice the most politically powerful member of the Canadian government, they are sometimes erroneously referred to as Canada's head of state, when, in fact, that role belongs to the Canadian monarch, represented by the governor general. The prime minister is, instead, the head of government and is responsible for advising the Crown on how to exercise much of the royal prerogative and its executive powers, which are governed by the constitution and its conventions. However, the function of the prime minister has evolved with increasing power. Today, per the doctrines of constitutional monarchy, the advice given by the prime minister is ordinarily binding, meaning the prime minister effectively carries out those duties ascribed to the sovereign or governor general, leaving the latter to act in predominantly ceremonial fashions. As such, the prime minister, supported by the Office of the Prime Minister (PMO), controls the appointments of many key figures in Canada's system of governance, including the governor general, the Cabinet, justices of the Supreme Court, senators, heads of Crown corporations, ambassadors and high commissioners, the provincial lieutenant governors, and approximately 3,100 other positions. Further, the prime minister plays a prominent role in the legislative process—with the majority of bills put before Parliament originating in the Cabinet—and the leadership of the Canadian Armed Forces.\n", 567 | "\n", 568 | "title:Prime Minister of Canada\n", 569 | "\n", 570 | "url:https://en.wikipedia.org/wiki?curid=24135\n", 571 | "\n", 572 | "views:2000\n", 573 | "\n", 574 | "\n", 575 | "item 31\n", 576 | "_additional:{'distance': -145.96161}\n", 577 | "\n", 578 | "lang:en\n", 579 | "\n", 580 | "text:Canada is a parliamentary democracy and a constitutional monarchy in the Westminster tradition. The country's head of government is the prime minister, who holds office by virtue of their ability to command the confidence of the elected House of Commons, and is appointed by the governor general, representing the monarch of Canada, the head of state. The country is a Commonwealth realm and is officially bilingual (English and French) at the federal level. It ranks among the highest in international measurements of government transparency, civil liberties, quality of life, economic freedom, education, gender equality and environmental sustainability. It is one of the world's most ethnically diverse and multicultural nations, the product of large-scale immigration. Canada's long and complex relationship with the United States has had a significant impact on its economy and culture.\n", 581 | "\n", 582 | "title:Canada\n", 583 | "\n", 584 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 585 | "\n", 586 | "views:4000\n", 587 | "\n", 588 | "\n", 589 | "item 32\n", 590 | "_additional:{'distance': -145.95244}\n", 591 | "\n", 592 | "lang:en\n", 593 | "\n", 594 | "text:As access to new lands remained problematic because they were still monopolized by the Clique du Château, an exodus of Canadiens towards New England began and went on for the next one hundred years. This phenomenon is known as the Grande Hémorragie and greatly threatened the survival of the Canadien nation. The massive British immigration ordered from London that soon followed the failed rebellion compounded this problem. In order to combat this, the Church adopted the revenge of the cradle policy. In 1844, the capital of the Province of Canada was moved from Kingston to Montreal.\n", 595 | "\n", 596 | "title:Quebec\n", 597 | "\n", 598 | "url:https://en.wikipedia.org/wiki?curid=7954867\n", 599 | "\n", 600 | "views:3000\n", 601 | "\n", 602 | "\n", 603 | "item 33\n", 604 | "_additional:{'distance': -145.82161}\n", 605 | "\n", 606 | "lang:en\n", 607 | "\n", 608 | "text:Ottawa is known as the most educated city in Canada, with over half the population having graduated from college and/or university. Ottawa has the highest per capita concentration of engineers, scientists, and residents with PhDs in Canada. The city has two main public universities, and two main public colleges.\n", 609 | "\n", 610 | "title:Ottawa\n", 611 | "\n", 612 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 613 | "\n", 614 | "views:2000\n", 615 | "\n", 616 | "\n", 617 | "item 34\n", 618 | "_additional:{'distance': -145.80045}\n", 619 | "\n", 620 | "lang:en\n", 621 | "\n", 622 | "text:Ottawa's primary employers are the Public Service of Canada and the high-tech industry, although tourism and healthcare also represent increasingly sizeable economic activities. The federal government is the city's largest employer, employing over 116,000 individuals from the National Capital Region. The national headquarters for many federal departments are in Ottawa, particularly throughout Centretown and in the Terrasses de la Chaudière and Place du Portage complexes in Hull. The National Defence Headquarters in Ottawa is the main command centre for the Canadian Armed Forces and hosts the Department of National Defence. During the summer, the city hosts the Ceremonial Guard, which performs functions such as the Changing the Guard.\n", 623 | "\n", 624 | "title:Ottawa\n", 625 | "\n", 626 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 627 | "\n", 628 | "views:2000\n", 629 | "\n", 630 | "\n", 631 | "item 35\n", 632 | "_additional:{'distance': -145.77827}\n", 633 | "\n", 634 | "lang:en\n", 635 | "\n", 636 | "text:Historically the commercial capital of Canada, Montreal was surpassed in population and in economic strength by Toronto in the 1970s. It remains an important centre of commerce, aerospace, transport, finance, pharmaceuticals, technology, design, education, art, culture, tourism, food, fashion, video game development, film, and world affairs. Montreal is the location of the headquarters of the International Civil Aviation Organization, and was named a UNESCO City of Design in 2006. In 2017, Montreal was ranked the 12th-most liveable city in the world by the Economist Intelligence Unit in its annual Global Liveability Ranking, although it slipped to rank 40 in the 2021 index, primarily due to stress on the healthcare system from the COVID-19 pandemic. It is regularly ranked as a top ten city in the world to be a university student in the QS World University Rankings.\n", 637 | "\n", 638 | "title:Montreal\n", 639 | "\n", 640 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 641 | "\n", 642 | "views:3000\n", 643 | "\n", 644 | "\n", 645 | "item 36\n", 646 | "_additional:{'distance': -145.69058}\n", 647 | "\n", 648 | "lang:en\n", 649 | "\n", 650 | "text:Alberta's capital city, Edmonton, is located at about the geographic centre of the province. It is the most northerly major city in Canada and serves as a gateway and hub for resource development in northern Canada. With its proximity to Canada's largest oil fields, the region has most of western Canada's oil refinery capacity. Calgary is about south of Edmonton and north of Montana, surrounded by extensive ranching country. Almost 75% of the province's population lives in the Calgary–Edmonton Corridor. The land grant policy to the railways served as a means to populate the province in its early years.\n", 651 | "\n", 652 | "title:Alberta\n", 653 | "\n", 654 | "url:https://en.wikipedia.org/wiki?curid=717\n", 655 | "\n", 656 | "views:2000\n", 657 | "\n", 658 | "\n", 659 | "item 37\n", 660 | "_additional:{'distance': -145.42075}\n", 661 | "\n", 662 | "lang:en\n", 663 | "\n", 664 | "text:Most jobs in Quebec City are concentrated in public administration, defence, services, commerce, transport and tourism. As the provincial capital, the city benefits from being a regional administrative and services centre: apropos, the provincial government is the largest employer in the city, employing 27,900 people as of 2007. CHUQ (the local hospital network) is the city's largest institutional employer, with more than 10,000 employees in 2007. The unemployment rate in June 2018 was 3.8%, below the national average (6.0%) and the second-lowest of Canada's 34 largest cities, behind Peterborough (2.7%).\n", 665 | "\n", 666 | "title:Quebec City\n", 667 | "\n", 668 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 669 | "\n", 670 | "views:2000\n", 671 | "\n", 672 | "\n", 673 | "item 38\n", 674 | "_additional:{'distance': -145.40527}\n", 675 | "\n", 676 | "lang:en\n", 677 | "\n", 678 | "text:Toronto is an international centre for business and finance. Generally considered the financial and industrial capital of Canada, Toronto has a high concentration of banks and brokerage firms on Bay Street in the Financial District. The Toronto Stock Exchange is the world's seventh-largest stock exchange by market capitalization. The five largest financial institutions of Canada, collectively known as the Big Five, have national offices in Toronto.\n", 679 | "\n", 680 | "title:Toronto\n", 681 | "\n", 682 | "url:https://en.wikipedia.org/wiki?curid=64646\n", 683 | "\n", 684 | "views:3000\n", 685 | "\n", 686 | "\n", 687 | "item 39\n", 688 | "_additional:{'distance': -145.37503}\n", 689 | "\n", 690 | "lang:en\n", 691 | "\n", 692 | "text:Along with concrete high-rises such as Édifice Marie-Guyart and Le Concorde on parliament hill (see List of tallest buildings in Quebec City), the city's skyline is dominated by the massive Château Frontenac hotel, perched on top of Cap-Diamant. It was designed by architect Bruce Price, as one of a series of \"château\" style hotels built for the Canadian Pacific Railway company. The railway company sought to encourage luxury tourism and bring wealthy travellers to its trains.\n", 693 | "\n", 694 | "title:Quebec City\n", 695 | "\n", 696 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 697 | "\n", 698 | "views:2000\n", 699 | "\n", 700 | "\n", 701 | "item 40\n", 702 | "_additional:{'distance': -145.36432}\n", 703 | "\n", 704 | "lang:en\n", 705 | "\n", 706 | "text:The name \"Canada\" refers to this settlement. Although the Acadian settlement at Port-Royal was established three years earlier, Quebec came to be known as the cradle of North America's Francophone population. The place seemed favourable to the establishment of a permanent colony.\n", 707 | "\n", 708 | "title:Quebec City\n", 709 | "\n", 710 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 711 | "\n", 712 | "views:2000\n", 713 | "\n", 714 | "\n", 715 | "item 41\n", 716 | "_additional:{'distance': -145.33707}\n", 717 | "\n", 718 | "lang:en\n", 719 | "\n", 720 | "text:British Columbia's capital is Victoria, located at the southeastern tip of Vancouver Island. Only a narrow strip of Vancouver Island, from Campbell River to Victoria, is significantly populated. Much of the western part of Vancouver Island and the rest of the coast is covered by temperate rainforest.\n", 721 | "\n", 722 | "title:British Columbia\n", 723 | "\n", 724 | "url:https://en.wikipedia.org/wiki?curid=3392\n", 725 | "\n", 726 | "views:3000\n", 727 | "\n", 728 | "\n", 729 | "item 42\n", 730 | "_additional:{'distance': -145.33646}\n", 731 | "\n", 732 | "lang:en\n", 733 | "\n", 734 | "text:The King of Canada has delegated his prerogative to grant armorial bearings to the Governor General of Canada. Canada has its own Chief Herald and Herald Chancellor. The Canadian Heraldic Authority, the governmental agency which is responsible for creating arms and promoting Canadian heraldry, is situated at Rideau Hall.\n", 735 | "\n", 736 | "title:Coat of arms\n", 737 | "\n", 738 | "url:https://en.wikipedia.org/wiki?curid=55284\n", 739 | "\n", 740 | "views:2000\n", 741 | "\n", 742 | "\n", 743 | "item 43\n", 744 | "_additional:{'distance': -145.27737}\n", 745 | "\n", 746 | "lang:en\n", 747 | "\n", 748 | "text:Generally, Canadian provinces have steadily grown in population along with Canada. However, some provinces such as Saskatchewan, Prince Edward Island and Newfoundland and Labrador have experienced long periods of stagnation or population decline. Ontario and Quebec have always been the two biggest provinces in Canada, with together over 60% of the population at any given time. The population of the West relative to Canada as a whole has steadily grown over time, while that of Atlantic Canada has declined.\n", 749 | "\n", 750 | "title:Provinces and territories of Canada\n", 751 | "\n", 752 | "url:https://en.wikipedia.org/wiki?curid=75763\n", 753 | "\n", 754 | "views:3000\n", 755 | "\n", 756 | "\n", 757 | "item 44\n", 758 | "_additional:{'distance': -145.24377}\n", 759 | "\n", 760 | "lang:en\n", 761 | "\n", 762 | "text:As Canada's capital, Ottawa has played host to a number of significant cultural events in Canadian history, including the first visit of the reigning Canadian sovereign—King George VI, with his consort, Queen Elizabeth—to his parliament, on 19 May 1939. VE Day was marked with a large celebration on 8 May 1945, the first raising of the country's new national flag took place on 15 February 1965, and the centennial of Confederation was celebrated on 1 July 1967. Queen Elizabeth II was in Ottawa on 17 April 1982, to issue a royal proclamation of the enactment of the Constitution Act. In 1983, Prince Charles and Diana Princess of Wales came to Ottawa for a state dinner hosted by then Prime Minister Pierre Trudeau. In 2011, Ottawa was selected as the first city to receive Prince William, Duke of Cambridge, and Catherine, Duchess of Cambridge during their tour of Canada.\n", 763 | "\n", 764 | "title:Ottawa\n", 765 | "\n", 766 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 767 | "\n", 768 | "views:2000\n", 769 | "\n", 770 | "\n", 771 | "item 45\n", 772 | "_additional:{'distance': -145.18558}\n", 773 | "\n", 774 | "lang:en\n", 775 | "\n", 776 | "text:By 1951, Montreal's population had surpassed one million. However, Toronto's growth had begun challenging Montreal's status as the economic capital of Canada. Indeed, the volume of stocks traded at the Toronto Stock Exchange had already surpassed that traded at the Montreal Stock Exchange in the 1940s. The Saint Lawrence Seaway opened in 1959, allowing vessels to bypass Montreal. In time, this development led to the end of the city's economic dominance as businesses moved to other areas. During the 1960s, there was continued growth as Canada's tallest skyscrapers, new expressways and the subway system known as the Montreal Metro were finished during this time. Montreal also held the World's Fair of 1967, better known as Expo67.\n", 777 | "\n", 778 | "title:Montreal\n", 779 | "\n", 780 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 781 | "\n", 782 | "views:3000\n", 783 | "\n", 784 | "\n", 785 | "item 46\n", 786 | "_additional:{'distance': -145.11502}\n", 787 | "\n", 788 | "lang:en\n", 789 | "\n", 790 | "text:Ontario ( ; ) is one of the thirteen provinces and territories of Canada. Located in Central Canada, it is Canada's most populous province, with 38.3 percent of the country's population, and is the second-largest province by total area (after Quebec). Ontario is Canada's fourth-largest jurisdiction in total area when the territories of the Northwest Territories and Nunavut are included. It is home to the nation's capital city, Ottawa, and the nation's most populous city, Toronto, which is Ontario's provincial capital.\n", 791 | "\n", 792 | "title:Ontario\n", 793 | "\n", 794 | "url:https://en.wikipedia.org/wiki?curid=22218\n", 795 | "\n", 796 | "views:3000\n", 797 | "\n", 798 | "\n", 799 | "item 47\n", 800 | "_additional:{'distance': -145.09877}\n", 801 | "\n", 802 | "lang:en\n", 803 | "\n", 804 | "text:In the heart of downtown are the British Columbia Parliament Buildings, The Empress Hotel, Victoria Police Department Station Museum, the gothic Christ Church Cathedral, and the Royal British Columbia Museum/IMAX National Geographic Theatre, with large exhibits on local Aboriginal peoples, natural history, and modern history, along with travelling international exhibits. In addition, the heart of downtown also has the Maritime Museum of British Columbia, Emily Carr House, Victoria Bug Zoo, and Market Square. The oldest (and most intact) Chinatown in Canada is within downtown. The Art Gallery of Greater Victoria is close to downtown in the Rockland neighbourhood several city blocks from Craigdarroch Castle built by industrialist Robert Dunsmuir and Government House, the official residence of the Lieutenant-Governor of British Columbia.\n", 805 | "\n", 806 | "title:Victoria, British Columbia\n", 807 | "\n", 808 | "url:https://en.wikipedia.org/wiki?curid=32388\n", 809 | "\n", 810 | "views:2000\n", 811 | "\n", 812 | "\n", 813 | "item 48\n", 814 | "_additional:{'distance': -145.04091}\n", 815 | "\n", 816 | "lang:en\n", 817 | "\n", 818 | "text:Toronto ( ; or ) is the capital city of the Canadian province of Ontario. With a recorded population of 2,794,356 in 2021, it is the most populous city in Canada and the fourth most populous city in North America. The city is the anchor of the Golden Horseshoe, an urban agglomeration of 9,765,188 people (as of 2021) surrounding the western end of Lake Ontario, while the Greater Toronto Area proper had a 2021 population of 6,712,341. Toronto is an international centre of business, finance, arts, sports and culture, and is recognized as one of the most multicultural and cosmopolitan cities in the world.\n", 819 | "\n", 820 | "title:Toronto\n", 821 | "\n", 822 | "url:https://en.wikipedia.org/wiki?curid=64646\n", 823 | "\n", 824 | "views:3000\n", 825 | "\n", 826 | "\n", 827 | "item 49\n", 828 | "_additional:{'distance': -145.01453}\n", 829 | "\n", 830 | "lang:en\n", 831 | "\n", 832 | "text:The prime minister is supported by the Prime Minister's Office and heads the Privy Council Office. The prime minister also effectively appoints individuals to the Senate of Canada and to the Supreme Court of Canada and other federal courts, along with choosing the leaders and boards, as required under law, of various Crown corporations. Under the \"Constitution Act, 1867\", government power is vested in the monarch (who is the head of state), but in practice the role of the monarch—or their representative, the governor general (or the administrator)—is largely ceremonial and only exercised on the advice of a Cabinet minister. The prime minister also provides advice to the monarch of Canada for the selection of the governor general.\n", 833 | "\n", 834 | "title:Prime Minister of Canada\n", 835 | "\n", 836 | "url:https://en.wikipedia.org/wiki?curid=24135\n", 837 | "\n", 838 | "views:2000\n", 839 | "\n", 840 | "\n", 841 | "item 50\n", 842 | "_additional:{'distance': -144.99521}\n", 843 | "\n", 844 | "lang:en\n", 845 | "\n", 846 | "text:The national flag of Canada (), often simply referred to as the Canadian flag or, unofficially, as the Maple Leaf or \"\" (; ), consists of a red field with a white square at its centre in the ratio of , in which is featured a stylized, red, 11-pointed maple leaf charged in the centre. It is the first flag to have been adopted by both houses of Parliament and officially proclaimed by the Canadian monarch as the country's official national flag. The flag has become the predominant and most recognizable national symbol of Canada.\n", 847 | "\n", 848 | "title:Flag of Canada\n", 849 | "\n", 850 | "url:https://en.wikipedia.org/wiki?curid=97066\n", 851 | "\n", 852 | "views:2000\n", 853 | "\n", 854 | "\n", 855 | "item 51\n", 856 | "_additional:{'distance': -144.94948}\n", 857 | "\n", 858 | "lang:en\n", 859 | "\n", 860 | "text:Since the days of King Louis XIV, the monarch is the fount of all honours in Canada and the orders, decorations, and medals form \"an integral element of the Crown.\" Hence, the insignia and medallions for these awards bear a crown, cypher, and/or portrait of the monarch. Similarly, the country's heraldic authority was created by the Queen and, operating under the authority of the governor general, grants new coats of arms, flags, and badges in Canada. Use of the royal crown in such symbols is a gift from the monarch showing royal support and/or association, and requires her approval before being added.\n", 861 | "\n", 862 | "title:Monarchy of Canada\n", 863 | "\n", 864 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 865 | "\n", 866 | "views:2000\n", 867 | "\n", 868 | "\n", 869 | "item 52\n", 870 | "_additional:{'distance': -144.89908}\n", 871 | "\n", 872 | "lang:en\n", 873 | "\n", 874 | "text:The Canadiens have developed strong rivalries with two fellow Original Six franchises, with whom they frequently shared divisions and competed in post-season play. The oldest is with the Toronto Maple Leafs, who first faced the Canadiens as the Toronto Arenas in 1917. The teams met 16 times in the playoffs, including five Stanley Cup Finals. Featuring the two largest cities in Canada and two of the largest fanbases in the league, the rivalry is sometimes dramatized as being emblematic of Canada's English and French linguistic divide. From 1938 to 1970, they were the only two Canadian teams in the league.\n", 875 | "\n", 876 | "title:Montreal Canadiens\n", 877 | "\n", 878 | "url:https://en.wikipedia.org/wiki?curid=42966\n", 879 | "\n", 880 | "views:2000\n", 881 | "\n", 882 | "\n", 883 | "item 53\n", 884 | "_additional:{'distance': -144.89122}\n", 885 | "\n", 886 | "lang:en\n", 887 | "\n", 888 | "text:Canadian Pacific Railway (CPR), headquartered in Calgary, Alberta, was founded here in 1881. Its corporate headquarters occupied Windsor Station at 910 Peel Street until 1995. With the Port of Montreal kept open year-round by icebreakers, lines to Eastern Canada became surplus, and now Montreal is the railway's eastern and intermodal freight terminus. CPR connects at Montreal with the Port of Montreal, the Delaware and Hudson Railway to New York, the Quebec Gatineau Railway to Quebec City and Buckingham, the Central Maine and Quebec Railway to Halifax, and Canadian National Railway (CN). The CPR's flagship train, \"The Canadian\", ran daily from Windsor Station to Vancouver, but in 1978 all passenger services were transferred to Via. Since 1990, \"The Canadian\" has terminated in Toronto instead of in Montreal.\n", 889 | "\n", 890 | "title:Montreal\n", 891 | "\n", 892 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 893 | "\n", 894 | "views:3000\n", 895 | "\n", 896 | "\n", 897 | "item 54\n", 898 | "_additional:{'distance': -144.88953}\n", 899 | "\n", 900 | "lang:en\n", 901 | "\n", 902 | "text:Calgary was designated as one of the cultural capitals of Canada in 2012. While many Calgarians continue to live in the city's suburbs, more central neighbourhoods such as Kensington, Inglewood, Forest Lawn, Bridgeland, Marda Loop, the Mission District, and especially the Beltline, have become more popular and density in those areas has increased.\n", 903 | "\n", 904 | "title:Calgary\n", 905 | "\n", 906 | "url:https://en.wikipedia.org/wiki?curid=15895358\n", 907 | "\n", 908 | "views:2000\n", 909 | "\n", 910 | "\n", 911 | "item 55\n", 912 | "_additional:{'distance': -144.88878}\n", 913 | "\n", 914 | "lang:en\n", 915 | "\n", 916 | "text:Nova Scotia's capital and largest municipality is Halifax, which is home to over 45% of the province's population as of the 2021 census. Halifax is the thirteenth-largest census metropolitan area in Canada, the largest municipality in Atlantic Canada, and Canada's second-largest coastal municipality after Vancouver.\n", 917 | "\n", 918 | "title:Nova Scotia\n", 919 | "\n", 920 | "url:https://en.wikipedia.org/wiki?curid=21184\n", 921 | "\n", 922 | "views:2000\n", 923 | "\n", 924 | "\n", 925 | "item 56\n", 926 | "_additional:{'distance': -144.88113}\n", 927 | "\n", 928 | "lang:en\n", 929 | "\n", 930 | "text:6.313 million Toronto ; 4.277 million Montreal ; 2.632 million Vancouver ; 1.611 million Calgary ; 1.519 million Edmonton ; 1.423 million OTTAWA (capital) (2022)\n", 931 | "\n", 932 | "title:Urban area\n", 933 | "\n", 934 | "url:https://en.wikipedia.org/wiki?curid=764593\n", 935 | "\n", 936 | "views:3000\n", 937 | "\n", 938 | "\n", 939 | "item 57\n", 940 | "_additional:{'distance': -144.85863}\n", 941 | "\n", 942 | "lang:en\n", 943 | "\n", 944 | "text:Centretown is next to downtown, which includes a substantial economic and architectural government presence across multiple branches of government. The legislature's work takes place in the parliamentary precinct, which includes buildings on Parliament Hill and others downtown, such as the Senate of Canada Building. Important buildings in the executive branch include the Office of the Prime Minister and Privy Council as well as many civil service buildings. The Supreme Court of Canada building can also be found in this area.\n", 945 | "\n", 946 | "title:Ottawa\n", 947 | "\n", 948 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 949 | "\n", 950 | "views:2000\n", 951 | "\n", 952 | "\n", 953 | "item 58\n", 954 | "_additional:{'distance': -144.79507}\n", 955 | "\n", 956 | "lang:en\n", 957 | "\n", 958 | "text:With an area of , Newfoundland is the world's 16th-largest island, Canada's fourth-largest island, and the largest Canadian island outside the North. The provincial capital, St. John's, is located on the southeastern coast of the island; Cape Spear, just south of the capital, is the easternmost point of North America, excluding Greenland. It is common to consider all directly neighbouring islands such as New World, Twillingate, Fogo and Bell Island to be 'part of Newfoundland' (i.e., distinct from Labrador). By that classification, Newfoundland and its associated small islands have a total area of .\n", 959 | "\n", 960 | "title:Newfoundland (island)\n", 961 | "\n", 962 | "url:https://en.wikipedia.org/wiki?curid=26304966\n", 963 | "\n", 964 | "views:2000\n", 965 | "\n", 966 | "\n", 967 | "item 59\n", 968 | "_additional:{'distance': -144.78793}\n", 969 | "\n", 970 | "lang:en\n", 971 | "\n", 972 | "text:Canada is one of the oldest continuing monarchies in the world. Initially established in the 16th century, monarchy in Canada has evolved through a continuous succession of initially French and later British sovereigns into the independent Canadian sovereigns of today. The institution that is Canada's system of constitutional monarchy is sometimes colloquially referred to as the \"Maple Crown\".\n", 973 | "\n", 974 | "title:Monarchy of Canada\n", 975 | "\n", 976 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 977 | "\n", 978 | "views:2000\n", 979 | "\n", 980 | "\n", 981 | "item 60\n", 982 | "_additional:{'distance': -144.77293}\n", 983 | "\n", 984 | "lang:en\n", 985 | "\n", 986 | "text:Edmonton ( ) is the capital city of the Canadian province of Alberta. Edmonton is situated on the North Saskatchewan River and is the centre of the Edmonton Metropolitan Region, which is surrounded by Alberta's central region. The city anchors the north end of what Statistics Canada defines as the \"Calgary–Edmonton Corridor\".\n", 987 | "\n", 988 | "title:Edmonton\n", 989 | "\n", 990 | "url:https://en.wikipedia.org/wiki?curid=95405\n", 991 | "\n", 992 | "views:2000\n", 993 | "\n", 994 | "\n", 995 | "item 61\n", 996 | "_additional:{'distance': -144.767}\n", 997 | "\n", 998 | "lang:en\n", 999 | "\n", 1000 | "text:Montreal-based CN was formed in 1919 by the Canadian government following a series of country-wide rail bankruptcies. It was formed from the Grand Trunk, Midland and Canadian Northern Railways, and has risen to become CPR's chief rival in freight carriage in Canada. Like the CPR, CN divested itself of passenger services in favour of Via. CN's flagship train, the \"Super Continental\", ran daily from Central Station to Vancouver and subsequently became a Via train in 1978. It was eliminated in 1990 in favour of rerouting \"The Canadian\".\n", 1001 | "\n", 1002 | "title:Montreal\n", 1003 | "\n", 1004 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 1005 | "\n", 1006 | "views:3000\n", 1007 | "\n", 1008 | "\n", 1009 | "item 62\n", 1010 | "_additional:{'distance': -144.69998}\n", 1011 | "\n", 1012 | "lang:en\n", 1013 | "\n", 1014 | "text:In Canada, these head of state powers belong to the monarch as part of the royal prerogative, but the Governor General has been permitted to exercise them since 1947 and has done so since the 1970s.\n", 1015 | "\n", 1016 | "title:Head of state\n", 1017 | "\n", 1018 | "url:https://en.wikipedia.org/wiki?curid=13456\n", 1019 | "\n", 1020 | "views:2000\n", 1021 | "\n", 1022 | "\n", 1023 | "item 63\n", 1024 | "_additional:{'distance': -144.66977}\n", 1025 | "\n", 1026 | "lang:en\n", 1027 | "\n", 1028 | "text:Nunavut comprises a major portion of Northern Canada and most of the Arctic Archipelago. Its vast territory makes it the fifth-largest country subdivision in the world, as well as North America's second-largest (after Greenland). The capital Iqaluit (formerly Frobisher Bay), on Baffin Island in the east, was chosen by a capital plebiscite in 1995. Other major communities include the regional centres of Rankin Inlet and Cambridge Bay.\n", 1029 | "\n", 1030 | "title:Nunavut\n", 1031 | "\n", 1032 | "url:https://en.wikipedia.org/wiki?curid=7129693\n", 1033 | "\n", 1034 | "views:2000\n", 1035 | "\n", 1036 | "\n", 1037 | "item 64\n", 1038 | "_additional:{'distance': -144.66476}\n", 1039 | "\n", 1040 | "lang:en\n", 1041 | "\n", 1042 | "text:The vast majority of Canada's population is concentrated in areas close to the Canada–US border. Its four largest provinces by area (Quebec, Ontario, British Columbia and Alberta) are also (with Quebec and Ontario switched in order) its most populous; together they account for 86% of the country's population. The territories (the Northwest Territories, Nunavut and Yukon) account for over a third of Canada's area but are only home to 0.3% of its population, which skews the national population density value.\n", 1043 | "\n", 1044 | "title:Provinces and territories of Canada\n", 1045 | "\n", 1046 | "url:https://en.wikipedia.org/wiki?curid=75763\n", 1047 | "\n", 1048 | "views:3000\n", 1049 | "\n", 1050 | "\n", 1051 | "item 65\n", 1052 | "_additional:{'distance': -144.66019}\n", 1053 | "\n", 1054 | "lang:en\n", 1055 | "\n", 1056 | "text:As King of Canada, Charles III is the head of state for the Government of Alberta. His duties in Alberta are carried out by Lieutenant Governor Salma Lakhani. The King and lieutenant governor are figureheads whose actions are highly restricted by custom and constitutional convention. The lieutenant governor handles numerous honorific duties in the name of the King. The government is headed by the premier. The premier is normally a member of the Legislative Assembly, and draws all the members of the Cabinet from among the members of the Legislative Assembly. The City of Edmonton is the seat of the provincial government—the capital of Alberta. The current premier is Danielle Smith, who was sworn in on October 11th, 2022.\n", 1057 | "\n", 1058 | "title:Alberta\n", 1059 | "\n", 1060 | "url:https://en.wikipedia.org/wiki?curid=717\n", 1061 | "\n", 1062 | "views:2000\n", 1063 | "\n", 1064 | "\n", 1065 | "item 66\n", 1066 | "_additional:{'distance': -144.62424}\n", 1067 | "\n", 1068 | "lang:en\n", 1069 | "\n", 1070 | "text:Toronto is the capital of Ontario with the Ontario Legislative Building, often metonymically known as Queen's Park after the street and park surrounding it, being located in downtown Toronto. Most of the provincial government offices are also located in downtown Toronto.\n", 1071 | "\n", 1072 | "title:Greater Toronto Area\n", 1073 | "\n", 1074 | "url:https://en.wikipedia.org/wiki?curid=266720\n", 1075 | "\n", 1076 | "views:2000\n", 1077 | "\n", 1078 | "\n", 1079 | "item 67\n", 1080 | "_additional:{'distance': -144.60934}\n", 1081 | "\n", 1082 | "lang:en\n", 1083 | "\n", 1084 | "text:Montreal was referred to as \"Canada's Cultural Capital\" by \"Monocle\" magazine. The city is Canada's centre for French-language television productions, radio, theatre, film, multimedia, and print publishing. Montreal's many cultural communities have given it a distinct local culture. Montreal was designated as the World Book Capital for the year 2005 by UNESCO.\n", 1085 | "\n", 1086 | "title:Montreal\n", 1087 | "\n", 1088 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 1089 | "\n", 1090 | "views:3000\n", 1091 | "\n", 1092 | "\n", 1093 | "item 68\n", 1094 | "_additional:{'distance': -144.6033}\n", 1095 | "\n", 1096 | "lang:en\n", 1097 | "\n", 1098 | "text:From 2007 to 2011, Winnipeg was the \"murder capital\" of Canada, with the highest per-capita rate of homicides; as of 2019 it is in second place, behind Thunder Bay. Winnipeg had the 13-highest violent crime index in Canada, and the highest robbery rate. Winnipeg was the \"violent crime capital\" of Canada in 2020 according to the Statistics Canada police-reported violent crime severity index. Despite high overall violent crime rates, crime in Winnipeg is mostly concentrated in the inner city, which makes up only 19% of the population but was the site of 86.4% of the city's shootings, 66.5% of the robberies, 63.3% of the homicides and 59.5% of the sexual assaults in 2012.\n", 1099 | "\n", 1100 | "title:Winnipeg\n", 1101 | "\n", 1102 | "url:https://en.wikipedia.org/wiki?curid=100730\n", 1103 | "\n", 1104 | "views:2000\n", 1105 | "\n", 1106 | "\n", 1107 | "item 69\n", 1108 | "_additional:{'distance': -144.5503}\n", 1109 | "\n", 1110 | "lang:en\n", 1111 | "\n", 1112 | "text:The second largest concentration of British Columbia population is at the southern tip of Vancouver Island, which is made up of the 13 municipalities of Greater Victoria, Victoria, Saanich, Esquimalt, Oak Bay, View Royal, Highlands, Colwood, Langford, Central Saanich/Saanichton, North Saanich, Sidney, Metchosin, Sooke, which are part of the Capital Regional District. The metropolitan area also includes several Indian reserves (the governments of which are not part of the regional district). Almost half of the Vancouver Island population is in Greater Victoria.\n", 1113 | "\n", 1114 | "title:British Columbia\n", 1115 | "\n", 1116 | "url:https://en.wikipedia.org/wiki?curid=3392\n", 1117 | "\n", 1118 | "views:3000\n", 1119 | "\n", 1120 | "\n", 1121 | "item 70\n", 1122 | "_additional:{'distance': -144.51027}\n", 1123 | "\n", 1124 | "lang:en\n", 1125 | "\n", 1126 | "text:Although it has been argued that the term \"head of state\" is a republican one inapplicable in a constitutional monarchy such as Canada, where the monarch is the embodiment of the state and thus cannot be head of it, the sovereign is regarded by official government sources, judges, constitutional scholars, and pollsters as the head of state, while the governor general and lieutenant governors are all only representatives of, and thus equally subordinate to, that figure. Some governors general, their staff, government publications, and constitutional scholars like Ted McWhinney and C. E. S. Franks have, however, referred to the position of governor general as that of Canada's head of state, though sometimes qualifying the assertion with \"de facto\" or \"effective\"; Franks has hence recommended that the governor general be named officially as the head of state. Still others view the role of head of state as being shared by both the sovereign and her viceroys. Since 1927, governors general have been received on state visits abroad as though they were heads of state.\n", 1127 | "\n", 1128 | "title:Monarchy of Canada\n", 1129 | "\n", 1130 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 1131 | "\n", 1132 | "views:2000\n", 1133 | "\n", 1134 | "\n", 1135 | "item 71\n", 1136 | "_additional:{'distance': -144.49982}\n", 1137 | "\n", 1138 | "lang:en\n", 1139 | "\n", 1140 | "text:The city is home to the Toronto Stock Exchange, the headquarters of Canada's five largest banks, and the headquarters of many large Canadian and multinational corporations. Its economy is highly diversified with strengths in technology, design, financial services, life sciences, education, arts, fashion, aerospace, environmental innovation, food services, and tourism. Toronto is the third-largest tech hub in North America after Silicon Valley and New York City, and the fastest growing.\n", 1141 | "\n", 1142 | "title:Toronto\n", 1143 | "\n", 1144 | "url:https://en.wikipedia.org/wiki?curid=64646\n", 1145 | "\n", 1146 | "views:3000\n", 1147 | "\n", 1148 | "\n", 1149 | "item 72\n", 1150 | "_additional:{'distance': -144.48755}\n", 1151 | "\n", 1152 | "lang:en\n", 1153 | "\n", 1154 | "text:The Port of Montreal is one of the largest inland ports in the world handling 26 million tonnes of cargo annually. As one of the most important ports in Canada, it remains a transshipment point for grain, sugar, petroleum products, machinery, and consumer goods. For this reason, Montreal is the railway hub of Canada and has always been an extremely important rail city; it is home to the headquarters of the Canadian National Railway, and was home to the headquarters of the Canadian Pacific Railway until 1995.\n", 1155 | "\n", 1156 | "title:Montreal\n", 1157 | "\n", 1158 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 1159 | "\n", 1160 | "views:3000\n", 1161 | "\n", 1162 | "\n", 1163 | "item 73\n", 1164 | "_additional:{'distance': -144.47365}\n", 1165 | "\n", 1166 | "lang:en\n", 1167 | "\n", 1168 | "text:Potential CFL expansion markets are the Maritimes, Quebec City, Saskatoon, London, and Windsor, all of which have been lobbying for Canadian Football League franchises in recent years. During the 1970s and 1980s, Harold Ballard attempted multiple times to secure a second CFL team for Toronto (either by way of expansion or by relocating the Hamilton Tiger-Cats), under the premise that Canada's largest city could support two teams.\n", 1169 | "\n", 1170 | "title:Canadian Football League\n", 1171 | "\n", 1172 | "url:https://en.wikipedia.org/wiki?curid=56802\n", 1173 | "\n", 1174 | "views:2000\n", 1175 | "\n", 1176 | "\n", 1177 | "item 74\n", 1178 | "_additional:{'distance': -144.4472}\n", 1179 | "\n", 1180 | "lang:en\n", 1181 | "\n", 1182 | "text:The city's landmarks include the Château Frontenac hotel that dominates the skyline and the Citadelle of Quebec, an intact fortress that forms the centrepiece of the ramparts surrounding the old city and includes a secondary royal residence. The National Assembly of Quebec (provincial legislature), the Musée national des beaux-arts du Québec (\"National Museum of Fine Arts of Quebec\"), and the Musée de la civilisation (\"Museum of Civilization\") are found within or near Vieux-Québec.\n", 1183 | "\n", 1184 | "title:Quebec City\n", 1185 | "\n", 1186 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 1187 | "\n", 1188 | "views:2000\n", 1189 | "\n", 1190 | "\n", 1191 | "item 75\n", 1192 | "_additional:{'distance': -144.42685}\n", 1193 | "\n", 1194 | "lang:en\n", 1195 | "\n", 1196 | "text:The city itself was not attacked during the War of 1812, when the United States again attempted to annex Canadian lands. Amid fears of another American attack on Quebec City, construction of the Citadelle of Quebec began in 1820. The Americans did not attack Canada after the War of 1812, but the Citadelle continued to house a large British garrison until 1871. It is still in use by the military and is also a tourist attraction.\n", 1197 | "\n", 1198 | "title:Quebec City\n", 1199 | "\n", 1200 | "url:https://en.wikipedia.org/wiki?curid=100727\n", 1201 | "\n", 1202 | "views:2000\n", 1203 | "\n", 1204 | "\n", 1205 | "item 76\n", 1206 | "_additional:{'distance': -144.36205}\n", 1207 | "\n", 1208 | "lang:en\n", 1209 | "\n", 1210 | "text:Ottawa has the most educated population among Canadian cities and is home to a number of colleges and universities, research and cultural institutions, including the University of Ottawa, Carleton University, Algonquin College, the National Arts Centre, the National Gallery of Canada; and numerous national museums, monuments, and historic sites. It is one of the most visited cities in Canada, with over 11 million visitors in 2018.\n", 1211 | "\n", 1212 | "title:Ottawa\n", 1213 | "\n", 1214 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 1215 | "\n", 1216 | "views:2000\n", 1217 | "\n", 1218 | "\n", 1219 | "item 77\n", 1220 | "_additional:{'distance': -144.33684}\n", 1221 | "\n", 1222 | "lang:en\n", 1223 | "\n", 1224 | "text:Montreal was incorporated as a city in 1832. The opening of the Lachine Canal permitted ships to bypass the unnavigable Lachine Rapids, while the construction of the Victoria Bridge established Montreal as a major railway hub. The leaders of Montreal's business community had started to build their homes in the Golden Square Mile from about 1850. By 1860, it was the largest municipality in British North America and the undisputed economic and cultural centre of Canada.\n", 1225 | "\n", 1226 | "title:Montreal\n", 1227 | "\n", 1228 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 1229 | "\n", 1230 | "views:3000\n", 1231 | "\n", 1232 | "\n", 1233 | "item 78\n", 1234 | "_additional:{'distance': -144.23914}\n", 1235 | "\n", 1236 | "lang:en\n", 1237 | "\n", 1238 | "text:For over a century and a half, Montreal was the industrial and financial centre of Canada. This legacy has left a variety of buildings including factories, elevators, warehouses, mills, and refineries, that today provide an invaluable insight into the city's history, especially in the downtown area and the Old Port area. There are 50 National Historic Sites of Canada, more than any other city.\n", 1239 | "\n", 1240 | "title:Montreal\n", 1241 | "\n", 1242 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 1243 | "\n", 1244 | "views:3000\n", 1245 | "\n", 1246 | "\n", 1247 | "item 79\n", 1248 | "_additional:{'distance': -144.23871}\n", 1249 | "\n", 1250 | "lang:en\n", 1251 | "\n", 1252 | "text:Incorporated as a town in 1892 with a population of 700 and then as a city in 1904 with a population of 8,350, Edmonton became the capital of Alberta when the province was formed a year later, on September 1, 1905. In November 1905, the Canadian Northern Railway (CNR) arrived in Edmonton, accelerating growth.\n", 1253 | "\n", 1254 | "title:Edmonton\n", 1255 | "\n", 1256 | "url:https://en.wikipedia.org/wiki?curid=95405\n", 1257 | "\n", 1258 | "views:2000\n", 1259 | "\n", 1260 | "\n", 1261 | "item 80\n", 1262 | "_additional:{'distance': -144.18848}\n", 1263 | "\n", 1264 | "lang:en\n", 1265 | "\n", 1266 | "text:The nationalist movement in Quebec, particularly after the election of the \"Parti Québécois\" in 1976, contributed to driving many businesses and English-speaking people out of Quebec to Ontario, and as a result, Toronto surpassed Montreal as the largest city and economic centre of Canada. Depressed economic conditions in the Maritime Provinces have also resulted in de-population of those provinces in the 20th century, with heavy migration into Ontario.\n", 1267 | "\n", 1268 | "title:Ontario\n", 1269 | "\n", 1270 | "url:https://en.wikipedia.org/wiki?curid=22218\n", 1271 | "\n", 1272 | "views:3000\n", 1273 | "\n", 1274 | "\n", 1275 | "item 81\n", 1276 | "_additional:{'distance': -144.17209}\n", 1277 | "\n", 1278 | "lang:en\n", 1279 | "\n", 1280 | "text:Shortly after Canadian Confederation in 1867, the need for distinctive Canadian flags emerged. The first Canadian flag was that then used as the flag of the Governor General of Canada, a Union Flag with a shield in the centre bearing the quartered arms of Ontario, Quebec, Nova Scotia and New Brunswick, surrounded by a wreath of maple leaves. In 1870, the Red Ensign, with the addition of the Canadian composite shield in the fly, began to be used unofficially on land and sea and was known as the \"Canadian Red Ensign\". As new provinces joined the Confederation, their arms were added to the shield. In 1892, the British admiralty approved the use of the Red Ensign for Canadian use at sea.\n", 1281 | "\n", 1282 | "title:Flag of Canada\n", 1283 | "\n", 1284 | "url:https://en.wikipedia.org/wiki?curid=97066\n", 1285 | "\n", 1286 | "views:2000\n", 1287 | "\n", 1288 | "\n", 1289 | "item 82\n", 1290 | "_additional:{'distance': -144.15552}\n", 1291 | "\n", 1292 | "lang:en\n", 1293 | "\n", 1294 | "text:Established in 1975, the Great Canadian Theatre Company specializes in the production of Canadian plays at a local level. The cities museum landscape is notable for containing six of Canada's nine national museums, the Canada Agriculture and Food Museum, the Canada Aviation and Space Museum, the Canada Science and Technology Museum, Canadian Museum of Nature, Canadian War Museum and National Gallery of Canada. The National Gallery of Canada; designed by famous architect Moshe Safdie, it is a permanent home to the \"Maman\" sculpture. The Canadian War Museum houses over 3.75 million artifacts and was moved to an expanded facility in 2005. The Canadian Museum of Nature was built in 1905, and underwent a major renovation between 2004 and 2010, leading to a centrepiece Blue Whale skeleton, and the creation of a monthly nightclub experience, \"Nature Nocturne\".\n", 1295 | "\n", 1296 | "title:Ottawa\n", 1297 | "\n", 1298 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 1299 | "\n", 1300 | "views:2000\n", 1301 | "\n", 1302 | "\n", 1303 | "item 83\n", 1304 | "_additional:{'distance': -144.15211}\n", 1305 | "\n", 1306 | "lang:en\n", 1307 | "\n", 1308 | "text:It is also notable that three cities (Montreal, Toronto, and Vancouver) are from Canada and three other cities (Beijing, Hong Kong, and Shanghai) are from the People's Republic of China. No other countries are represented by more than one city.\n", 1309 | "\n", 1310 | "title:Monopoly (game)\n", 1311 | "\n", 1312 | "url:https://en.wikipedia.org/wiki?curid=19692\n", 1313 | "\n", 1314 | "views:2000\n", 1315 | "\n", 1316 | "\n", 1317 | "item 84\n", 1318 | "_additional:{'distance': -144.14737}\n", 1319 | "\n", 1320 | "lang:en\n", 1321 | "\n", 1322 | "text:Themes of nature, pioneers, trappers, and traders played an important part in the early development of Canadian symbolism. Modern symbols emphasize the country's geography, cold climate, lifestyles and the Canadianization of traditional European and Indigenous symbols. The use of the maple leaf as a Canadian symbol dates to the early 18th century. The maple leaf is depicted on Canada's current and previous flags, and on the Arms of Canada. Canada's official tartan, known as the \"maple leaf tartan\", has four colours that reflect the colours of the maple leaf as it changes through the seasons—green in the spring, gold in the early autumn, red at the first frost, and brown after falling. The Arms of Canada are closely modelled after the royal coat of arms of the United Kingdom with French and distinctive Canadian elements replacing or added to those derived from the British version.\n", 1323 | "\n", 1324 | "title:Canada\n", 1325 | "\n", 1326 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1327 | "\n", 1328 | "views:4000\n", 1329 | "\n", 1330 | "\n", 1331 | "item 85\n", 1332 | "_additional:{'distance': -144.126}\n", 1333 | "\n", 1334 | "lang:en\n", 1335 | "\n", 1336 | "text:The 2021 Canadian census enumerated a total population of 36,991,981, an increase of around 5.2 percent over the 2016 figure. The main drivers of population growth are immigration and, to a lesser extent, natural growth. Canada has one of the highest per-capita immigration rates in the world, driven mainly by economic policy and also family reunification. A record number of 405,000 immigrants were admitted to Canada in 2021. New immigrants settle mostly in major urban areas in the country, such as Toronto, Montreal and Vancouver. Canada also accepts large numbers of refugees, accounting for over 10 percent of annual global refugee resettlements; it resettled more than 28,000 in 2018.\n", 1337 | "\n", 1338 | "title:Canada\n", 1339 | "\n", 1340 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1341 | "\n", 1342 | "views:4000\n", 1343 | "\n", 1344 | "\n", 1345 | "item 86\n", 1346 | "_additional:{'distance': -144.10197}\n", 1347 | "\n", 1348 | "lang:en\n", 1349 | "\n", 1350 | "text:A highly developed country, Canada has the 24th highest nominal per capita income globally and the sixteenth-highest ranking on the Human Development Index. Its advanced economy is the eighth-largest in the world, relying chiefly upon its abundant natural resources and well-developed international trade networks. Canada is part of several major international and intergovernmental institutions or groupings including the United Nations, NATO, the G7, the Group of Ten, the G20, the Organisation for Economic Co-operation and Development (OECD), the World Trade Organization (WTO), the Commonwealth of Nations, the Arctic Council, the , the Asia-Pacific Economic Cooperation forum, and the Organization of American States.\n", 1351 | "\n", 1352 | "title:Canada\n", 1353 | "\n", 1354 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1355 | "\n", 1356 | "views:4000\n", 1357 | "\n", 1358 | "\n", 1359 | "item 87\n", 1360 | "_additional:{'distance': -144.0854}\n", 1361 | "\n", 1362 | "lang:en\n", 1363 | "\n", 1364 | "text:Toronto became the capital of the province of Ontario after its official creation in 1867. The seat of government of the Ontario Legislature is at Queen's Park. Because of its provincial capital status, the city was also the location of Government House, the residence of the viceregal representative of the Crown in right of Ontario.\n", 1365 | "\n", 1366 | "title:Toronto\n", 1367 | "\n", 1368 | "url:https://en.wikipedia.org/wiki?curid=64646\n", 1369 | "\n", 1370 | "views:3000\n", 1371 | "\n", 1372 | "\n", 1373 | "item 88\n", 1374 | "_additional:{'distance': -144.06784}\n", 1375 | "\n", 1376 | "lang:en\n", 1377 | "\n", 1378 | "text:The Canadian Heraldic Authority (CHA) grants former prime ministers an augmentation of honour on the coat of arms of those who apply for them. The heraldic badge, referred to by the CHA as the \"mark of the Prime Ministership of Canada\", consists of four red maple leaves joined at the stem on a white field (\"Argent four maple leaves conjoined in cross at the stem Gules\"); the augmentation is usually a canton or centred in the chief. Joe Clark, Pierre Trudeau, John Turner, Brian Mulroney, Kim Campbell, Jean Chrétien and Paul Martin were granted arms with the augmentation.\n", 1379 | "\n", 1380 | "title:Prime Minister of Canada\n", 1381 | "\n", 1382 | "url:https://en.wikipedia.org/wiki?curid=24135\n", 1383 | "\n", 1384 | "views:2000\n", 1385 | "\n", 1386 | "\n", 1387 | "item 89\n", 1388 | "_additional:{'distance': -144.02658}\n", 1389 | "\n", 1390 | "lang:en\n", 1391 | "\n", 1392 | "text:In 2011, Canadian forces participated in the NATO-led intervention into the Libyan Civil War, and also became involved in battling the Islamic State insurgency in Iraq in the mid-2010s. The COVID-19 pandemic in Canada began on January 27, 2020, with wide social and economic disruption. In 2021, the remains of hundreds of Indigenous people were discovered near the former sites of Canadian Indian residential schools. Administered by the Canadian Catholic Church and funded by the Canadian government from 1828 to 1997, these boarding schools attempted to assimilate Indigenous children into Euro-Canadian culture.\n", 1393 | "\n", 1394 | "title:Canada\n", 1395 | "\n", 1396 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1397 | "\n", 1398 | "views:4000\n", 1399 | "\n", 1400 | "\n", 1401 | "item 90\n", 1402 | "_additional:{'distance': -144.01222}\n", 1403 | "\n", 1404 | "lang:en\n", 1405 | "\n", 1406 | "text:Ottawa is headquarters to numerous major medical organizations and institutions such as Canadian Red Cross, Canadian Blood Services, Health Canada, Canadian Medical Association, Royal College of Physicians and Surgeons of Canada, Canadian Nurses Association, and the Medical Council of Canada.\n", 1407 | "\n", 1408 | "title:Ottawa\n", 1409 | "\n", 1410 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 1411 | "\n", 1412 | "views:2000\n", 1413 | "\n", 1414 | "\n", 1415 | "item 91\n", 1416 | "_additional:{'distance': -144.00433}\n", 1417 | "\n", 1418 | "lang:en\n", 1419 | "\n", 1420 | "text:The skyline has been controlled by building height restrictions originally implemented to keep Parliament Hill and the Peace Tower at visible from most parts of the city. Today, several buildings are slightly taller than the Peace Tower, with the tallest being the Claridge Icon at 143 metres. Many federal buildings in the National Capital Region are managed by Public Works Canada, which leads to heritage conservation in its renovations and management of buildings, such as the renovation of the Senate Building. Most of the federal land in the region is managed by the National Capital Commission; its control of much undeveloped land and appropriations powers gives the NCC a great deal of influence over the city's development.\n", 1421 | "\n", 1422 | "title:Ottawa\n", 1423 | "\n", 1424 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 1425 | "\n", 1426 | "views:2000\n", 1427 | "\n", 1428 | "\n", 1429 | "item 92\n", 1430 | "_additional:{'distance': -143.96895}\n", 1431 | "\n", 1432 | "lang:en\n", 1433 | "\n", 1434 | "text:Canada's constitution is based on the Westminster parliamentary model, wherein the role of the King is both legal and practical, but not political. The sovereign is vested with all the powers of state, collectively known as the royal prerogative, leading the populace to be considered subjects of the Crown. However, as the sovereign's power stems from the people and the monarch is a constitutional one, he or she does not rule alone, as in an absolute monarchy. Instead, the Crown is regarded as a corporation sole, with the monarch being the centre of a construct in which the power of the whole is shared by multiple institutions of government—the executive, legislative, and judicial—acting under the sovereign's authority, which is entrusted for exercise by the politicians (the elected and appointed parliamentarians and the ministers of the Crown generally drawn from among them) and the judges and justices of the peace. The monarchy has thus been described as the underlying principle of Canada's institutional unity and the monarch as a \"guardian of constitutional freedoms\" whose \"job is to ensure that the political process remains intact and is allowed to function.\"\n", 1435 | "\n", 1436 | "title:Monarchy of Canada\n", 1437 | "\n", 1438 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 1439 | "\n", 1440 | "views:2000\n", 1441 | "\n", 1442 | "\n", 1443 | "item 93\n", 1444 | "_additional:{'distance': -143.92328}\n", 1445 | "\n", 1446 | "lang:en\n", 1447 | "\n", 1448 | "text:While the monarchy is the source of authority in Canada, in practice its position is mainly symbolic. The use of the executive powers is directed by the Cabinet, a committee of ministers of the Crown responsible to the elected House of Commons and chosen and headed by the prime minister (at present Justin Trudeau), the head of government. The governor general or monarch may, though, in certain crisis situations exercise their power without ministerial advice. To ensure the stability of government, the governor general will usually appoint as prime minister the individual who is the current leader of the political party that can obtain the confidence of a plurality in the House of Commons. The Prime Minister's Office (PMO) is thus one of the most powerful institutions in government, initiating most legislation for parliamentary approval and selecting for appointment by the Crown, besides the aforementioned, the governor general, lieutenant governors, senators, federal court judges, and heads of Crown corporations and government agencies. The leader of the party with the second-most seats usually becomes the leader of the Official Opposition and is part of an adversarial parliamentary system intended to keep the government in check.\n", 1449 | "\n", 1450 | "title:Canada\n", 1451 | "\n", 1452 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1453 | "\n", 1454 | "views:4000\n", 1455 | "\n", 1456 | "\n", 1457 | "item 94\n", 1458 | "_additional:{'distance': -143.91855}\n", 1459 | "\n", 1460 | "lang:en\n", 1461 | "\n", 1462 | "text:The Constitution of Canada is the supreme law of the country, and consists of written text and unwritten conventions. The \"Constitution Act, 1867\" (known as the British North America Act, 1867 prior to 1982), affirmed governance based on parliamentary precedent and divided powers between the federal and provincial governments. The Statute of Westminster, 1931 granted full autonomy, and the \"Constitution Act, 1982\" ended all legislative ties to Britain, as well as adding a constitutional amending formula and the \"Canadian Charter of Rights and Freedoms\". The \"Charter\" guarantees basic rights and freedoms that usually cannot be over-ridden by any government—though a notwithstanding clause allows Parliament and the provincial legislatures to override certain sections of the \"Charter\" for a period of five years.\n", 1463 | "\n", 1464 | "title:Canada\n", 1465 | "\n", 1466 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1467 | "\n", 1468 | "views:4000\n", 1469 | "\n", 1470 | "\n", 1471 | "item 95\n", 1472 | "_additional:{'distance': -143.89182}\n", 1473 | "\n", 1474 | "lang:en\n", 1475 | "\n", 1476 | "text:The main symbol of the monarchy is the sovereign himself, described as \"the personal expression of the Crown in Canada,\" and his image is thus used to signify Canadian sovereignty and government authority—his image, for instance, appearing on currency, and his portrait in government buildings. The sovereign is further both mentioned in and the subject of songs, loyal toasts, and salutes. A royal cypher, appearing on buildings and official seals, or a crown, seen on provincial and national coats of arms, as well as police force and Canadian Forces regimental and maritime badges and rank insignia, is also used to illustrate the monarchy as the locus of authority, the latter without referring to any specific monarch.\n", 1477 | "\n", 1478 | "title:Monarchy of Canada\n", 1479 | "\n", 1480 | "url:https://en.wikipedia.org/wiki?curid=56504\n", 1481 | "\n", 1482 | "views:2000\n", 1483 | "\n", 1484 | "\n", 1485 | "item 96\n", 1486 | "_additional:{'distance': -143.87836}\n", 1487 | "\n", 1488 | "lang:en\n", 1489 | "\n", 1490 | "text:In addition to the market's local media services, Ottawa is home to several national media operations, including CPAC (Canada's national legislature broadcaster) and the parliamentary bureau staff of virtually all of Canada's major newsgathering organizations in television, radio and print. The city is also home to the head office of the Canadian Broadcasting Corporation.\n", 1491 | "\n", 1492 | "title:Ottawa\n", 1493 | "\n", 1494 | "url:https://en.wikipedia.org/wiki?curid=22219\n", 1495 | "\n", 1496 | "views:2000\n", 1497 | "\n", 1498 | "\n", 1499 | "item 97\n", 1500 | "_additional:{'distance': -143.83647}\n", 1501 | "\n", 1502 | "lang:en\n", 1503 | "\n", 1504 | "text:Other prominent symbols include the national motto \"\" (\"From Sea to Sea\"), the sports of ice hockey and lacrosse, the beaver, Canada goose, common loon, Canadian horse, the Royal Canadian Mounted Police, the Canadian Rockies, and more recently the totem pole and Inuksuk. Material items such as Canadian beer, maple syrup, tuques, canoes, nanaimo bars, butter tarts and the Quebec dish of poutine are defined as uniquely Canadian. Canadian coins feature many of these symbols: the loon on the $1 coin, the Arms of Canada on the 50¢ piece, the beaver on the nickel. The penny, removed from circulation in 2013, featured the maple leaf. An image of the previous monarch, Queen Elizabeth II, appears on $20 bank notes, and on the obverse of all current Canadian coins.\n", 1505 | "\n", 1506 | "title:Canada\n", 1507 | "\n", 1508 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1509 | "\n", 1510 | "views:4000\n", 1511 | "\n", 1512 | "\n", 1513 | "item 98\n", 1514 | "_additional:{'distance': -143.83636}\n", 1515 | "\n", 1516 | "lang:en\n", 1517 | "\n", 1518 | "text:Canada has a parliamentary system within the context of a constitutional monarchy—the monarchy of Canada being the foundation of the executive, legislative, and judicial branches. The reigning monarch is , who is also monarch of 14 other Commonwealth countries and each of Canada's 10 provinces. The person who is the Canadian monarch is the same as the British monarch, although the two institutions are separate. The monarch appoints a representative, the governor general, with the advice of the prime minister, to carry out most of their federal royal duties in Canada.\n", 1519 | "\n", 1520 | "title:Canada\n", 1521 | "\n", 1522 | "url:https://en.wikipedia.org/wiki?curid=5042916\n", 1523 | "\n", 1524 | "views:4000\n", 1525 | "\n", 1526 | "\n", 1527 | "item 99\n", 1528 | "_additional:{'distance': -143.83032}\n", 1529 | "\n", 1530 | "lang:en\n", 1531 | "\n", 1532 | "text:Several companies are headquartered in Greater Montreal Area including Rio Tinto Alcan, Bombardier Inc., Canadian National Railway, CGI Group, Air Canada, Air Transat, CAE, Saputo, Cirque du Soleil, Stingray Group, Quebecor, Ultramar, Kruger Inc., Jean Coutu Group, Uniprix, Proxim, Domtar, Le Château, Power Corporation, Cellcom Communications, Bell Canada. Standard Life, Hydro-Québec, AbitibiBowater, Pratt and Whitney Canada, Molson, Tembec, Canada Steamship Lines, Fednav, Alimentation Couche-Tard, SNC-Lavalin, MEGA Brands, Aeroplan, Agropur, Metro Inc., Laurentian Bank of Canada, National Bank of Canada, Transat A.T., Via Rail, GardaWorld, Novacam Technologies, SOLABS, Dollarama, Rona and the Caisse de dépôt et placement du Québec.\n", 1533 | "\n", 1534 | "title:Montreal\n", 1535 | "\n", 1536 | "url:https://en.wikipedia.org/wiki?curid=7954681\n", 1537 | "\n", 1538 | "views:3000\n", 1539 | "\n", 1540 | "\n" 1541 | ] 1542 | } 1543 | ], 1544 | "source": [ 1545 | "print_result(dense_retrieval_results)" 1546 | ] 1547 | }, 1548 | { 1549 | "cell_type": "markdown", 1550 | "id": "db449134", 1551 | "metadata": {}, 1552 | "source": [ 1553 | "## Improving Keyword Search with ReRank" 1554 | ] 1555 | }, 1556 | { 1557 | "cell_type": "code", 1558 | "execution_count": 84, 1559 | "id": "8071c68a-6dec-47f9-b5e1-473f9acdc83f", 1560 | "metadata": {}, 1561 | "outputs": [], 1562 | "source": [ 1563 | "from utils import keyword_search" 1564 | ] 1565 | }, 1566 | { 1567 | "cell_type": "code", 1568 | "execution_count": 85, 1569 | "id": "aa47e41a-5988-4405-af7b-c7cb7382eed9", 1570 | "metadata": {}, 1571 | "outputs": [], 1572 | "source": [ 1573 | "query_1 = \"What is the capital of Canada?\"" 1574 | ] 1575 | }, 1576 | { 1577 | "cell_type": "code", 1578 | "execution_count": 112, 1579 | "id": "e851efa5-10c7-4f98-85f1-2a1c565d9723", 1580 | "metadata": {}, 1581 | "outputs": [ 1582 | { 1583 | "name": "stdout", 1584 | "output_type": "stream", 1585 | "text": [ 1586 | "i:0\n", 1587 | "Monarchy of Canada\n", 1588 | "i:1\n", 1589 | "Early modern period\n", 1590 | "i:2\n", 1591 | "Flag of Canada\n", 1592 | "i:3\n", 1593 | "Flag of Canada\n", 1594 | "i:4\n", 1595 | "Prime Minister of Canada\n", 1596 | "i:5\n", 1597 | "Hamilton, Ontario\n", 1598 | "i:6\n", 1599 | "Liberal Party of Canada\n", 1600 | "i:7\n", 1601 | "Stephen Harper\n", 1602 | "i:8\n", 1603 | "Monarchy of Canada\n", 1604 | "i:9\n", 1605 | "Flag of Canada\n", 1606 | "i:10\n", 1607 | "Order of Canada\n", 1608 | "i:11\n", 1609 | "University of Toronto\n", 1610 | "i:12\n", 1611 | "Newfoundland (island)\n", 1612 | "i:13\n", 1613 | "Liberal Party of Canada\n", 1614 | "i:14\n", 1615 | "Newfoundland (island)\n", 1616 | "i:15\n", 1617 | "Flag of Canada\n", 1618 | "i:16\n", 1619 | "North American Free Trade Agreement\n", 1620 | "i:17\n", 1621 | "Pea\n", 1622 | "i:18\n", 1623 | "Monarchy of Canada\n", 1624 | "i:19\n", 1625 | "Prime Minister of Canada\n", 1626 | "i:20\n", 1627 | "Hamilton, Ontario\n", 1628 | "i:21\n", 1629 | "Aesop's Fables\n", 1630 | "i:22\n", 1631 | "Revolutions of 1989\n", 1632 | "i:23\n", 1633 | "R.S.C. Anderlecht\n", 1634 | "i:24\n", 1635 | "Hudson's Bay Company\n", 1636 | "i:25\n", 1637 | "Liberal Party of Canada\n", 1638 | "i:26\n", 1639 | "2020–21 NBA season\n", 1640 | "i:27\n", 1641 | "Filibuster\n", 1642 | "i:28\n", 1643 | "Hardcore punk\n", 1644 | "i:29\n", 1645 | "Early modern period\n", 1646 | "i:30\n", 1647 | "Skopje\n", 1648 | "i:31\n", 1649 | "Venture capital\n", 1650 | "i:32\n", 1651 | "Wakanda\n", 1652 | "i:33\n", 1653 | "Arjuna\n", 1654 | "i:34\n", 1655 | "Luhansk\n", 1656 | "i:35\n", 1657 | "Arlington National Cemetery\n", 1658 | "i:36\n", 1659 | "North American Free Trade Agreement\n", 1660 | "i:37\n", 1661 | "Global North and Global South\n", 1662 | "i:38\n", 1663 | "Shia–Sunni relations\n", 1664 | "i:39\n", 1665 | "Jacob Zuma\n", 1666 | "i:40\n", 1667 | "Early modern period\n", 1668 | "i:41\n", 1669 | "Maui\n", 1670 | "i:42\n", 1671 | "Gerhard Schröder\n", 1672 | "i:43\n", 1673 | "Revolutions of 1989\n", 1674 | "i:44\n", 1675 | "Earl Warren\n", 1676 | "i:45\n", 1677 | "Mary Celeste\n", 1678 | "i:46\n", 1679 | "Exodus: Gods and Kings\n", 1680 | "i:47\n", 1681 | "Phnom Penh\n", 1682 | "i:48\n", 1683 | "Quebec\n", 1684 | "i:49\n", 1685 | "Air Canada\n", 1686 | "i:50\n", 1687 | "Americas\n", 1688 | "i:51\n", 1689 | "Canada\n", 1690 | "i:52\n", 1691 | "Indigenous peoples\n", 1692 | "i:53\n", 1693 | "Toronto Blue Jays\n", 1694 | "i:54\n", 1695 | "Canada men's national soccer team\n", 1696 | "i:55\n", 1697 | "Métis\n", 1698 | "i:56\n", 1699 | "Monarchy of Canada\n", 1700 | "i:57\n", 1701 | "Capitol Records\n", 1702 | "i:58\n", 1703 | "Air Canada\n", 1704 | "i:59\n", 1705 | "Canadian Broadcasting Corporation\n", 1706 | "i:60\n", 1707 | "Canada\n", 1708 | "i:61\n", 1709 | "Air Canada\n", 1710 | "i:62\n", 1711 | "Air Canada\n", 1712 | "i:63\n", 1713 | "Order of Canada\n", 1714 | "i:64\n", 1715 | "Canada\n", 1716 | "i:65\n", 1717 | "Order of Canada\n", 1718 | "i:66\n", 1719 | "Toronto\n", 1720 | "i:67\n", 1721 | "Air Canada\n", 1722 | "i:68\n", 1723 | "Ontario\n", 1724 | "i:69\n", 1725 | "Canada\n", 1726 | "i:70\n", 1727 | "United States men's national soccer team\n", 1728 | "i:71\n", 1729 | "Canada\n", 1730 | "i:72\n", 1731 | "George VI\n", 1732 | "i:73\n", 1733 | "Canada men's national soccer team\n", 1734 | "i:74\n", 1735 | "Canada\n", 1736 | "i:75\n", 1737 | "Steve Nash\n", 1738 | "i:76\n", 1739 | "Monarchy of Canada\n", 1740 | "i:77\n", 1741 | "Canada men's national soccer team\n", 1742 | "i:78\n", 1743 | "Pierre Trudeau\n", 1744 | "i:79\n", 1745 | "Pierre Trudeau\n", 1746 | "i:80\n", 1747 | "Air Canada\n", 1748 | "i:81\n", 1749 | "Air Canada\n", 1750 | "i:82\n", 1751 | "Monarchy of Canada\n", 1752 | "i:83\n", 1753 | "Air Canada\n", 1754 | "i:84\n", 1755 | "Air Canada\n", 1756 | "i:85\n", 1757 | "Air Canada\n", 1758 | "i:86\n", 1759 | "Canada\n", 1760 | "i:87\n", 1761 | "Walmart\n", 1762 | "i:88\n", 1763 | "Canadian Broadcasting Corporation\n", 1764 | "i:89\n", 1765 | "Canada\n", 1766 | "i:90\n", 1767 | "Air Canada\n", 1768 | "i:91\n", 1769 | "Wayne Gretzky\n", 1770 | "i:92\n", 1771 | "Canada\n", 1772 | "i:93\n", 1773 | "Underground Railroad\n", 1774 | "i:94\n", 1775 | "Ottawa\n", 1776 | "i:95\n", 1777 | "Montreal\n", 1778 | "i:96\n", 1779 | "Joni Mitchell\n", 1780 | "i:97\n", 1781 | "Air Canada\n", 1782 | "i:98\n", 1783 | "Conservative Party of Canada\n", 1784 | "i:99\n", 1785 | "Canada\n", 1786 | "i:100\n", 1787 | "Montreal\n", 1788 | "i:101\n", 1789 | "Millennials\n", 1790 | "i:102\n", 1791 | "Charles de Gaulle\n", 1792 | "i:103\n", 1793 | "Canada\n", 1794 | "i:104\n", 1795 | "Monarchy of Canada\n", 1796 | "i:105\n", 1797 | "Beaver\n", 1798 | "i:106\n", 1799 | "Canada men's national soccer team\n", 1800 | "i:107\n", 1801 | "Canada men's national soccer team\n", 1802 | "i:108\n", 1803 | "Winnipeg\n", 1804 | "i:109\n", 1805 | "Reindeer\n", 1806 | "i:110\n", 1807 | "Monarchy of Canada\n", 1808 | "i:111\n", 1809 | "Order of Canada\n", 1810 | "i:112\n", 1811 | "Country music\n", 1812 | "i:113\n", 1813 | "Platinum Jubilee of Elizabeth II\n", 1814 | "i:114\n", 1815 | "Quebec\n", 1816 | "i:115\n", 1817 | "Monarchy of Canada\n", 1818 | "i:116\n", 1819 | "Nestlé\n", 1820 | "i:117\n", 1821 | "Air Canada\n", 1822 | "i:118\n", 1823 | "Canada\n", 1824 | "i:119\n", 1825 | "Air Canada\n", 1826 | "i:120\n", 1827 | "Air Canada\n", 1828 | "i:121\n", 1829 | "Canada\n", 1830 | "i:122\n", 1831 | "McGill University\n", 1832 | "i:123\n", 1833 | "Ice hockey\n", 1834 | "i:124\n", 1835 | "Air Canada\n", 1836 | "i:125\n", 1837 | "Canada\n", 1838 | "i:126\n", 1839 | "Canada\n", 1840 | "i:127\n", 1841 | "Monarchy of Canada\n", 1842 | "i:128\n", 1843 | "Supertramp\n", 1844 | "i:129\n", 1845 | "Canada\n", 1846 | "i:130\n", 1847 | "Sildenafil\n", 1848 | "i:131\n", 1849 | "Justin Trudeau\n", 1850 | "i:132\n", 1851 | "Presbyterianism\n", 1852 | "i:133\n", 1853 | "Order of Canada\n", 1854 | "i:134\n", 1855 | "Provinces and territories of Canada\n", 1856 | "i:135\n", 1857 | "Canada\n", 1858 | "i:136\n", 1859 | "Canada men's national soccer team\n", 1860 | "i:137\n", 1861 | "Air Canada\n", 1862 | "i:138\n", 1863 | "North America\n", 1864 | "i:139\n", 1865 | "Canada\n", 1866 | "i:140\n", 1867 | "Flag of Canada\n", 1868 | "i:141\n", 1869 | "Constitution\n", 1870 | "i:142\n", 1871 | "Quebec\n", 1872 | "i:143\n", 1873 | "Degrassi: The Next Generation\n", 1874 | "i:144\n", 1875 | "Celine Dion\n", 1876 | "i:145\n", 1877 | "Quebec\n", 1878 | "i:146\n", 1879 | "Air Canada\n", 1880 | "i:147\n", 1881 | "Canadian Broadcasting Corporation\n", 1882 | "i:148\n", 1883 | "Mohammed bin Salman\n", 1884 | "i:149\n", 1885 | "Monarch butterfly\n", 1886 | "i:150\n", 1887 | "Canada\n", 1888 | "i:151\n", 1889 | "University of Toronto\n", 1890 | "i:152\n", 1891 | "Beaufort scale\n", 1892 | "i:153\n", 1893 | "Canada\n", 1894 | "i:154\n", 1895 | "Conservative Party of Canada\n", 1896 | "i:155\n", 1897 | "Assisted suicide\n", 1898 | "i:156\n", 1899 | "Liberal Party of Canada\n", 1900 | "i:157\n", 1901 | "Fleur-de-lis\n", 1902 | "i:158\n", 1903 | "Bobcat\n", 1904 | "i:159\n", 1905 | "Air Canada\n", 1906 | "i:160\n", 1907 | "Embraer E-Jet family\n", 1908 | "i:161\n", 1909 | "Canada\n", 1910 | "i:162\n", 1911 | "Canada men's national soccer team\n", 1912 | "i:163\n", 1913 | "Santa Claus\n", 1914 | "i:164\n", 1915 | "Canada\n", 1916 | "i:165\n", 1917 | "Monarchy of Canada\n", 1918 | "i:166\n", 1919 | "Canada\n", 1920 | "i:167\n", 1921 | "Canada\n", 1922 | "i:168\n", 1923 | "Great Reset\n", 1924 | "i:169\n", 1925 | "Stephen Harper\n", 1926 | "i:170\n", 1927 | "Methodism\n", 1928 | "i:171\n", 1929 | "De Havilland Canada Dash 8\n", 1930 | "i:172\n", 1931 | "Montreal\n", 1932 | "i:173\n", 1933 | "Martin Van Buren\n", 1934 | "i:174\n", 1935 | "Monarchy of Canada\n", 1936 | "i:175\n", 1937 | "Air Canada\n", 1938 | "i:176\n", 1939 | "Sidney Crosby\n", 1940 | "i:177\n", 1941 | "Rudyard Kipling\n", 1942 | "i:178\n", 1943 | "Canada\n", 1944 | "i:179\n", 1945 | "Kyoto Protocol\n", 1946 | "i:180\n", 1947 | "Quebec\n", 1948 | "i:181\n", 1949 | "Liberal Party of Canada\n", 1950 | "i:182\n", 1951 | "North America\n", 1952 | "i:183\n", 1953 | "Alphonso Davies\n", 1954 | "i:184\n", 1955 | "Liberal Party of Canada\n", 1956 | "i:185\n", 1957 | "Boeing F/A-18E/F Super Hornet\n", 1958 | "i:186\n", 1959 | "Underground Railroad\n", 1960 | "i:187\n", 1961 | "Canada men's national soccer team\n", 1962 | "i:188\n", 1963 | "Canada\n", 1964 | "i:189\n", 1965 | "The Marshall Mathers LP\n", 1966 | "i:190\n", 1967 | "2022 FIFA World Cup\n", 1968 | "i:191\n", 1969 | "History of slavery\n", 1970 | "i:192\n", 1971 | "War of 1812\n", 1972 | "i:193\n", 1973 | "Stephen Harper\n", 1974 | "i:194\n", 1975 | "Monarchy of Canada\n", 1976 | "i:195\n", 1977 | "Air Canada\n", 1978 | "i:196\n", 1979 | "History of Ukraine\n", 1980 | "i:197\n", 1981 | "Conservative Party of Canada\n", 1982 | "i:198\n", 1983 | "Commonwealth realm\n", 1984 | "i:199\n", 1985 | "Monarchy of Canada\n", 1986 | "i:200\n", 1987 | "Bryan Adams\n", 1988 | "i:201\n", 1989 | "Air Canada\n", 1990 | "i:202\n", 1991 | "Ottawa\n", 1992 | "i:203\n", 1993 | "Lynx\n", 1994 | "i:204\n", 1995 | "Air Canada\n", 1996 | "i:205\n", 1997 | "Air Canada\n", 1998 | "i:206\n", 1999 | "Air Canada\n", 2000 | "i:207\n", 2001 | "Methodism\n", 2002 | "i:208\n", 2003 | "Canada men's national soccer team\n", 2004 | "i:209\n", 2005 | "Calgary\n", 2006 | "i:210\n", 2007 | "Monarchy of Canada\n", 2008 | "i:211\n", 2009 | "Canada\n", 2010 | "i:212\n", 2011 | "Canada men's national soccer team\n", 2012 | "i:213\n", 2013 | "Canada\n", 2014 | "i:214\n", 2015 | "The Home Depot\n", 2016 | "i:215\n", 2017 | "Conservatism\n", 2018 | "i:216\n", 2019 | "Air Canada\n", 2020 | "i:217\n", 2021 | "Air Canada\n", 2022 | "i:218\n", 2023 | "Monarchy of Canada\n", 2024 | "i:219\n", 2025 | "Montreal\n", 2026 | "i:220\n", 2027 | "ICICI Bank\n", 2028 | "i:221\n", 2029 | "Doctor of Medicine\n", 2030 | "i:222\n", 2031 | "Great Plains\n", 2032 | "i:223\n", 2033 | "Mennonites\n", 2034 | "i:224\n", 2035 | "Union Jack\n", 2036 | "i:225\n", 2037 | "Canada\n", 2038 | "i:226\n", 2039 | "Air Canada\n", 2040 | "i:227\n", 2041 | "Capitol Records\n", 2042 | "i:228\n", 2043 | "Ontario\n", 2044 | "i:229\n", 2045 | "Regiment\n", 2046 | "i:230\n", 2047 | "Air Canada\n", 2048 | "i:231\n", 2049 | "Eugene Levy\n", 2050 | "i:232\n", 2051 | "Air Canada\n", 2052 | "i:233\n", 2053 | "North America\n", 2054 | "i:234\n", 2055 | "Reindeer\n", 2056 | "i:235\n", 2057 | "Liberal Party of Canada\n", 2058 | "i:236\n", 2059 | "Scotland\n", 2060 | "i:237\n", 2061 | "Tim Hortons\n", 2062 | "i:238\n", 2063 | "Chrystia Freeland\n", 2064 | "i:239\n", 2065 | "Air Canada\n", 2066 | "i:240\n", 2067 | "Canada men's national soccer team\n", 2068 | "i:241\n", 2069 | "Alphonso Davies\n", 2070 | "i:242\n", 2071 | "Air Canada\n", 2072 | "i:243\n", 2073 | "Common European Framework of Reference for Languages\n", 2074 | "i:244\n", 2075 | "Canada men's national soccer team\n", 2076 | "i:245\n", 2077 | "Provinces and territories of Canada\n", 2078 | "i:246\n", 2079 | "Canada\n", 2080 | "i:247\n", 2081 | "Ice hockey\n", 2082 | "i:248\n", 2083 | "Flag of Canada\n", 2084 | "i:249\n", 2085 | "Order of Canada\n", 2086 | "i:250\n", 2087 | "Quebec\n", 2088 | "i:251\n", 2089 | "Thanksgiving\n", 2090 | "i:252\n", 2091 | "Provinces and territories of Canada\n", 2092 | "i:253\n", 2093 | "Canada men's national soccer team\n", 2094 | "i:254\n", 2095 | "North American Free Trade Agreement\n", 2096 | "i:255\n", 2097 | "Provinces and territories of Canada\n", 2098 | "i:256\n", 2099 | "Alberta\n", 2100 | "i:257\n", 2101 | "Canada\n", 2102 | "i:258\n", 2103 | "Conservative Party of Canada\n", 2104 | "i:259\n", 2105 | "Calgary\n", 2106 | "i:260\n", 2107 | "Steve Nash\n", 2108 | "i:261\n", 2109 | "Canada\n", 2110 | "i:262\n", 2111 | "Al Capone\n", 2112 | "i:263\n", 2113 | "Monarchy of Canada\n", 2114 | "i:264\n", 2115 | "Conservative Party of Canada\n", 2116 | "i:265\n", 2117 | "Prime Minister of Canada\n", 2118 | "i:266\n", 2119 | "Underground Railroad\n", 2120 | "i:267\n", 2121 | "Air Canada\n", 2122 | "i:268\n", 2123 | "Heavy water\n", 2124 | "i:269\n", 2125 | "Boeing P-8 Poseidon\n", 2126 | "i:270\n", 2127 | "Alphonso Davies\n", 2128 | "i:271\n", 2129 | "Winnipeg\n", 2130 | "i:272\n", 2131 | "Flag of Canada\n", 2132 | "i:273\n", 2133 | "12 Rules for Life\n", 2134 | "i:274\n", 2135 | "Provinces and territories of Canada\n", 2136 | "i:275\n", 2137 | "Air Canada\n", 2138 | "i:276\n", 2139 | "Sidney Crosby\n", 2140 | "i:277\n", 2141 | "Canada\n", 2142 | "i:278\n", 2143 | "Calgary\n", 2144 | "i:279\n", 2145 | "Prime Minister of Canada\n", 2146 | "i:280\n", 2147 | "Christopher Plummer\n", 2148 | "i:281\n", 2149 | "Lynx\n", 2150 | "i:282\n", 2151 | "Wolf\n", 2152 | "i:283\n", 2153 | "Warner Music Group\n", 2154 | "i:284\n", 2155 | "Canada\n", 2156 | "i:285\n", 2157 | "Americas\n", 2158 | "i:286\n", 2159 | "Prince Edward, Duke of Kent and Strathearn\n", 2160 | "i:287\n", 2161 | "Canada\n", 2162 | "i:288\n", 2163 | "Provinces and territories of Canada\n", 2164 | "i:289\n", 2165 | "Provinces and territories of Canada\n", 2166 | "i:290\n", 2167 | "Canada\n", 2168 | "i:291\n", 2169 | "Liberal Party of Canada\n", 2170 | "i:292\n", 2171 | "Ontario\n", 2172 | "i:293\n", 2173 | "Pierre Trudeau\n", 2174 | "i:294\n", 2175 | "Three Days Grace\n", 2176 | "i:295\n", 2177 | "Canada\n", 2178 | "i:296\n", 2179 | "Reindeer\n", 2180 | "i:297\n", 2181 | "Newfoundland and Labrador\n", 2182 | "i:298\n", 2183 | "Millennials\n", 2184 | "i:299\n", 2185 | "London, Ontario\n", 2186 | "i:300\n", 2187 | "De Havilland Canada Dash 8\n", 2188 | "i:301\n", 2189 | "North America\n", 2190 | "i:302\n", 2191 | "Canada\n", 2192 | "i:303\n", 2193 | "COVID-19 pandemic\n", 2194 | "i:304\n", 2195 | "Canada men's national soccer team\n", 2196 | "i:305\n", 2197 | "Hyundai Motor Company\n", 2198 | "i:306\n", 2199 | "Monarchy of Canada\n", 2200 | "i:307\n", 2201 | "Volkswagen\n", 2202 | "i:308\n", 2203 | "Canada\n", 2204 | "i:309\n", 2205 | "War of 1812\n", 2206 | "i:310\n", 2207 | "Roaring Twenties\n", 2208 | "i:311\n", 2209 | "Countries of the United Kingdom\n", 2210 | "i:312\n", 2211 | "Gujarati people\n", 2212 | "i:313\n", 2213 | "McGill University\n", 2214 | "i:314\n", 2215 | "Canada\n", 2216 | "i:315\n", 2217 | "McGill University\n", 2218 | "i:316\n", 2219 | "Greater Toronto Area\n", 2220 | "i:317\n", 2221 | "Great Replacement\n", 2222 | "i:318\n", 2223 | "Monarchy of Canada\n", 2224 | "i:319\n", 2225 | "Canada\n", 2226 | "i:320\n", 2227 | "British Empire\n", 2228 | "i:321\n", 2229 | "Juris Doctor\n", 2230 | "i:322\n", 2231 | "Liberal Party of Canada\n", 2232 | "i:323\n", 2233 | "2022 Atlantic hurricane season\n", 2234 | "i:324\n", 2235 | "Canada\n", 2236 | "i:325\n", 2237 | "North American Free Trade Agreement\n", 2238 | "i:326\n", 2239 | "Monarchy of Canada\n", 2240 | "i:327\n", 2241 | "Sidney Crosby\n", 2242 | "i:328\n", 2243 | "Canada men's national soccer team\n", 2244 | "i:329\n", 2245 | "Beluga whale\n", 2246 | "i:330\n", 2247 | "Baby boomers\n", 2248 | "i:331\n", 2249 | "Canada men's national soccer team\n", 2250 | "i:332\n", 2251 | "'Ndrangheta\n", 2252 | "i:333\n", 2253 | "Commonwealth realm\n", 2254 | "i:334\n", 2255 | "Rugby union\n", 2256 | "i:335\n", 2257 | "Newfoundland and Labrador\n", 2258 | "i:336\n", 2259 | "Canada\n", 2260 | "i:337\n", 2261 | "Alberta\n", 2262 | "i:338\n", 2263 | "Iroquois\n", 2264 | "i:339\n", 2265 | "British Empire\n", 2266 | "i:340\n", 2267 | "Monarchy of Canada\n", 2268 | "i:341\n", 2269 | "Air Canada\n", 2270 | "i:342\n", 2271 | "Canada men's national soccer team\n", 2272 | "i:343\n", 2273 | "Avril Lavigne\n", 2274 | "i:344\n", 2275 | "Methodism\n", 2276 | "i:345\n", 2277 | "William Howard Taft\n", 2278 | "i:346\n", 2279 | "Monarchy of Canada\n", 2280 | "i:347\n", 2281 | "Acronym\n", 2282 | "i:348\n", 2283 | "Pat Benatar\n", 2284 | "i:349\n", 2285 | "Ontario\n", 2286 | "i:350\n", 2287 | "Monarchy of Canada\n", 2288 | "i:351\n", 2289 | "Flag of Canada\n", 2290 | "i:352\n", 2291 | "Connor McDavid\n", 2292 | "i:353\n", 2293 | "Foreigner (band)\n", 2294 | "i:354\n", 2295 | "Canada men's national soccer team\n", 2296 | "i:355\n", 2297 | "Schitt's Creek\n", 2298 | "i:356\n", 2299 | "2001: A Space Odyssey (film)\n", 2300 | "i:357\n", 2301 | "Petroleum\n", 2302 | "i:358\n", 2303 | "Calgary\n", 2304 | "i:359\n", 2305 | "Mikhail Baryshnikov\n", 2306 | "i:360\n", 2307 | "Nova Scotia\n", 2308 | "i:361\n", 2309 | "Chelsea Manning\n", 2310 | "i:362\n", 2311 | "Monarchy of Canada\n", 2312 | "i:363\n", 2313 | "Monarchy of Canada\n", 2314 | "i:364\n", 2315 | "Chiropractic\n", 2316 | "i:365\n", 2317 | "Liberal Party of Canada\n", 2318 | "i:366\n", 2319 | "Pierre Trudeau\n", 2320 | "i:367\n", 2321 | "The Batman (film)\n", 2322 | "i:368\n", 2323 | "Folk music\n", 2324 | "i:369\n", 2325 | "Canada men's national soccer team\n", 2326 | "i:370\n", 2327 | "Def Leppard\n", 2328 | "i:371\n", 2329 | "War of 1812\n", 2330 | "i:372\n", 2331 | "Def Leppard\n", 2332 | "i:373\n", 2333 | "Prime Minister of Canada\n", 2334 | "i:374\n", 2335 | "Canada\n", 2336 | "i:375\n", 2337 | "Dunkin' Donuts\n", 2338 | "i:376\n", 2339 | "ABBA\n", 2340 | "i:377\n", 2341 | "Order of Canada\n", 2342 | "i:378\n", 2343 | "Pierre Trudeau\n", 2344 | "i:379\n", 2345 | "Commonwealth realm\n", 2346 | "i:380\n", 2347 | "Nope (film)\n", 2348 | "i:381\n", 2349 | "Habeas corpus\n", 2350 | "i:382\n", 2351 | "Flag of Canada\n", 2352 | "i:383\n", 2353 | "Coronation Street\n", 2354 | "i:384\n", 2355 | "Jim Carrey\n", 2356 | "i:385\n", 2357 | "Celine Dion\n", 2358 | "i:386\n", 2359 | "Electronic cigarette\n", 2360 | "i:387\n", 2361 | "Venture capital\n", 2362 | "i:388\n", 2363 | "Air Canada\n", 2364 | "i:389\n", 2365 | "Ethnic group\n", 2366 | "i:390\n", 2367 | "Border Collie\n", 2368 | "i:391\n", 2369 | "Order of Canada\n", 2370 | "i:392\n", 2371 | "Air Canada\n", 2372 | "i:393\n", 2373 | "Nickelback\n", 2374 | "i:394\n", 2375 | "Ottawa\n", 2376 | "i:395\n", 2377 | "Air Canada\n", 2378 | "i:396\n", 2379 | "Trade union\n", 2380 | "i:397\n", 2381 | "Prince Edward Island\n", 2382 | "i:398\n", 2383 | "Air Canada\n", 2384 | "i:399\n", 2385 | "Nova Scotia\n", 2386 | "i:400\n", 2387 | "Donald Sutherland\n", 2388 | "i:401\n", 2389 | "School shooting\n", 2390 | "i:402\n", 2391 | "Ice hockey\n", 2392 | "i:403\n", 2393 | "Kyoto Protocol\n", 2394 | "i:404\n", 2395 | "Canada\n", 2396 | "i:405\n", 2397 | "Ontario\n", 2398 | "i:406\n", 2399 | "Lady-in-waiting\n", 2400 | "i:407\n", 2401 | "Ottawa\n", 2402 | "i:408\n", 2403 | "Conservative Party of Canada\n", 2404 | "i:409\n", 2405 | "Heartland (Canadian TV series)\n", 2406 | "i:410\n", 2407 | "Air Canada\n", 2408 | "i:411\n", 2409 | "IKEA\n", 2410 | "i:412\n", 2411 | "Tristan Thompson\n", 2412 | "i:413\n", 2413 | "Fringe (TV series)\n", 2414 | "i:414\n", 2415 | "Suzuki\n", 2416 | "i:415\n", 2417 | "Air Canada\n", 2418 | "i:416\n", 2419 | "Provinces and territories of Canada\n", 2420 | "i:417\n", 2421 | "National Hockey League\n", 2422 | "i:418\n", 2423 | "Russell Williams (criminal)\n", 2424 | "i:419\n", 2425 | "Coat of arms\n", 2426 | "i:420\n", 2427 | "ICICI Bank\n", 2428 | "i:421\n", 2429 | "Canadian Broadcasting Corporation\n", 2430 | "i:422\n", 2431 | "Kevin O'Leary\n", 2432 | "i:423\n", 2433 | "University of Toronto\n", 2434 | "i:424\n", 2435 | "Dodge\n", 2436 | "i:425\n", 2437 | "Proud Boys\n", 2438 | "i:426\n", 2439 | "Air Canada\n", 2440 | "i:427\n", 2441 | "Family Guy\n", 2442 | "i:428\n", 2443 | "Canada\n", 2444 | "i:429\n", 2445 | "Union Jack\n", 2446 | "i:430\n", 2447 | "Canada men's national soccer team\n", 2448 | "i:431\n", 2449 | "Flag of Canada\n", 2450 | "i:432\n", 2451 | "Air Canada\n", 2452 | "i:433\n", 2453 | "Credit card\n", 2454 | "i:434\n", 2455 | "Canada\n", 2456 | "i:435\n", 2457 | "Justin Trudeau\n", 2458 | "i:436\n", 2459 | "Acre\n", 2460 | "i:437\n", 2461 | "McGill University\n", 2462 | "i:438\n", 2463 | "Conservation status\n", 2464 | "i:439\n", 2465 | "YMCA\n", 2466 | "i:440\n", 2467 | "Canadian dollar\n", 2468 | "i:441\n", 2469 | "ITER\n", 2470 | "i:442\n", 2471 | "The Mist (film)\n", 2472 | "i:443\n", 2473 | "The Big Bang Theory\n", 2474 | "i:444\n", 2475 | "Air Canada\n", 2476 | "i:445\n", 2477 | "Fair use\n", 2478 | "i:446\n", 2479 | "Provinces and territories of Canada\n", 2480 | "i:447\n", 2481 | "Osteopathy\n", 2482 | "i:448\n", 2483 | "Midsomer Murders\n", 2484 | "i:449\n", 2485 | "North American Free Trade Agreement\n", 2486 | "i:450\n", 2487 | "Rolling Stone's 100 Greatest Artists of All Time\n", 2488 | "i:451\n", 2489 | "Union Jack\n", 2490 | "i:452\n", 2491 | "Commonwealth Games\n", 2492 | "i:453\n", 2493 | "Canada\n", 2494 | "i:454\n", 2495 | "Canada\n", 2496 | "i:455\n", 2497 | "London, Ontario\n", 2498 | "i:456\n", 2499 | "Super Bowl LV\n", 2500 | "i:457\n", 2501 | "Mackenzie Phillips\n", 2502 | "i:458\n", 2503 | "Toyota\n", 2504 | "i:459\n", 2505 | "Alberta\n", 2506 | "i:460\n", 2507 | "Toronto Pearson International Airport\n", 2508 | "i:461\n", 2509 | "Mennonites\n", 2510 | "i:462\n", 2511 | "Adobe Inc.\n", 2512 | "i:463\n", 2513 | "Vancouver\n", 2514 | "i:464\n", 2515 | "Canada men's national soccer team\n", 2516 | "i:465\n", 2517 | "Canada\n", 2518 | "i:466\n", 2519 | "Columbia Records\n", 2520 | "i:467\n", 2521 | "Newfoundland (island)\n", 2522 | "i:468\n", 2523 | "Winnipeg\n", 2524 | "i:469\n", 2525 | "Justin Trudeau\n", 2526 | "i:470\n", 2527 | "Victoria, British Columbia\n", 2528 | "i:471\n", 2529 | "The Lord of the Rings: The Fellowship of the Ring\n", 2530 | "i:472\n", 2531 | "Air Canada\n", 2532 | "i:473\n", 2533 | "Toronto Blue Jays\n", 2534 | "i:474\n", 2535 | "Toronto Pearson International Airport\n", 2536 | "i:475\n", 2537 | "Calgary\n", 2538 | "i:476\n", 2539 | "Bell UH-1 Iroquois\n", 2540 | "i:477\n", 2541 | "Nicolas Sarkozy\n", 2542 | "i:478\n", 2543 | "Fake news\n", 2544 | "i:479\n", 2545 | "Ottawa\n", 2546 | "i:480\n", 2547 | "Marshall Plan\n", 2548 | "i:481\n", 2549 | "Canada\n", 2550 | "i:482\n", 2551 | "Winnipeg\n", 2552 | "i:483\n", 2553 | "Marshmello\n", 2554 | "i:484\n", 2555 | "Conservative Party of Canada\n", 2556 | "i:485\n", 2557 | "Canadian Broadcasting Corporation\n", 2558 | "i:486\n", 2559 | "Yellowstone (American TV series)\n", 2560 | "i:487\n", 2561 | "Edmonton\n", 2562 | "i:488\n", 2563 | "Sitting Bull\n", 2564 | "i:489\n", 2565 | "North American Free Trade Agreement\n", 2566 | "i:490\n", 2567 | "Canada men's national soccer team\n", 2568 | "i:491\n", 2569 | "Order of Canada\n", 2570 | "i:492\n", 2571 | "Iroquois\n", 2572 | "i:493\n", 2573 | "Prince Edward Island\n", 2574 | "i:494\n", 2575 | "Homosexuality\n", 2576 | "i:495\n", 2577 | "Canada\n", 2578 | "i:496\n", 2579 | "Quebec City\n", 2580 | "i:497\n", 2581 | "Prime Minister of Canada\n", 2582 | "i:498\n", 2583 | "Martin Short\n", 2584 | "i:499\n", 2585 | "Church of England\n" 2586 | ] 2587 | } 2588 | ], 2589 | "source": [ 2590 | "query_1 = \"What is the capital of Canada?\"\n", 2591 | "results = keyword_search(query_1,\n", 2592 | " client,\n", 2593 | " properties=[\"text\", \"title\", \"url\", \"views\", \"lang\", \"_additional {distance}\"],\n", 2594 | " num_results=3\n", 2595 | " )\n", 2596 | "\n", 2597 | "for i, result in enumerate(results):\n", 2598 | " print(f\"i:{i}\")\n", 2599 | " print(result.get('title'))\n", 2600 | " print(result.get('text'))" 2601 | ] 2602 | }, 2603 | { 2604 | "cell_type": "code", 2605 | "execution_count": null, 2606 | "id": "6e1b2d2c", 2607 | "metadata": {}, 2608 | "outputs": [], 2609 | "source": [ 2610 | "query_1 = \"What is the capital of Canada?\"\n", 2611 | "results = keyword_search(query_1,\n", 2612 | " client,\n", 2613 | " properties=[\"text\", \"title\", \"url\", \"views\", \"lang\", \"_additional {distance}\"],\n", 2614 | " num_results=500\n", 2615 | " )\n", 2616 | "\n", 2617 | "for i, result in enumerate(results):\n", 2618 | " print(f\"i:{i}\")\n", 2619 | " print(result.get('title'))\n", 2620 | " #print(result.get('text'))" 2621 | ] 2622 | }, 2623 | { 2624 | "cell_type": "code", 2625 | "execution_count": 113, 2626 | "id": "b38761f8-32b1-4b44-be97-0884894cf6b3", 2627 | "metadata": {}, 2628 | "outputs": [], 2629 | "source": [ 2630 | "def rerank_responses(query, responses, num_responses=10):\n", 2631 | " reranked_responses = co.rerank(\n", 2632 | " model = 'rerank-english-v2.0',\n", 2633 | " query = query,\n", 2634 | " documents = responses,\n", 2635 | " top_n = num_responses,\n", 2636 | " )\n", 2637 | " return reranked_responses" 2638 | ] 2639 | }, 2640 | { 2641 | "cell_type": "code", 2642 | "execution_count": 114, 2643 | "id": "02d3e55c-0a5b-4b3a-9a59-3f7164927dc0", 2644 | "metadata": {}, 2645 | "outputs": [], 2646 | "source": [ 2647 | "texts = [result.get('text') for result in results]\n", 2648 | "reranked_text = rerank_responses(query_1, texts)" 2649 | ] 2650 | }, 2651 | { 2652 | "cell_type": "code", 2653 | "execution_count": 115, 2654 | "id": "6b3a380b-cebf-47da-956d-dc62dc53e5a0", 2655 | "metadata": {}, 2656 | "outputs": [ 2657 | { 2658 | "name": "stdout", 2659 | "output_type": "stream", 2660 | "text": [ 2661 | "i:0\n", 2662 | "RerankResult\n", 2663 | "\n", 2664 | "i:1\n", 2665 | "RerankResult\n", 2666 | "\n", 2667 | "i:2\n", 2668 | "RerankResult\n", 2669 | "\n", 2670 | "i:3\n", 2671 | "RerankResult\n", 2672 | "\n", 2673 | "i:4\n", 2674 | "RerankResult\n", 2675 | "\n", 2676 | "i:5\n", 2677 | "RerankResult\n", 2678 | "\n", 2679 | "i:6\n", 2680 | "RerankResult\n", 2681 | "\n", 2682 | "i:7\n", 2683 | "RerankResult\n", 2684 | "\n", 2685 | "i:8\n", 2686 | "RerankResult\n", 2687 | "\n", 2688 | "i:9\n", 2689 | "RerankResult\n", 2690 | "\n" 2691 | ] 2692 | } 2693 | ], 2694 | "source": [ 2695 | "for i, rerank_result in enumerate(reranked_text):\n", 2696 | " print(f\"i:{i}\")\n", 2697 | " print(f\"{rerank_result}\")\n", 2698 | " print()" 2699 | ] 2700 | }, 2701 | { 2702 | "cell_type": "markdown", 2703 | "id": "f6cbb081", 2704 | "metadata": {}, 2705 | "source": [ 2706 | "## Improving Dense Retrieval with ReRank" 2707 | ] 2708 | }, 2709 | { 2710 | "cell_type": "code", 2711 | "execution_count": 116, 2712 | "id": "be2e5378-ea37-4726-b3c3-5875d46759e7", 2713 | "metadata": {}, 2714 | "outputs": [], 2715 | "source": [ 2716 | "from utils import dense_retrieval" 2717 | ] 2718 | }, 2719 | { 2720 | "cell_type": "code", 2721 | "execution_count": 117, 2722 | "id": "d5af11ea-6c30-4303-8c9e-8a5510e046bb", 2723 | "metadata": {}, 2724 | "outputs": [], 2725 | "source": [ 2726 | "query_2 = \"Who is the tallest person in history?\"" 2727 | ] 2728 | }, 2729 | { 2730 | "cell_type": "code", 2731 | "execution_count": 130, 2732 | "id": "4da5c744-01b8-4780-a615-0a5edf9bfbd6", 2733 | "metadata": {}, 2734 | "outputs": [], 2735 | "source": [ 2736 | "results = dense_retrieval(query_2,client)" 2737 | ] 2738 | }, 2739 | { 2740 | "cell_type": "code", 2741 | "execution_count": 132, 2742 | "id": "9e4540d8-ed5e-4f97-8802-6d39a52b8964", 2743 | "metadata": {}, 2744 | "outputs": [ 2745 | { 2746 | "name": "stdout", 2747 | "output_type": "stream", 2748 | "text": [ 2749 | "i:0\n", 2750 | "Robert Wadlow\n", 2751 | "\n", 2752 | "i:1\n", 2753 | "Manute Bol\n", 2754 | "\n", 2755 | "i:2\n", 2756 | "Sultan Kösen\n", 2757 | "\n", 2758 | "i:3\n", 2759 | "Sultan Kösen\n", 2760 | "\n", 2761 | "i:4\n", 2762 | "Netherlands\n", 2763 | "\n", 2764 | "i:5\n", 2765 | "Robert Wadlow\n", 2766 | "\n", 2767 | "i:6\n", 2768 | "Randy Johnson\n", 2769 | "\n", 2770 | "i:7\n", 2771 | "Manute Bol\n", 2772 | "\n", 2773 | "i:8\n", 2774 | "Harald Hardrada\n", 2775 | "\n", 2776 | "i:9\n", 2777 | "Manute Bol\n", 2778 | "\n" 2779 | ] 2780 | } 2781 | ], 2782 | "source": [ 2783 | "for i, result in enumerate(results):\n", 2784 | " print(f\"i:{i}\")\n", 2785 | " print(result.get('title'))\n", 2786 | " print(result.get('text'))\n", 2787 | " print()" 2788 | ] 2789 | }, 2790 | { 2791 | "cell_type": "code", 2792 | "execution_count": 121, 2793 | "id": "d269db28-15aa-426a-a993-14275a36ca09", 2794 | "metadata": {}, 2795 | "outputs": [], 2796 | "source": [ 2797 | "texts = [result.get('text') for result in results]\n", 2798 | "reranked_text = rerank_responses(query_2, texts)" 2799 | ] 2800 | }, 2801 | { 2802 | "cell_type": "code", 2803 | "execution_count": 122, 2804 | "id": "aa7aca9b-bdc0-4c08-9615-1a7408854cb4", 2805 | "metadata": {}, 2806 | "outputs": [ 2807 | { 2808 | "name": "stdout", 2809 | "output_type": "stream", 2810 | "text": [ 2811 | "i:0\n", 2812 | "RerankResult\n", 2813 | "\n", 2814 | "i:1\n", 2815 | "RerankResult\n", 2816 | "\n", 2817 | "i:2\n", 2818 | "RerankResult\n", 2819 | "\n", 2820 | "i:3\n", 2821 | "RerankResult\n", 2822 | "\n", 2823 | "i:4\n", 2824 | "RerankResult\n", 2825 | "\n" 2826 | ] 2827 | } 2828 | ], 2829 | "source": [ 2830 | "for i, rerank_result in enumerate(reranked_text):\n", 2831 | " print(f\"i:{i}\")\n", 2832 | " print(f\"{rerank_result}\")\n", 2833 | " print()" 2834 | ] 2835 | }, 2836 | { 2837 | "cell_type": "code", 2838 | "execution_count": null, 2839 | "id": "ef1a763a-1b1d-4d4a-99b4-26ab341663e9", 2840 | "metadata": {}, 2841 | "outputs": [], 2842 | "source": [] 2843 | }, 2844 | { 2845 | "cell_type": "code", 2846 | "execution_count": null, 2847 | "id": "b9000db7-3202-436f-a542-ae20b8879ea7", 2848 | "metadata": {}, 2849 | "outputs": [], 2850 | "source": [] 2851 | }, 2852 | { 2853 | "cell_type": "code", 2854 | "execution_count": null, 2855 | "id": "08880d64-a78e-4871-bae3-e75ab88ac3ad", 2856 | "metadata": {}, 2857 | "outputs": [], 2858 | "source": [] 2859 | }, 2860 | { 2861 | "cell_type": "code", 2862 | "execution_count": null, 2863 | "id": "bd4bb517-e5a7-4a4f-bc2b-cb4ea2fad2bc", 2864 | "metadata": {}, 2865 | "outputs": [], 2866 | "source": [] 2867 | }, 2868 | { 2869 | "cell_type": "code", 2870 | "execution_count": null, 2871 | "id": "c297e7ba-5a95-412e-9a4b-b5f214570cfe", 2872 | "metadata": {}, 2873 | "outputs": [], 2874 | "source": [] 2875 | }, 2876 | { 2877 | "cell_type": "code", 2878 | "execution_count": null, 2879 | "id": "93763474-ea16-4a2e-a0ff-08c088f8c708", 2880 | "metadata": {}, 2881 | "outputs": [], 2882 | "source": [] 2883 | }, 2884 | { 2885 | "cell_type": "code", 2886 | "execution_count": null, 2887 | "id": "f8cbf0cd-1150-47d4-a33b-966267960dfc", 2888 | "metadata": {}, 2889 | "outputs": [], 2890 | "source": [] 2891 | }, 2892 | { 2893 | "cell_type": "code", 2894 | "execution_count": null, 2895 | "id": "37f62842-d1e0-4388-8a1d-43e5cb1e6d05", 2896 | "metadata": {}, 2897 | "outputs": [], 2898 | "source": [] 2899 | }, 2900 | { 2901 | "cell_type": "code", 2902 | "execution_count": null, 2903 | "id": "ac7843fb-2d5a-49ed-9520-fd671333ee0c", 2904 | "metadata": {}, 2905 | "outputs": [], 2906 | "source": [] 2907 | }, 2908 | { 2909 | "cell_type": "code", 2910 | "execution_count": null, 2911 | "id": "e353066b-7c07-42ad-bc75-ac56fc1a25b2", 2912 | "metadata": {}, 2913 | "outputs": [], 2914 | "source": [] 2915 | }, 2916 | { 2917 | "cell_type": "code", 2918 | "execution_count": null, 2919 | "id": "1bde2f0f-5d0a-4b3f-8739-d1d0ae8fadab", 2920 | "metadata": {}, 2921 | "outputs": [], 2922 | "source": [] 2923 | }, 2924 | { 2925 | "cell_type": "code", 2926 | "execution_count": null, 2927 | "id": "fa9a7eef-980b-4591-8c7c-1505a33a6f95", 2928 | "metadata": {}, 2929 | "outputs": [], 2930 | "source": [] 2931 | }, 2932 | { 2933 | "cell_type": "code", 2934 | "execution_count": null, 2935 | "id": "c4da8462-b32c-4aeb-8fe4-2370e499e7a1", 2936 | "metadata": {}, 2937 | "outputs": [], 2938 | "source": [] 2939 | }, 2940 | { 2941 | "cell_type": "code", 2942 | "execution_count": null, 2943 | "id": "75a0466e-ffae-4c16-988c-862290a7b604", 2944 | "metadata": {}, 2945 | "outputs": [], 2946 | "source": [] 2947 | }, 2948 | { 2949 | "cell_type": "code", 2950 | "execution_count": null, 2951 | "id": "8233d2cf-0cfd-4f59-98dc-838708dd8452", 2952 | "metadata": {}, 2953 | "outputs": [], 2954 | "source": [] 2955 | }, 2956 | { 2957 | "cell_type": "code", 2958 | "execution_count": null, 2959 | "id": "aec3aead-d23b-47e1-8ec4-7c585f91a960", 2960 | "metadata": {}, 2961 | "outputs": [], 2962 | "source": [] 2963 | }, 2964 | { 2965 | "cell_type": "code", 2966 | "execution_count": null, 2967 | "id": "317bafdc-18e9-4c80-8098-79a35c83eb1f", 2968 | "metadata": {}, 2969 | "outputs": [], 2970 | "source": [] 2971 | }, 2972 | { 2973 | "cell_type": "code", 2974 | "execution_count": null, 2975 | "id": "d5707147-2c0c-4055-a179-e653ad9533c9", 2976 | "metadata": {}, 2977 | "outputs": [], 2978 | "source": [] 2979 | }, 2980 | { 2981 | "cell_type": "code", 2982 | "execution_count": null, 2983 | "id": "a1c633a4-daea-4d8a-983d-807fc612874b", 2984 | "metadata": {}, 2985 | "outputs": [], 2986 | "source": [] 2987 | } 2988 | ], 2989 | "metadata": { 2990 | "kernelspec": { 2991 | "display_name": "Python 3 (ipykernel)", 2992 | "language": "python", 2993 | "name": "python3" 2994 | }, 2995 | "language_info": { 2996 | "codemirror_mode": { 2997 | "name": "ipython", 2998 | "version": 3 2999 | }, 3000 | "file_extension": ".py", 3001 | "mimetype": "text/x-python", 3002 | "name": "python", 3003 | "nbconvert_exporter": "python", 3004 | "pygments_lexer": "ipython3", 3005 | "version": "3.10.9" 3006 | } 3007 | }, 3008 | "nbformat": 4, 3009 | "nbformat_minor": 5 3010 | } 3011 | -------------------------------------------------------------------------------- /05_Generative_Search.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "6822588f", 6 | "metadata": {}, 7 | "source": [ 8 | "# Generating Answers" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": null, 14 | "id": "ee9744da-72bf-4228-a75c-d98648eb13f9", 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "question = \"Are side projects important when you are starting to learn about AI?\"" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 2, 24 | "id": "3c644ca0-e1d2-42e4-a668-12e3c80413ed", 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "text = \"\"\"\n", 29 | "The rapid rise of AI has led to a rapid rise in AI jobs, and many people are building exciting careers in this field. A career is a decades-long journey, and the path is not always straightforward. Over many years, I’ve been privileged to see thousands of students as well as engineers in companies large and small navigate careers in AI. In this and the next few letters, I’d like to share a few thoughts that might be useful in charting your own course.\n", 30 | "\n", 31 | "Three key steps of career growth are learning (to gain technical and other skills), working on projects (to deepen skills, build a portfolio, and create impact) and searching for a job. These steps stack on top of each other:\n", 32 | "\n", 33 | "Initially, you focus on gaining foundational technical skills.\n", 34 | "After having gained foundational skills, you lean into project work. During this period, you’ll probably keep learning.\n", 35 | "Later, you might occasionally carry out a job search. Throughout this process, you’ll probably continue to learn and work on meaningful projects.\n", 36 | "These phases apply in a wide range of professions, but AI involves unique elements. For example:\n", 37 | "\n", 38 | "AI is nascent, and many technologies are still evolving. While the foundations of machine learning and deep learning are maturing — and coursework is an efficient way to master them — beyond these foundations, keeping up-to-date with changing technology is more important in AI than fields that are more mature.\n", 39 | "Project work often means working with stakeholders who lack expertise in AI. This can make it challenging to find a suitable project, estimate the project’s timeline and return on investment, and set expectations. In addition, the highly iterative nature of AI projects leads to special challenges in project management: How can you come up with a plan for building a system when you don’t know in advance how long it will take to achieve the target accuracy? Even after the system has hit the target, further iteration may be necessary to address post-deployment drift.\n", 40 | "While searching for a job in AI can be similar to searching for a job in other sectors, there are some differences. Many companies are still trying to figure out which AI skills they need and how to hire people who have them. Things you’ve worked on may be significantly different than anything your interviewer has seen, and you’re more likely to have to educate potential employers about some elements of your work.\n", 41 | "Throughout these steps, a supportive community is a big help. Having a group of friends and allies who can help you — and whom you strive to help — makes the path easier. This is true whether you’re taking your first steps or you’ve been on the journey for years.\n", 42 | "\n", 43 | "I’m excited to work with all of you to grow the global AI community, and that includes helping everyone in our community develop their careers. I’ll dive more deeply into these topics in the next few weeks.\n", 44 | "\n", 45 | "Last week, I wrote about key steps for building a career in AI: learning technical skills, doing project work, and searching for a job, all of which is supported by being part of a community. In this letter, I’d like to dive more deeply into the first step.\n", 46 | "\n", 47 | "More papers have been published on AI than any person can read in a lifetime. So, in your efforts to learn, it’s critical to prioritize topic selection. I believe the most important topics for a technical career in machine learning are:\n", 48 | "\n", 49 | "Foundational machine learning skills. For example, it’s important to understand models such as linear regression, logistic regression, neural networks, decision trees, clustering, and anomaly detection. Beyond specific models, it’s even more important to understand the core concepts behind how and why machine learning works, such as bias/variance, cost functions, regularization, optimization algorithms, and error analysis.\n", 50 | "Deep learning. This has become such a large fraction of machine learning that it’s hard to excel in the field without some understanding of it! It’s valuable to know the basics of neural networks, practical skills for making them work (such as hyperparameter tuning), convolutional networks, sequence models, and transformers.\n", 51 | "Math relevant to machine learning. Key areas include linear algebra (vectors, matrices, and various manipulations of them) as well as probability and statistics (including discrete and continuous probability, standard probability distributions, basic rules such as independence and Bayes rule, and hypothesis testing). In addition, exploratory data analysis (EDA) — using visualizations and other methods to systematically explore a dataset — is an underrated skill. I’ve found EDA particularly useful in data-centric AI development, where analyzing errors and gaining insights can really help drive progress! Finally, a basic intuitive understanding of calculus will also help. In a previous letter, I described how the math needed to do machine learning well has been changing. For instance, although some tasks require calculus, improved automatic differentiation software makes it possible to invent and implement new neural network architectures without doing any calculus. This was almost impossible a decade ago.\n", 52 | "Software development. While you can get a job and make huge contributions with only machine learning modeling skills, your job opportunities will increase if you can also write good software to implement complex AI systems. These skills include programming fundamentals, data structures (especially those that relate to machine learning, such as data frames), algorithms (including those related to databases and data manipulation), software design, familiarity with Python, and familiarity with key libraries such as TensorFlow or PyTorch, and scikit-learn.\n", 53 | "This is a lot to learn! Even after you master everything in this list, I hope you’ll keep learning and continue to deepen your technical knowledge. I’ve known many machine learning engineers who benefitted from deeper skills in an application area such as natural language processing or computer vision, or in a technology area such as probabilistic graphical models or building scalable software systems.\n", 54 | "\n", 55 | "How do you gain these skills? There’s a lot of good content on the internet, and in theory reading dozens of web pages could work. But when the goal is deep understanding, reading disjointed web pages is inefficient because they tend to repeat each other, use inconsistent terminology (which slows you down), vary in quality, and leave gaps. That’s why a good course — in which a body of material has been organized into a coherent and logical form — is often the most time-efficient way to master a meaningful body of knowledge. When you’ve absorbed the knowledge available in courses, you can switch over to research papers and other resources.\n", 56 | "\n", 57 | "Finally, keep in mind that no one can cram everything they need to know over a weekend or even a month. Everyone I know who’s great at machine learning is a lifelong learner. In fact, given how quickly our field is changing, there’s little choice but to keep learning if you want to keep up. How can you maintain a steady pace of learning for years? I’ve written about the value of habits. If you cultivate the habit of learning a little bit every week, you can make significant progress with what feels like less effort.\n", 58 | "\n", 59 | "In the last two letters, I wrote about developing a career in AI and shared tips for gaining technical skills. This time, I’d like to discuss an important step in building a career: project work.\n", 60 | "\n", 61 | "It goes without saying that we should only work on projects that are responsible and ethical, and that benefit people. But those limits leave a large variety to choose from. I wrote previously about how to identify and scope AI projects. This and next week’s letter have a different emphasis: picking and executing projects with an eye toward career development.\n", 62 | "\n", 63 | "A fruitful career will include many projects, hopefully growing in scope, complexity, and impact over time. Thus, it is fine to start small. Use early projects to learn and gradually step up to bigger projects as your skills grow.\n", 64 | "\n", 65 | "When you’re starting out, don’t expect others to hand great ideas or resources to you on a platter. Many people start by working on small projects in their spare time. With initial successes — even small ones — under your belt, your growing skills increase your ability to come up with better ideas, and it becomes easier to persuade others to help you step up to bigger projects.\n", 66 | "\n", 67 | "What if you don’t have any project ideas? Here are a few ways to generate them:\n", 68 | "\n", 69 | "Join existing projects. If you find someone else with an idea, ask to join their project.\n", 70 | "Keep reading and talking to people. I come up with new ideas whenever I spend a lot of time reading, taking courses, or talking with domain experts. I’m confident that you will, too.\n", 71 | "Focus on an application area. Many researchers are trying to advance basic AI technology — say, by inventing the next generation of transformers or further scaling up language models — so, while this is an exciting direction, it is hard. But the variety of applications to which machine learning has not yet been applied is vast! I’m fortunate to have been able to apply neural networks to everything from autonomous helicopter flight to online advertising, partly because I jumped in when relatively few people were working on those applications. If your company or school cares about a particular application, explore the possibilities for machine learning. That can give you a first look at a potentially creative application — one where you can do unique work — that no one else has done yet.\n", 72 | "Develop a side hustle. Even if you have a full-time job, a fun project that may or may not develop into something bigger can stir the creative juices and strengthen bonds with collaborators. When I was a full-time professor, working on online education wasn’t part of my “job” (which was doing research and teaching classes). It was a fun hobby that I often worked on out of passion for education. My early experiences recording videos at home helped me later in working on online education in a more substantive way. Silicon Valley abounds with stories of startups that started as side projects. So long as it doesn’t create a conflict with your employer, these projects can be a stepping stone to something significant.\n", 73 | "Given a few project ideas, which one should you jump into? Here’s a quick checklist of factors to consider:\n", 74 | "\n", 75 | "Will the project help you grow technically? Ideally, it should be challenging enough to stretch your skills but not so hard that you have little chance of success. This will put you on a path toward mastering ever-greater technical complexity.\n", 76 | "Do you have good teammates to work with? If not, are there people you can discuss things with? We learn a lot from the people around us, and good collaborators will have a huge impact on your growth.\n", 77 | "Can it be a stepping stone? If the project is successful, will its technical complexity and/or business impact make it a meaningful stepping stone to larger projects? (If the project is bigger than those you’ve worked on before, there’s a good chance it could be such a stepping stone.)\n", 78 | "Finally, avoid analysis paralysis. It doesn’t make sense to spend a month deciding whether to work on a project that would take a week to complete. You'll work on multiple projects over the course of your career, so you’ll have ample opportunity to refine your thinking on what’s worthwhile. Given the huge number of possible AI projects, rather than the conventional “ready, aim, fire” approach, you can accelerate your progress with “ready, fire, aim.”\n", 79 | "\n", 80 | "\"\"\"" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "f464683c", 86 | "metadata": {}, 87 | "source": [ 88 | "## Setup\n", 89 | "\n", 90 | "Load needed API keys and relevant Python libaries." 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 3, 96 | "id": "5ad188a7-9b22-41cf-be13-9b9aa88b1d8b", 97 | "metadata": {}, 98 | "outputs": [], 99 | "source": [ 100 | "import os\n", 101 | "from dotenv import load_dotenv, find_dotenv\n", 102 | "_ = load_dotenv(find_dotenv()) # read local .env file" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "id": "4b8e6539", 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "import cohere\n", 113 | "\n", 114 | "import numpy as np\n", 115 | "import warnings\n", 116 | "warnings.filterwarnings('ignore')" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "id": "26b2d46b", 122 | "metadata": {}, 123 | "source": [ 124 | "## Chunking" 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": 4, 130 | "id": "6048a9ee-d489-437c-8dd6-38c228b59b06", 131 | "metadata": {}, 132 | "outputs": [], 133 | "source": [ 134 | "# Split into a list of paragraphs\n", 135 | "texts = text.split('\\n\\n')\n", 136 | "\n", 137 | "# Clean up to remove empty spaces and new lines\n", 138 | "texts = np.array([t.strip(' \\n') for t in texts if t])" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": 6, 144 | "id": "1ce659d9-b665-4559-804e-509001ee39e7", 145 | "metadata": {}, 146 | "outputs": [ 147 | { 148 | "data": { 149 | "text/plain": [ 150 | "array(['The rapid rise of AI has led to a rapid rise in AI jobs, and many people are building exciting careers in this field. A career is a decades-long journey, and the path is not always straightforward. Over many years, I’ve been privileged to see thousands of students as well as engineers in companies large and small navigate careers in AI. In this and the next few letters, I’d like to share a few thoughts that might be useful in charting your own course.',\n", 151 | " 'Three key steps of career growth are learning (to gain technical and other skills), working on projects (to deepen skills, build a portfolio, and create impact) and searching for a job. These steps stack on top of each other:',\n", 152 | " 'Initially, you focus on gaining foundational technical skills.\\nAfter having gained foundational skills, you lean into project work. During this period, you’ll probably keep learning.\\nLater, you might occasionally carry out a job search. Throughout this process, you’ll probably continue to learn and work on meaningful projects.\\nThese phases apply in a wide range of professions, but AI involves unique elements. For example:'],\n", 153 | " dtype='