├── .gitignore ├── LICENSE ├── Question_Answering_with_ALBERT.ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Ming 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Question_Answering_with_ALBERT.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "Question Answering with ALBERT.ipynb", 7 | "provenance": [], 8 | "private_outputs": true, 9 | "collapsed_sections": [], 10 | "toc_visible": true, 11 | "machine_shape": "hm", 12 | "authorship_tag": "ABX9TyPtt7t+plApQ1pDExTO4vM/", 13 | "include_colab_link": true 14 | }, 15 | "kernelspec": { 16 | "name": "python3", 17 | "display_name": "Python 3" 18 | }, 19 | "accelerator": "GPU" 20 | }, 21 | "cells": [ 22 | { 23 | "cell_type": "markdown", 24 | "metadata": { 25 | "id": "view-in-github", 26 | "colab_type": "text" 27 | }, 28 | "source": [ 29 | "\"Open" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": { 35 | "id": "1qfQAtRsMVl7", 36 | "colab_type": "text" 37 | }, 38 | "source": [ 39 | "# Reading Comprehension with ALBERT (and similar)\n", 40 | "\n", 41 | "Author: [@techno246](https://twitter.com/techno246)\n", 42 | "\n", 43 | "Github Repo: https://github.com/spark-ming/albert-qa-demo/\n", 44 | "\n", 45 | "Blog Post: https://www.spark64.com/post/machine-comprehension\n", 46 | "\n", 47 | "\n", 48 | "## Introduction\n", 49 | "\n", 50 | "Reading comprehension, otherwise known as question answering systems, are one of the tasks that NLP tries to solve. The goal of this task is to be able to answer an arbitary question given a context. For instance, given the following context:\n", 51 | "\n", 52 | "> New Zealand (Māori: Aotearoa) is a sovereign island country in the southwestern Pacific Ocean. It has a total land area of 268,000 square kilometres (103,500 sq mi), and a population of 4.9 million. New Zealand's capital city is Wellington, and its most populous city is Auckland.\n", 53 | "\n", 54 | "We ask the question\n", 55 | "\n", 56 | "> How many people live in New Zealand?\n", 57 | "\n", 58 | "We expect the QA system is to respond with something like this:\n", 59 | "\n", 60 | "> 4.9 million\n", 61 | "\n", 62 | "Since 2017, transformer models have shown to outperform existing approaches for this task. Many pretrained transformer models exist, including BERT, GPT-2, XLNET. One of the newcomers to the group is ALBERT (A Lite BERT) which was published in September 2019. The research group claims that it outperforms BERT, with much less parameters (shorter training and inference time). \n", 63 | "\n", 64 | "This tutorial demonstrates how you can fine-tune ALBERT for the task of QnA and use it for inference. For this tutorial, we will use the transformer library built by [Hugging Face](https://huggingface.co/), which is an extremely nice implementation of the transformer models (including ALBERT) in both TensorFlow and PyTorch. You can just use a fine-tuned model from their [model repository](https://huggingface.co/models) (which I encourage in general to save money and reduce emissions). However for educational purposes I will also show you how to finetune it yourself so you can adapt it for your own data. \n", 65 | "\n", 66 | "Note that the goal of this is not to build an optimised, production ready system, but to demonstrate the concept with as little code as possible. Therefore a lot of code will be retrofitted for this purpose. \n" 67 | ] 68 | }, 69 | { 70 | "cell_type": "markdown", 71 | "metadata": { 72 | "id": "sBBHbGvQN5vX", 73 | "colab_type": "text" 74 | }, 75 | "source": [ 76 | "## 1.0 Setup\n", 77 | "\n", 78 | "Let's check out what kind of GPU our friends at Google gave us. This notebook should be configured to give you a P100 😃 (saved in metadata)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "metadata": { 84 | "id": "frTeTcy4WdbY", 85 | "colab_type": "code", 86 | "colab": {} 87 | }, 88 | "source": [ 89 | "!nvidia-smi" 90 | ], 91 | "execution_count": 0, 92 | "outputs": [] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": { 97 | "id": "D5RImM3oWbrZ", 98 | "colab_type": "text" 99 | }, 100 | "source": [ 101 | "First, we clone the Hugging Face transformer library from Github.\n", 102 | "\n", 103 | "\n", 104 | "Note it's checking out a specific commit only because I've tested this" 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "metadata": { 110 | "id": "QOAoUwBFMQCg", 111 | "colab_type": "code", 112 | "colab": {} 113 | }, 114 | "source": [ 115 | "!git clone https://github.com/huggingface/transformers \\\n", 116 | "&& cd transformers \\\n", 117 | "&& git checkout a3085020ed0d81d4903c50967687192e3101e770 " 118 | ], 119 | "execution_count": 0, 120 | "outputs": [] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "metadata": { 125 | "id": "TRZned-8WJrj", 126 | "colab_type": "code", 127 | "colab": {} 128 | }, 129 | "source": [ 130 | "!pip install ./transformers\n", 131 | "!pip install tensorboardX" 132 | ], 133 | "execution_count": 0, 134 | "outputs": [] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": { 139 | "id": "UHCuzhPptH0M", 140 | "colab_type": "text" 141 | }, 142 | "source": [ 143 | "## 2.0 Train Model\n", 144 | "\n", 145 | "This is where we can train our own model. Note you can skip this step if you don't want to wait 1.5 hours!" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": { 151 | "id": "OaQGsAiWXcnd", 152 | "colab_type": "text" 153 | }, 154 | "source": [ 155 | "### 2.1 Get Training and Evaluation Data\n", 156 | "\n", 157 | "The SQuAD dataset contains question/answer pairs to for training the ALBERT model for the QA task. \n", 158 | "\n", 159 | "Now get the SQuAD V2.0 dataset. `train-v2.0.json` is for training and `dev-v2.0.json` is for evaluation to see how well your model trained.\n", 160 | "\n", 161 | "Read more about this dataset here: https://rajpurkar.github.io/SQuAD-explorer/" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "metadata": { 167 | "id": "dI6e-PfOXSnO", 168 | "colab_type": "code", 169 | "colab": {} 170 | }, 171 | "source": [ 172 | "!mkdir dataset \\\n", 173 | "&& cd dataset \\\n", 174 | "&& wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json \\\n", 175 | "&& wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json" 176 | ], 177 | "execution_count": 0, 178 | "outputs": [] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": { 183 | "id": "dZ87q93GDeeL", 184 | "colab_type": "text" 185 | }, 186 | "source": [ 187 | "### 2.2 Run training \n", 188 | "\n", 189 | "We can now train the model with the training set. \n", 190 | "\n", 191 | "### Notes about parameters:\n", 192 | "`per_gpu_train_batch_size` specifies the number of training examples per iteration per GPU. *In general*, higher means more accuracy and faster training. However, the biggest limitation is the size of the GPU. 12 is what I use for a GPU with 16GB memory. \n", 193 | "\n", 194 | "`save_steps` specifies number of steps before it outputs a checkpoint file. I've increased it to save disk space.\n", 195 | "\n", 196 | "`num_train_epochs` I recommend two epochs here. It's currently set to one for the purpose of time\n", 197 | "\n", 198 | "`version_2_with_negative` is required for SQuAD V2.0. If training with V1.1, take out this flag\n", 199 | "\n", 200 | "Warning: it takes about 1.5 hours to train an epoch! If you don't want to wait this long, feel free to skip this step and note the comment in the code to use a pretrained model!" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "metadata": { 206 | "id": "-Eg53t3QXZAb", 207 | "colab_type": "code", 208 | "colab": {} 209 | }, 210 | "source": [ 211 | "!export SQUAD_DIR=/content/dataset \\\n", 212 | "&& python transformers/examples/run_squad.py \\\n", 213 | " --model_type albert \\\n", 214 | " --model_name_or_path albert-base-v2 \\\n", 215 | " --do_train \\\n", 216 | " --do_eval \\\n", 217 | " --do_lower_case \\\n", 218 | " --train_file $SQUAD_DIR/train-v2.0.json \\\n", 219 | " --predict_file $SQUAD_DIR/dev-v2.0.json \\\n", 220 | " --per_gpu_train_batch_size 12 \\\n", 221 | " --learning_rate 3e-5 \\\n", 222 | " --num_train_epochs 1.0 \\\n", 223 | " --max_seq_length 384 \\\n", 224 | " --doc_stride 128 \\\n", 225 | " --output_dir /content/model_output \\\n", 226 | " --save_steps 1000 \\\n", 227 | " --threads 4 \\\n", 228 | " --version_2_with_negative " 229 | ], 230 | "execution_count": 0, 231 | "outputs": [] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": { 236 | "id": "-JCNRkQwUD56", 237 | "colab_type": "text" 238 | }, 239 | "source": [ 240 | "## 3.0 Setup prediction code\n", 241 | "\n", 242 | "Now we can use the Hugging Face library to make predictions using our newly trained model. Note that a lot of the code is pulled from `run_squad.py` in the Hugging Face repository, with all the training parts removed. This modified code allows to run predictions we pass in directly as strings, rather .json format like the training/test set.\n", 243 | "\n", 244 | "NOTE if you decided train your own mode, change the flag `use_own_model` to `True`\n" 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "metadata": { 250 | "id": "qp0Pq9z9Y4S0", 251 | "colab_type": "code", 252 | "cellView": "code", 253 | "colab": {} 254 | }, 255 | "source": [ 256 | "import os\n", 257 | "import torch\n", 258 | "import time\n", 259 | "from torch.utils.data import DataLoader, RandomSampler, SequentialSampler\n", 260 | "\n", 261 | "from transformers import (\n", 262 | " AlbertConfig,\n", 263 | " AlbertForQuestionAnswering,\n", 264 | " AlbertTokenizer,\n", 265 | " squad_convert_examples_to_features\n", 266 | ")\n", 267 | "\n", 268 | "from transformers.data.processors.squad import SquadResult, SquadV2Processor, SquadExample\n", 269 | "\n", 270 | "from transformers.data.metrics.squad_metrics import compute_predictions_logits\n", 271 | "\n", 272 | "# READER NOTE: Set this flag to use own model, or use pretrained model in the Hugging Face repository\n", 273 | "use_own_model = False\n", 274 | "\n", 275 | "if use_own_model:\n", 276 | " model_name_or_path = \"/content/model_output\"\n", 277 | "else:\n", 278 | " model_name_or_path = \"ktrapeznikov/albert-xlarge-v2-squad-v2\"\n", 279 | "\n", 280 | "output_dir = \"\"\n", 281 | "\n", 282 | "# Config\n", 283 | "n_best_size = 1\n", 284 | "max_answer_length = 30\n", 285 | "do_lower_case = True\n", 286 | "null_score_diff_threshold = 0.0\n", 287 | "\n", 288 | "def to_list(tensor):\n", 289 | " return tensor.detach().cpu().tolist()\n", 290 | "\n", 291 | "# Setup model\n", 292 | "config_class, model_class, tokenizer_class = (\n", 293 | " AlbertConfig, AlbertForQuestionAnswering, AlbertTokenizer)\n", 294 | "config = config_class.from_pretrained(model_name_or_path)\n", 295 | "tokenizer = tokenizer_class.from_pretrained(\n", 296 | " model_name_or_path, do_lower_case=True)\n", 297 | "model = model_class.from_pretrained(model_name_or_path, config=config)\n", 298 | "\n", 299 | "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", 300 | "\n", 301 | "model.to(device)\n", 302 | "\n", 303 | "processor = SquadV2Processor()\n", 304 | "\n", 305 | "def run_prediction(question_texts, context_text):\n", 306 | " \"\"\"Setup function to compute predictions\"\"\"\n", 307 | " examples = []\n", 308 | "\n", 309 | " for i, question_text in enumerate(question_texts):\n", 310 | " example = SquadExample(\n", 311 | " qas_id=str(i),\n", 312 | " question_text=question_text,\n", 313 | " context_text=context_text,\n", 314 | " answer_text=None,\n", 315 | " start_position_character=None,\n", 316 | " title=\"Predict\",\n", 317 | " is_impossible=False,\n", 318 | " answers=None,\n", 319 | " )\n", 320 | "\n", 321 | " examples.append(example)\n", 322 | "\n", 323 | " features, dataset = squad_convert_examples_to_features(\n", 324 | " examples=examples,\n", 325 | " tokenizer=tokenizer,\n", 326 | " max_seq_length=384,\n", 327 | " doc_stride=128,\n", 328 | " max_query_length=64,\n", 329 | " is_training=False,\n", 330 | " return_dataset=\"pt\",\n", 331 | " threads=1,\n", 332 | " )\n", 333 | "\n", 334 | " eval_sampler = SequentialSampler(dataset)\n", 335 | " eval_dataloader = DataLoader(dataset, sampler=eval_sampler, batch_size=10)\n", 336 | "\n", 337 | " all_results = []\n", 338 | "\n", 339 | " for batch in eval_dataloader:\n", 340 | " model.eval()\n", 341 | " batch = tuple(t.to(device) for t in batch)\n", 342 | "\n", 343 | " with torch.no_grad():\n", 344 | " inputs = {\n", 345 | " \"input_ids\": batch[0],\n", 346 | " \"attention_mask\": batch[1],\n", 347 | " \"token_type_ids\": batch[2],\n", 348 | " }\n", 349 | "\n", 350 | " example_indices = batch[3]\n", 351 | "\n", 352 | " outputs = model(**inputs)\n", 353 | "\n", 354 | " for i, example_index in enumerate(example_indices):\n", 355 | " eval_feature = features[example_index.item()]\n", 356 | " unique_id = int(eval_feature.unique_id)\n", 357 | "\n", 358 | " output = [to_list(output[i]) for output in outputs]\n", 359 | "\n", 360 | " start_logits, end_logits = output\n", 361 | " result = SquadResult(unique_id, start_logits, end_logits)\n", 362 | " all_results.append(result)\n", 363 | "\n", 364 | " output_prediction_file = \"predictions.json\"\n", 365 | " output_nbest_file = \"nbest_predictions.json\"\n", 366 | " output_null_log_odds_file = \"null_predictions.json\"\n", 367 | "\n", 368 | " predictions = compute_predictions_logits(\n", 369 | " examples,\n", 370 | " features,\n", 371 | " all_results,\n", 372 | " n_best_size,\n", 373 | " max_answer_length,\n", 374 | " do_lower_case,\n", 375 | " output_prediction_file,\n", 376 | " output_nbest_file,\n", 377 | " output_null_log_odds_file,\n", 378 | " False, # verbose_logging\n", 379 | " True, # version_2_with_negative\n", 380 | " null_score_diff_threshold,\n", 381 | " tokenizer,\n", 382 | " )\n", 383 | "\n", 384 | " return predictions" 385 | ], 386 | "execution_count": 0, 387 | "outputs": [] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": { 392 | "id": "nIQOB8vhpcKs", 393 | "colab_type": "text" 394 | }, 395 | "source": [ 396 | "## 4.0 Run predictions\n", 397 | "\n", 398 | "Now for the fun part... testing out your model on different inputs. Pretty rudimentary example here. But the possibilities are endless with this function." 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "metadata": { 404 | "id": "F-sUrcA5nXTH", 405 | "colab_type": "code", 406 | "cellView": "code", 407 | "colab": {} 408 | }, 409 | "source": [ 410 | "context = \"New Zealand (Māori: Aotearoa) is a sovereign island country in the southwestern Pacific Ocean. It has a total land area of 268,000 square kilometres (103,500 sq mi), and a population of 4.9 million. New Zealand's capital city is Wellington, and its most populous city is Auckland.\"\n", 411 | "questions = [\"How many people live in New Zealand?\", \n", 412 | " \"What's the largest city?\"]\n", 413 | "\n", 414 | "# Run method\n", 415 | "predictions = run_prediction(questions, context)\n", 416 | "\n", 417 | "# Print results\n", 418 | "for key in predictions.keys():\n", 419 | " print(predictions[key])" 420 | ], 421 | "execution_count": 0, 422 | "outputs": [] 423 | }, 424 | { 425 | "cell_type": "markdown", 426 | "metadata": { 427 | "id": "rkivu8FOqp_8", 428 | "colab_type": "text" 429 | }, 430 | "source": [ 431 | "## 5.0 Next Steps\n", 432 | "\n", 433 | "In this tutorial, you learnt how to fine-tune an ALBERT model for the task of question answering, using the SQuAD dataset. Then, you learnt how you can make predictions using the model. \n", 434 | "\n", 435 | "We retrofitted `compute_predictions_logits` to make the prediction for the purpose of simplicity and minimising dependencies in the tutorial. Take a peak inside that module to see how it works. If you want to serve this as an API, you will want to strip out a lot of the stuff it's doing (such as writing the predictions to a JSON, etc)\n", 436 | "\n", 437 | "You can now turn this into an API by serving it using a web framework. I recommend checking out FastAPI, which is what [Albert Learns to Read](https://littlealbert.now.sh) is built on. \n", 438 | "\n", 439 | "Feel free to open an issue in the [Github respository](https://github.com/spark-ming/albert-qa-demo/) for this notebook, or tweet me @techno246 if you have any questions! \n", 440 | "\n" 441 | ] 442 | } 443 | ] 444 | } -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Reading Comprehension with ALBERT (and similar) 2 | 3 | Author: [@techno246](https://twitter.com/techno246) 4 | 5 | Blog Post: https://www.spark64.com/post/machine-comprehension 6 | 7 | ## Introduction 8 | 9 | Reading comprehension, otherwise known as question answering systems, are one of the tasks that NLP tries to solve. The goal of this task is to be able to answer an arbitary question given a context. For instance, given the following context: 10 | 11 | > New Zealand (Māori: Aotearoa) is a sovereign island country in the southwestern Pacific Ocean. It has a total land area of 268,000 square kilometres (103,500 sq mi), and a population of 4.9 million. New Zealand's capital city is Wellington, and its most populous city is Auckland. 12 | 13 | We ask the question 14 | 15 | > How many people live in New Zealand? 16 | 17 | We expect the QA system is to respond with something like this: 18 | 19 | > 4.9 million 20 | 21 | Since 2017, transformer models have shown to outperform existing approaches for this task. Many pretrained transformer models exist, including BERT, GPT-2, XLNET. One of the newcomers to the group is ALBERT (A Lite BERT) which was published in September 2019. The research group claims that it outperforms BERT, with much less parameters (shorter training and inference time). 22 | 23 | This tutorial demonstrates how you can fine-tune ALBERT for the task of QnA and use it for inference. For this tutorial, we will use the transformer library built by Hugging Face, which is an extremely nice implementation of the transformer models (including ALBERT) in both TensorFlow and PyTorch. You can just use a fine-tuned model from their [model repository](https://huggingface.co/models) (which I encourage in general to save money and reduce emissions). However for educational purposes I will also show you how to finetune it yourself so you can adapt it for your own data. 24 | 25 | Note that the goal of this is not to build an optimised, production ready system, but to demonstrate the concept with as little code as possible. Therefore a lot of code will be retrofitted for this purpose. 26 | 27 | Get started below: 28 | 29 | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/spark-ming/albert-qa-demo/blob/master/Question_Answering_with_ALBERT.ipynb) 30 | 31 | --------------------------------------------------------------------------------