├── img └── LLaMAs.jpeg ├── pyproject.toml ├── LICENSE ├── instructions └── README.md ├── README.md ├── .gitignore └── code ├── GPT4All-inference.ipynb ├── generate_data.py ├── OpenAI-API.ipynb ├── GPT4-API.ipynb ├── Finetune-T5-multiGPU.py ├── GPT.ipynb ├── scrap └── GPT4All-J.ipynb ├── T5.ipynb └── tokens.ipynb /img/LLaMAs.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonkrohn/NLP-with-LLMs/HEAD/img/LLaMAs.jpeg -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [tool.poetry] 2 | name = "nlp-with-llms" 3 | version = "0.1.0" 4 | description = "" 5 | authors = ["shaan "] 6 | readme = "README.md" 7 | packages = [{include = "nlp_with_llms"}] 8 | 9 | [tool.poetry.dependencies] 10 | python = "^3.9.16" 11 | nvidia-ml-py3 = "^7.352.0" 12 | pytorch-lightning = "^2.0.1.post0" 13 | transformers = "^4.28.0" 14 | torchvision = "^0.15.1" 15 | rouge-score = "^0.1.2" 16 | tensorboardx = "^2.6" 17 | accelerate = "^0.18.0" 18 | peft = "^0.2.0" 19 | deepspeed = "^0.9.1" 20 | 21 | 22 | [tool.poetry.group.dev.dependencies] 23 | black = {extras = ["jupyter"], version = "^23.3.0"} 24 | ruff = "^0.0.263" 25 | 26 | [build-system] 27 | requires = ["poetry-core"] 28 | build-backend = "poetry.core.masonry.api" 29 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Shaan Khosla, Sinan Ozdemir, & Jon Krohn 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /instructions/README.md: -------------------------------------------------------------------------------- 1 | # Sample Instructions for LLM Fine-Tuning on Multiple LLMs 2 | 3 | 1. Use [Paperspace](https://www.paperspace.com/) for cloud training: 4 | * Select **Core** virtual servers 5 | * **Create A Machine** with the following configuration: 6 | * **OS**: ML-in-a-Box 7 | * **Machine Type**: select cheapest Multi-GPU option 8 | * **Region**: closest to you 9 | * **Authentication**: select `Password` 10 | * In **Advanced Options**, give your machine a name and select `Static` Public IP 11 | * Wait for machine to boot 12 | 2. In a command-line interface, use `ssh` address and corresponding password provided by Paperspace 13 | * Run `git clone https://github.com/jonkrohn/NLP-with-LLMs.git` 14 | * If using [Poetry](https://python-poetry.org/) for the first time, run `curl -sSL https://install.python-poetry.org | python3 -` 15 | * Change into the repo's directory with `cd NLP-with-LLMs/` 16 | * If the repo's dependencies haven't already been installed with Poetry, run `poetry install` 17 | * Fine-tune the T5 LLM with `poetry run python code/Finetune-T5-multiGPU.py` 18 | 3. In a separate command-line window (that's also SSH'ed into your Paperspace instance), you can confirm multiple-GPU usage with `nvidia-smi` 19 | 4. When you are satisfied with your model, you can push the model to Hugging Face: 20 | * Uncomment these lines in `Finetune-T5-multiGPU.py`: 21 | * `training_model.model.push_to_hub("digit_conversion")` 22 | * `training_model.tokenizer.push_to_hub("digit_conversion")` 23 | * Run `poetry run huggingface-cli login` 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Natural Language Processing with Large Language Models 2 | 3 | ### Purpose 4 | 5 | [Jon Krohn](https://www.jonkrohn.com/) created this repo to accompany his half-day training on **NLP with GPT-4 and other LLMs: From Training to Deployment with Hugging Face and PyTorch Lightning**, which was first offered at the [Open Data Science Conference (ODSC) East](https://odsc.com/boston/) in Boston on May 10th, 2023. 6 | 7 | ### Code 8 | 9 | Code can be found in the aptly named [`code`](https://github.com/jonkrohn/NLP-with-LLMs/tree/main/code) directory: 10 | * Jupyter Notebooks are directly supported for execution in [Google Colab](https://colab.research.google.com/) 11 | * `.py` files are for running at the command line (see [instructions](https://github.com/jonkrohn/NLP-with-LLMs/tree/main/instructions)) 12 | 13 | N.B.: Code is intended to be accompanied by live instructions and so it will not necessarily be self-explanatory. 14 | 15 | ### Repo Art 16 | 17 |

18 | 20 | 21 | The "repo art" above was generated by prompting Midjourney v5 with this artistic take on LLMs (that was output by GPT-4): 22 | 23 | *Painting of a harmonious blend of [alpacas](https://crfm.stanford.edu/2023/03/13/alpaca.html) and [vicuñas](https://vicuna.lmsys.org/) in rich shades of caramel and soft gray, amidst a lush, futuristic landscape. The animals are surrounded by a web of glowing, pulsating neural network connections in hues of electric blue and neon green, symbolizing the cutting-edge and cost-effective AI training techniques. In the background, a dynamic matrix of binary code cascades down, further emphasizing the technological prowess of the scene.* 24 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py,cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | .pybuilder/ 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | # For a library or package, you might want to ignore these files since the code is 87 | # intended to run in multiple environments; otherwise, check them in: 88 | # .python-version 89 | 90 | # pipenv 91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 94 | # install all needed dependencies. 95 | #Pipfile.lock 96 | 97 | # poetry 98 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 99 | # This is especially recommended for binary packages to ensure reproducibility, and is more 100 | # commonly ignored for libraries. 101 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 102 | #poetry.lock 103 | 104 | # pdm 105 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 106 | #pdm.lock 107 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 108 | # in version control. 109 | # https://pdm.fming.dev/#use-with-ide 110 | .pdm.toml 111 | 112 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 113 | __pypackages__/ 114 | 115 | # Celery stuff 116 | celerybeat-schedule 117 | celerybeat.pid 118 | 119 | # SageMath parsed files 120 | *.sage.py 121 | 122 | # Environments 123 | .env 124 | .venv 125 | env/ 126 | venv/ 127 | ENV/ 128 | env.bak/ 129 | venv.bak/ 130 | 131 | # Spyder project settings 132 | .spyderproject 133 | .spyproject 134 | 135 | # Rope project settings 136 | .ropeproject 137 | 138 | # mkdocs documentation 139 | /site 140 | 141 | # mypy 142 | .mypy_cache/ 143 | .dmypy.json 144 | dmypy.json 145 | 146 | # Pyre type checker 147 | .pyre/ 148 | 149 | # pytype static type analyzer 150 | .pytype/ 151 | 152 | # Cython debug symbols 153 | cython_debug/ 154 | 155 | # PyCharm 156 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 157 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 158 | # and can be added to the global gitignore or merged into this file. For a more nuclear 159 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 160 | #.idea/ 161 | -------------------------------------------------------------------------------- /code/GPT4All-inference.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPktxGZhE4MIOCn0YzH1ztC", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# GPT4All CPU Interface\n", 33 | "\n", 34 | "In this notebook, we jump as quickly as we can into a command-line interaction with a GPT-4-like model that is on our own device." 35 | ], 36 | "metadata": { 37 | "id": "Powa-rm3ApS0" 38 | } 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 1, 43 | "metadata": { 44 | "id": "vVpLf3_fX3qK" 45 | }, 46 | "outputs": [], 47 | "source": [ 48 | "%%capture\n", 49 | "!pip install nomic" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "source": [ 55 | "from nomic.gpt4all import GPT4All" 56 | ], 57 | "metadata": { 58 | "id": "eegz7O7Nuvj0" 59 | }, 60 | "execution_count": 2, 61 | "outputs": [] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "source": [ 66 | "m = GPT4All()\n", 67 | "m.open()" 68 | ], 69 | "metadata": { 70 | "id": "XZ3Rz0grAneG", 71 | "outputId": "ce4cfa9f-eb98-43d0-86f4-7f5931e4b944", 72 | "colab": { 73 | "base_uri": "https://localhost:8080/" 74 | } 75 | }, 76 | "execution_count": 3, 77 | "outputs": [ 78 | { 79 | "output_type": "stream", 80 | "name": "stderr", 81 | "text": [ 82 | "\u001b[32m2023-05-04 07:48:01.346\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mnomic.gpt4all.gpt4all\u001b[0m:\u001b[36m__init__\u001b[0m:\u001b[36m77\u001b[0m - \u001b[1mDownloading executable...\u001b[0m\n", 83 | "51KB [00:00, 709.18KB/s] \n", 84 | "\u001b[32m2023-05-04 07:48:01.606\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mnomic.gpt4all.gpt4all\u001b[0m:\u001b[36m__init__\u001b[0m:\u001b[36m80\u001b[0m - \u001b[1mDownloading model...\u001b[0m\n" 85 | ] 86 | }, 87 | { 88 | "output_type": "stream", 89 | "name": "stdout", 90 | "text": [ 91 | "File downloaded successfully to /root/.nomic/gpt4all\n" 92 | ] 93 | }, 94 | { 95 | "output_type": "stream", 96 | "name": "stderr", 97 | "text": [ 98 | "514250KB [01:37, 5274.58KB/s] \n" 99 | ] 100 | }, 101 | { 102 | "output_type": "stream", 103 | "name": "stdout", 104 | "text": [ 105 | "File downloaded successfully to /root/.nomic/gpt4all-lora-quantized.bin\n" 106 | ] 107 | } 108 | ] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "source": [ 113 | "m.prompt('Please answer this question: What is a Large Language Model?')" 114 | ], 115 | "metadata": { 116 | "id": "H_Fscz0qBAAl", 117 | "outputId": "9c0408e2-27ad-4de0-c15a-2894e4d6364d", 118 | "colab": { 119 | "base_uri": "https://localhost:8080/", 120 | "height": 124 121 | } 122 | }, 123 | "execution_count": 7, 124 | "outputs": [ 125 | { 126 | "output_type": "execute_result", 127 | "data": { 128 | "text/plain": [ 129 | "'1 A large language model (LLM) refers to an artificial intelligence algorithm that learns from vast amounts of data and can be trained on multiple domains, including natural language processing tasks like machine translation or text classification/retrieval. LLMs are typically based on deep learning techniques such as recurrent neural networks (RNN), which allow them to learn complex patterns in unstructured datasets without being explicitly programmed for each task they perform.'" 130 | ], 131 | "application/vnd.google.colaboratory.intrinsic+json": { 132 | "type": "string" 133 | } 134 | }, 135 | "metadata": {}, 136 | "execution_count": 7 137 | } 138 | ] 139 | } 140 | ] 141 | } -------------------------------------------------------------------------------- /code/generate_data.py: -------------------------------------------------------------------------------- 1 | ## This code was created by Shaan Khosla. See original here: https://github.com/shaankhosla/NLP_with_LLMs/blob/main/generate_data.py 2 | ## He adapted large portions from https://docs.mosaicml.com/projects/streaming/en/stable/examples/synthetic_nlp.html 3 | 4 | from typing import Any, Dict, List, Tuple 5 | 6 | import numpy as np 7 | from tqdm import tqdm 8 | import json, os 9 | 10 | 11 | ones = ( 12 | "zero one two three four five six seven eight nine ten eleven twelve thirteen fourteen " 13 | + "fifteen sixteen seventeen eighteen nineteen" 14 | ).split() 15 | 16 | tens = "twenty thirty forty fifty sixty seventy eighty ninety".split() 17 | 18 | 19 | def say(i: int) -> List[str]: 20 | """Get the word form of a number. 21 | 22 | Args: 23 | i (int): The number. 24 | 25 | Returns: 26 | List[str]: The number in word form. 27 | """ 28 | if i < 0: 29 | return ["negative"] + say(-i) 30 | elif i <= 19: 31 | return [ones[i]] 32 | elif i < 100: 33 | return [tens[i // 10 - 2]] + ([ones[i % 10]] if i % 10 else []) 34 | elif i < 1_000: 35 | return [ones[i // 100], "hundred"] + (say(i % 100) if i % 100 else []) 36 | elif i < 1_000_000: 37 | return say(i // 1_000) + ["thousand"] + (say(i % 1_000) if i % 1_000 else []) 38 | elif i < 1_000_000_000: 39 | return ( 40 | say(i // 1_000_000) 41 | + ["million"] 42 | + (say(i % 1_000_000) if i % 1_000_000 else []) 43 | ) 44 | else: 45 | assert False 46 | 47 | 48 | def get_random_number() -> int: 49 | """Pick a random number the way humans would. 50 | 51 | Picked numbers are positively skewed, exponentially distributed (good for curriculum learning). 52 | 53 | Returns: 54 | int: The number. 55 | """ 56 | sign = (np.random.random() < 0.8) * 2 - 1 57 | mag = 10 ** np.random.uniform(1, 4) - 10 58 | return sign * int(mag**2) 59 | 60 | 61 | def get_numbers(num_train: int, num_val: int) -> Tuple[List[int], List[int]]: 62 | """Get two non-overlapping splits of unique random numbers. 63 | 64 | Because the distribution is exponential, we are unlikely to run out of numbers. 65 | 66 | Args: 67 | num_train (int): Number of training samples. 68 | num_val (int): Number of validation samples. 69 | 70 | Returns: 71 | Tuple[List[int], List[int]]: The two generated splits. 72 | """ 73 | total = num_train + num_val 74 | numbers = set() 75 | bar = tqdm(total=total, leave=False) 76 | while len(numbers) < total: 77 | was = len(numbers) 78 | numbers.add(get_random_number()) 79 | bar.update(len(numbers) - was) 80 | numbers = list(numbers) 81 | np.random.shuffle(numbers) 82 | return numbers[:num_train], numbers[num_train:] 83 | 84 | 85 | def generate_samples(numbers: List[int]) -> List[Dict[str, Any]]: 86 | """Generate samples from a list of numbers. 87 | 88 | Args: 89 | numbers (List[int]): The numbers. 90 | 91 | Returns: 92 | List[Dict[str, Any]]: The corresponding samples. 93 | """ 94 | samples = [] 95 | for num in numbers: 96 | words = " ".join(say(num)) 97 | sample = {"number": num, "words": words} 98 | samples.append(sample) 99 | return samples 100 | 101 | 102 | def get_dataset( 103 | num_train: int, num_val: int 104 | ) -> Tuple[List[Dict[str, Any]], List[Dict[str, Any]]]: 105 | """Generate a number-saying dataset of the given size. 106 | 107 | Args: 108 | num_train (int): Number of training samples. 109 | num_val (int): Number of validation samples. 110 | 111 | Returns: 112 | Tuple[List[Dict[str, Any]], List[Dict[str, Any]]]: The two generated splits. 113 | """ 114 | train_nums, val_nums = get_numbers(num_train, num_val) 115 | train_samples = generate_samples(train_nums) 116 | val_samples = generate_samples(val_nums) 117 | return train_samples, val_samples 118 | 119 | 120 | def create_folder_structure(): 121 | if not os.path.isdir("./data/"): 122 | os.mkdir("./data/") 123 | if not os.path.isdir("./data/train/"): 124 | os.mkdir("./data/train/") 125 | if not os.path.isdir("./data/val/"): 126 | os.mkdir("./data/val/") 127 | 128 | for f in os.listdir("./data/train/"): 129 | os.remove(os.path.join("./data/train", f)) 130 | for f in os.listdir("./data/val/"): 131 | os.remove(os.path.join("./data/val", f)) 132 | 133 | 134 | def main(num_train: int, num_val: int): 135 | print(f"Generating synthetic dataset ({num_train} train, {num_val} val)...") 136 | train_samples, val_samples = get_dataset(num_train, num_val) 137 | 138 | create_folder_structure() 139 | 140 | for i in range(len(train_samples)): 141 | with open(f"./data/train/{i}.json", "w") as outfile: 142 | json.dump(train_samples[i], outfile) 143 | 144 | for j in range(len(val_samples)): 145 | with open(f"./data/val/{j}.json", "w") as outfile: 146 | json.dump(val_samples[j], outfile) 147 | 148 | 149 | if __name__ == "__main__": 150 | # Number of training and validation samples 151 | num_train_samples = 10_000 # 10k samples 152 | num_val_samples = 2000 # 2k samples 153 | 154 | # Create the samples. 155 | main(num_train_samples, num_val_samples) 156 | -------------------------------------------------------------------------------- /code/OpenAI-API.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyNWVQ13kV+SPVbKCoDo+xO4", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# OpenAI API\n", 33 | "\n", 34 | "In this notebook (using code from [this blog post](https://medium.com/codingthesmartway-com-blog/unlocking-the-power-of-gpt-4-api-a-beginners-guide-for-developers-a4baef2b5a81)), we chat with GPT-4-Turbo via the OpenAI API.\n", 35 | "\n", 36 | "Create your API key [here](https://platform.openai.com/account/api-keys) if you don't already have one." 37 | ], 38 | "metadata": { 39 | "id": "prVML98XnLhF" 40 | } 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "source": [ 45 | "### Load dependencies" 46 | ], 47 | "metadata": { 48 | "id": "blkPf_jKwrcv" 49 | } 50 | }, 51 | { 52 | "cell_type": "code", 53 | "source": [ 54 | "import requests\n", 55 | "import json" 56 | ], 57 | "metadata": { 58 | "id": "gyPuepHVjdrQ" 59 | }, 60 | "execution_count": null, 61 | "outputs": [] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "source": [ 66 | "### Secret key\n", 67 | "\n", 68 | "You only need to use this section if you don't want to put your API key in your code." 69 | ], 70 | "metadata": { 71 | "id": "uVKnKeAhwvk8" 72 | } 73 | }, 74 | { 75 | "cell_type": "code", 76 | "source": [ 77 | "import getpass" 78 | ], 79 | "metadata": { 80 | "id": "S2J5uIK_uMKq" 81 | }, 82 | "execution_count": null, 83 | "outputs": [] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "source": [ 88 | "from getpass import getpass" 89 | ], 90 | "metadata": { 91 | "id": "vHWXQcJmuOO3" 92 | }, 93 | "execution_count": null, 94 | "outputs": [] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "source": [ 99 | "secret_key = getpass('Enter OpenAI API key:')" 100 | ], 101 | "metadata": { 102 | "colab": { 103 | "base_uri": "https://localhost:8080/" 104 | }, 105 | "id": "KQRNzxQzuTUK", 106 | "outputId": "5f514ec1-b6de-4a68-b3c8-4de138365351" 107 | }, 108 | "execution_count": 4, 109 | "outputs": [ 110 | { 111 | "name": "stdout", 112 | "output_type": "stream", 113 | "text": [ 114 | "Enter OpenAI API key:··········\n" 115 | ] 116 | } 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "source": [ 122 | "### Create chat function" 123 | ], 124 | "metadata": { 125 | "id": "PPS35grmxB-9" 126 | } 127 | }, 128 | { 129 | "cell_type": "code", 130 | "source": [ 131 | "API_ENDPOINT = \"https://api.openai.com/v1/chat/completions\"\n", 132 | "API_KEY = secret_key\n", 133 | "## Alternatively, you can hard code your API key:\n", 134 | "# API_KEY = \"\"" 135 | ], 136 | "metadata": { 137 | "id": "DS5AmN9KiGGL" 138 | }, 139 | "execution_count": 5, 140 | "outputs": [] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "source": [ 145 | "def generate_chat_completion(messages,\n", 146 | " model=\"gpt-4-turbo\",\n", 147 | " temperature=1, # controls randomness; higher = more random; range = 0-5\n", 148 | " max_tokens=None):\n", 149 | "\n", 150 | " headers = {\n", 151 | " \"Content-Type\": \"application/json\",\n", 152 | " \"Authorization\": f\"Bearer {API_KEY}\",\n", 153 | " }\n", 154 | "\n", 155 | " data = {\n", 156 | " \"model\": model,\n", 157 | " \"messages\": messages,\n", 158 | " \"temperature\": temperature,\n", 159 | " }\n", 160 | "\n", 161 | " if max_tokens is not None:\n", 162 | " data[\"max_tokens\"] = max_tokens\n", 163 | "\n", 164 | " response = requests.post(API_ENDPOINT, headers=headers, data=json.dumps(data))\n", 165 | "\n", 166 | " if response.status_code == 200: # 200 = request OK!\n", 167 | " return response.json()[\"choices\"][0][\"message\"][\"content\"]\n", 168 | " else:\n", 169 | " raise Exception(f\"Error {response.status_code}: {response.text}\")" 170 | ], 171 | "metadata": { 172 | "id": "nQ9z1ZgTjhyT" 173 | }, 174 | "execution_count": 7, 175 | "outputs": [] 176 | }, 177 | { 178 | "cell_type": "markdown", 179 | "source": [ 180 | "### Generate chat completion" 181 | ], 182 | "metadata": { 183 | "id": "3h9W5a3jxLZb" 184 | } 185 | }, 186 | { 187 | "cell_type": "code", 188 | "source": [ 189 | "messages = [\n", 190 | " {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, # optional but helps set behavior\n", 191 | " {\"role\": \"user\", \"content\": \"Write a sentence about Jon Krohn where every word begins with the next letter of the alphabet, starting with the letter A.\"}\n", 192 | "]" 193 | ], 194 | "metadata": { 195 | "id": "X6qKtSpIjpk9" 196 | }, 197 | "execution_count": 8, 198 | "outputs": [] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "source": [ 203 | "generate_chat_completion(messages)" 204 | ], 205 | "metadata": { 206 | "colab": { 207 | "base_uri": "https://localhost:8080/", 208 | "height": 36 209 | }, 210 | "id": "MWEWxNWSjuo7", 211 | "outputId": "58d6ebf9-210a-429b-8393-d8a5012b059c" 212 | }, 213 | "execution_count": 9, 214 | "outputs": [ 215 | { 216 | "output_type": "execute_result", 217 | "data": { 218 | "text/plain": [ 219 | "'\"Jon Krohn adeptly builds complex deep-learning ecosystems for global healthcare innovation journeys.\"'" 220 | ], 221 | "application/vnd.google.colaboratory.intrinsic+json": { 222 | "type": "string" 223 | } 224 | }, 225 | "metadata": {}, 226 | "execution_count": 9 227 | } 228 | ] 229 | } 230 | ] 231 | } -------------------------------------------------------------------------------- /code/GPT4-API.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyM98anHa6rngZK0yW6LwFao", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# GPT-4 API\n", 33 | "\n", 34 | "In this notebook (using code from [this blog post](https://medium.com/codingthesmartway-com-blog/unlocking-the-power-of-gpt-4-api-a-beginners-guide-for-developers-a4baef2b5a81)), we chat with GPT-4 via the OpenAI API.\n", 35 | "\n", 36 | "You may need to:\n", 37 | "* [Join the GPT-4 waitlist](https://openai.com/waitlist/gpt-4-api)\n", 38 | "* [Create your API key](https://platform.openai.com/account/api-keys)" 39 | ], 40 | "metadata": { 41 | "id": "prVML98XnLhF" 42 | } 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "source": [ 47 | "### Load dependencies" 48 | ], 49 | "metadata": { 50 | "id": "blkPf_jKwrcv" 51 | } 52 | }, 53 | { 54 | "cell_type": "code", 55 | "source": [ 56 | "import requests\n", 57 | "import json" 58 | ], 59 | "metadata": { 60 | "id": "gyPuepHVjdrQ" 61 | }, 62 | "execution_count": null, 63 | "outputs": [] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "source": [ 68 | "### Secret key\n", 69 | "\n", 70 | "You only need to use this section if you don't want to put your API key in your code." 71 | ], 72 | "metadata": { 73 | "id": "uVKnKeAhwvk8" 74 | } 75 | }, 76 | { 77 | "cell_type": "code", 78 | "source": [ 79 | "import getpass" 80 | ], 81 | "metadata": { 82 | "id": "S2J5uIK_uMKq" 83 | }, 84 | "execution_count": null, 85 | "outputs": [] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "source": [ 90 | "from getpass import getpass" 91 | ], 92 | "metadata": { 93 | "id": "vHWXQcJmuOO3" 94 | }, 95 | "execution_count": null, 96 | "outputs": [] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "source": [ 101 | "secret_key = getpass('Enter OpenAI API key:')" 102 | ], 103 | "metadata": { 104 | "colab": { 105 | "base_uri": "https://localhost:8080/" 106 | }, 107 | "id": "KQRNzxQzuTUK", 108 | "outputId": "f9de833f-16a6-4f05-b0ab-51daac947403" 109 | }, 110 | "execution_count": null, 111 | "outputs": [ 112 | { 113 | "name": "stdout", 114 | "output_type": "stream", 115 | "text": [ 116 | "Enter OpenAI API key:··········\n" 117 | ] 118 | } 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "source": [ 124 | "### Create chat function" 125 | ], 126 | "metadata": { 127 | "id": "PPS35grmxB-9" 128 | } 129 | }, 130 | { 131 | "cell_type": "code", 132 | "source": [ 133 | "API_ENDPOINT = \"https://api.openai.com/v1/chat/completions\"\n", 134 | "API_KEY = secret_key\n", 135 | "## Alternatively, you can hard code your API key:\n", 136 | "# API_KEY = \"\"" 137 | ], 138 | "metadata": { 139 | "id": "DS5AmN9KiGGL" 140 | }, 141 | "execution_count": null, 142 | "outputs": [] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "source": [ 147 | "def generate_chat_completion(messages, \n", 148 | " model=\"gpt-4\", # use \"gpt-3.5-turbo\" no GPT-4 access\n", 149 | " temperature=1, # controls randomness; higher = more random; range = 0-5\n", 150 | " max_tokens=None):\n", 151 | "\n", 152 | " headers = {\n", 153 | " \"Content-Type\": \"application/json\",\n", 154 | " \"Authorization\": f\"Bearer {API_KEY}\",\n", 155 | " }\n", 156 | "\n", 157 | " data = {\n", 158 | " \"model\": model,\n", 159 | " \"messages\": messages,\n", 160 | " \"temperature\": temperature,\n", 161 | " }\n", 162 | "\n", 163 | " if max_tokens is not None:\n", 164 | " data[\"max_tokens\"] = max_tokens\n", 165 | "\n", 166 | " response = requests.post(API_ENDPOINT, headers=headers, data=json.dumps(data))\n", 167 | "\n", 168 | " if response.status_code == 200: # 200 = request OK!\n", 169 | " return response.json()[\"choices\"][0][\"message\"][\"content\"]\n", 170 | " else:\n", 171 | " raise Exception(f\"Error {response.status_code}: {response.text}\")" 172 | ], 173 | "metadata": { 174 | "id": "nQ9z1ZgTjhyT" 175 | }, 176 | "execution_count": null, 177 | "outputs": [] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "source": [ 182 | "### Generate chat completion" 183 | ], 184 | "metadata": { 185 | "id": "3h9W5a3jxLZb" 186 | } 187 | }, 188 | { 189 | "cell_type": "code", 190 | "source": [ 191 | "messages = [\n", 192 | " {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, # optional but helps set behavior\n", 193 | " {\"role\": \"user\", \"content\": \"Write a sentence about Jon Krohn where every word begins with the next letter of the alphabet, starting with the letter A.\"}\n", 194 | "]" 195 | ], 196 | "metadata": { 197 | "id": "X6qKtSpIjpk9" 198 | }, 199 | "execution_count": null, 200 | "outputs": [] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "source": [ 205 | "generate_chat_completion(messages)" 206 | ], 207 | "metadata": { 208 | "colab": { 209 | "base_uri": "https://localhost:8080/", 210 | "height": 71 211 | }, 212 | "id": "MWEWxNWSjuo7", 213 | "outputId": "f0329c7c-c6a5-40bf-a0e2-4a0bd1c5bf69" 214 | }, 215 | "execution_count": null, 216 | "outputs": [ 217 | { 218 | "output_type": "execute_result", 219 | "data": { 220 | "text/plain": [ 221 | "'Achieving basic comprehension, Dr. Krohn educates folks genuinely, helping individuals just keenly learn machine neuroscience - obviously, presenting quality robotics studies, teaching unique visions while x-raying yonder zettabytes.'" 222 | ], 223 | "application/vnd.google.colaboratory.intrinsic+json": { 224 | "type": "string" 225 | } 226 | }, 227 | "metadata": {}, 228 | "execution_count": 9 229 | } 230 | ] 231 | } 232 | ] 233 | } -------------------------------------------------------------------------------- /code/Finetune-T5-multiGPU.py: -------------------------------------------------------------------------------- 1 | # This code was created by Shaan Khosla. See original here: https://github.com/shaankhosla/NLP_with_LLMs/blob/main/train.py 2 | 3 | import pytorch_lightning as pl 4 | from transformers import T5ForConditionalGeneration 5 | from transformers import AutoTokenizer 6 | from torch.utils.data import DataLoader 7 | from transformers import get_linear_schedule_with_warmup 8 | from torch.optim import AdamW 9 | import os 10 | from peft import get_peft_model, LoraConfig, TaskType 11 | import generate_data 12 | import torch 13 | import argparse 14 | from pytorch_lightning.loggers import TensorBoardLogger 15 | import os, json 16 | from torch.utils.data import Dataset 17 | from pytorch_lightning.utilities.deepspeed import ( 18 | convert_zero_checkpoint_to_fp32_state_dict, 19 | ) 20 | from torch.utils.data import default_collate 21 | 22 | 23 | os.environ["TOKENIZERS_PARALLELISM"] = "true" 24 | 25 | 26 | class T5Finetuner(pl.LightningModule): 27 | def __init__(self, args, train_data, val_data): 28 | super().__init__() 29 | self.save_hyperparameters() 30 | self.args = args 31 | self.model = T5ForConditionalGeneration.from_pretrained( 32 | self.args.model_name, 33 | cache_dir=args.cache, 34 | ) 35 | self.tokenizer = AutoTokenizer.from_pretrained( 36 | args.model_name, cache_dir=args.cache, use_fast=True 37 | ) 38 | self.cache_dir = args.cache 39 | 40 | self.train_data, self.val_data = train_data, val_data 41 | 42 | self.validation_step_outputs = [] 43 | self.get_peft() 44 | 45 | def get_peft(self): 46 | self.model.enable_input_require_grads() 47 | peft_config = LoraConfig( 48 | task_type=TaskType.SEQ_2_SEQ_LM, 49 | inference_mode=False, 50 | r=8, 51 | lora_alpha=32, 52 | lora_dropout=0.1, 53 | ) 54 | self.model = get_peft_model(self.model, peft_config) 55 | self.model.print_trainable_parameters() 56 | 57 | def forward(self, batch, batch_idx): 58 | return self.model(**batch) 59 | 60 | def training_step(self, batch, batch_idx): 61 | # accumulation: https://lightning.ai/docs/fabric/latest/advanced/gradient_accumulation.html 62 | loss = self(batch, batch_idx)[0] 63 | return {"loss": loss, "log": {"train_loss": loss}} 64 | 65 | def validation_step(self, batch, batch_idx): 66 | loss = self(batch, batch_idx)[0] 67 | self.validation_step_outputs.append(loss) 68 | return {"loss": loss} 69 | 70 | def on_validation_epoch_end(self): 71 | torch.stack(self.validation_step_outputs).mean() 72 | self.validation_step_outputs.clear() 73 | 74 | def train_dataloader(self): 75 | return DataLoader( 76 | self.train_data, 77 | batch_size=self.args.batch_size, 78 | num_workers=os.cpu_count(), 79 | pin_memory=True, 80 | collate_fn=default_collate, 81 | prefetch_factor=50, 82 | ) 83 | 84 | def val_dataloader(self): 85 | return DataLoader( 86 | self.val_data, 87 | batch_size=self.args.batch_size, 88 | num_workers=os.cpu_count(), 89 | pin_memory=True, 90 | collate_fn=default_collate, 91 | prefetch_factor=50, 92 | ) 93 | 94 | def configure_optimizers(self): 95 | optimizer = AdamW( 96 | self.trainer.model.parameters(), lr=self.args.lr, weight_decay=0.01 97 | ) 98 | scheduler = get_linear_schedule_with_warmup( 99 | optimizer, 100 | num_warmup_steps=0, 101 | num_training_steps=self.args.epochs 102 | * len(self.train_data) 103 | / self.args.batch_size, 104 | ) 105 | return {"optimizer": optimizer, "lr_scheduler": scheduler} 106 | 107 | 108 | class StreamingDataset(Dataset): 109 | def __init__(self, path, model_name): 110 | self.path = path 111 | self.tokenizer = AutoTokenizer.from_pretrained( 112 | model_name, cache_dir=args.cache, use_fast=True 113 | ) 114 | 115 | def __len__(self): 116 | return len(os.listdir(self.path)) 117 | 118 | def encode_text(self, text_input, text_ouput): 119 | inputs = self.tokenizer( 120 | text_input, 121 | max_length=16, 122 | truncation=True, 123 | padding="max_length", 124 | return_tensors="pt", 125 | ) 126 | labels = self.tokenizer( 127 | text_ouput, 128 | max_length=16, 129 | truncation=True, 130 | padding="max_length", 131 | return_tensors="pt", 132 | ).input_ids[0] 133 | input_ids = inputs["input_ids"][0] 134 | attention_mask = inputs["attention_mask"][0] 135 | labels = torch.tensor([label if label != 0 else -100 for label in labels]) 136 | return { 137 | "input_ids": input_ids, 138 | "attention_mask": attention_mask, 139 | "labels": labels, 140 | } 141 | 142 | def __getitem__(self, idx): 143 | file_path = os.path.join(self.path, str(idx) + ".json") 144 | with open(file_path, "r") as infile: 145 | data = json.load(infile) 146 | number, words = str(data["number"]), data["words"] 147 | return self.encode_text(number, words) 148 | 149 | 150 | def start_training(args): 151 | generate_data.main(args.train_size, args.val_size) 152 | train_data = StreamingDataset(os.path.join(args.data, "train"), args.model_name) 153 | val_data = StreamingDataset(os.path.join(args.data, "val"), args.model_name) 154 | 155 | summarizer = T5Finetuner(args, train_data, val_data) 156 | # https://lightning.ai/docs/pytorch/stable/advanced/training_tricks.html?highlight=gradient%20accumulation 157 | # checkpointing: https://lightning.ai/docs/pytorch/stable/advanced/model_parallel.html#activation-checkpointing 158 | trainer = pl.Trainer( 159 | max_epochs=args.epochs, 160 | accelerator="gpu", 161 | devices="auto", 162 | precision="16-mixed", 163 | accumulate_grad_batches=4, 164 | strategy="deepspeed_stage_3", # https://lightning.ai/docs/pytorch/latest/extensions/strategy.html#:~:text=The%20Strategy%20in%20PyTorch%20Lightning,%2C%20broadcast%2C%20and%20so%20on. 165 | check_val_every_n_epoch=1, 166 | logger=TensorBoardLogger( 167 | os.path.join(args.output, "logs"), name=args.model_name 168 | ), 169 | log_every_n_steps=1, 170 | ) 171 | trainer.fit(summarizer) 172 | 173 | ckpt_path = f"./output/logs/{args.model_name}/version_{trainer.logger.version}/checkpoints/epoch={trainer.current_epoch-1}-step={trainer.global_step}.ckpt" 174 | 175 | convert_zero_checkpoint_to_fp32_state_dict(ckpt_path, "lightning_model.pt") 176 | training_model = T5Finetuner.load_from_checkpoint("lightning_model.pt") 177 | training_model.model.save_pretrained("digit_conversion") 178 | 179 | # run `huggingface-cli login` or `poetry run huggingface-cli login` 180 | # training_model.model.push_to_hub("digit_conversion") 181 | # training_model.tokenizer.push_to_hub("digit_conversion") 182 | 183 | 184 | if __name__ == "__main__": 185 | parser = argparse.ArgumentParser() 186 | parser.add_argument("-d", "--data", default="./data/") 187 | parser.add_argument("-c", "--cache", default="./cache/") 188 | parser.add_argument("-o", "--output", default="./output/") 189 | parser.add_argument("-t", "--train_size", default=20_000) 190 | parser.add_argument("-v", "--val_size", default=2_000) 191 | parser.add_argument("-m", "--model_name", default="t5-base") 192 | parser.add_argument("-l", "--lr", default=1e-05) 193 | parser.add_argument("-e", "--epochs", default=10) 194 | parser.add_argument("-b", "--batch_size", default=16) 195 | args = parser.parse_args() 196 | start_training(args) 197 | -------------------------------------------------------------------------------- /code/GPT.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyMe8TfdiLEzivRvXtS3IiEh", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# GPT\n", 33 | "\n", 34 | "In this notebook (based on [Sinan Ozdemir's](https://github.com/sinanuozdemir/oreilly-gpt-hands-on-nlg/blob/main/notebooks/Introduction_to_GPT.ipynb)), we:\n", 35 | "\n", 36 | "1. Use `transformers` pipeline objects to generate text very easily (using a GPT model)\n", 37 | "2. Explore tokens" 38 | ], 39 | "metadata": { 40 | "id": "YN46LPl-l54T" 41 | } 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "source": [ 46 | "### Load dependencies" 47 | ], 48 | "metadata": { 49 | "id": "x4D097ejSDzh" 50 | } 51 | }, 52 | { 53 | "cell_type": "code", 54 | "source": [ 55 | "%%capture\n", 56 | "! pip install transformers==4.28.0" 57 | ], 58 | "metadata": { 59 | "id": "67O5gEnnB4V3" 60 | }, 61 | "execution_count": null, 62 | "outputs": [] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": null, 67 | "metadata": { 68 | "id": "dSEmQMy09mG4" 69 | }, 70 | "outputs": [], 71 | "source": [ 72 | "from transformers import pipeline, GPT2Tokenizer" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "source": [ 78 | "### Hello, Pipeline! \n", 79 | "\n", 80 | "Let's use the `pipeline` object to generate text.\n", 81 | "\n", 82 | "Other examples of tasks we can carry out with pipelines include:\n", 83 | "* `\"sentiment-analysis\"`\n", 84 | "* `\"ner\"` (named entity recognition)\n", 85 | "* `\"summarization\"`\n", 86 | "* `\"translation_en_to_fr\"`\n", 87 | "* `\"feature-extraction\"`" 88 | ], 89 | "metadata": { 90 | "id": "9USHAIspR-lC" 91 | } 92 | }, 93 | { 94 | "cell_type": "code", 95 | "source": [ 96 | "generator = pipeline('text-generation', model='gpt2')\n", 97 | "\n", 98 | "generator(\"The capital of Germany is Berlin. The capital of China is Beijing. The capital of France is\",\n", 99 | " max_new_tokens=2,)" 100 | ], 101 | "metadata": { 102 | "colab": { 103 | "base_uri": "https://localhost:8080/" 104 | }, 105 | "id": "aEB3MyurB1Ug", 106 | "outputId": "a76dfd9e-de01-4cd1-80be-5cb02254a679" 107 | }, 108 | "execution_count": null, 109 | "outputs": [ 110 | { 111 | "output_type": "stream", 112 | "name": "stderr", 113 | "text": [ 114 | "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n" 115 | ] 116 | }, 117 | { 118 | "output_type": "execute_result", 119 | "data": { 120 | "text/plain": [ 121 | "[{'generated_text': 'The capital of Germany is Berlin. The capital of China is Beijing. The capital of France is Paris.'}]" 122 | ] 123 | }, 124 | "metadata": {}, 125 | "execution_count": 25 126 | } 127 | ] 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "source": [ 132 | "### Exploring tokens" 133 | ], 134 | "metadata": { 135 | "id": "YqEJxtLiSZz7" 136 | } 137 | }, 138 | { 139 | "cell_type": "code", 140 | "source": [ 141 | "tokenizer = GPT2Tokenizer.from_pretrained('gpt2') # load up a tokenizer" 142 | ], 143 | "metadata": { 144 | "id": "NgU8F2s6CfJz" 145 | }, 146 | "execution_count": null, 147 | "outputs": [] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "source": [ 152 | "'love' in tokenizer.get_vocab()" 153 | ], 154 | "metadata": { 155 | "id": "ixSB2OeESmfd", 156 | "outputId": "d6be03a6-641c-48a3-98c6-edc49a733b5a", 157 | "colab": { 158 | "base_uri": "https://localhost:8080/" 159 | } 160 | }, 161 | "execution_count": null, 162 | "outputs": [ 163 | { 164 | "output_type": "execute_result", 165 | "data": { 166 | "text/plain": [ 167 | "True" 168 | ] 169 | }, 170 | "metadata": {}, 171 | "execution_count": 27 172 | } 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "source": [ 178 | "'Sinan' in tokenizer.get_vocab()" 179 | ], 180 | "metadata": { 181 | "id": "kQGMZ-9wSjvg", 182 | "outputId": "a6e6725b-3a6d-4c4e-fa9d-8b4ae124b7c9", 183 | "colab": { 184 | "base_uri": "https://localhost:8080/" 185 | } 186 | }, 187 | "execution_count": null, 188 | "outputs": [ 189 | { 190 | "output_type": "execute_result", 191 | "data": { 192 | "text/plain": [ 193 | "False" 194 | ] 195 | }, 196 | "metadata": {}, 197 | "execution_count": 28 198 | } 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "source": [ 204 | "Encode a string:" 205 | ], 206 | "metadata": { 207 | "id": "Ok6nimr0XWKw" 208 | } 209 | }, 210 | { 211 | "cell_type": "code", 212 | "source": [ 213 | "tokenizer.encode('Sinan loves a beautiful day')" 214 | ], 215 | "metadata": { 216 | "colab": { 217 | "base_uri": "https://localhost:8080/" 218 | }, 219 | "id": "qgHBvaBWnNFI", 220 | "outputId": "c4ffc510-fb43-4610-952e-cdd5dccb162d" 221 | }, 222 | "execution_count": null, 223 | "outputs": [ 224 | { 225 | "output_type": "execute_result", 226 | "data": { 227 | "text/plain": [ 228 | "[46200, 272, 10408, 257, 4950, 1110]" 229 | ] 230 | }, 231 | "metadata": {}, 232 | "execution_count": 29 233 | } 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "source": [ 239 | "...then convert the ids into tokens: " 240 | ], 241 | "metadata": { 242 | "id": "0wVJdiacXcLv" 243 | } 244 | }, 245 | { 246 | "cell_type": "code", 247 | "source": [ 248 | "tokenizer.convert_ids_to_tokens(tokenizer.encode('Sinan loves a beautiful day'))" 249 | ], 250 | "metadata": { 251 | "colab": { 252 | "base_uri": "https://localhost:8080/" 253 | }, 254 | "id": "CLWaENSZnWVQ", 255 | "outputId": "a4f0aedf-9c5f-40ca-e0b7-1496293451a7" 256 | }, 257 | "execution_count": null, 258 | "outputs": [ 259 | { 260 | "output_type": "execute_result", 261 | "data": { 262 | "text/plain": [ 263 | "['Sin', 'an', 'Ġloves', 'Ġa', 'Ġbeautiful', 'Ġday']" 264 | ] 265 | }, 266 | "metadata": {}, 267 | "execution_count": 30 268 | } 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "source": [ 274 | "(The `Ġ` character denotes a space before the token.)" 275 | ], 276 | "metadata": { 277 | "id": "oZYln4urXPkF" 278 | } 279 | } 280 | ] 281 | } -------------------------------------------------------------------------------- /code/scrap/GPT4All-J.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPk/K2KHlAKYK8eVikP80eO", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 1, 32 | "metadata": { 33 | "colab": { 34 | "base_uri": "https://localhost:8080/" 35 | }, 36 | "id": "_d4hjN1CDlrc", 37 | "outputId": "e9973f1e-6bcc-467c-d639-7f57cb69b5e1" 38 | }, 39 | "outputs": [ 40 | { 41 | "output_type": "stream", 42 | "name": "stdout", 43 | "text": [ 44 | "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", 45 | "Collecting pygpt4all\n", 46 | " Downloading pygpt4all-1.0.1.tar.gz (3.8 kB)\n", 47 | " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", 48 | "Collecting pyllamacpp==1.0.6\n", 49 | " Downloading pyllamacpp-1.0.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (269 kB)\n", 50 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m269.4/269.4 kB\u001b[0m \u001b[31m6.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 51 | "\u001b[?25hCollecting pygptj\n", 52 | " Downloading pygptj-1.0.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (234 kB)\n", 53 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m234.1/234.1 kB\u001b[0m \u001b[31m9.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 54 | "\u001b[?25hCollecting sentencepiece\n", 55 | " Downloading sentencepiece-0.1.98-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n", 56 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m20.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 57 | "\u001b[?25hCollecting streamlit\n", 58 | " Downloading streamlit-1.21.0-py2.py3-none-any.whl (9.7 MB)\n", 59 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m9.7/9.7 MB\u001b[0m \u001b[31m46.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 60 | "\u001b[?25hRequirement already satisfied: torch in /usr/local/lib/python3.9/dist-packages (from pyllamacpp==1.0.6->pygpt4all) (2.0.0+cu118)\n", 61 | "Collecting streamlit-ace\n", 62 | " Downloading streamlit_ace-0.1.1-py3-none-any.whl (3.6 MB)\n", 63 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.6/3.6 MB\u001b[0m \u001b[31m56.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 64 | "\u001b[?25hRequirement already satisfied: toml in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (0.10.2)\n", 65 | "Collecting blinker>=1.0.0\n", 66 | " Downloading blinker-1.6.2-py3-none-any.whl (13 kB)\n", 67 | "Requirement already satisfied: typing-extensions>=3.10.0.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (4.5.0)\n", 68 | "Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (8.1.3)\n", 69 | "Collecting pympler>=0.9\n", 70 | " Downloading Pympler-1.0.1-py3-none-any.whl (164 kB)\n", 71 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m164.8/164.8 kB\u001b[0m \u001b[31m10.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 72 | "\u001b[?25hRequirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (1.22.4)\n", 73 | "Requirement already satisfied: importlib-metadata>=1.4 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (6.6.0)\n", 74 | "Collecting validators>=0.2\n", 75 | " Downloading validators-0.20.0.tar.gz (30 kB)\n", 76 | " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", 77 | "Collecting pydeck>=0.1.dev5\n", 78 | " Downloading pydeck-0.8.1b0-py2.py3-none-any.whl (4.8 MB)\n", 79 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.8/4.8 MB\u001b[0m \u001b[31m49.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 80 | "\u001b[?25hRequirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (8.4.0)\n", 81 | "Requirement already satisfied: altair<5,>=3.2.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (4.2.2)\n", 82 | "Requirement already satisfied: pandas<2,>=0.25 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (1.5.3)\n", 83 | "Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (13.3.4)\n", 84 | "Requirement already satisfied: tornado>=6.0.3 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (6.2)\n", 85 | "Requirement already satisfied: tzlocal>=1.1 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (4.3)\n", 86 | "Requirement already satisfied: python-dateutil in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (2.8.2)\n", 87 | "Requirement already satisfied: requests>=2.4 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (2.27.1)\n", 88 | "Collecting watchdog\n", 89 | " Downloading watchdog-3.0.0-py3-none-manylinux2014_x86_64.whl (82 kB)\n", 90 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m82.1/82.1 kB\u001b[0m \u001b[31m4.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 91 | "\u001b[?25hRequirement already satisfied: protobuf<4,>=3.12 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (3.20.3)\n", 92 | "Requirement already satisfied: cachetools>=4.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (5.3.0)\n", 93 | "Collecting gitpython!=3.1.19\n", 94 | " Downloading GitPython-3.1.31-py3-none-any.whl (184 kB)\n", 95 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m184.3/184.3 kB\u001b[0m \u001b[31m9.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 96 | "\u001b[?25hRequirement already satisfied: packaging>=14.1 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (23.1)\n", 97 | "Requirement already satisfied: pyarrow>=4.0 in /usr/local/lib/python3.9/dist-packages (from streamlit->pyllamacpp==1.0.6->pygpt4all) (9.0.0)\n", 98 | "Requirement already satisfied: jinja2 in /usr/local/lib/python3.9/dist-packages (from torch->pyllamacpp==1.0.6->pygpt4all) (3.1.2)\n", 99 | "Requirement already satisfied: filelock in /usr/local/lib/python3.9/dist-packages (from torch->pyllamacpp==1.0.6->pygpt4all) (3.12.0)\n", 100 | "Requirement already satisfied: sympy in /usr/local/lib/python3.9/dist-packages (from torch->pyllamacpp==1.0.6->pygpt4all) (1.11.1)\n", 101 | "Requirement already satisfied: networkx in /usr/local/lib/python3.9/dist-packages (from torch->pyllamacpp==1.0.6->pygpt4all) (3.1)\n", 102 | "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.9/dist-packages (from torch->pyllamacpp==1.0.6->pygpt4all) (2.0.0)\n", 103 | "Requirement already satisfied: lit in /usr/local/lib/python3.9/dist-packages (from triton==2.0.0->torch->pyllamacpp==1.0.6->pygpt4all) (16.0.2)\n", 104 | "Requirement already satisfied: cmake in /usr/local/lib/python3.9/dist-packages (from triton==2.0.0->torch->pyllamacpp==1.0.6->pygpt4all) (3.25.2)\n", 105 | "Requirement already satisfied: jsonschema>=3.0 in /usr/local/lib/python3.9/dist-packages (from altair<5,>=3.2.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (4.3.3)\n", 106 | "Requirement already satisfied: entrypoints in /usr/local/lib/python3.9/dist-packages (from altair<5,>=3.2.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (0.4)\n", 107 | "Requirement already satisfied: toolz in /usr/local/lib/python3.9/dist-packages (from altair<5,>=3.2.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (0.12.0)\n", 108 | "Collecting gitdb<5,>=4.0.1\n", 109 | " Downloading gitdb-4.0.10-py3-none-any.whl (62 kB)\n", 110 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m62.7/62.7 kB\u001b[0m \u001b[31m5.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 111 | "\u001b[?25hRequirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.9/dist-packages (from importlib-metadata>=1.4->streamlit->pyllamacpp==1.0.6->pygpt4all) (3.15.0)\n", 112 | "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas<2,>=0.25->streamlit->pyllamacpp==1.0.6->pygpt4all) (2022.7.1)\n", 113 | "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.9/dist-packages (from jinja2->torch->pyllamacpp==1.0.6->pygpt4all) (2.1.2)\n", 114 | "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil->streamlit->pyllamacpp==1.0.6->pygpt4all) (1.16.0)\n", 115 | "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.9/dist-packages (from requests>=2.4->streamlit->pyllamacpp==1.0.6->pygpt4all) (2.0.12)\n", 116 | "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/dist-packages (from requests>=2.4->streamlit->pyllamacpp==1.0.6->pygpt4all) (3.4)\n", 117 | "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/dist-packages (from requests>=2.4->streamlit->pyllamacpp==1.0.6->pygpt4all) (1.26.15)\n", 118 | "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/dist-packages (from requests>=2.4->streamlit->pyllamacpp==1.0.6->pygpt4all) (2022.12.7)\n", 119 | "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.9/dist-packages (from rich>=10.11.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (2.14.0)\n", 120 | "Requirement already satisfied: markdown-it-py<3.0.0,>=2.2.0 in /usr/local/lib/python3.9/dist-packages (from rich>=10.11.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (2.2.0)\n", 121 | "Requirement already satisfied: pytz-deprecation-shim in /usr/local/lib/python3.9/dist-packages (from tzlocal>=1.1->streamlit->pyllamacpp==1.0.6->pygpt4all) (0.1.0.post0)\n", 122 | "Requirement already satisfied: decorator>=3.4.0 in /usr/local/lib/python3.9/dist-packages (from validators>=0.2->streamlit->pyllamacpp==1.0.6->pygpt4all) (4.4.2)\n", 123 | "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.9/dist-packages (from sympy->torch->pyllamacpp==1.0.6->pygpt4all) (1.3.0)\n", 124 | "Collecting smmap<6,>=3.0.1\n", 125 | " Downloading smmap-5.0.0-py3-none-any.whl (24 kB)\n", 126 | "Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /usr/local/lib/python3.9/dist-packages (from jsonschema>=3.0->altair<5,>=3.2.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (0.19.3)\n", 127 | "Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.9/dist-packages (from jsonschema>=3.0->altair<5,>=3.2.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (23.1.0)\n", 128 | "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.9/dist-packages (from markdown-it-py<3.0.0,>=2.2.0->rich>=10.11.0->streamlit->pyllamacpp==1.0.6->pygpt4all) (0.1.2)\n", 129 | "Requirement already satisfied: tzdata in /usr/local/lib/python3.9/dist-packages (from pytz-deprecation-shim->tzlocal>=1.1->streamlit->pyllamacpp==1.0.6->pygpt4all) (2023.3)\n", 130 | "Building wheels for collected packages: pygpt4all, validators\n", 131 | " Building wheel for pygpt4all (setup.py) ... \u001b[?25l\u001b[?25hdone\n", 132 | " Created wheel for pygpt4all: filename=pygpt4all-1.0.1-py3-none-any.whl size=5244 sha256=6012507aa34b0bdf8ea979676126f3f939143daef126fa0f00f2241d8e2fa3d3\n", 133 | " Stored in directory: /root/.cache/pip/wheels/a7/75/c7/9809d7bd96333779e79f452bd2beb1f7ace6095c8b2d6cd149\n", 134 | " Building wheel for validators (setup.py) ... \u001b[?25l\u001b[?25hdone\n", 135 | " Created wheel for validators: filename=validators-0.20.0-py3-none-any.whl size=19579 sha256=25a7734c95c6c4e8d17dd3c38ef34bcd1b57ca1c7d627e6e44c26cbe90bccb55\n", 136 | " Stored in directory: /root/.cache/pip/wheels/2d/f0/a8/1094fca7a7e5d0d12ff56e0c64675d72aa5cc81a5fc200e849\n", 137 | "Successfully built pygpt4all validators\n", 138 | "Installing collected packages: sentencepiece, watchdog, validators, smmap, pympler, pygptj, blinker, pydeck, gitdb, gitpython, streamlit, streamlit-ace, pyllamacpp, pygpt4all\n", 139 | "Successfully installed blinker-1.6.2 gitdb-4.0.10 gitpython-3.1.31 pydeck-0.8.1b0 pygpt4all-1.0.1 pygptj-1.0.8 pyllamacpp-1.0.6 pympler-1.0.1 sentencepiece-0.1.98 smmap-5.0.0 streamlit-1.21.0 streamlit-ace-0.1.1 validators-0.20.0 watchdog-3.0.0\n" 140 | ] 141 | } 142 | ], 143 | "source": [ 144 | "! pip install pygpt4all" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "source": [ 150 | "from pygpt4all.models.gpt4all_j import GPT4All_J" 151 | ], 152 | "metadata": { 153 | "id": "RthSMuHwDpE3" 154 | }, 155 | "execution_count": 2, 156 | "outputs": [] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "source": [ 161 | "! wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin" 162 | ], 163 | "metadata": { 164 | "colab": { 165 | "base_uri": "https://localhost:8080/" 166 | }, 167 | "id": "GF2h6a3lEBaa", 168 | "outputId": "1aeb78d2-e231-44a3-cd23-82c5fd7f1a00" 169 | }, 170 | "execution_count": 3, 171 | "outputs": [ 172 | { 173 | "output_type": "stream", 174 | "name": "stdout", 175 | "text": [ 176 | "--2023-04-27 00:15:16-- https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin\n", 177 | "Resolving gpt4all.io (gpt4all.io)... 104.21.50.99, 172.67.204.28, 2606:4700:3033::6815:3263, ...\n", 178 | "Connecting to gpt4all.io (gpt4all.io)|104.21.50.99|:443... connected.\n", 179 | "HTTP request sent, awaiting response... 200 OK\n", 180 | "Length: 3785248281 (3.5G)\n", 181 | "Saving to: ‘ggml-gpt4all-j-v1.3-groovy.bin’\n", 182 | "\n", 183 | "ggml-gpt4all-j-v1.3 100%[===================>] 3.52G 38.4MB/s in 63s \n", 184 | "\n", 185 | "2023-04-27 00:16:19 (57.2 MB/s) - ‘ggml-gpt4all-j-v1.3-groovy.bin’ saved [3785248281/3785248281]\n", 186 | "\n" 187 | ] 188 | } 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "source": [ 194 | "def new_text_callback(text):\n", 195 | " print(text, end=\"\")" 196 | ], 197 | "metadata": { 198 | "id": "WpIe6mt7D0yO" 199 | }, 200 | "execution_count": 6, 201 | "outputs": [] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "source": [ 206 | "model = GPT4All_J('./ggml-gpt4all-j-v1.3-groovy.bin')" 207 | ], 208 | "metadata": { 209 | "id": "RMK_ULSbEnEM" 210 | }, 211 | "execution_count": 10, 212 | "outputs": [] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "source": [ 217 | "model.generate(\"Once upon a time\", n_predict=128, new_text_callback=new_text_callback)" 218 | ], 219 | "metadata": { 220 | "colab": { 221 | "base_uri": "https://localhost:8080/", 222 | "height": 353 223 | }, 224 | "id": "FhuA9ZKLEzFG", 225 | "outputId": "28440803-d3b7-4d2b-8dd4-aa5eeb1cc070" 226 | }, 227 | "execution_count": 12, 228 | "outputs": [ 229 | { 230 | "output_type": "stream", 231 | "name": "stdout", 232 | "text": [ 233 | "Once upon ti wason time when there was no such thing" 234 | ] 235 | }, 236 | { 237 | "output_type": "error", 238 | "ename": "KeyboardInterrupt", 239 | "evalue": "ignored", 240 | "traceback": [ 241 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 242 | "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", 243 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgenerate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Once upon a time\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn_predict\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m128\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnew_text_callback\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mnew_text_callback\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 244 | "\u001b[0;32m/usr/local/lib/python3.9/dist-packages/pygptj/model.py\u001b[0m in \u001b[0;36mgenerate\u001b[0;34m(self, prompt, new_text_callback, n_predict, seed, n_threads, top_k, top_p, temp, n_batch)\u001b[0m\n\u001b[1;32m 126\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 127\u001b[0m \u001b[0;31m# run the prediction\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 128\u001b[0;31m \u001b[0mpp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgptj_generate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgpt_params\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_model\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_vocab\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_call_new_text_callback\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 129\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mres\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 130\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 245 | "\u001b[0;32m/usr/local/lib/python3.9/dist-packages/pygptj/model.py\u001b[0m in \u001b[0;36m_call_new_text_callback\u001b[0;34m(self, text_bytes)\u001b[0m\n\u001b[1;32m 72\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 73\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 74\u001b[0;31m \u001b[0;32mdef\u001b[0m \u001b[0m_call_new_text_callback\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtext_bytes\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 75\u001b[0m \"\"\"\n\u001b[1;32m 76\u001b[0m \u001b[0mInternal\u001b[0m \u001b[0mnew_segment_callback\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mit\u001b[0m \u001b[0mjust\u001b[0m \u001b[0mcalls\u001b[0m \u001b[0mthe\u001b[0m \u001b[0muser\u001b[0m\u001b[0;31m'\u001b[0m\u001b[0ms\u001b[0m \u001b[0mcallback\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mthe\u001b[0m\u001b[0;31m \u001b[0m\u001b[0;31m`\u001b[0m\u001b[0mSegment\u001b[0m\u001b[0;31m`\u001b[0m \u001b[0mobject\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 246 | "\u001b[0;31mKeyboardInterrupt\u001b[0m: " 247 | ] 248 | } 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "source": [ 254 | "# way too slow -- the above took three minutes on four Colab threads" 255 | ], 256 | "metadata": { 257 | "id": "iGeVNzx1FB6E" 258 | }, 259 | "execution_count": null, 260 | "outputs": [] 261 | } 262 | ] 263 | } -------------------------------------------------------------------------------- /code/T5.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyMZsxa4Jf9eBA7ahu+X+EAQ", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | }, 17 | "widgets": { 18 | "application/vnd.jupyter.widget-state+json": { 19 | "c6fa0fd0ebf94e27bb09e371f32332b2": { 20 | "model_module": "@jupyter-widgets/controls", 21 | "model_name": "HBoxModel", 22 | "model_module_version": "1.5.0", 23 | "state": { 24 | "_dom_classes": [], 25 | "_model_module": "@jupyter-widgets/controls", 26 | "_model_module_version": "1.5.0", 27 | "_model_name": "HBoxModel", 28 | "_view_count": null, 29 | "_view_module": "@jupyter-widgets/controls", 30 | "_view_module_version": "1.5.0", 31 | "_view_name": "HBoxView", 32 | "box_style": "", 33 | "children": [ 34 | "IPY_MODEL_c3dd5af567d34df89775c8b6f3024902", 35 | "IPY_MODEL_64ce73d0751040d6a9cfe5db8c3c89a1", 36 | "IPY_MODEL_7be68aa3f5754912a53f5e66190287b0" 37 | ], 38 | "layout": "IPY_MODEL_24323e98ab234d21ac01692340fd8759" 39 | } 40 | }, 41 | "c3dd5af567d34df89775c8b6f3024902": { 42 | "model_module": "@jupyter-widgets/controls", 43 | "model_name": "HTMLModel", 44 | "model_module_version": "1.5.0", 45 | "state": { 46 | "_dom_classes": [], 47 | "_model_module": "@jupyter-widgets/controls", 48 | "_model_module_version": "1.5.0", 49 | "_model_name": "HTMLModel", 50 | "_view_count": null, 51 | "_view_module": "@jupyter-widgets/controls", 52 | "_view_module_version": "1.5.0", 53 | "_view_name": "HTMLView", 54 | "description": "", 55 | "description_tooltip": null, 56 | "layout": "IPY_MODEL_6f72b74df608453eb79833f3f6792188", 57 | "placeholder": "​", 58 | "style": "IPY_MODEL_9aa58f409e7848428d4fd52eb6d9f19c", 59 | "value": "Downloading (…)ve/main/spiece.model: 100%" 60 | } 61 | }, 62 | "64ce73d0751040d6a9cfe5db8c3c89a1": { 63 | "model_module": "@jupyter-widgets/controls", 64 | "model_name": "FloatProgressModel", 65 | "model_module_version": "1.5.0", 66 | "state": { 67 | "_dom_classes": [], 68 | "_model_module": "@jupyter-widgets/controls", 69 | "_model_module_version": "1.5.0", 70 | "_model_name": "FloatProgressModel", 71 | "_view_count": null, 72 | "_view_module": "@jupyter-widgets/controls", 73 | "_view_module_version": "1.5.0", 74 | "_view_name": "ProgressView", 75 | "bar_style": "success", 76 | "description": "", 77 | "description_tooltip": null, 78 | "layout": "IPY_MODEL_655524edf6754f6cb4d923929287a4f5", 79 | "max": 791656, 80 | "min": 0, 81 | "orientation": "horizontal", 82 | "style": "IPY_MODEL_566f3c18232d413fb3646b955049a23a", 83 | "value": 791656 84 | } 85 | }, 86 | "7be68aa3f5754912a53f5e66190287b0": { 87 | "model_module": "@jupyter-widgets/controls", 88 | "model_name": "HTMLModel", 89 | "model_module_version": "1.5.0", 90 | "state": { 91 | "_dom_classes": [], 92 | "_model_module": "@jupyter-widgets/controls", 93 | "_model_module_version": "1.5.0", 94 | "_model_name": "HTMLModel", 95 | "_view_count": null, 96 | "_view_module": "@jupyter-widgets/controls", 97 | "_view_module_version": "1.5.0", 98 | "_view_name": "HTMLView", 99 | "description": "", 100 | "description_tooltip": null, 101 | "layout": "IPY_MODEL_6e30cc0c9dda4e5cb4426228adcee4b6", 102 | "placeholder": "​", 103 | "style": "IPY_MODEL_b00f427a5abf4d77a0f792a3b37254b8", 104 | "value": " 792k/792k [00:00<00:00, 5.22MB/s]" 105 | } 106 | }, 107 | "24323e98ab234d21ac01692340fd8759": { 108 | "model_module": "@jupyter-widgets/base", 109 | "model_name": "LayoutModel", 110 | "model_module_version": "1.2.0", 111 | "state": { 112 | "_model_module": "@jupyter-widgets/base", 113 | "_model_module_version": "1.2.0", 114 | "_model_name": "LayoutModel", 115 | "_view_count": null, 116 | "_view_module": "@jupyter-widgets/base", 117 | "_view_module_version": "1.2.0", 118 | "_view_name": "LayoutView", 119 | "align_content": null, 120 | "align_items": null, 121 | "align_self": null, 122 | "border": null, 123 | "bottom": null, 124 | "display": null, 125 | "flex": null, 126 | "flex_flow": null, 127 | "grid_area": null, 128 | "grid_auto_columns": null, 129 | "grid_auto_flow": null, 130 | "grid_auto_rows": null, 131 | "grid_column": null, 132 | "grid_gap": null, 133 | "grid_row": null, 134 | "grid_template_areas": null, 135 | "grid_template_columns": null, 136 | "grid_template_rows": null, 137 | "height": null, 138 | "justify_content": null, 139 | "justify_items": null, 140 | "left": null, 141 | "margin": null, 142 | "max_height": null, 143 | "max_width": null, 144 | "min_height": null, 145 | "min_width": null, 146 | "object_fit": null, 147 | "object_position": null, 148 | "order": null, 149 | "overflow": null, 150 | "overflow_x": null, 151 | "overflow_y": null, 152 | "padding": null, 153 | "right": null, 154 | "top": null, 155 | "visibility": null, 156 | "width": null 157 | } 158 | }, 159 | "6f72b74df608453eb79833f3f6792188": { 160 | "model_module": "@jupyter-widgets/base", 161 | "model_name": "LayoutModel", 162 | "model_module_version": "1.2.0", 163 | "state": { 164 | "_model_module": "@jupyter-widgets/base", 165 | "_model_module_version": "1.2.0", 166 | "_model_name": "LayoutModel", 167 | "_view_count": null, 168 | "_view_module": "@jupyter-widgets/base", 169 | "_view_module_version": "1.2.0", 170 | "_view_name": "LayoutView", 171 | "align_content": null, 172 | "align_items": null, 173 | "align_self": null, 174 | "border": null, 175 | "bottom": null, 176 | "display": null, 177 | "flex": null, 178 | "flex_flow": null, 179 | "grid_area": null, 180 | "grid_auto_columns": null, 181 | "grid_auto_flow": null, 182 | "grid_auto_rows": null, 183 | "grid_column": null, 184 | "grid_gap": null, 185 | "grid_row": null, 186 | "grid_template_areas": null, 187 | "grid_template_columns": null, 188 | "grid_template_rows": null, 189 | "height": null, 190 | "justify_content": null, 191 | "justify_items": null, 192 | "left": null, 193 | "margin": null, 194 | "max_height": null, 195 | "max_width": null, 196 | "min_height": null, 197 | "min_width": null, 198 | "object_fit": null, 199 | "object_position": null, 200 | "order": null, 201 | "overflow": null, 202 | "overflow_x": null, 203 | "overflow_y": null, 204 | "padding": null, 205 | "right": null, 206 | "top": null, 207 | "visibility": null, 208 | "width": null 209 | } 210 | }, 211 | "9aa58f409e7848428d4fd52eb6d9f19c": { 212 | "model_module": "@jupyter-widgets/controls", 213 | "model_name": "DescriptionStyleModel", 214 | "model_module_version": "1.5.0", 215 | "state": { 216 | "_model_module": "@jupyter-widgets/controls", 217 | "_model_module_version": "1.5.0", 218 | "_model_name": "DescriptionStyleModel", 219 | "_view_count": null, 220 | "_view_module": "@jupyter-widgets/base", 221 | "_view_module_version": "1.2.0", 222 | "_view_name": "StyleView", 223 | "description_width": "" 224 | } 225 | }, 226 | "655524edf6754f6cb4d923929287a4f5": { 227 | "model_module": "@jupyter-widgets/base", 228 | "model_name": "LayoutModel", 229 | "model_module_version": "1.2.0", 230 | "state": { 231 | "_model_module": "@jupyter-widgets/base", 232 | "_model_module_version": "1.2.0", 233 | "_model_name": "LayoutModel", 234 | "_view_count": null, 235 | "_view_module": "@jupyter-widgets/base", 236 | "_view_module_version": "1.2.0", 237 | "_view_name": "LayoutView", 238 | "align_content": null, 239 | "align_items": null, 240 | "align_self": null, 241 | "border": null, 242 | "bottom": null, 243 | "display": null, 244 | "flex": null, 245 | "flex_flow": null, 246 | "grid_area": null, 247 | "grid_auto_columns": null, 248 | "grid_auto_flow": null, 249 | "grid_auto_rows": null, 250 | "grid_column": null, 251 | "grid_gap": null, 252 | "grid_row": null, 253 | "grid_template_areas": null, 254 | "grid_template_columns": null, 255 | "grid_template_rows": null, 256 | "height": null, 257 | "justify_content": null, 258 | "justify_items": null, 259 | "left": null, 260 | "margin": null, 261 | "max_height": null, 262 | "max_width": null, 263 | "min_height": null, 264 | "min_width": null, 265 | "object_fit": null, 266 | "object_position": null, 267 | "order": null, 268 | "overflow": null, 269 | "overflow_x": null, 270 | "overflow_y": null, 271 | "padding": null, 272 | "right": null, 273 | "top": null, 274 | "visibility": null, 275 | "width": null 276 | } 277 | }, 278 | "566f3c18232d413fb3646b955049a23a": { 279 | "model_module": "@jupyter-widgets/controls", 280 | "model_name": "ProgressStyleModel", 281 | "model_module_version": "1.5.0", 282 | "state": { 283 | "_model_module": "@jupyter-widgets/controls", 284 | "_model_module_version": "1.5.0", 285 | "_model_name": "ProgressStyleModel", 286 | "_view_count": null, 287 | "_view_module": "@jupyter-widgets/base", 288 | "_view_module_version": "1.2.0", 289 | "_view_name": "StyleView", 290 | "bar_color": null, 291 | "description_width": "" 292 | } 293 | }, 294 | "6e30cc0c9dda4e5cb4426228adcee4b6": { 295 | "model_module": "@jupyter-widgets/base", 296 | "model_name": "LayoutModel", 297 | "model_module_version": "1.2.0", 298 | "state": { 299 | "_model_module": "@jupyter-widgets/base", 300 | "_model_module_version": "1.2.0", 301 | "_model_name": "LayoutModel", 302 | "_view_count": null, 303 | "_view_module": "@jupyter-widgets/base", 304 | "_view_module_version": "1.2.0", 305 | "_view_name": "LayoutView", 306 | "align_content": null, 307 | "align_items": null, 308 | "align_self": null, 309 | "border": null, 310 | "bottom": null, 311 | "display": null, 312 | "flex": null, 313 | "flex_flow": null, 314 | "grid_area": null, 315 | "grid_auto_columns": null, 316 | "grid_auto_flow": null, 317 | "grid_auto_rows": null, 318 | "grid_column": null, 319 | "grid_gap": null, 320 | "grid_row": null, 321 | "grid_template_areas": null, 322 | "grid_template_columns": null, 323 | "grid_template_rows": null, 324 | "height": null, 325 | "justify_content": null, 326 | "justify_items": null, 327 | "left": null, 328 | "margin": null, 329 | "max_height": null, 330 | "max_width": null, 331 | "min_height": null, 332 | "min_width": null, 333 | "object_fit": null, 334 | "object_position": null, 335 | "order": null, 336 | "overflow": null, 337 | "overflow_x": null, 338 | "overflow_y": null, 339 | "padding": null, 340 | "right": null, 341 | "top": null, 342 | "visibility": null, 343 | "width": null 344 | } 345 | }, 346 | "b00f427a5abf4d77a0f792a3b37254b8": { 347 | "model_module": "@jupyter-widgets/controls", 348 | "model_name": "DescriptionStyleModel", 349 | "model_module_version": "1.5.0", 350 | "state": { 351 | "_model_module": "@jupyter-widgets/controls", 352 | "_model_module_version": "1.5.0", 353 | "_model_name": "DescriptionStyleModel", 354 | "_view_count": null, 355 | "_view_module": "@jupyter-widgets/base", 356 | "_view_module_version": "1.2.0", 357 | "_view_name": "StyleView", 358 | "description_width": "" 359 | } 360 | }, 361 | "a0ec395acb654f19b2ec1f84e0538f88": { 362 | "model_module": "@jupyter-widgets/controls", 363 | "model_name": "HBoxModel", 364 | "model_module_version": "1.5.0", 365 | "state": { 366 | "_dom_classes": [], 367 | "_model_module": "@jupyter-widgets/controls", 368 | "_model_module_version": "1.5.0", 369 | "_model_name": "HBoxModel", 370 | "_view_count": null, 371 | "_view_module": "@jupyter-widgets/controls", 372 | "_view_module_version": "1.5.0", 373 | "_view_name": "HBoxView", 374 | "box_style": "", 375 | "children": [ 376 | "IPY_MODEL_63194346ee4342d1a1cadf6113aa59b6", 377 | "IPY_MODEL_f92f34fd76e042079d66a99203ec8629", 378 | "IPY_MODEL_9c49b384ba784e4685d71af70dce7a24" 379 | ], 380 | "layout": "IPY_MODEL_a9bd050817ea4b5191ea0b2c48eb1743" 381 | } 382 | }, 383 | "63194346ee4342d1a1cadf6113aa59b6": { 384 | "model_module": "@jupyter-widgets/controls", 385 | "model_name": "HTMLModel", 386 | "model_module_version": "1.5.0", 387 | "state": { 388 | "_dom_classes": [], 389 | "_model_module": "@jupyter-widgets/controls", 390 | "_model_module_version": "1.5.0", 391 | "_model_name": "HTMLModel", 392 | "_view_count": null, 393 | "_view_module": "@jupyter-widgets/controls", 394 | "_view_module_version": "1.5.0", 395 | "_view_name": "HTMLView", 396 | "description": "", 397 | "description_tooltip": null, 398 | "layout": "IPY_MODEL_f53e4483e951450f9962bc34b9b2a723", 399 | "placeholder": "​", 400 | "style": "IPY_MODEL_7f222e156792442c97def1786bc4f13e", 401 | "value": "Downloading (…)lve/main/config.json: 100%" 402 | } 403 | }, 404 | "f92f34fd76e042079d66a99203ec8629": { 405 | "model_module": "@jupyter-widgets/controls", 406 | "model_name": "FloatProgressModel", 407 | "model_module_version": "1.5.0", 408 | "state": { 409 | "_dom_classes": [], 410 | "_model_module": "@jupyter-widgets/controls", 411 | "_model_module_version": "1.5.0", 412 | "_model_name": "FloatProgressModel", 413 | "_view_count": null, 414 | "_view_module": "@jupyter-widgets/controls", 415 | "_view_module_version": "1.5.0", 416 | "_view_name": "ProgressView", 417 | "bar_style": "success", 418 | "description": "", 419 | "description_tooltip": null, 420 | "layout": "IPY_MODEL_1e0f2dafb0c3456bbafb561875558dad", 421 | "max": 1208, 422 | "min": 0, 423 | "orientation": "horizontal", 424 | "style": "IPY_MODEL_2b4a558e81e349c695e060f1bdf11657", 425 | "value": 1208 426 | } 427 | }, 428 | "9c49b384ba784e4685d71af70dce7a24": { 429 | "model_module": "@jupyter-widgets/controls", 430 | "model_name": "HTMLModel", 431 | "model_module_version": "1.5.0", 432 | "state": { 433 | "_dom_classes": [], 434 | "_model_module": "@jupyter-widgets/controls", 435 | "_model_module_version": "1.5.0", 436 | "_model_name": "HTMLModel", 437 | "_view_count": null, 438 | "_view_module": "@jupyter-widgets/controls", 439 | "_view_module_version": "1.5.0", 440 | "_view_name": "HTMLView", 441 | "description": "", 442 | "description_tooltip": null, 443 | "layout": "IPY_MODEL_7bb2b713829c406bb134f2f659bc94c8", 444 | "placeholder": "​", 445 | "style": "IPY_MODEL_391e2b2a00af4dd4ac7cdb71f7a04621", 446 | "value": " 1.21k/1.21k [00:00<00:00, 31.5kB/s]" 447 | } 448 | }, 449 | "a9bd050817ea4b5191ea0b2c48eb1743": { 450 | "model_module": "@jupyter-widgets/base", 451 | "model_name": "LayoutModel", 452 | "model_module_version": "1.2.0", 453 | "state": { 454 | "_model_module": "@jupyter-widgets/base", 455 | "_model_module_version": "1.2.0", 456 | "_model_name": "LayoutModel", 457 | "_view_count": null, 458 | "_view_module": "@jupyter-widgets/base", 459 | "_view_module_version": "1.2.0", 460 | "_view_name": "LayoutView", 461 | "align_content": null, 462 | "align_items": null, 463 | "align_self": null, 464 | "border": null, 465 | "bottom": null, 466 | "display": null, 467 | "flex": null, 468 | "flex_flow": null, 469 | "grid_area": null, 470 | "grid_auto_columns": null, 471 | "grid_auto_flow": null, 472 | "grid_auto_rows": null, 473 | "grid_column": null, 474 | "grid_gap": null, 475 | "grid_row": null, 476 | "grid_template_areas": null, 477 | "grid_template_columns": null, 478 | "grid_template_rows": null, 479 | "height": null, 480 | "justify_content": null, 481 | "justify_items": null, 482 | "left": null, 483 | "margin": null, 484 | "max_height": null, 485 | "max_width": null, 486 | "min_height": null, 487 | "min_width": null, 488 | "object_fit": null, 489 | "object_position": null, 490 | "order": null, 491 | "overflow": null, 492 | "overflow_x": null, 493 | "overflow_y": null, 494 | "padding": null, 495 | "right": null, 496 | "top": null, 497 | "visibility": null, 498 | "width": null 499 | } 500 | }, 501 | "f53e4483e951450f9962bc34b9b2a723": { 502 | "model_module": "@jupyter-widgets/base", 503 | "model_name": "LayoutModel", 504 | "model_module_version": "1.2.0", 505 | "state": { 506 | "_model_module": "@jupyter-widgets/base", 507 | "_model_module_version": "1.2.0", 508 | "_model_name": "LayoutModel", 509 | "_view_count": null, 510 | "_view_module": "@jupyter-widgets/base", 511 | "_view_module_version": "1.2.0", 512 | "_view_name": "LayoutView", 513 | "align_content": null, 514 | "align_items": null, 515 | "align_self": null, 516 | "border": null, 517 | "bottom": null, 518 | "display": null, 519 | "flex": null, 520 | "flex_flow": null, 521 | "grid_area": null, 522 | "grid_auto_columns": null, 523 | "grid_auto_flow": null, 524 | "grid_auto_rows": null, 525 | "grid_column": null, 526 | "grid_gap": null, 527 | "grid_row": null, 528 | "grid_template_areas": null, 529 | "grid_template_columns": null, 530 | "grid_template_rows": null, 531 | "height": null, 532 | "justify_content": null, 533 | "justify_items": null, 534 | "left": null, 535 | "margin": null, 536 | "max_height": null, 537 | "max_width": null, 538 | "min_height": null, 539 | "min_width": null, 540 | "object_fit": null, 541 | "object_position": null, 542 | "order": null, 543 | "overflow": null, 544 | "overflow_x": null, 545 | "overflow_y": null, 546 | "padding": null, 547 | "right": null, 548 | "top": null, 549 | "visibility": null, 550 | "width": null 551 | } 552 | }, 553 | "7f222e156792442c97def1786bc4f13e": { 554 | "model_module": "@jupyter-widgets/controls", 555 | "model_name": "DescriptionStyleModel", 556 | "model_module_version": "1.5.0", 557 | "state": { 558 | "_model_module": "@jupyter-widgets/controls", 559 | "_model_module_version": "1.5.0", 560 | "_model_name": "DescriptionStyleModel", 561 | "_view_count": null, 562 | "_view_module": "@jupyter-widgets/base", 563 | "_view_module_version": "1.2.0", 564 | "_view_name": "StyleView", 565 | "description_width": "" 566 | } 567 | }, 568 | "1e0f2dafb0c3456bbafb561875558dad": { 569 | "model_module": "@jupyter-widgets/base", 570 | "model_name": "LayoutModel", 571 | "model_module_version": "1.2.0", 572 | "state": { 573 | "_model_module": "@jupyter-widgets/base", 574 | "_model_module_version": "1.2.0", 575 | "_model_name": "LayoutModel", 576 | "_view_count": null, 577 | "_view_module": "@jupyter-widgets/base", 578 | "_view_module_version": "1.2.0", 579 | "_view_name": "LayoutView", 580 | "align_content": null, 581 | "align_items": null, 582 | "align_self": null, 583 | "border": null, 584 | "bottom": null, 585 | "display": null, 586 | "flex": null, 587 | "flex_flow": null, 588 | "grid_area": null, 589 | "grid_auto_columns": null, 590 | "grid_auto_flow": null, 591 | "grid_auto_rows": null, 592 | "grid_column": null, 593 | "grid_gap": null, 594 | "grid_row": null, 595 | "grid_template_areas": null, 596 | "grid_template_columns": null, 597 | "grid_template_rows": null, 598 | "height": null, 599 | "justify_content": null, 600 | "justify_items": null, 601 | "left": null, 602 | "margin": null, 603 | "max_height": null, 604 | "max_width": null, 605 | "min_height": null, 606 | "min_width": null, 607 | "object_fit": null, 608 | "object_position": null, 609 | "order": null, 610 | "overflow": null, 611 | "overflow_x": null, 612 | "overflow_y": null, 613 | "padding": null, 614 | "right": null, 615 | "top": null, 616 | "visibility": null, 617 | "width": null 618 | } 619 | }, 620 | "2b4a558e81e349c695e060f1bdf11657": { 621 | "model_module": "@jupyter-widgets/controls", 622 | "model_name": "ProgressStyleModel", 623 | "model_module_version": "1.5.0", 624 | "state": { 625 | "_model_module": "@jupyter-widgets/controls", 626 | "_model_module_version": "1.5.0", 627 | "_model_name": "ProgressStyleModel", 628 | "_view_count": null, 629 | "_view_module": "@jupyter-widgets/base", 630 | "_view_module_version": "1.2.0", 631 | "_view_name": "StyleView", 632 | "bar_color": null, 633 | "description_width": "" 634 | } 635 | }, 636 | "7bb2b713829c406bb134f2f659bc94c8": { 637 | "model_module": "@jupyter-widgets/base", 638 | "model_name": "LayoutModel", 639 | "model_module_version": "1.2.0", 640 | "state": { 641 | "_model_module": "@jupyter-widgets/base", 642 | "_model_module_version": "1.2.0", 643 | "_model_name": "LayoutModel", 644 | "_view_count": null, 645 | "_view_module": "@jupyter-widgets/base", 646 | "_view_module_version": "1.2.0", 647 | "_view_name": "LayoutView", 648 | "align_content": null, 649 | "align_items": null, 650 | "align_self": null, 651 | "border": null, 652 | "bottom": null, 653 | "display": null, 654 | "flex": null, 655 | "flex_flow": null, 656 | "grid_area": null, 657 | "grid_auto_columns": null, 658 | "grid_auto_flow": null, 659 | "grid_auto_rows": null, 660 | "grid_column": null, 661 | "grid_gap": null, 662 | "grid_row": null, 663 | "grid_template_areas": null, 664 | "grid_template_columns": null, 665 | "grid_template_rows": null, 666 | "height": null, 667 | "justify_content": null, 668 | "justify_items": null, 669 | "left": null, 670 | "margin": null, 671 | "max_height": null, 672 | "max_width": null, 673 | "min_height": null, 674 | "min_width": null, 675 | "object_fit": null, 676 | "object_position": null, 677 | "order": null, 678 | "overflow": null, 679 | "overflow_x": null, 680 | "overflow_y": null, 681 | "padding": null, 682 | "right": null, 683 | "top": null, 684 | "visibility": null, 685 | "width": null 686 | } 687 | }, 688 | "391e2b2a00af4dd4ac7cdb71f7a04621": { 689 | "model_module": "@jupyter-widgets/controls", 690 | "model_name": "DescriptionStyleModel", 691 | "model_module_version": "1.5.0", 692 | "state": { 693 | "_model_module": "@jupyter-widgets/controls", 694 | "_model_module_version": "1.5.0", 695 | "_model_name": "DescriptionStyleModel", 696 | "_view_count": null, 697 | "_view_module": "@jupyter-widgets/base", 698 | "_view_module_version": "1.2.0", 699 | "_view_name": "StyleView", 700 | "description_width": "" 701 | } 702 | }, 703 | "2fc86ef6c170481bb9a6292c5bd2cd32": { 704 | "model_module": "@jupyter-widgets/controls", 705 | "model_name": "HBoxModel", 706 | "model_module_version": "1.5.0", 707 | "state": { 708 | "_dom_classes": [], 709 | "_model_module": "@jupyter-widgets/controls", 710 | "_model_module_version": "1.5.0", 711 | "_model_name": "HBoxModel", 712 | "_view_count": null, 713 | "_view_module": "@jupyter-widgets/controls", 714 | "_view_module_version": "1.5.0", 715 | "_view_name": "HBoxView", 716 | "box_style": "", 717 | "children": [ 718 | "IPY_MODEL_374c97dc59e44edf8ee0a8646e5ad38c", 719 | "IPY_MODEL_61c4405c5b824232975b480842f495d2", 720 | "IPY_MODEL_3f34d90ebed54b28bb06cb12c421e7a9" 721 | ], 722 | "layout": "IPY_MODEL_8590f693ba8b4fe6ad3d393e7a0c4483" 723 | } 724 | }, 725 | "374c97dc59e44edf8ee0a8646e5ad38c": { 726 | "model_module": "@jupyter-widgets/controls", 727 | "model_name": "HTMLModel", 728 | "model_module_version": "1.5.0", 729 | "state": { 730 | "_dom_classes": [], 731 | "_model_module": "@jupyter-widgets/controls", 732 | "_model_module_version": "1.5.0", 733 | "_model_name": "HTMLModel", 734 | "_view_count": null, 735 | "_view_module": "@jupyter-widgets/controls", 736 | "_view_module_version": "1.5.0", 737 | "_view_name": "HTMLView", 738 | "description": "", 739 | "description_tooltip": null, 740 | "layout": "IPY_MODEL_d7d9a6c8ccbf4f80a6af10985354ca09", 741 | "placeholder": "​", 742 | "style": "IPY_MODEL_b7f3f6532e524c2cb3a792f322681c0f", 743 | "value": "Downloading pytorch_model.bin: 100%" 744 | } 745 | }, 746 | "61c4405c5b824232975b480842f495d2": { 747 | "model_module": "@jupyter-widgets/controls", 748 | "model_name": "FloatProgressModel", 749 | "model_module_version": "1.5.0", 750 | "state": { 751 | "_dom_classes": [], 752 | "_model_module": "@jupyter-widgets/controls", 753 | "_model_module_version": "1.5.0", 754 | "_model_name": "FloatProgressModel", 755 | "_view_count": null, 756 | "_view_module": "@jupyter-widgets/controls", 757 | "_view_module_version": "1.5.0", 758 | "_view_name": "ProgressView", 759 | "bar_style": "success", 760 | "description": "", 761 | "description_tooltip": null, 762 | "layout": "IPY_MODEL_be9edb4a9d574bf985b3215b79e221c3", 763 | "max": 891691430, 764 | "min": 0, 765 | "orientation": "horizontal", 766 | "style": "IPY_MODEL_da5592d90563447598baaf5117771b6e", 767 | "value": 891691430 768 | } 769 | }, 770 | "3f34d90ebed54b28bb06cb12c421e7a9": { 771 | "model_module": "@jupyter-widgets/controls", 772 | "model_name": "HTMLModel", 773 | "model_module_version": "1.5.0", 774 | "state": { 775 | "_dom_classes": [], 776 | "_model_module": "@jupyter-widgets/controls", 777 | "_model_module_version": "1.5.0", 778 | "_model_name": "HTMLModel", 779 | "_view_count": null, 780 | "_view_module": "@jupyter-widgets/controls", 781 | "_view_module_version": "1.5.0", 782 | "_view_name": "HTMLView", 783 | "description": "", 784 | "description_tooltip": null, 785 | "layout": "IPY_MODEL_fa5cd9e1db5e434eb2df92f2cd4d1af2", 786 | "placeholder": "​", 787 | "style": "IPY_MODEL_868a6813d63b4b85858043fd2a2e985a", 788 | "value": " 892M/892M [00:09<00:00, 77.2MB/s]" 789 | } 790 | }, 791 | "8590f693ba8b4fe6ad3d393e7a0c4483": { 792 | "model_module": "@jupyter-widgets/base", 793 | "model_name": "LayoutModel", 794 | "model_module_version": "1.2.0", 795 | "state": { 796 | "_model_module": "@jupyter-widgets/base", 797 | "_model_module_version": "1.2.0", 798 | "_model_name": "LayoutModel", 799 | "_view_count": null, 800 | "_view_module": "@jupyter-widgets/base", 801 | "_view_module_version": "1.2.0", 802 | "_view_name": "LayoutView", 803 | "align_content": null, 804 | "align_items": null, 805 | "align_self": null, 806 | "border": null, 807 | "bottom": null, 808 | "display": null, 809 | "flex": null, 810 | "flex_flow": null, 811 | "grid_area": null, 812 | "grid_auto_columns": null, 813 | "grid_auto_flow": null, 814 | "grid_auto_rows": null, 815 | "grid_column": null, 816 | "grid_gap": null, 817 | "grid_row": null, 818 | "grid_template_areas": null, 819 | "grid_template_columns": null, 820 | "grid_template_rows": null, 821 | "height": null, 822 | "justify_content": null, 823 | "justify_items": null, 824 | "left": null, 825 | "margin": null, 826 | "max_height": null, 827 | "max_width": null, 828 | "min_height": null, 829 | "min_width": null, 830 | "object_fit": null, 831 | "object_position": null, 832 | "order": null, 833 | "overflow": null, 834 | "overflow_x": null, 835 | "overflow_y": null, 836 | "padding": null, 837 | "right": null, 838 | "top": null, 839 | "visibility": null, 840 | "width": null 841 | } 842 | }, 843 | "d7d9a6c8ccbf4f80a6af10985354ca09": { 844 | "model_module": "@jupyter-widgets/base", 845 | "model_name": "LayoutModel", 846 | "model_module_version": "1.2.0", 847 | "state": { 848 | "_model_module": "@jupyter-widgets/base", 849 | "_model_module_version": "1.2.0", 850 | "_model_name": "LayoutModel", 851 | "_view_count": null, 852 | "_view_module": "@jupyter-widgets/base", 853 | "_view_module_version": "1.2.0", 854 | "_view_name": "LayoutView", 855 | "align_content": null, 856 | "align_items": null, 857 | "align_self": null, 858 | "border": null, 859 | "bottom": null, 860 | "display": null, 861 | "flex": null, 862 | "flex_flow": null, 863 | "grid_area": null, 864 | "grid_auto_columns": null, 865 | "grid_auto_flow": null, 866 | "grid_auto_rows": null, 867 | "grid_column": null, 868 | "grid_gap": null, 869 | "grid_row": null, 870 | "grid_template_areas": null, 871 | "grid_template_columns": null, 872 | "grid_template_rows": null, 873 | "height": null, 874 | "justify_content": null, 875 | "justify_items": null, 876 | "left": null, 877 | "margin": null, 878 | "max_height": null, 879 | "max_width": null, 880 | "min_height": null, 881 | "min_width": null, 882 | "object_fit": null, 883 | "object_position": null, 884 | "order": null, 885 | "overflow": null, 886 | "overflow_x": null, 887 | "overflow_y": null, 888 | "padding": null, 889 | "right": null, 890 | "top": null, 891 | "visibility": null, 892 | "width": null 893 | } 894 | }, 895 | "b7f3f6532e524c2cb3a792f322681c0f": { 896 | "model_module": "@jupyter-widgets/controls", 897 | "model_name": "DescriptionStyleModel", 898 | "model_module_version": "1.5.0", 899 | "state": { 900 | "_model_module": "@jupyter-widgets/controls", 901 | "_model_module_version": "1.5.0", 902 | "_model_name": "DescriptionStyleModel", 903 | "_view_count": null, 904 | "_view_module": "@jupyter-widgets/base", 905 | "_view_module_version": "1.2.0", 906 | "_view_name": "StyleView", 907 | "description_width": "" 908 | } 909 | }, 910 | "be9edb4a9d574bf985b3215b79e221c3": { 911 | "model_module": "@jupyter-widgets/base", 912 | "model_name": "LayoutModel", 913 | "model_module_version": "1.2.0", 914 | "state": { 915 | "_model_module": "@jupyter-widgets/base", 916 | "_model_module_version": "1.2.0", 917 | "_model_name": "LayoutModel", 918 | "_view_count": null, 919 | "_view_module": "@jupyter-widgets/base", 920 | "_view_module_version": "1.2.0", 921 | "_view_name": "LayoutView", 922 | "align_content": null, 923 | "align_items": null, 924 | "align_self": null, 925 | "border": null, 926 | "bottom": null, 927 | "display": null, 928 | "flex": null, 929 | "flex_flow": null, 930 | "grid_area": null, 931 | "grid_auto_columns": null, 932 | "grid_auto_flow": null, 933 | "grid_auto_rows": null, 934 | "grid_column": null, 935 | "grid_gap": null, 936 | "grid_row": null, 937 | "grid_template_areas": null, 938 | "grid_template_columns": null, 939 | "grid_template_rows": null, 940 | "height": null, 941 | "justify_content": null, 942 | "justify_items": null, 943 | "left": null, 944 | "margin": null, 945 | "max_height": null, 946 | "max_width": null, 947 | "min_height": null, 948 | "min_width": null, 949 | "object_fit": null, 950 | "object_position": null, 951 | "order": null, 952 | "overflow": null, 953 | "overflow_x": null, 954 | "overflow_y": null, 955 | "padding": null, 956 | "right": null, 957 | "top": null, 958 | "visibility": null, 959 | "width": null 960 | } 961 | }, 962 | "da5592d90563447598baaf5117771b6e": { 963 | "model_module": "@jupyter-widgets/controls", 964 | "model_name": "ProgressStyleModel", 965 | "model_module_version": "1.5.0", 966 | "state": { 967 | "_model_module": "@jupyter-widgets/controls", 968 | "_model_module_version": "1.5.0", 969 | "_model_name": "ProgressStyleModel", 970 | "_view_count": null, 971 | "_view_module": "@jupyter-widgets/base", 972 | "_view_module_version": "1.2.0", 973 | "_view_name": "StyleView", 974 | "bar_color": null, 975 | "description_width": "" 976 | } 977 | }, 978 | "fa5cd9e1db5e434eb2df92f2cd4d1af2": { 979 | "model_module": "@jupyter-widgets/base", 980 | "model_name": "LayoutModel", 981 | "model_module_version": "1.2.0", 982 | "state": { 983 | "_model_module": "@jupyter-widgets/base", 984 | "_model_module_version": "1.2.0", 985 | "_model_name": "LayoutModel", 986 | "_view_count": null, 987 | "_view_module": "@jupyter-widgets/base", 988 | "_view_module_version": "1.2.0", 989 | "_view_name": "LayoutView", 990 | "align_content": null, 991 | "align_items": null, 992 | "align_self": null, 993 | "border": null, 994 | "bottom": null, 995 | "display": null, 996 | "flex": null, 997 | "flex_flow": null, 998 | "grid_area": null, 999 | "grid_auto_columns": null, 1000 | "grid_auto_flow": null, 1001 | "grid_auto_rows": null, 1002 | "grid_column": null, 1003 | "grid_gap": null, 1004 | "grid_row": null, 1005 | "grid_template_areas": null, 1006 | "grid_template_columns": null, 1007 | "grid_template_rows": null, 1008 | "height": null, 1009 | "justify_content": null, 1010 | "justify_items": null, 1011 | "left": null, 1012 | "margin": null, 1013 | "max_height": null, 1014 | "max_width": null, 1015 | "min_height": null, 1016 | "min_width": null, 1017 | "object_fit": null, 1018 | "object_position": null, 1019 | "order": null, 1020 | "overflow": null, 1021 | "overflow_x": null, 1022 | "overflow_y": null, 1023 | "padding": null, 1024 | "right": null, 1025 | "top": null, 1026 | "visibility": null, 1027 | "width": null 1028 | } 1029 | }, 1030 | "868a6813d63b4b85858043fd2a2e985a": { 1031 | "model_module": "@jupyter-widgets/controls", 1032 | "model_name": "DescriptionStyleModel", 1033 | "model_module_version": "1.5.0", 1034 | "state": { 1035 | "_model_module": "@jupyter-widgets/controls", 1036 | "_model_module_version": "1.5.0", 1037 | "_model_name": "DescriptionStyleModel", 1038 | "_view_count": null, 1039 | "_view_module": "@jupyter-widgets/base", 1040 | "_view_module_version": "1.2.0", 1041 | "_view_name": "StyleView", 1042 | "description_width": "" 1043 | } 1044 | }, 1045 | "6b210e2da4c94769a059f63dfdc27249": { 1046 | "model_module": "@jupyter-widgets/controls", 1047 | "model_name": "HBoxModel", 1048 | "model_module_version": "1.5.0", 1049 | "state": { 1050 | "_dom_classes": [], 1051 | "_model_module": "@jupyter-widgets/controls", 1052 | "_model_module_version": "1.5.0", 1053 | "_model_name": "HBoxModel", 1054 | "_view_count": null, 1055 | "_view_module": "@jupyter-widgets/controls", 1056 | "_view_module_version": "1.5.0", 1057 | "_view_name": "HBoxView", 1058 | "box_style": "", 1059 | "children": [ 1060 | "IPY_MODEL_439f35ed42b24d25bbfc1e5cc7824213", 1061 | "IPY_MODEL_bfa485804bcf4839ba992f24f7230093", 1062 | "IPY_MODEL_dd5b089068174ceaabf34eacc3447d40" 1063 | ], 1064 | "layout": "IPY_MODEL_69012ed1dd2748c39e0fd459ff31199d" 1065 | } 1066 | }, 1067 | "439f35ed42b24d25bbfc1e5cc7824213": { 1068 | "model_module": "@jupyter-widgets/controls", 1069 | "model_name": "HTMLModel", 1070 | "model_module_version": "1.5.0", 1071 | "state": { 1072 | "_dom_classes": [], 1073 | "_model_module": "@jupyter-widgets/controls", 1074 | "_model_module_version": "1.5.0", 1075 | "_model_name": "HTMLModel", 1076 | "_view_count": null, 1077 | "_view_module": "@jupyter-widgets/controls", 1078 | "_view_module_version": "1.5.0", 1079 | "_view_name": "HTMLView", 1080 | "description": "", 1081 | "description_tooltip": null, 1082 | "layout": "IPY_MODEL_20fc92997c344b38baef39583bd940e9", 1083 | "placeholder": "​", 1084 | "style": "IPY_MODEL_f9ce960acc884432b66007add17ff77f", 1085 | "value": "Downloading (…)neration_config.json: 100%" 1086 | } 1087 | }, 1088 | "bfa485804bcf4839ba992f24f7230093": { 1089 | "model_module": "@jupyter-widgets/controls", 1090 | "model_name": "FloatProgressModel", 1091 | "model_module_version": "1.5.0", 1092 | "state": { 1093 | "_dom_classes": [], 1094 | "_model_module": "@jupyter-widgets/controls", 1095 | "_model_module_version": "1.5.0", 1096 | "_model_name": "FloatProgressModel", 1097 | "_view_count": null, 1098 | "_view_module": "@jupyter-widgets/controls", 1099 | "_view_module_version": "1.5.0", 1100 | "_view_name": "ProgressView", 1101 | "bar_style": "success", 1102 | "description": "", 1103 | "description_tooltip": null, 1104 | "layout": "IPY_MODEL_79659d6c3b664ec7a350e1946a5911f4", 1105 | "max": 147, 1106 | "min": 0, 1107 | "orientation": "horizontal", 1108 | "style": "IPY_MODEL_c9ac5d1a989040a2a9da8438f1580730", 1109 | "value": 147 1110 | } 1111 | }, 1112 | "dd5b089068174ceaabf34eacc3447d40": { 1113 | "model_module": "@jupyter-widgets/controls", 1114 | "model_name": "HTMLModel", 1115 | "model_module_version": "1.5.0", 1116 | "state": { 1117 | "_dom_classes": [], 1118 | "_model_module": "@jupyter-widgets/controls", 1119 | "_model_module_version": "1.5.0", 1120 | "_model_name": "HTMLModel", 1121 | "_view_count": null, 1122 | "_view_module": "@jupyter-widgets/controls", 1123 | "_view_module_version": "1.5.0", 1124 | "_view_name": "HTMLView", 1125 | "description": "", 1126 | "description_tooltip": null, 1127 | "layout": "IPY_MODEL_3786c6e7e0754801a020d64480fb2146", 1128 | "placeholder": "​", 1129 | "style": "IPY_MODEL_4b5f616ec9834a288bb381250f0a0d6c", 1130 | "value": " 147/147 [00:00<00:00, 5.85kB/s]" 1131 | } 1132 | }, 1133 | "69012ed1dd2748c39e0fd459ff31199d": { 1134 | "model_module": "@jupyter-widgets/base", 1135 | "model_name": "LayoutModel", 1136 | "model_module_version": "1.2.0", 1137 | "state": { 1138 | "_model_module": "@jupyter-widgets/base", 1139 | "_model_module_version": "1.2.0", 1140 | "_model_name": "LayoutModel", 1141 | "_view_count": null, 1142 | "_view_module": "@jupyter-widgets/base", 1143 | "_view_module_version": "1.2.0", 1144 | "_view_name": "LayoutView", 1145 | "align_content": null, 1146 | "align_items": null, 1147 | "align_self": null, 1148 | "border": null, 1149 | "bottom": null, 1150 | "display": null, 1151 | "flex": null, 1152 | "flex_flow": null, 1153 | "grid_area": null, 1154 | "grid_auto_columns": null, 1155 | "grid_auto_flow": null, 1156 | "grid_auto_rows": null, 1157 | "grid_column": null, 1158 | "grid_gap": null, 1159 | "grid_row": null, 1160 | "grid_template_areas": null, 1161 | "grid_template_columns": null, 1162 | "grid_template_rows": null, 1163 | "height": null, 1164 | "justify_content": null, 1165 | "justify_items": null, 1166 | "left": null, 1167 | "margin": null, 1168 | "max_height": null, 1169 | "max_width": null, 1170 | "min_height": null, 1171 | "min_width": null, 1172 | "object_fit": null, 1173 | "object_position": null, 1174 | "order": null, 1175 | "overflow": null, 1176 | "overflow_x": null, 1177 | "overflow_y": null, 1178 | "padding": null, 1179 | "right": null, 1180 | "top": null, 1181 | "visibility": null, 1182 | "width": null 1183 | } 1184 | }, 1185 | "20fc92997c344b38baef39583bd940e9": { 1186 | "model_module": "@jupyter-widgets/base", 1187 | "model_name": "LayoutModel", 1188 | "model_module_version": "1.2.0", 1189 | "state": { 1190 | "_model_module": "@jupyter-widgets/base", 1191 | "_model_module_version": "1.2.0", 1192 | "_model_name": "LayoutModel", 1193 | "_view_count": null, 1194 | "_view_module": "@jupyter-widgets/base", 1195 | "_view_module_version": "1.2.0", 1196 | "_view_name": "LayoutView", 1197 | "align_content": null, 1198 | "align_items": null, 1199 | "align_self": null, 1200 | "border": null, 1201 | "bottom": null, 1202 | "display": null, 1203 | "flex": null, 1204 | "flex_flow": null, 1205 | "grid_area": null, 1206 | "grid_auto_columns": null, 1207 | "grid_auto_flow": null, 1208 | "grid_auto_rows": null, 1209 | "grid_column": null, 1210 | "grid_gap": null, 1211 | "grid_row": null, 1212 | "grid_template_areas": null, 1213 | "grid_template_columns": null, 1214 | "grid_template_rows": null, 1215 | "height": null, 1216 | "justify_content": null, 1217 | "justify_items": null, 1218 | "left": null, 1219 | "margin": null, 1220 | "max_height": null, 1221 | "max_width": null, 1222 | "min_height": null, 1223 | "min_width": null, 1224 | "object_fit": null, 1225 | "object_position": null, 1226 | "order": null, 1227 | "overflow": null, 1228 | "overflow_x": null, 1229 | "overflow_y": null, 1230 | "padding": null, 1231 | "right": null, 1232 | "top": null, 1233 | "visibility": null, 1234 | "width": null 1235 | } 1236 | }, 1237 | "f9ce960acc884432b66007add17ff77f": { 1238 | "model_module": "@jupyter-widgets/controls", 1239 | "model_name": "DescriptionStyleModel", 1240 | "model_module_version": "1.5.0", 1241 | "state": { 1242 | "_model_module": "@jupyter-widgets/controls", 1243 | "_model_module_version": "1.5.0", 1244 | "_model_name": "DescriptionStyleModel", 1245 | "_view_count": null, 1246 | "_view_module": "@jupyter-widgets/base", 1247 | "_view_module_version": "1.2.0", 1248 | "_view_name": "StyleView", 1249 | "description_width": "" 1250 | } 1251 | }, 1252 | "79659d6c3b664ec7a350e1946a5911f4": { 1253 | "model_module": "@jupyter-widgets/base", 1254 | "model_name": "LayoutModel", 1255 | "model_module_version": "1.2.0", 1256 | "state": { 1257 | "_model_module": "@jupyter-widgets/base", 1258 | "_model_module_version": "1.2.0", 1259 | "_model_name": "LayoutModel", 1260 | "_view_count": null, 1261 | "_view_module": "@jupyter-widgets/base", 1262 | "_view_module_version": "1.2.0", 1263 | "_view_name": "LayoutView", 1264 | "align_content": null, 1265 | "align_items": null, 1266 | "align_self": null, 1267 | "border": null, 1268 | "bottom": null, 1269 | "display": null, 1270 | "flex": null, 1271 | "flex_flow": null, 1272 | "grid_area": null, 1273 | "grid_auto_columns": null, 1274 | "grid_auto_flow": null, 1275 | "grid_auto_rows": null, 1276 | "grid_column": null, 1277 | "grid_gap": null, 1278 | "grid_row": null, 1279 | "grid_template_areas": null, 1280 | "grid_template_columns": null, 1281 | "grid_template_rows": null, 1282 | "height": null, 1283 | "justify_content": null, 1284 | "justify_items": null, 1285 | "left": null, 1286 | "margin": null, 1287 | "max_height": null, 1288 | "max_width": null, 1289 | "min_height": null, 1290 | "min_width": null, 1291 | "object_fit": null, 1292 | "object_position": null, 1293 | "order": null, 1294 | "overflow": null, 1295 | "overflow_x": null, 1296 | "overflow_y": null, 1297 | "padding": null, 1298 | "right": null, 1299 | "top": null, 1300 | "visibility": null, 1301 | "width": null 1302 | } 1303 | }, 1304 | "c9ac5d1a989040a2a9da8438f1580730": { 1305 | "model_module": "@jupyter-widgets/controls", 1306 | "model_name": "ProgressStyleModel", 1307 | "model_module_version": "1.5.0", 1308 | "state": { 1309 | "_model_module": "@jupyter-widgets/controls", 1310 | "_model_module_version": "1.5.0", 1311 | "_model_name": "ProgressStyleModel", 1312 | "_view_count": null, 1313 | "_view_module": "@jupyter-widgets/base", 1314 | "_view_module_version": "1.2.0", 1315 | "_view_name": "StyleView", 1316 | "bar_color": null, 1317 | "description_width": "" 1318 | } 1319 | }, 1320 | "3786c6e7e0754801a020d64480fb2146": { 1321 | "model_module": "@jupyter-widgets/base", 1322 | "model_name": "LayoutModel", 1323 | "model_module_version": "1.2.0", 1324 | "state": { 1325 | "_model_module": "@jupyter-widgets/base", 1326 | "_model_module_version": "1.2.0", 1327 | "_model_name": "LayoutModel", 1328 | "_view_count": null, 1329 | "_view_module": "@jupyter-widgets/base", 1330 | "_view_module_version": "1.2.0", 1331 | "_view_name": "LayoutView", 1332 | "align_content": null, 1333 | "align_items": null, 1334 | "align_self": null, 1335 | "border": null, 1336 | "bottom": null, 1337 | "display": null, 1338 | "flex": null, 1339 | "flex_flow": null, 1340 | "grid_area": null, 1341 | "grid_auto_columns": null, 1342 | "grid_auto_flow": null, 1343 | "grid_auto_rows": null, 1344 | "grid_column": null, 1345 | "grid_gap": null, 1346 | "grid_row": null, 1347 | "grid_template_areas": null, 1348 | "grid_template_columns": null, 1349 | "grid_template_rows": null, 1350 | "height": null, 1351 | "justify_content": null, 1352 | "justify_items": null, 1353 | "left": null, 1354 | "margin": null, 1355 | "max_height": null, 1356 | "max_width": null, 1357 | "min_height": null, 1358 | "min_width": null, 1359 | "object_fit": null, 1360 | "object_position": null, 1361 | "order": null, 1362 | "overflow": null, 1363 | "overflow_x": null, 1364 | "overflow_y": null, 1365 | "padding": null, 1366 | "right": null, 1367 | "top": null, 1368 | "visibility": null, 1369 | "width": null 1370 | } 1371 | }, 1372 | "4b5f616ec9834a288bb381250f0a0d6c": { 1373 | "model_module": "@jupyter-widgets/controls", 1374 | "model_name": "DescriptionStyleModel", 1375 | "model_module_version": "1.5.0", 1376 | "state": { 1377 | "_model_module": "@jupyter-widgets/controls", 1378 | "_model_module_version": "1.5.0", 1379 | "_model_name": "DescriptionStyleModel", 1380 | "_view_count": null, 1381 | "_view_module": "@jupyter-widgets/base", 1382 | "_view_module_version": "1.2.0", 1383 | "_view_name": "StyleView", 1384 | "description_width": "" 1385 | } 1386 | } 1387 | } 1388 | } 1389 | }, 1390 | "cells": [ 1391 | { 1392 | "cell_type": "markdown", 1393 | "metadata": { 1394 | "id": "view-in-github", 1395 | "colab_type": "text" 1396 | }, 1397 | "source": [ 1398 | "\"Open" 1399 | ] 1400 | }, 1401 | { 1402 | "cell_type": "markdown", 1403 | "source": [ 1404 | "# T5 \n", 1405 | "\n", 1406 | "In this notebook (based on Sinan Ozdemir's [here](https://github.com/sinanuozdemir/oreilly-hands-on-transformers/blob/main/notebooks/t5.ipynb)), we use T5 \"out of the box\" for a broad range of NLP/generation tasks." 1407 | ], 1408 | "metadata": { 1409 | "id": "anGf2wcakFvl" 1410 | } 1411 | }, 1412 | { 1413 | "cell_type": "markdown", 1414 | "source": [ 1415 | "### Load dependencies" 1416 | ], 1417 | "metadata": { 1418 | "id": "kkWAKZR56VvC" 1419 | } 1420 | }, 1421 | { 1422 | "cell_type": "code", 1423 | "execution_count": null, 1424 | "metadata": { 1425 | "id": "-XOUUdCcGM53" 1426 | }, 1427 | "outputs": [], 1428 | "source": [ 1429 | "%%capture\n", 1430 | "!pip install transformers==4.28.0 sentencepiece==0.1.98" 1431 | ] 1432 | }, 1433 | { 1434 | "cell_type": "code", 1435 | "source": [ 1436 | "from transformers import T5ForConditionalGeneration, T5Tokenizer" 1437 | ], 1438 | "metadata": { 1439 | "id": "HWewQPULGZst" 1440 | }, 1441 | "execution_count": null, 1442 | "outputs": [] 1443 | }, 1444 | { 1445 | "cell_type": "markdown", 1446 | "source": [ 1447 | "### Load model" 1448 | ], 1449 | "metadata": { 1450 | "id": "LGRIAGva6lRt" 1451 | } 1452 | }, 1453 | { 1454 | "cell_type": "code", 1455 | "source": [ 1456 | "tokenizer = T5Tokenizer.from_pretrained('t5-base')\n", 1457 | "model = T5ForConditionalGeneration.from_pretrained('t5-base')" 1458 | ], 1459 | "metadata": { 1460 | "id": "V26vaBtmGOrS", 1461 | "colab": { 1462 | "base_uri": "https://localhost:8080/", 1463 | "height": 384, 1464 | "referenced_widgets": [ 1465 | "c6fa0fd0ebf94e27bb09e371f32332b2", 1466 | "c3dd5af567d34df89775c8b6f3024902", 1467 | "64ce73d0751040d6a9cfe5db8c3c89a1", 1468 | "7be68aa3f5754912a53f5e66190287b0", 1469 | "24323e98ab234d21ac01692340fd8759", 1470 | "6f72b74df608453eb79833f3f6792188", 1471 | "9aa58f409e7848428d4fd52eb6d9f19c", 1472 | "655524edf6754f6cb4d923929287a4f5", 1473 | "566f3c18232d413fb3646b955049a23a", 1474 | "6e30cc0c9dda4e5cb4426228adcee4b6", 1475 | "b00f427a5abf4d77a0f792a3b37254b8", 1476 | "a0ec395acb654f19b2ec1f84e0538f88", 1477 | "63194346ee4342d1a1cadf6113aa59b6", 1478 | "f92f34fd76e042079d66a99203ec8629", 1479 | "9c49b384ba784e4685d71af70dce7a24", 1480 | "a9bd050817ea4b5191ea0b2c48eb1743", 1481 | "f53e4483e951450f9962bc34b9b2a723", 1482 | "7f222e156792442c97def1786bc4f13e", 1483 | "1e0f2dafb0c3456bbafb561875558dad", 1484 | "2b4a558e81e349c695e060f1bdf11657", 1485 | "7bb2b713829c406bb134f2f659bc94c8", 1486 | "391e2b2a00af4dd4ac7cdb71f7a04621", 1487 | "2fc86ef6c170481bb9a6292c5bd2cd32", 1488 | "374c97dc59e44edf8ee0a8646e5ad38c", 1489 | "61c4405c5b824232975b480842f495d2", 1490 | "3f34d90ebed54b28bb06cb12c421e7a9", 1491 | "8590f693ba8b4fe6ad3d393e7a0c4483", 1492 | "d7d9a6c8ccbf4f80a6af10985354ca09", 1493 | "b7f3f6532e524c2cb3a792f322681c0f", 1494 | "be9edb4a9d574bf985b3215b79e221c3", 1495 | "da5592d90563447598baaf5117771b6e", 1496 | "fa5cd9e1db5e434eb2df92f2cd4d1af2", 1497 | "868a6813d63b4b85858043fd2a2e985a", 1498 | "6b210e2da4c94769a059f63dfdc27249", 1499 | "439f35ed42b24d25bbfc1e5cc7824213", 1500 | "bfa485804bcf4839ba992f24f7230093", 1501 | "dd5b089068174ceaabf34eacc3447d40", 1502 | "69012ed1dd2748c39e0fd459ff31199d", 1503 | "20fc92997c344b38baef39583bd940e9", 1504 | "f9ce960acc884432b66007add17ff77f", 1505 | "79659d6c3b664ec7a350e1946a5911f4", 1506 | "c9ac5d1a989040a2a9da8438f1580730", 1507 | "3786c6e7e0754801a020d64480fb2146", 1508 | "4b5f616ec9834a288bb381250f0a0d6c" 1509 | ] 1510 | }, 1511 | "outputId": "0c72a9e4-3b2b-4aaf-d383-7a6abd8336b4" 1512 | }, 1513 | "execution_count": null, 1514 | "outputs": [ 1515 | { 1516 | "output_type": "display_data", 1517 | "data": { 1518 | "text/plain": [ 1519 | "Downloading (…)ve/main/spiece.model: 0%| | 0.00/792k [00:00\"Open" 1741 | ] 1742 | }, 1743 | { 1744 | "cell_type": "markdown", 1745 | "source": [ 1746 | "# Tokens\n", 1747 | "\n", 1748 | "In this notebook, we explore tokens." 1749 | ], 1750 | "metadata": { 1751 | "id": "YN46LPl-l54T" 1752 | } 1753 | }, 1754 | { 1755 | "cell_type": "markdown", 1756 | "source": [ 1757 | "### Load dependencies" 1758 | ], 1759 | "metadata": { 1760 | "id": "x4D097ejSDzh" 1761 | } 1762 | }, 1763 | { 1764 | "cell_type": "code", 1765 | "source": [ 1766 | "#%%capture\n", 1767 | "#! pip install transformers==4.28.0" 1768 | ], 1769 | "metadata": { 1770 | "id": "67O5gEnnB4V3" 1771 | }, 1772 | "execution_count": null, 1773 | "outputs": [] 1774 | }, 1775 | { 1776 | "cell_type": "code", 1777 | "execution_count": 1, 1778 | "metadata": { 1779 | "id": "dSEmQMy09mG4" 1780 | }, 1781 | "outputs": [], 1782 | "source": [ 1783 | "from transformers import GPT2Tokenizer" 1784 | ] 1785 | }, 1786 | { 1787 | "cell_type": "markdown", 1788 | "source": [ 1789 | "### Exploring tokens" 1790 | ], 1791 | "metadata": { 1792 | "id": "YqEJxtLiSZz7" 1793 | } 1794 | }, 1795 | { 1796 | "cell_type": "code", 1797 | "source": [ 1798 | "tokenizer = GPT2Tokenizer.from_pretrained('gpt2') # load up a tokenizer" 1799 | ], 1800 | "metadata": { 1801 | "id": "NgU8F2s6CfJz", 1802 | "outputId": "2d2ca2ae-3ba3-4ab9-d0ec-5c5f878a9fb6", 1803 | "colab": { 1804 | "base_uri": "https://localhost:8080/", 1805 | "height": 301, 1806 | "referenced_widgets": [ 1807 | "1e3d228f63f745d6904eee734f31ec58", 1808 | "123b8cbc1ea243e8be3359a66f44853d", 1809 | "f3eb9803d71840eea35550ed2e4c3f7c", 1810 | "13d0d1237fd84734a5bf0419ca1660f1", 1811 | "08eec7e539fb47e6ac46e45084a75432", 1812 | "d4b44b329c054d2e9e5cf5b810f24b5c", 1813 | "47581234de5645ffae556300089e6c8d", 1814 | "1f74e8ae0407416db86cd770ebb0478d", 1815 | "17e1a54011874c509548836a2afe22a8", 1816 | "6f071ede54874567946fa1ceea410643", 1817 | "3959a7ae33814094b130b58c479e2ddc", 1818 | "09ffd65ee27640b6aaaf3317e0c01897", 1819 | "0762cb2b7df14074afa932575b2bb693", 1820 | "4b812279075b4710b6a7c4f8610029ec", 1821 | "2e68e5a727154230bb2ca3468bb34237", 1822 | "3225ca166d904f48a821555c090ec50c", 1823 | "b8573d9eb9704003850922837374e339", 1824 | "b4dd752d18fe40708ff2c8c72575a374", 1825 | "36e32de8486949928e1a2bc1e5014559", 1826 | "9c37c4315b2f4073b19271155e17612c", 1827 | "3eba79365bb94eb3b94a79c4c6c9fdea", 1828 | "8d430eff946b48028e74c6062c719aa9", 1829 | "a56b708e7b564d75bafd3a382950dedc", 1830 | "2f1bcb8fd03d41269986c2c81ae9578f", 1831 | "ecc983bcd9ca450aaaa9dec0a4682939", 1832 | "78eb0a6eea34483daaeea051e3751427", 1833 | "bc55cff200ce42a69b6fbace8a1ca11b", 1834 | "590f17f0a4c5423e84917bfca2744c0f", 1835 | "679bc98152074622bf066bf123dd6151", 1836 | "1c6e6dbffc0a4df3b9ba8923cbc2e01d", 1837 | "53a05daa32db447c992d7dc2ce028582", 1838 | "d3ca21f80d2241d5958ae0d30ea56606", 1839 | "df4aafd427cf420caa06a99ee3f8c05d", 1840 | "b7df07e0b3e2462a91099e401edfca6c", 1841 | "551380a03dcb449caf25bc042085a5ff", 1842 | "26cda15bbfc1464ba768b9bc571721f9", 1843 | "3ea725d03a8b465e9e3d58d84c28eeae", 1844 | "25e8863c7ed64903a73e05d0bccaf28d", 1845 | "6cddf11ccca6430f8fca489304b5483f", 1846 | "0bc845fe34b94b61a74bfc97cc98a0e8", 1847 | "9259e44f49704152bb01670450fd84c0", 1848 | "f6c4f38f82fb4b34a942ebe422553577", 1849 | "3e3a1a99e70a434a8d8c2580a8998cd8", 1850 | "4e0cf2f484bb4a44a274e870a443477a", 1851 | "834d8c2329e6418ea6e42f1972837247", 1852 | "0ed94b0f4c134925a612bbcffa97ebaa", 1853 | "d6aeeb7991654bb88069da7a724d11d1", 1854 | "cecbb151f37a4a16acbadacaff966763", 1855 | "771e017c63a2412988155419a40ee20d", 1856 | "f5b5e2a327c4481f9dd8b9324f590179", 1857 | "3b61437d544648668dc977cfced9a732", 1858 | "7d67054d3b6e4f86a582cee38feef7bb", 1859 | "ce8f14c2c2ae49d3a0533ba60cb82861", 1860 | "98846823ceac493b8af7ad5a7bceef3a", 1861 | "d868f9eed49b4aaea89fc0bafa598b2e" 1862 | ] 1863 | } 1864 | }, 1865 | "execution_count": 2, 1866 | "outputs": [ 1867 | { 1868 | "output_type": "stream", 1869 | "name": "stderr", 1870 | "text": [ 1871 | "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: \n", 1872 | "The secret `HF_TOKEN` does not exist in your Colab secrets.\n", 1873 | "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n", 1874 | "You will be able to reuse this secret in all of your notebooks.\n", 1875 | "Please note that authentication is recommended but still optional to access public models or datasets.\n", 1876 | " warnings.warn(\n" 1877 | ] 1878 | }, 1879 | { 1880 | "output_type": "display_data", 1881 | "data": { 1882 | "text/plain": [ 1883 | "tokenizer_config.json: 0%| | 0.00/26.0 [00:00