├── README.md
├── generate_colab.ipynb
├── image
    └── top.png
└── requirements.txt


/README.md:
--------------------------------------------------------------------------------
 1 | <img src="https://github.com/kunishou/Japanese-Alpaca-LoRA/blob/main/image/top.png" alt="alpaca">
 2 |   
 3 | # 🦙🌲🤏🌸 Japanese-Alpaca-LoRA 🌸
 4 | 日本語に翻訳したStanford Alpacaのデータセットを用いてLLaMAをファインチューニングし作成したLow-Rank AdapterのリンクとGenerateサンプルコード
 5 | 
 6 | ### Japanese-Alpaca-LoRA-7b DEMOページ (期間限定公開)  
 7 | ※ 当初のデモ公開期間は終了しましたが [@_kaiinui](https://twitter.com/_kaiinui) 様のマシンにホスティングしていただき提供を再開いたしました。ご厚意に感謝いたします。
 8 | 
 9 | https://jalpaca.infertron.com/
10 | 
11 | Instruct : 指示を入力  
12 | Input : 付属情報を入力  
13 | Temparature : 生成する回答の多様性度合い    
14 | Beams : 生成する回答の候補数  
15 | Max_tokens : 生成する回答の長さ  
16 | 
17 | 入力例：  
18 | instruct : 次の文章を要約して下さい。  
19 | input : ディープラーニングまたは深層学習とは、対象の全体像から細部までの各々の粒度の概念を階層構造として関連させて学習する手法のことである。コーセラの共同創業者であるアンドリュー・ンによれば、「人工知能への第一歩」という認識は正しいのだという。
20 | 
21 | ### Try the pretrained model using google colab
22 | Google Colabで実行したい場合は以下より（30Bなどの大きいモデルはProプラン以上でA100を使わないと動かないかも）
23 | 
24 | ※ 30BモデルはMax_tokensを128にしないとエラーが出ることがあります。    
25 | 
26 | <a href="https://colab.research.google.com/github/kunishou/Japanese-Alpaca-LoRA/blob/main/generate_colab.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
27 | 
28 | ### LoRA on Hugging Face
29 | Japanese-Alpaca-LoRA 7b, 13B, 30B, 65B
30 | https://huggingface.co/kunishou
31 | 
32 | ### 参考
33 | Stanford Alpaca  
34 | https://github.com/tatsu-lab/stanford_alpaca  
35 | Alpaca-LoRA  
36 | https://github.com/tloen/alpaca-lora  
37 | 
38 | ### スポンサー
39 | [@_kaiinui](https://twitter.com/_kaiinui) 様
40 | 


--------------------------------------------------------------------------------
/generate_colab.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "metadata": {
  3 |     "kernelspec": {
  4 |       "language": "python",
  5 |       "display_name": "Python 3",
  6 |       "name": "python3"
  7 |     },
  8 |     "language_info": {
  9 |       "name": "python",
 10 |       "version": "3.7.12",
 11 |       "mimetype": "text/x-python",
 12 |       "codemirror_mode": {
 13 |         "name": "ipython",
 14 |         "version": 3
 15 |       },
 16 |       "pygments_lexer": "ipython3",
 17 |       "nbconvert_exporter": "python",
 18 |       "file_extension": ".py"
 19 |     },
 20 |     "colab": {
 21 |       "provenance": [],
 22 |       "machine_shape": "hm",
 23 |       "include_colab_link": true
 24 |     },
 25 |     "gpuClass": "premium",
 26 |     "accelerator": "GPU"
 27 |   },
 28 |   "nbformat_minor": 0,
 29 |   "nbformat": 4,
 30 |   "cells": [
 31 |     {
 32 |       "cell_type": "markdown",
 33 |       "metadata": {
 34 |         "id": "view-in-github",
 35 |         "colab_type": "text"
 36 |       },
 37 |       "source": [
 38 |         "<a href=\"https://colab.research.google.com/github/kunishou/Japanese-Alpaca-LoRA/blob/main/generate_colab.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
 39 |       ]
 40 |     },
 41 |     {
 42 |       "cell_type": "markdown",
 43 |       "source": [
 44 |         "#### ※ 30BモデルはMax_tokensを128にしないとエラーが出ることがあります。"
 45 |       ],
 46 |       "metadata": {
 47 |         "id": "6vhnIEzPy4HU"
 48 |       }
 49 |     },
 50 |     {
 51 |       "cell_type": "code",
 52 |       "source": [
 53 |         "!nvidia-smi"
 54 |       ],
 55 |       "metadata": {
 56 |         "id": "REC8UsP7mr0-"
 57 |       },
 58 |       "execution_count": null,
 59 |       "outputs": []
 60 |     },
 61 |     {
 62 |       "cell_type": "code",
 63 |       "source": [
 64 |         "!git clone https://github.com/kunishou/Japanese-Alpaca-LoRA.git\n",
 65 |         "%cd Japanese-Alpaca-LoRA"
 66 |       ],
 67 |       "metadata": {
 68 |         "id": "oumckS9eRAUb"
 69 |       },
 70 |       "execution_count": null,
 71 |       "outputs": []
 72 |     },
 73 |     {
 74 |       "cell_type": "code",
 75 |       "source": [
 76 |         "# パッケージのインストール\n",
 77 |         "!pip install -r requirements.txt"
 78 |       ],
 79 |       "metadata": {
 80 |         "id": "lTSBldIT7Y96"
 81 |       },
 82 |       "execution_count": null,
 83 |       "outputs": []
 84 |     },
 85 |     {
 86 |       "cell_type": "code",
 87 |       "source": [
 88 |         "import torch\n",
 89 |         "from peft import PeftModel\n",
 90 |         "import transformers\n",
 91 |         "import gradio as gr\n",
 92 |         "\n",
 93 |         "assert (\n",
 94 |         "    \"LlamaTokenizer\" in transformers._import_structure[\"models.llama\"]\n",
 95 |         "), \"LLaMA is now in HuggingFace's main branch.\\nPlease reinstall it: pip uninstall transformers && pip install git+https://github.com/huggingface/transformers.git\"\n",
 96 |         "from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig\n",
 97 |         "\n",
 98 |         "# colab pro以上でのプランでA100を使用しないと動かないかも\n",
 99 |         "\n",
100 |         "BASE_MODEL = \"decapoda-research/llama-7b-hf\"\n",
101 |         "# BASE_MODEL = \"decapoda-research/llama-13b-hf\"\n",
102 |         "# BASE_MODEL = \"decapoda-research/llama-30b-hf\"\n",
103 |         "# BASE_MODEL = \"decapoda-research/llama-65b-hf\"\n",
104 |         "\n",
105 |         "tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL,device_map={'': 0})\n",
106 |         "\n",
107 |         "LORA_WEIGHTS = \"kunishou/Japanese-Alpaca-LoRA-7b-v0\"\n",
108 |         "# LORA_WEIGHTS =\"kunishou/Japanese-Alpaca-LoRA-13b-v0\"\n",
109 |         "# LORA_WEIGHTS = \"kunishou/Japanese-Alpaca-LoRA-30b-v0\"\n",
110 |         "# LORA_WEIGHTS = \"kunishou/Japanese-Alpaca-LoRA-65b-v0\"\n",
111 |         "\n",
112 |         "if BASE_MODEL == \"decapoda-research/llama-7b-hf\":\n",
113 |         "  model_param = \"7B\"\n",
114 |         "elif BASE_MODEL == \"decapoda-research/llama-13b-hf\":\n",
115 |         "  model_param = \"13B\"\n",
116 |         "elif BASE_MODEL == \"decapoda-research/llama-30b-hf\":\n",
117 |         "  model_param = \"30B\"\n",
118 |         "else:\n",
119 |         "  model_param = \"65B\"\n",
120 |         "\n",
121 |         "if torch.cuda.is_available():\n",
122 |         "    device = \"cuda\"\n",
123 |         "else:\n",
124 |         "    device = \"cpu\"\n",
125 |         "\n",
126 |         "try:\n",
127 |         "    if torch.backends.mps.is_available():\n",
128 |         "        device = \"mps\"\n",
129 |         "except:\n",
130 |         "    pass\n",
131 |         "\n",
132 |         "if device == \"cuda\":\n",
133 |         "    model = LlamaForCausalLM.from_pretrained(\n",
134 |         "        BASE_MODEL,\n",
135 |         "        load_in_8bit=True,\n",
136 |         "        torch_dtype=torch.float16,\n",
137 |         "        # device_map=\"auto\",\n",
138 |         "        device_map={'': 0},\n",
139 |         "    )\n",
140 |         "    model = PeftModel.from_pretrained(model, LORA_WEIGHTS, torch_dtype=torch.float16, device_map={'': 0},)\n",
141 |         "elif device == \"mps\":\n",
142 |         "    model = LlamaForCausalLM.from_pretrained(\n",
143 |         "        BASE_MODEL,\n",
144 |         "        # device_map={\"\": device},\n",
145 |         "        device_map={'': 0},\n",
146 |         "        torch_dtype=torch.float16,\n",
147 |         "    )\n",
148 |         "    model = PeftModel.from_pretrained(\n",
149 |         "        model,\n",
150 |         "        LORA_WEIGHTS,\n",
151 |         "        # device_map={\"\": device},\n",
152 |         "        device_map={'': 0},\n",
153 |         "        torch_dtype=torch.float16,\n",
154 |         "    )\n",
155 |         "else:\n",
156 |         "    model = LlamaForCausalLM.from_pretrained(\n",
157 |         "        BASE_MODEL,\n",
158 |         "        # device_map={\"\": device},\n",
159 |         "        device_map={'': 0},\n",
160 |         "        low_cpu_mem_usage=True\n",
161 |         "    )\n",
162 |         "    model = PeftModel.from_pretrained(\n",
163 |         "        model,\n",
164 |         "        LORA_WEIGHTS,\n",
165 |         "        # device_map={\"\": device},\n",
166 |         "        device_map={'': 0},\n",
167 |         "    )\n",
168 |         "\n",
169 |         "\n",
170 |         "def generate_prompt(instruction, input=None):\n",
171 |         "    if input:\n",
172 |         "        return f\"\"\"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
173 |         "### Instruction:\n",
174 |         "{instruction}\n",
175 |         "### Input:\n",
176 |         "{input}\n",
177 |         "### Response:\"\"\"\n",
178 |         "    else:\n",
179 |         "        return f\"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
180 |         "### Instruction:\n",
181 |         "{instruction}\n",
182 |         "### Response:\"\"\"\n",
183 |         "\n",
184 |         "\n",
185 |         "model.eval()\n",
186 |         "if torch.__version__ >= \"2\":\n",
187 |         "    model = torch.compile(model)\n",
188 |         "\n",
189 |         "\n",
190 |         "def evaluate(\n",
191 |         "    instruction,\n",
192 |         "    input=None,\n",
193 |         "    temperature=0.1,\n",
194 |         "    top_p=0.75,\n",
195 |         "    top_k=40,\n",
196 |         "    num_beams=4,\n",
197 |         "    max_new_tokens=256,\n",
198 |         "    **kwargs,\n",
199 |         "):\n",
200 |         "    prompt = generate_prompt(instruction, input)\n",
201 |         "    inputs = tokenizer(prompt, return_tensors=\"pt\")\n",
202 |         "    input_ids = inputs[\"input_ids\"].to(device)\n",
203 |         "    generation_config = GenerationConfig(\n",
204 |         "        temperature=temperature,\n",
205 |         "        top_p=top_p,\n",
206 |         "        top_k=top_k,\n",
207 |         "        num_beams=num_beams,\n",
208 |         "        no_repeat_ngram_size=3,\n",
209 |         "        **kwargs,\n",
210 |         "    )\n",
211 |         "\n",
212 |         "    with torch.no_grad():\n",
213 |         "        generation_output = model.generate(\n",
214 |         "            input_ids=input_ids,\n",
215 |         "            generation_config=generation_config,\n",
216 |         "            return_dict_in_generate=True,\n",
217 |         "            output_scores=True,\n",
218 |         "            max_new_tokens=max_new_tokens,\n",
219 |         "        )\n",
220 |         "    s = generation_output.sequences[0]\n",
221 |         "    output = tokenizer.decode(s)\n",
222 |         "    return output.split(\"### Response:\")[1].strip()\n",
223 |         "\n",
224 |         "\n",
225 |         "gr.Interface(\n",
226 |         "    fn=evaluate,\n",
227 |         "    inputs=[\n",
228 |         "        gr.components.Textbox(\n",
229 |         "            lines=2, label=\"Instruction\", placeholder=\"Tell me about alpacas.\"\n",
230 |         "        ),\n",
231 |         "        gr.components.Textbox(lines=2, label=\"Input\", placeholder=\"none\"),\n",
232 |         "        gr.components.Slider(minimum=0, maximum=1, value=0.1, label=\"Temperature\"),\n",
233 |         "        # gr.components.Slider(minimum=0, maximum=1, value=0.75, label=\"Top p\"),\n",
234 |         "        # gr.components.Slider(minimum=0, maximum=100, step=1, value=40, label=\"Top k\"),\n",
235 |         "        gr.components.Slider(minimum=1, maximum=4, step=1, value=4, label=\"Beams\"),\n",
236 |         "        gr.components.Slider(\n",
237 |         "            minimum=1, maximum=512, step=1, value=256, label=\"Max tokens\"\n",
238 |         "        ),\n",
239 |         "    ],\n",
240 |         "    outputs=[\n",
241 |         "        gr.inputs.Textbox(\n",
242 |         "            lines=8,\n",
243 |         "            label=\"Output\",\n",
244 |         "        )\n",
245 |         "    ],\n",
246 |         "    title=f\"🦙🌲🌸 Japanese-Alpaca-LoRA-{model_param}🌸\",\n",
247 |         "    description=f\"Alpaca-LoRA is a {model_param}-parameter LLaMA model finetuned to follow instructions. It is trained on the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset and makes use of the Huggingface LLaMA implementation. For more information, please visit [the project's website](https://github.com/tloen/alpaca-lora).\",\n",
248 |         ").launch(inline=False ,share=True)"
249 |       ],
250 |       "metadata": {
251 |         "id": "bAl6_6PrizMb"
252 |       },
253 |       "execution_count": null,
254 |       "outputs": []
255 |     },
256 |     {
257 |       "cell_type": "code",
258 |       "source": [],
259 |       "metadata": {
260 |         "id": "W8hRrkPcO3JR"
261 |       },
262 |       "execution_count": null,
263 |       "outputs": []
264 |     }
265 |   ]
266 | }


--------------------------------------------------------------------------------
/image/top.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kunishou/Japanese-Alpaca-LoRA/5bc99f9f83184531fa65fbe04a933af1398fa3b9/image/top.png


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | datasets
 2 | loralib
 3 | sentencepiece
 4 | git+https://github.com/huggingface/transformers.git
 5 | accelerate
 6 | bitsandbytes
 7 | git+https://github.com/huggingface/peft.git
 8 | gradio
 9 | appdirs
10 | 


--------------------------------------------------------------------------------