├── README.md
├── 机器学习面试常见问题100道.pdf
└── 李沐DeepLearning
    ├── GRU门控循环单元.ipynb
    ├── LSTM长短期记忆网络.ipynb
    ├── README.md
    ├── RNN循环神经网络.ipynb
    ├── RNN循环神经网络的简洁实现.ipynb
    ├── seq2seq序列到序列.ipynb
    ├── 图片
        ├── 文本预处理
        │   ├── result1.png
        │   └── result2.png
        └── 语言模型
        │   ├── fig1.png
        │   ├── fig2.png
        │   ├── fig3.png
        │   ├── fig4.png
        │   └── fig5.png
    ├── 数据处理.ipynb
    ├── 文本预处理
        ├── README.md
        └── 文本预处理.ipynb
    ├── 时序模型.ipynb
    ├── 机器翻译数据集.ipynb
    ├── 线性回归.ipynb
    ├── 自动求导.ipynb
    └── 语言模型
        ├── README.md
        └── 语言模型.ipynb


/README.md:
--------------------------------------------------------------------------------
1 | # Machine_Learning
2 | 
3 | This repository contains my machine learning materials
4 | 


--------------------------------------------------------------------------------
/机器学习面试常见问题100道.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/机器学习面试常见问题100道.pdf


--------------------------------------------------------------------------------
/李沐DeepLearning/README.md:
--------------------------------------------------------------------------------
1 | ## 本文件夹包含了B站李沐老师的深度学习部分课程的代码（主要是NLP部分）
2 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/RNN循环神经网络的简洁实现.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "code",
   5 |    "execution_count": 1,
   6 |    "id": "8b907494",
   7 |    "metadata": {},
   8 |    "outputs": [],
   9 |    "source": [
  10 |     "import torch\n",
  11 |     "from torch import nn\n",
  12 |     "from torch.nn import functional as F\n",
  13 |     "from d2l import torch as d2l\n",
  14 |     "\n",
  15 |     "%matplotlib inline"
  16 |    ]
  17 |   },
  18 |   {
  19 |    "cell_type": "code",
  20 |    "execution_count": 2,
  21 |    "id": "53e41841",
  22 |    "metadata": {},
  23 |    "outputs": [],
  24 |    "source": [
  25 |     "batch_size, num_steps = 32, 35\n",
  26 |     "train_iter, vocab = d2l.load_data_time_machine(batch_size, num_steps)"
  27 |    ]
  28 |   },
  29 |   {
  30 |    "cell_type": "markdown",
  31 |    "id": "58b4c441",
  32 |    "metadata": {},
  33 |    "source": [
  34 |     "首先需要构造一个具有256个隐藏单元的单隐藏层"
  35 |    ]
  36 |   },
  37 |   {
  38 |    "cell_type": "code",
  39 |    "execution_count": 3,
  40 |    "id": "7c7cc596",
  41 |    "metadata": {},
  42 |    "outputs": [],
  43 |    "source": [
  44 |     "num_hiddens = 256\n",
  45 |     "rnn_layer = nn.RNN(len(vocab), num_hiddens)"
  46 |    ]
  47 |   },
  48 |   {
  49 |    "cell_type": "markdown",
  50 |    "id": "76b89ea0",
  51 |    "metadata": {},
  52 |    "source": [
  53 |     "使用张量来初始化隐状态，隐状态的形状为（隐藏层数，批量大小，隐藏单元数）"
  54 |    ]
  55 |   },
  56 |   {
  57 |    "cell_type": "code",
  58 |    "execution_count": 4,
  59 |    "id": "0b1b6f52",
  60 |    "metadata": {},
  61 |    "outputs": [
  62 |     {
  63 |      "data": {
  64 |       "text/plain": [
  65 |        "torch.Size([1, 32, 256])"
  66 |       ]
  67 |      },
  68 |      "execution_count": 4,
  69 |      "metadata": {},
  70 |      "output_type": "execute_result"
  71 |     }
  72 |    ],
  73 |    "source": [
  74 |     "state = torch.zeros(1, batch_size, num_hiddens)\n",
  75 |     "state.shape"
  76 |    ]
  77 |   },
  78 |   {
  79 |    "cell_type": "markdown",
  80 |    "id": "46a9a508",
  81 |    "metadata": {},
  82 |    "source": [
  83 |     "现在，通过一个输入和一个隐状态，就能算出往后的隐状态，以及使用隐状态来计算输出。"
  84 |    ]
  85 |   },
  86 |   {
  87 |    "cell_type": "code",
  88 |    "execution_count": 6,
  89 |    "id": "9cef140a",
  90 |    "metadata": {},
  91 |    "outputs": [
  92 |     {
  93 |      "data": {
  94 |       "text/plain": [
  95 |        "(torch.Size([35, 32, 256]), torch.Size([1, 32, 256]))"
  96 |       ]
  97 |      },
  98 |      "execution_count": 6,
  99 |      "metadata": {},
 100 |      "output_type": "execute_result"
 101 |     }
 102 |    ],
 103 |    "source": [
 104 |     "X = torch.rand(size=(num_steps, batch_size, len(vocab)))\n",
 105 |     "# Y_hiddens为单个隐藏层的输出\n",
 106 |     "Y_hiddens, new_state = rnn_layer(X, state)\n",
 107 |     "Y_hiddens.shape, new_state.shape"
 108 |    ]
 109 |   },
 110 |   {
 111 |    "cell_type": "markdown",
 112 |    "id": "8947c76a",
 113 |    "metadata": {},
 114 |    "source": [
 115 |     "接着为完整的RNN模型构造一个类"
 116 |    ]
 117 |   },
 118 |   {
 119 |    "cell_type": "code",
 120 |    "execution_count": 7,
 121 |    "id": "d983a7fb",
 122 |    "metadata": {},
 123 |    "outputs": [],
 124 |    "source": [
 125 |     "class RNNModel(nn.Module):\n",
 126 |     "    def __init__(self, rnn_layer, vocab_size, **kwargs):\n",
 127 |     "        super(RNNModel, self).__init__(**kwargs)\n",
 128 |     "        self.rnn = rnn_layer\n",
 129 |     "        self.vocab_size = vocab_size\n",
 130 |     "        self.num_hiddens = self.rnn.hidden_size\n",
 131 |     "        if not self.rnn.bidirectional:\n",
 132 |     "            self.num_direction = 1\n",
 133 |     "            self.linear = nn.Linear(self.num_hiddens, self.vocab_size)\n",
 134 |     "        else:\n",
 135 |     "            self.num_direction = 2\n",
 136 |     "            self.linear = nn.Linear(self.num_hiddens*2, self.vocab_size)\n",
 137 |     "    \n",
 138 |     "    def forward(self, inputs, state):\n",
 139 |     "        X = F.one_hot(inputs.T.long(), self.vocab_size).type(torch.float32)\n",
 140 |     "        # Y为单个隐藏层的输出\n",
 141 |     "        Y, state = self.rnn(X, state)\n",
 142 |     "        # 使用全连接层将输出形状改成（时间步长*批量大小，隐藏单元数）\n",
 143 |     "        outputs = self.linear(Y.reshape(-1, Y.shape[-1]))\n",
 144 |     "        return outputs, state\n",
 145 |     "    \n",
 146 |     "    def begin_state(self, device, batch_size=1):\n",
 147 |     "        if not isinstance(self.rnn, nn.LSTM):\n",
 148 |     "            # nn.GRU以张量为隐状态\n",
 149 |     "            return torch.zeros((self.num_direction*self.rnn.num_layers, batch_size, self.num_hiddens), device=device)\n",
 150 |     "        else:\n",
 151 |     "            # nn.LSTM以元组为隐状态\n",
 152 |     "            return (torch.zeros((self.num_direction*self.rnn.num_layers, batch_size, self.num_hiddens), device=device),\n",
 153 |     "                    torch.zeros((self.num_direction*self.rnn.num_layers, batch_size, self.num_hiddens), device=device))"
 154 |    ]
 155 |   },
 156 |   {
 157 |    "cell_type": "markdown",
 158 |    "id": "0c552f24",
 159 |    "metadata": {},
 160 |    "source": [
 161 |     "先使用具有随机权重的模型进行预测"
 162 |    ]
 163 |   },
 164 |   {
 165 |    "cell_type": "code",
 166 |    "execution_count": 8,
 167 |    "id": "53c409cd",
 168 |    "metadata": {},
 169 |    "outputs": [
 170 |     {
 171 |      "data": {
 172 |       "text/plain": [
 173 |        "'time travelleruttttttttt'"
 174 |       ]
 175 |      },
 176 |      "execution_count": 8,
 177 |      "metadata": {},
 178 |      "output_type": "execute_result"
 179 |     }
 180 |    ],
 181 |    "source": [
 182 |     "device = d2l.try_gpu()\n",
 183 |     "net = RNNModel(rnn_layer, vocab_size=len(vocab))\n",
 184 |     "net = net.to(device)\n",
 185 |     "d2l.predict_ch8('time traveller', 10, net, vocab, device)"
 186 |    ]
 187 |   },
 188 |   {
 189 |    "cell_type": "markdown",
 190 |    "id": "2dd71cec",
 191 |    "metadata": {},
 192 |    "source": [
 193 |     "接着，使用训练函数进行预测"
 194 |    ]
 195 |   },
 196 |   {
 197 |    "cell_type": "code",
 198 |    "execution_count": 9,
 199 |    "id": "ecc58cd7",
 200 |    "metadata": {},
 201 |    "outputs": [
 202 |     {
 203 |      "name": "stdout",
 204 |      "output_type": "stream",
 205 |      "text": [
 206 |       "perplexity 1.3, 289806.6 tokens/sec on cuda:0\n",
 207 |       "time traveller hald was an youmpand the time travellerit would b\n",
 208 |       "traveller fof no war fot mathematicians have it isspoken of\n"
 209 |      ]
 210 |     },
 211 |     {
 212 |      "data": {
 213 |       "image/svg+xml": [
 214 |        "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
 215 |        "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
 216 |        "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
 217 |        "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"252.646875pt\" height=\"183.35625pt\" viewBox=\"0 0 252.646875 183.35625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
 218 |        " <metadata>\n",
 219 |        "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
 220 |        "   <cc:Work>\n",
 221 |        "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
 222 |        "    <dc:date>2023-05-18T23:43:02.716421</dc:date>\n",
 223 |        "    <dc:format>image/svg+xml</dc:format>\n",
 224 |        "    <dc:creator>\n",
 225 |        "     <cc:Agent>\n",
 226 |        "      <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
 227 |        "     </cc:Agent>\n",
 228 |        "    </dc:creator>\n",
 229 |        "   </cc:Work>\n",
 230 |        "  </rdf:RDF>\n",
 231 |        " </metadata>\n",
 232 |        " <defs>\n",
 233 |        "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
 234 |        " </defs>\n",
 235 |        " <g id=\"figure_1\">\n",
 236 |        "  <g id=\"patch_1\">\n",
 237 |        "   <path d=\"M 0 183.35625 \n",
 238 |        "L 252.646875 183.35625 \n",
 239 |        "L 252.646875 0 \n",
 240 |        "L 0 0 \n",
 241 |        "z\n",
 242 |        "\" style=\"fill: #ffffff\"/>\n",
 243 |        "  </g>\n",
 244 |        "  <g id=\"axes_1\">\n",
 245 |        "   <g id=\"patch_2\">\n",
 246 |        "    <path d=\"M 40.603125 145.8 \n",
 247 |        "L 235.903125 145.8 \n",
 248 |        "L 235.903125 7.2 \n",
 249 |        "L 40.603125 7.2 \n",
 250 |        "z\n",
 251 |        "\" style=\"fill: #ffffff\"/>\n",
 252 |        "   </g>\n",
 253 |        "   <g id=\"matplotlib.axis_1\">\n",
 254 |        "    <g id=\"xtick_1\">\n",
 255 |        "     <g id=\"line2d_1\">\n",
 256 |        "      <path d=\"M 76.474554 145.8 \n",
 257 |        "L 76.474554 7.2 \n",
 258 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 259 |        "     </g>\n",
 260 |        "     <g id=\"line2d_2\">\n",
 261 |        "      <defs>\n",
 262 |        "       <path id=\"m9008aa0555\" d=\"M 0 0 \n",
 263 |        "L 0 3.5 \n",
 264 |        "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 265 |        "      </defs>\n",
 266 |        "      <g>\n",
 267 |        "       <use xlink:href=\"#m9008aa0555\" x=\"76.474554\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 268 |        "      </g>\n",
 269 |        "     </g>\n",
 270 |        "     <g id=\"text_1\">\n",
 271 |        "      <!-- 100 -->\n",
 272 |        "      <g transform=\"translate(66.930804 160.398438)scale(0.1 -0.1)\">\n",
 273 |        "       <defs>\n",
 274 |        "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
 275 |        "L 1825 531 \n",
 276 |        "L 1825 4091 \n",
 277 |        "L 703 3866 \n",
 278 |        "L 703 4441 \n",
 279 |        "L 1819 4666 \n",
 280 |        "L 2450 4666 \n",
 281 |        "L 2450 531 \n",
 282 |        "L 3481 531 \n",
 283 |        "L 3481 0 \n",
 284 |        "L 794 0 \n",
 285 |        "L 794 531 \n",
 286 |        "z\n",
 287 |        "\" transform=\"scale(0.015625)\"/>\n",
 288 |        "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
 289 |        "Q 1547 4250 1301 3770 \n",
 290 |        "Q 1056 3291 1056 2328 \n",
 291 |        "Q 1056 1369 1301 889 \n",
 292 |        "Q 1547 409 2034 409 \n",
 293 |        "Q 2525 409 2770 889 \n",
 294 |        "Q 3016 1369 3016 2328 \n",
 295 |        "Q 3016 3291 2770 3770 \n",
 296 |        "Q 2525 4250 2034 4250 \n",
 297 |        "z\n",
 298 |        "M 2034 4750 \n",
 299 |        "Q 2819 4750 3233 4129 \n",
 300 |        "Q 3647 3509 3647 2328 \n",
 301 |        "Q 3647 1150 3233 529 \n",
 302 |        "Q 2819 -91 2034 -91 \n",
 303 |        "Q 1250 -91 836 529 \n",
 304 |        "Q 422 1150 422 2328 \n",
 305 |        "Q 422 3509 836 4129 \n",
 306 |        "Q 1250 4750 2034 4750 \n",
 307 |        "z\n",
 308 |        "\" transform=\"scale(0.015625)\"/>\n",
 309 |        "       </defs>\n",
 310 |        "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
 311 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 312 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 313 |        "      </g>\n",
 314 |        "     </g>\n",
 315 |        "    </g>\n",
 316 |        "    <g id=\"xtick_2\">\n",
 317 |        "     <g id=\"line2d_3\">\n",
 318 |        "      <path d=\"M 116.331696 145.8 \n",
 319 |        "L 116.331696 7.2 \n",
 320 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 321 |        "     </g>\n",
 322 |        "     <g id=\"line2d_4\">\n",
 323 |        "      <g>\n",
 324 |        "       <use xlink:href=\"#m9008aa0555\" x=\"116.331696\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 325 |        "      </g>\n",
 326 |        "     </g>\n",
 327 |        "     <g id=\"text_2\">\n",
 328 |        "      <!-- 200 -->\n",
 329 |        "      <g transform=\"translate(106.787946 160.398438)scale(0.1 -0.1)\">\n",
 330 |        "       <defs>\n",
 331 |        "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
 332 |        "L 3431 531 \n",
 333 |        "L 3431 0 \n",
 334 |        "L 469 0 \n",
 335 |        "L 469 531 \n",
 336 |        "Q 828 903 1448 1529 \n",
 337 |        "Q 2069 2156 2228 2338 \n",
 338 |        "Q 2531 2678 2651 2914 \n",
 339 |        "Q 2772 3150 2772 3378 \n",
 340 |        "Q 2772 3750 2511 3984 \n",
 341 |        "Q 2250 4219 1831 4219 \n",
 342 |        "Q 1534 4219 1204 4116 \n",
 343 |        "Q 875 4013 500 3803 \n",
 344 |        "L 500 4441 \n",
 345 |        "Q 881 4594 1212 4672 \n",
 346 |        "Q 1544 4750 1819 4750 \n",
 347 |        "Q 2544 4750 2975 4387 \n",
 348 |        "Q 3406 4025 3406 3419 \n",
 349 |        "Q 3406 3131 3298 2873 \n",
 350 |        "Q 3191 2616 2906 2266 \n",
 351 |        "Q 2828 2175 2409 1742 \n",
 352 |        "Q 1991 1309 1228 531 \n",
 353 |        "z\n",
 354 |        "\" transform=\"scale(0.015625)\"/>\n",
 355 |        "       </defs>\n",
 356 |        "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
 357 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 358 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 359 |        "      </g>\n",
 360 |        "     </g>\n",
 361 |        "    </g>\n",
 362 |        "    <g id=\"xtick_3\">\n",
 363 |        "     <g id=\"line2d_5\">\n",
 364 |        "      <path d=\"M 156.188839 145.8 \n",
 365 |        "L 156.188839 7.2 \n",
 366 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 367 |        "     </g>\n",
 368 |        "     <g id=\"line2d_6\">\n",
 369 |        "      <g>\n",
 370 |        "       <use xlink:href=\"#m9008aa0555\" x=\"156.188839\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 371 |        "      </g>\n",
 372 |        "     </g>\n",
 373 |        "     <g id=\"text_3\">\n",
 374 |        "      <!-- 300 -->\n",
 375 |        "      <g transform=\"translate(146.645089 160.398438)scale(0.1 -0.1)\">\n",
 376 |        "       <defs>\n",
 377 |        "        <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
 378 |        "Q 3050 2419 3304 2112 \n",
 379 |        "Q 3559 1806 3559 1356 \n",
 380 |        "Q 3559 666 3084 287 \n",
 381 |        "Q 2609 -91 1734 -91 \n",
 382 |        "Q 1441 -91 1130 -33 \n",
 383 |        "Q 819 25 488 141 \n",
 384 |        "L 488 750 \n",
 385 |        "Q 750 597 1062 519 \n",
 386 |        "Q 1375 441 1716 441 \n",
 387 |        "Q 2309 441 2620 675 \n",
 388 |        "Q 2931 909 2931 1356 \n",
 389 |        "Q 2931 1769 2642 2001 \n",
 390 |        "Q 2353 2234 1838 2234 \n",
 391 |        "L 1294 2234 \n",
 392 |        "L 1294 2753 \n",
 393 |        "L 1863 2753 \n",
 394 |        "Q 2328 2753 2575 2939 \n",
 395 |        "Q 2822 3125 2822 3475 \n",
 396 |        "Q 2822 3834 2567 4026 \n",
 397 |        "Q 2313 4219 1838 4219 \n",
 398 |        "Q 1578 4219 1281 4162 \n",
 399 |        "Q 984 4106 628 3988 \n",
 400 |        "L 628 4550 \n",
 401 |        "Q 988 4650 1302 4700 \n",
 402 |        "Q 1616 4750 1894 4750 \n",
 403 |        "Q 2613 4750 3031 4423 \n",
 404 |        "Q 3450 4097 3450 3541 \n",
 405 |        "Q 3450 3153 3228 2886 \n",
 406 |        "Q 3006 2619 2597 2516 \n",
 407 |        "z\n",
 408 |        "\" transform=\"scale(0.015625)\"/>\n",
 409 |        "       </defs>\n",
 410 |        "       <use xlink:href=\"#DejaVuSans-33\"/>\n",
 411 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 412 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 413 |        "      </g>\n",
 414 |        "     </g>\n",
 415 |        "    </g>\n",
 416 |        "    <g id=\"xtick_4\">\n",
 417 |        "     <g id=\"line2d_7\">\n",
 418 |        "      <path d=\"M 196.045982 145.8 \n",
 419 |        "L 196.045982 7.2 \n",
 420 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 421 |        "     </g>\n",
 422 |        "     <g id=\"line2d_8\">\n",
 423 |        "      <g>\n",
 424 |        "       <use xlink:href=\"#m9008aa0555\" x=\"196.045982\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 425 |        "      </g>\n",
 426 |        "     </g>\n",
 427 |        "     <g id=\"text_4\">\n",
 428 |        "      <!-- 400 -->\n",
 429 |        "      <g transform=\"translate(186.502232 160.398438)scale(0.1 -0.1)\">\n",
 430 |        "       <defs>\n",
 431 |        "        <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
 432 |        "L 825 1625 \n",
 433 |        "L 2419 1625 \n",
 434 |        "L 2419 4116 \n",
 435 |        "z\n",
 436 |        "M 2253 4666 \n",
 437 |        "L 3047 4666 \n",
 438 |        "L 3047 1625 \n",
 439 |        "L 3713 1625 \n",
 440 |        "L 3713 1100 \n",
 441 |        "L 3047 1100 \n",
 442 |        "L 3047 0 \n",
 443 |        "L 2419 0 \n",
 444 |        "L 2419 1100 \n",
 445 |        "L 313 1100 \n",
 446 |        "L 313 1709 \n",
 447 |        "L 2253 4666 \n",
 448 |        "z\n",
 449 |        "\" transform=\"scale(0.015625)\"/>\n",
 450 |        "       </defs>\n",
 451 |        "       <use xlink:href=\"#DejaVuSans-34\"/>\n",
 452 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 453 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 454 |        "      </g>\n",
 455 |        "     </g>\n",
 456 |        "    </g>\n",
 457 |        "    <g id=\"xtick_5\">\n",
 458 |        "     <g id=\"line2d_9\">\n",
 459 |        "      <path d=\"M 235.903125 145.8 \n",
 460 |        "L 235.903125 7.2 \n",
 461 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 462 |        "     </g>\n",
 463 |        "     <g id=\"line2d_10\">\n",
 464 |        "      <g>\n",
 465 |        "       <use xlink:href=\"#m9008aa0555\" x=\"235.903125\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 466 |        "      </g>\n",
 467 |        "     </g>\n",
 468 |        "     <g id=\"text_5\">\n",
 469 |        "      <!-- 500 -->\n",
 470 |        "      <g transform=\"translate(226.359375 160.398438)scale(0.1 -0.1)\">\n",
 471 |        "       <defs>\n",
 472 |        "        <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
 473 |        "L 3169 4666 \n",
 474 |        "L 3169 4134 \n",
 475 |        "L 1269 4134 \n",
 476 |        "L 1269 2991 \n",
 477 |        "Q 1406 3038 1543 3061 \n",
 478 |        "Q 1681 3084 1819 3084 \n",
 479 |        "Q 2600 3084 3056 2656 \n",
 480 |        "Q 3513 2228 3513 1497 \n",
 481 |        "Q 3513 744 3044 326 \n",
 482 |        "Q 2575 -91 1722 -91 \n",
 483 |        "Q 1428 -91 1123 -41 \n",
 484 |        "Q 819 9 494 109 \n",
 485 |        "L 494 744 \n",
 486 |        "Q 775 591 1075 516 \n",
 487 |        "Q 1375 441 1709 441 \n",
 488 |        "Q 2250 441 2565 725 \n",
 489 |        "Q 2881 1009 2881 1497 \n",
 490 |        "Q 2881 1984 2565 2268 \n",
 491 |        "Q 2250 2553 1709 2553 \n",
 492 |        "Q 1456 2553 1204 2497 \n",
 493 |        "Q 953 2441 691 2322 \n",
 494 |        "L 691 4666 \n",
 495 |        "z\n",
 496 |        "\" transform=\"scale(0.015625)\"/>\n",
 497 |        "       </defs>\n",
 498 |        "       <use xlink:href=\"#DejaVuSans-35\"/>\n",
 499 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 500 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 501 |        "      </g>\n",
 502 |        "     </g>\n",
 503 |        "    </g>\n",
 504 |        "    <g id=\"text_6\">\n",
 505 |        "     <!-- epoch -->\n",
 506 |        "     <g transform=\"translate(123.025 174.076563)scale(0.1 -0.1)\">\n",
 507 |        "      <defs>\n",
 508 |        "       <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
 509 |        "L 3597 1613 \n",
 510 |        "L 953 1613 \n",
 511 |        "Q 991 1019 1311 708 \n",
 512 |        "Q 1631 397 2203 397 \n",
 513 |        "Q 2534 397 2845 478 \n",
 514 |        "Q 3156 559 3463 722 \n",
 515 |        "L 3463 178 \n",
 516 |        "Q 3153 47 2828 -22 \n",
 517 |        "Q 2503 -91 2169 -91 \n",
 518 |        "Q 1331 -91 842 396 \n",
 519 |        "Q 353 884 353 1716 \n",
 520 |        "Q 353 2575 817 3079 \n",
 521 |        "Q 1281 3584 2069 3584 \n",
 522 |        "Q 2775 3584 3186 3129 \n",
 523 |        "Q 3597 2675 3597 1894 \n",
 524 |        "z\n",
 525 |        "M 3022 2063 \n",
 526 |        "Q 3016 2534 2758 2815 \n",
 527 |        "Q 2500 3097 2075 3097 \n",
 528 |        "Q 1594 3097 1305 2825 \n",
 529 |        "Q 1016 2553 972 2059 \n",
 530 |        "L 3022 2063 \n",
 531 |        "z\n",
 532 |        "\" transform=\"scale(0.015625)\"/>\n",
 533 |        "       <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
 534 |        "L 1159 -1331 \n",
 535 |        "L 581 -1331 \n",
 536 |        "L 581 3500 \n",
 537 |        "L 1159 3500 \n",
 538 |        "L 1159 2969 \n",
 539 |        "Q 1341 3281 1617 3432 \n",
 540 |        "Q 1894 3584 2278 3584 \n",
 541 |        "Q 2916 3584 3314 3078 \n",
 542 |        "Q 3713 2572 3713 1747 \n",
 543 |        "Q 3713 922 3314 415 \n",
 544 |        "Q 2916 -91 2278 -91 \n",
 545 |        "Q 1894 -91 1617 61 \n",
 546 |        "Q 1341 213 1159 525 \n",
 547 |        "z\n",
 548 |        "M 3116 1747 \n",
 549 |        "Q 3116 2381 2855 2742 \n",
 550 |        "Q 2594 3103 2138 3103 \n",
 551 |        "Q 1681 3103 1420 2742 \n",
 552 |        "Q 1159 2381 1159 1747 \n",
 553 |        "Q 1159 1113 1420 752 \n",
 554 |        "Q 1681 391 2138 391 \n",
 555 |        "Q 2594 391 2855 752 \n",
 556 |        "Q 3116 1113 3116 1747 \n",
 557 |        "z\n",
 558 |        "\" transform=\"scale(0.015625)\"/>\n",
 559 |        "       <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
 560 |        "Q 1497 3097 1228 2736 \n",
 561 |        "Q 959 2375 959 1747 \n",
 562 |        "Q 959 1119 1226 758 \n",
 563 |        "Q 1494 397 1959 397 \n",
 564 |        "Q 2419 397 2687 759 \n",
 565 |        "Q 2956 1122 2956 1747 \n",
 566 |        "Q 2956 2369 2687 2733 \n",
 567 |        "Q 2419 3097 1959 3097 \n",
 568 |        "z\n",
 569 |        "M 1959 3584 \n",
 570 |        "Q 2709 3584 3137 3096 \n",
 571 |        "Q 3566 2609 3566 1747 \n",
 572 |        "Q 3566 888 3137 398 \n",
 573 |        "Q 2709 -91 1959 -91 \n",
 574 |        "Q 1206 -91 779 398 \n",
 575 |        "Q 353 888 353 1747 \n",
 576 |        "Q 353 2609 779 3096 \n",
 577 |        "Q 1206 3584 1959 3584 \n",
 578 |        "z\n",
 579 |        "\" transform=\"scale(0.015625)\"/>\n",
 580 |        "       <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
 581 |        "L 3122 2828 \n",
 582 |        "Q 2878 2963 2633 3030 \n",
 583 |        "Q 2388 3097 2138 3097 \n",
 584 |        "Q 1578 3097 1268 2742 \n",
 585 |        "Q 959 2388 959 1747 \n",
 586 |        "Q 959 1106 1268 751 \n",
 587 |        "Q 1578 397 2138 397 \n",
 588 |        "Q 2388 397 2633 464 \n",
 589 |        "Q 2878 531 3122 666 \n",
 590 |        "L 3122 134 \n",
 591 |        "Q 2881 22 2623 -34 \n",
 592 |        "Q 2366 -91 2075 -91 \n",
 593 |        "Q 1284 -91 818 406 \n",
 594 |        "Q 353 903 353 1747 \n",
 595 |        "Q 353 2603 823 3093 \n",
 596 |        "Q 1294 3584 2113 3584 \n",
 597 |        "Q 2378 3584 2631 3529 \n",
 598 |        "Q 2884 3475 3122 3366 \n",
 599 |        "z\n",
 600 |        "\" transform=\"scale(0.015625)\"/>\n",
 601 |        "       <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
 602 |        "L 3513 0 \n",
 603 |        "L 2938 0 \n",
 604 |        "L 2938 2094 \n",
 605 |        "Q 2938 2591 2744 2837 \n",
 606 |        "Q 2550 3084 2163 3084 \n",
 607 |        "Q 1697 3084 1428 2787 \n",
 608 |        "Q 1159 2491 1159 1978 \n",
 609 |        "L 1159 0 \n",
 610 |        "L 581 0 \n",
 611 |        "L 581 4863 \n",
 612 |        "L 1159 4863 \n",
 613 |        "L 1159 2956 \n",
 614 |        "Q 1366 3272 1645 3428 \n",
 615 |        "Q 1925 3584 2291 3584 \n",
 616 |        "Q 2894 3584 3203 3211 \n",
 617 |        "Q 3513 2838 3513 2113 \n",
 618 |        "z\n",
 619 |        "\" transform=\"scale(0.015625)\"/>\n",
 620 |        "      </defs>\n",
 621 |        "      <use xlink:href=\"#DejaVuSans-65\"/>\n",
 622 |        "      <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
 623 |        "      <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
 624 |        "      <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
 625 |        "      <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
 626 |        "     </g>\n",
 627 |        "    </g>\n",
 628 |        "   </g>\n",
 629 |        "   <g id=\"matplotlib.axis_2\">\n",
 630 |        "    <g id=\"ytick_1\">\n",
 631 |        "     <g id=\"line2d_11\">\n",
 632 |        "      <path d=\"M 40.603125 130.350885 \n",
 633 |        "L 235.903125 130.350885 \n",
 634 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 635 |        "     </g>\n",
 636 |        "     <g id=\"line2d_12\">\n",
 637 |        "      <defs>\n",
 638 |        "       <path id=\"mf345254254\" d=\"M 0 0 \n",
 639 |        "L -3.5 0 \n",
 640 |        "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 641 |        "      </defs>\n",
 642 |        "      <g>\n",
 643 |        "       <use xlink:href=\"#mf345254254\" x=\"40.603125\" y=\"130.350885\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 644 |        "      </g>\n",
 645 |        "     </g>\n",
 646 |        "     <g id=\"text_7\">\n",
 647 |        "      <!-- 2 -->\n",
 648 |        "      <g transform=\"translate(27.240625 134.150104)scale(0.1 -0.1)\">\n",
 649 |        "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
 650 |        "      </g>\n",
 651 |        "     </g>\n",
 652 |        "    </g>\n",
 653 |        "    <g id=\"ytick_2\">\n",
 654 |        "     <g id=\"line2d_13\">\n",
 655 |        "      <path d=\"M 40.603125 104.795311 \n",
 656 |        "L 235.903125 104.795311 \n",
 657 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 658 |        "     </g>\n",
 659 |        "     <g id=\"line2d_14\">\n",
 660 |        "      <g>\n",
 661 |        "       <use xlink:href=\"#mf345254254\" x=\"40.603125\" y=\"104.795311\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 662 |        "      </g>\n",
 663 |        "     </g>\n",
 664 |        "     <g id=\"text_8\">\n",
 665 |        "      <!-- 4 -->\n",
 666 |        "      <g transform=\"translate(27.240625 108.594529)scale(0.1 -0.1)\">\n",
 667 |        "       <use xlink:href=\"#DejaVuSans-34\"/>\n",
 668 |        "      </g>\n",
 669 |        "     </g>\n",
 670 |        "    </g>\n",
 671 |        "    <g id=\"ytick_3\">\n",
 672 |        "     <g id=\"line2d_15\">\n",
 673 |        "      <path d=\"M 40.603125 79.239736 \n",
 674 |        "L 235.903125 79.239736 \n",
 675 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 676 |        "     </g>\n",
 677 |        "     <g id=\"line2d_16\">\n",
 678 |        "      <g>\n",
 679 |        "       <use xlink:href=\"#mf345254254\" x=\"40.603125\" y=\"79.239736\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 680 |        "      </g>\n",
 681 |        "     </g>\n",
 682 |        "     <g id=\"text_9\">\n",
 683 |        "      <!-- 6 -->\n",
 684 |        "      <g transform=\"translate(27.240625 83.038955)scale(0.1 -0.1)\">\n",
 685 |        "       <defs>\n",
 686 |        "        <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
 687 |        "Q 1688 2584 1439 2293 \n",
 688 |        "Q 1191 2003 1191 1497 \n",
 689 |        "Q 1191 994 1439 701 \n",
 690 |        "Q 1688 409 2113 409 \n",
 691 |        "Q 2538 409 2786 701 \n",
 692 |        "Q 3034 994 3034 1497 \n",
 693 |        "Q 3034 2003 2786 2293 \n",
 694 |        "Q 2538 2584 2113 2584 \n",
 695 |        "z\n",
 696 |        "M 3366 4563 \n",
 697 |        "L 3366 3988 \n",
 698 |        "Q 3128 4100 2886 4159 \n",
 699 |        "Q 2644 4219 2406 4219 \n",
 700 |        "Q 1781 4219 1451 3797 \n",
 701 |        "Q 1122 3375 1075 2522 \n",
 702 |        "Q 1259 2794 1537 2939 \n",
 703 |        "Q 1816 3084 2150 3084 \n",
 704 |        "Q 2853 3084 3261 2657 \n",
 705 |        "Q 3669 2231 3669 1497 \n",
 706 |        "Q 3669 778 3244 343 \n",
 707 |        "Q 2819 -91 2113 -91 \n",
 708 |        "Q 1303 -91 875 529 \n",
 709 |        "Q 447 1150 447 2328 \n",
 710 |        "Q 447 3434 972 4092 \n",
 711 |        "Q 1497 4750 2381 4750 \n",
 712 |        "Q 2619 4750 2861 4703 \n",
 713 |        "Q 3103 4656 3366 4563 \n",
 714 |        "z\n",
 715 |        "\" transform=\"scale(0.015625)\"/>\n",
 716 |        "       </defs>\n",
 717 |        "       <use xlink:href=\"#DejaVuSans-36\"/>\n",
 718 |        "      </g>\n",
 719 |        "     </g>\n",
 720 |        "    </g>\n",
 721 |        "    <g id=\"ytick_4\">\n",
 722 |        "     <g id=\"line2d_17\">\n",
 723 |        "      <path d=\"M 40.603125 53.684161 \n",
 724 |        "L 235.903125 53.684161 \n",
 725 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 726 |        "     </g>\n",
 727 |        "     <g id=\"line2d_18\">\n",
 728 |        "      <g>\n",
 729 |        "       <use xlink:href=\"#mf345254254\" x=\"40.603125\" y=\"53.684161\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 730 |        "      </g>\n",
 731 |        "     </g>\n",
 732 |        "     <g id=\"text_10\">\n",
 733 |        "      <!-- 8 -->\n",
 734 |        "      <g transform=\"translate(27.240625 57.48338)scale(0.1 -0.1)\">\n",
 735 |        "       <defs>\n",
 736 |        "        <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
 737 |        "Q 1584 2216 1326 1975 \n",
 738 |        "Q 1069 1734 1069 1313 \n",
 739 |        "Q 1069 891 1326 650 \n",
 740 |        "Q 1584 409 2034 409 \n",
 741 |        "Q 2484 409 2743 651 \n",
 742 |        "Q 3003 894 3003 1313 \n",
 743 |        "Q 3003 1734 2745 1975 \n",
 744 |        "Q 2488 2216 2034 2216 \n",
 745 |        "z\n",
 746 |        "M 1403 2484 \n",
 747 |        "Q 997 2584 770 2862 \n",
 748 |        "Q 544 3141 544 3541 \n",
 749 |        "Q 544 4100 942 4425 \n",
 750 |        "Q 1341 4750 2034 4750 \n",
 751 |        "Q 2731 4750 3128 4425 \n",
 752 |        "Q 3525 4100 3525 3541 \n",
 753 |        "Q 3525 3141 3298 2862 \n",
 754 |        "Q 3072 2584 2669 2484 \n",
 755 |        "Q 3125 2378 3379 2068 \n",
 756 |        "Q 3634 1759 3634 1313 \n",
 757 |        "Q 3634 634 3220 271 \n",
 758 |        "Q 2806 -91 2034 -91 \n",
 759 |        "Q 1263 -91 848 271 \n",
 760 |        "Q 434 634 434 1313 \n",
 761 |        "Q 434 1759 690 2068 \n",
 762 |        "Q 947 2378 1403 2484 \n",
 763 |        "z\n",
 764 |        "M 1172 3481 \n",
 765 |        "Q 1172 3119 1398 2916 \n",
 766 |        "Q 1625 2713 2034 2713 \n",
 767 |        "Q 2441 2713 2670 2916 \n",
 768 |        "Q 2900 3119 2900 3481 \n",
 769 |        "Q 2900 3844 2670 4047 \n",
 770 |        "Q 2441 4250 2034 4250 \n",
 771 |        "Q 1625 4250 1398 4047 \n",
 772 |        "Q 1172 3844 1172 3481 \n",
 773 |        "z\n",
 774 |        "\" transform=\"scale(0.015625)\"/>\n",
 775 |        "       </defs>\n",
 776 |        "       <use xlink:href=\"#DejaVuSans-38\"/>\n",
 777 |        "      </g>\n",
 778 |        "     </g>\n",
 779 |        "    </g>\n",
 780 |        "    <g id=\"ytick_5\">\n",
 781 |        "     <g id=\"line2d_19\">\n",
 782 |        "      <path d=\"M 40.603125 28.128587 \n",
 783 |        "L 235.903125 28.128587 \n",
 784 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 785 |        "     </g>\n",
 786 |        "     <g id=\"line2d_20\">\n",
 787 |        "      <g>\n",
 788 |        "       <use xlink:href=\"#mf345254254\" x=\"40.603125\" y=\"28.128587\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 789 |        "      </g>\n",
 790 |        "     </g>\n",
 791 |        "     <g id=\"text_11\">\n",
 792 |        "      <!-- 10 -->\n",
 793 |        "      <g transform=\"translate(20.878125 31.927805)scale(0.1 -0.1)\">\n",
 794 |        "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
 795 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 796 |        "      </g>\n",
 797 |        "     </g>\n",
 798 |        "    </g>\n",
 799 |        "    <g id=\"text_12\">\n",
 800 |        "     <!-- perplexity -->\n",
 801 |        "     <g transform=\"translate(14.798437 101.626563)rotate(-90)scale(0.1 -0.1)\">\n",
 802 |        "      <defs>\n",
 803 |        "       <path id=\"DejaVuSans-72\" d=\"M 2631 2963 \n",
 804 |        "Q 2534 3019 2420 3045 \n",
 805 |        "Q 2306 3072 2169 3072 \n",
 806 |        "Q 1681 3072 1420 2755 \n",
 807 |        "Q 1159 2438 1159 1844 \n",
 808 |        "L 1159 0 \n",
 809 |        "L 581 0 \n",
 810 |        "L 581 3500 \n",
 811 |        "L 1159 3500 \n",
 812 |        "L 1159 2956 \n",
 813 |        "Q 1341 3275 1631 3429 \n",
 814 |        "Q 1922 3584 2338 3584 \n",
 815 |        "Q 2397 3584 2469 3576 \n",
 816 |        "Q 2541 3569 2628 3553 \n",
 817 |        "L 2631 2963 \n",
 818 |        "z\n",
 819 |        "\" transform=\"scale(0.015625)\"/>\n",
 820 |        "       <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
 821 |        "L 1178 4863 \n",
 822 |        "L 1178 0 \n",
 823 |        "L 603 0 \n",
 824 |        "L 603 4863 \n",
 825 |        "z\n",
 826 |        "\" transform=\"scale(0.015625)\"/>\n",
 827 |        "       <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
 828 |        "L 2247 1797 \n",
 829 |        "L 3578 0 \n",
 830 |        "L 2900 0 \n",
 831 |        "L 1881 1375 \n",
 832 |        "L 863 0 \n",
 833 |        "L 184 0 \n",
 834 |        "L 1544 1831 \n",
 835 |        "L 300 3500 \n",
 836 |        "L 978 3500 \n",
 837 |        "L 1906 2253 \n",
 838 |        "L 2834 3500 \n",
 839 |        "L 3513 3500 \n",
 840 |        "z\n",
 841 |        "\" transform=\"scale(0.015625)\"/>\n",
 842 |        "       <path id=\"DejaVuSans-69\" d=\"M 603 3500 \n",
 843 |        "L 1178 3500 \n",
 844 |        "L 1178 0 \n",
 845 |        "L 603 0 \n",
 846 |        "L 603 3500 \n",
 847 |        "z\n",
 848 |        "M 603 4863 \n",
 849 |        "L 1178 4863 \n",
 850 |        "L 1178 4134 \n",
 851 |        "L 603 4134 \n",
 852 |        "L 603 4863 \n",
 853 |        "z\n",
 854 |        "\" transform=\"scale(0.015625)\"/>\n",
 855 |        "       <path id=\"DejaVuSans-74\" d=\"M 1172 4494 \n",
 856 |        "L 1172 3500 \n",
 857 |        "L 2356 3500 \n",
 858 |        "L 2356 3053 \n",
 859 |        "L 1172 3053 \n",
 860 |        "L 1172 1153 \n",
 861 |        "Q 1172 725 1289 603 \n",
 862 |        "Q 1406 481 1766 481 \n",
 863 |        "L 2356 481 \n",
 864 |        "L 2356 0 \n",
 865 |        "L 1766 0 \n",
 866 |        "Q 1100 0 847 248 \n",
 867 |        "Q 594 497 594 1153 \n",
 868 |        "L 594 3053 \n",
 869 |        "L 172 3053 \n",
 870 |        "L 172 3500 \n",
 871 |        "L 594 3500 \n",
 872 |        "L 594 4494 \n",
 873 |        "L 1172 4494 \n",
 874 |        "z\n",
 875 |        "\" transform=\"scale(0.015625)\"/>\n",
 876 |        "       <path id=\"DejaVuSans-79\" d=\"M 2059 -325 \n",
 877 |        "Q 1816 -950 1584 -1140 \n",
 878 |        "Q 1353 -1331 966 -1331 \n",
 879 |        "L 506 -1331 \n",
 880 |        "L 506 -850 \n",
 881 |        "L 844 -850 \n",
 882 |        "Q 1081 -850 1212 -737 \n",
 883 |        "Q 1344 -625 1503 -206 \n",
 884 |        "L 1606 56 \n",
 885 |        "L 191 3500 \n",
 886 |        "L 800 3500 \n",
 887 |        "L 1894 763 \n",
 888 |        "L 2988 3500 \n",
 889 |        "L 3597 3500 \n",
 890 |        "L 2059 -325 \n",
 891 |        "z\n",
 892 |        "\" transform=\"scale(0.015625)\"/>\n",
 893 |        "      </defs>\n",
 894 |        "      <use xlink:href=\"#DejaVuSans-70\"/>\n",
 895 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"63.476562\"/>\n",
 896 |        "      <use xlink:href=\"#DejaVuSans-72\" x=\"125\"/>\n",
 897 |        "      <use xlink:href=\"#DejaVuSans-70\" x=\"166.113281\"/>\n",
 898 |        "      <use xlink:href=\"#DejaVuSans-6c\" x=\"229.589844\"/>\n",
 899 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"257.373047\"/>\n",
 900 |        "      <use xlink:href=\"#DejaVuSans-78\" x=\"317.146484\"/>\n",
 901 |        "      <use xlink:href=\"#DejaVuSans-69\" x=\"376.326172\"/>\n",
 902 |        "      <use xlink:href=\"#DejaVuSans-74\" x=\"404.109375\"/>\n",
 903 |        "      <use xlink:href=\"#DejaVuSans-79\" x=\"443.318359\"/>\n",
 904 |        "     </g>\n",
 905 |        "    </g>\n",
 906 |        "   </g>\n",
 907 |        "   <g id=\"line2d_21\">\n",
 908 |        "    <path d=\"M 40.603125 13.5 \n",
 909 |        "L 44.588839 38.565141 \n",
 910 |        "L 48.574554 50.183976 \n",
 911 |        "L 52.560268 56.713362 \n",
 912 |        "L 56.545982 62.200166 \n",
 913 |        "L 60.531696 68.500832 \n",
 914 |        "L 64.517411 75.284049 \n",
 915 |        "L 68.503125 89.331621 \n",
 916 |        "L 72.488839 97.883524 \n",
 917 |        "L 76.474554 107.281121 \n",
 918 |        "L 80.460268 115.530234 \n",
 919 |        "L 84.445982 120.538307 \n",
 920 |        "L 88.431696 124.203651 \n",
 921 |        "L 92.417411 125.677577 \n",
 922 |        "L 96.403125 129.263975 \n",
 923 |        "L 100.388839 130.008326 \n",
 924 |        "L 104.374554 132.553867 \n",
 925 |        "L 108.360268 132.913748 \n",
 926 |        "L 112.345982 134.416763 \n",
 927 |        "L 116.331696 135.304209 \n",
 928 |        "L 120.317411 135.475434 \n",
 929 |        "L 124.303125 136.282909 \n",
 930 |        "L 128.288839 136.942128 \n",
 931 |        "L 132.274554 136.988238 \n",
 932 |        "L 136.260268 138.014104 \n",
 933 |        "L 140.245982 137.329448 \n",
 934 |        "L 144.231696 137.79334 \n",
 935 |        "L 148.217411 138.134959 \n",
 936 |        "L 152.203125 137.839748 \n",
 937 |        "L 156.188839 137.605125 \n",
 938 |        "L 160.174554 138.285643 \n",
 939 |        "L 164.160268 138.373847 \n",
 940 |        "L 168.145982 138.687104 \n",
 941 |        "L 172.131696 138.431521 \n",
 942 |        "L 176.117411 138.291849 \n",
 943 |        "L 180.103125 138.78992 \n",
 944 |        "L 184.088839 138.397931 \n",
 945 |        "L 188.074554 139.07329 \n",
 946 |        "L 192.060268 139.168252 \n",
 947 |        "L 196.045982 138.766782 \n",
 948 |        "L 200.031696 138.901581 \n",
 949 |        "L 204.017411 138.586236 \n",
 950 |        "L 208.003125 139.144617 \n",
 951 |        "L 211.988839 139.11438 \n",
 952 |        "L 215.974554 139.5 \n",
 953 |        "L 219.960268 138.076873 \n",
 954 |        "L 223.945982 139.36387 \n",
 955 |        "L 227.931696 138.731088 \n",
 956 |        "L 231.917411 139.42048 \n",
 957 |        "L 235.903125 139.342732 \n",
 958 |        "\" clip-path=\"url(#p08c391ead6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
 959 |        "   </g>\n",
 960 |        "   <g id=\"patch_3\">\n",
 961 |        "    <path d=\"M 40.603125 145.8 \n",
 962 |        "L 40.603125 7.2 \n",
 963 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 964 |        "   </g>\n",
 965 |        "   <g id=\"patch_4\">\n",
 966 |        "    <path d=\"M 235.903125 145.8 \n",
 967 |        "L 235.903125 7.2 \n",
 968 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 969 |        "   </g>\n",
 970 |        "   <g id=\"patch_5\">\n",
 971 |        "    <path d=\"M 40.603125 145.8 \n",
 972 |        "L 235.903125 145.8 \n",
 973 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 974 |        "   </g>\n",
 975 |        "   <g id=\"patch_6\">\n",
 976 |        "    <path d=\"M 40.603125 7.2 \n",
 977 |        "L 235.903125 7.2 \n",
 978 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 979 |        "   </g>\n",
 980 |        "   <g id=\"legend_1\">\n",
 981 |        "    <g id=\"patch_7\">\n",
 982 |        "     <path d=\"M 173.628125 29.878125 \n",
 983 |        "L 228.903125 29.878125 \n",
 984 |        "Q 230.903125 29.878125 230.903125 27.878125 \n",
 985 |        "L 230.903125 14.2 \n",
 986 |        "Q 230.903125 12.2 228.903125 12.2 \n",
 987 |        "L 173.628125 12.2 \n",
 988 |        "Q 171.628125 12.2 171.628125 14.2 \n",
 989 |        "L 171.628125 27.878125 \n",
 990 |        "Q 171.628125 29.878125 173.628125 29.878125 \n",
 991 |        "z\n",
 992 |        "\" style=\"fill: #ffffff; opacity: 0.8; stroke: #cccccc; stroke-linejoin: miter\"/>\n",
 993 |        "    </g>\n",
 994 |        "    <g id=\"line2d_22\">\n",
 995 |        "     <path d=\"M 175.628125 20.298438 \n",
 996 |        "L 185.628125 20.298438 \n",
 997 |        "L 195.628125 20.298438 \n",
 998 |        "\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
 999 |        "    </g>\n",
1000 |        "    <g id=\"text_13\">\n",
1001 |        "     <!-- train -->\n",
1002 |        "     <g transform=\"translate(203.628125 23.798438)scale(0.1 -0.1)\">\n",
1003 |        "      <defs>\n",
1004 |        "       <path id=\"DejaVuSans-61\" d=\"M 2194 1759 \n",
1005 |        "Q 1497 1759 1228 1600 \n",
1006 |        "Q 959 1441 959 1056 \n",
1007 |        "Q 959 750 1161 570 \n",
1008 |        "Q 1363 391 1709 391 \n",
1009 |        "Q 2188 391 2477 730 \n",
1010 |        "Q 2766 1069 2766 1631 \n",
1011 |        "L 2766 1759 \n",
1012 |        "L 2194 1759 \n",
1013 |        "z\n",
1014 |        "M 3341 1997 \n",
1015 |        "L 3341 0 \n",
1016 |        "L 2766 0 \n",
1017 |        "L 2766 531 \n",
1018 |        "Q 2569 213 2275 61 \n",
1019 |        "Q 1981 -91 1556 -91 \n",
1020 |        "Q 1019 -91 701 211 \n",
1021 |        "Q 384 513 384 1019 \n",
1022 |        "Q 384 1609 779 1909 \n",
1023 |        "Q 1175 2209 1959 2209 \n",
1024 |        "L 2766 2209 \n",
1025 |        "L 2766 2266 \n",
1026 |        "Q 2766 2663 2505 2880 \n",
1027 |        "Q 2244 3097 1772 3097 \n",
1028 |        "Q 1472 3097 1187 3025 \n",
1029 |        "Q 903 2953 641 2809 \n",
1030 |        "L 641 3341 \n",
1031 |        "Q 956 3463 1253 3523 \n",
1032 |        "Q 1550 3584 1831 3584 \n",
1033 |        "Q 2591 3584 2966 3190 \n",
1034 |        "Q 3341 2797 3341 1997 \n",
1035 |        "z\n",
1036 |        "\" transform=\"scale(0.015625)\"/>\n",
1037 |        "       <path id=\"DejaVuSans-6e\" d=\"M 3513 2113 \n",
1038 |        "L 3513 0 \n",
1039 |        "L 2938 0 \n",
1040 |        "L 2938 2094 \n",
1041 |        "Q 2938 2591 2744 2837 \n",
1042 |        "Q 2550 3084 2163 3084 \n",
1043 |        "Q 1697 3084 1428 2787 \n",
1044 |        "Q 1159 2491 1159 1978 \n",
1045 |        "L 1159 0 \n",
1046 |        "L 581 0 \n",
1047 |        "L 581 3500 \n",
1048 |        "L 1159 3500 \n",
1049 |        "L 1159 2956 \n",
1050 |        "Q 1366 3272 1645 3428 \n",
1051 |        "Q 1925 3584 2291 3584 \n",
1052 |        "Q 2894 3584 3203 3211 \n",
1053 |        "Q 3513 2838 3513 2113 \n",
1054 |        "z\n",
1055 |        "\" transform=\"scale(0.015625)\"/>\n",
1056 |        "      </defs>\n",
1057 |        "      <use xlink:href=\"#DejaVuSans-74\"/>\n",
1058 |        "      <use xlink:href=\"#DejaVuSans-72\" x=\"39.208984\"/>\n",
1059 |        "      <use xlink:href=\"#DejaVuSans-61\" x=\"80.322266\"/>\n",
1060 |        "      <use xlink:href=\"#DejaVuSans-69\" x=\"141.601562\"/>\n",
1061 |        "      <use xlink:href=\"#DejaVuSans-6e\" x=\"169.384766\"/>\n",
1062 |        "     </g>\n",
1063 |        "    </g>\n",
1064 |        "   </g>\n",
1065 |        "  </g>\n",
1066 |        " </g>\n",
1067 |        " <defs>\n",
1068 |        "  <clipPath id=\"p08c391ead6\">\n",
1069 |        "   <rect x=\"40.603125\" y=\"7.2\" width=\"195.3\" height=\"138.6\"/>\n",
1070 |        "  </clipPath>\n",
1071 |        " </defs>\n",
1072 |        "</svg>\n"
1073 |       ],
1074 |       "text/plain": [
1075 |        "<Figure size 350x250 with 1 Axes>"
1076 |       ]
1077 |      },
1078 |      "metadata": {},
1079 |      "output_type": "display_data"
1080 |     }
1081 |    ],
1082 |    "source": [
1083 |     "num_epochs, lr = 500, 1\n",
1084 |     "d2l.train_ch8(net, train_iter, vocab, lr, num_epochs, device)"
1085 |    ]
1086 |   },
1087 |   {
1088 |    "cell_type": "code",
1089 |    "execution_count": null,
1090 |    "id": "e2880e48",
1091 |    "metadata": {},
1092 |    "outputs": [],
1093 |    "source": []
1094 |   }
1095 |  ],
1096 |  "metadata": {
1097 |   "kernelspec": {
1098 |    "display_name": "Python [conda env:torch] *",
1099 |    "language": "python",
1100 |    "name": "conda-env-torch-py"
1101 |   },
1102 |   "language_info": {
1103 |    "codemirror_mode": {
1104 |     "name": "ipython",
1105 |     "version": 3
1106 |    },
1107 |    "file_extension": ".py",
1108 |    "mimetype": "text/x-python",
1109 |    "name": "python",
1110 |    "nbconvert_exporter": "python",
1111 |    "pygments_lexer": "ipython3",
1112 |    "version": "3.8.16"
1113 |   }
1114 |  },
1115 |  "nbformat": 4,
1116 |  "nbformat_minor": 5
1117 | }
1118 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/seq2seq序列到序列.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "code",
   5 |    "execution_count": 1,
   6 |    "id": "59f21fee",
   7 |    "metadata": {},
   8 |    "outputs": [],
   9 |    "source": [
  10 |     "import collections\n",
  11 |     "import math\n",
  12 |     "import torch\n",
  13 |     "from torch import nn\n",
  14 |     "from d2l import torch as d2l"
  15 |    ]
  16 |   },
  17 |   {
  18 |    "cell_type": "markdown",
  19 |    "id": "328d5ed0",
  20 |    "metadata": {},
  21 |    "source": [
  22 |     "首先，先定义编码器"
  23 |    ]
  24 |   },
  25 |   {
  26 |    "cell_type": "code",
  27 |    "execution_count": 2,
  28 |    "id": "101b4534",
  29 |    "metadata": {},
  30 |    "outputs": [],
  31 |    "source": [
  32 |     "class seq2seqEncoder(d2l.Encoder): # 继承自d2l.Encoder这个父类\n",
  33 |     "    def __init__(self, vocab_size, embed_size, num_hiddens, num_layers, dropout=0, **kwargs):\n",
  34 |     "        super(seq2seqEncoder, self).__init__(**kwargs) # 调用父类，使得seq2seq编码器可以继承父类的属性和方法\n",
  35 |     "        self.embedding = nn.Embedding(vocab_size, embed_size)\n",
  36 |     "        self.rnn = nn.GRU(embed_size, num_hiddens, num_layers, dropout=dropout)\n",
  37 |     "    \n",
  38 |     "    def forward(self, X, *args):\n",
  39 |     "        # X的形状为（batch_size, num_steps, embed_size）\n",
  40 |     "        X = self.embedding(X)\n",
  41 |     "        X = X.permute(1, 0, 2) # 改变输入的形状，时间步长放到前面\n",
  42 |     "        output, state = self.rnn(X)\n",
  43 |     "        # output的形状：（num_steps, batch_size, num_hidens），state的形状：（num_layers, batch_size, num_hiddens）\n",
  44 |     "        return output, state"
  45 |    ]
  46 |   },
  47 |   {
  48 |    "cell_type": "markdown",
  49 |    "id": "bb053509",
  50 |    "metadata": {},
  51 |    "source": [
  52 |     "实例化编码器测试输出"
  53 |    ]
  54 |   },
  55 |   {
  56 |    "cell_type": "code",
  57 |    "execution_count": 3,
  58 |    "id": "95dd9d26",
  59 |    "metadata": {},
  60 |    "outputs": [
  61 |     {
  62 |      "data": {
  63 |       "text/plain": [
  64 |        "torch.Size([7, 4, 16])"
  65 |       ]
  66 |      },
  67 |      "execution_count": 3,
  68 |      "metadata": {},
  69 |      "output_type": "execute_result"
  70 |     }
  71 |    ],
  72 |    "source": [
  73 |     "encoder = seq2seqEncoder(vocab_size=10, embed_size=8, num_hiddens=16,\n",
  74 |     "                         num_layers=2)\n",
  75 |     "encoder.eval()\n",
  76 |     "X = torch.zeros((4, 7), dtype=torch.long)\n",
  77 |     "output, state = encoder(X)\n",
  78 |     "output.shape"
  79 |    ]
  80 |   },
  81 |   {
  82 |    "cell_type": "code",
  83 |    "execution_count": 4,
  84 |    "id": "633d4cc6",
  85 |    "metadata": {},
  86 |    "outputs": [
  87 |     {
  88 |      "data": {
  89 |       "text/plain": [
  90 |        "torch.Size([2, 4, 16])"
  91 |       ]
  92 |      },
  93 |      "execution_count": 4,
  94 |      "metadata": {},
  95 |      "output_type": "execute_result"
  96 |     }
  97 |    ],
  98 |    "source": [
  99 |     "state.shape"
 100 |    ]
 101 |   },
 102 |   {
 103 |    "cell_type": "markdown",
 104 |    "id": "73fd4a53",
 105 |    "metadata": {},
 106 |    "source": [
 107 |     "接着定义解码器"
 108 |    ]
 109 |   },
 110 |   {
 111 |    "cell_type": "code",
 112 |    "execution_count": 5,
 113 |    "id": "a1aa038d",
 114 |    "metadata": {},
 115 |    "outputs": [],
 116 |    "source": [
 117 |     "class seq2seqDecoder(d2l.Decoder):\n",
 118 |     "    def __init__(self, vocab_size, embed_size, num_hiddens, num_layers, dropout=0, **kwargs):\n",
 119 |     "        super(seq2seqDecoder, self).__init__(**kwargs)\n",
 120 |     "        self.embedding = nn.Embedding(vocab_size, embed_size)\n",
 121 |     "        # 输入为embed_size和num_hiddens的和是因为需要将输入X和上下文进行合并\n",
 122 |     "        self.rnn = nn.GRU(embed_size+num_hiddens, num_hiddens, num_layers, dropout=dropout)\n",
 123 |     "        self.dense = nn.Linear(num_hiddens, vocab_size)\n",
 124 |     "       \n",
 125 |     "    def init_state(self, enc_outputs, *args):\n",
 126 |     "        # 初始化隐状态，取编码器输出的隐状态作为解码器的初始隐状态\n",
 127 |     "        return enc_outputs[1]\n",
 128 |     "    \n",
 129 |     "    def forward(self, X, state):\n",
 130 |     "        # 将X的形状变为（num_steps, batch_size, embed_size）\n",
 131 |     "        X = self.embedding(X).permute(1, 0, 2)\n",
 132 |     "        # 上下文变量取自编码器最后一个时刻输出隐状态的最后一层，并且使其和输入X具有相同的时间步长\n",
 133 |     "        content = state[-1].repeat(X.shape[0], 1, 1)\n",
 134 |     "        # 将输入X和上下文合并\n",
 135 |     "        X_content = torch.cat((X, content), 2)\n",
 136 |     "        output, state = self.rnn(X_content, state)\n",
 137 |     "        output = self.dense(output).permute(1, 0, 2)\n",
 138 |     "        # 输出output形状为（batch_size, num_steps, vocab_size）,隐状态state的形状为（num_layers, batch_size, num_hiddens）\n",
 139 |     "        return output, state"
 140 |    ]
 141 |   },
 142 |   {
 143 |    "cell_type": "markdown",
 144 |    "id": "b64ed077",
 145 |    "metadata": {},
 146 |    "source": [
 147 |     "实例化解码器测试输出"
 148 |    ]
 149 |   },
 150 |   {
 151 |    "cell_type": "code",
 152 |    "execution_count": 7,
 153 |    "id": "faf7ef8d",
 154 |    "metadata": {},
 155 |    "outputs": [
 156 |     {
 157 |      "data": {
 158 |       "text/plain": [
 159 |        "(torch.Size([4, 7, 10]), torch.Size([2, 4, 16]))"
 160 |       ]
 161 |      },
 162 |      "execution_count": 7,
 163 |      "metadata": {},
 164 |      "output_type": "execute_result"
 165 |     }
 166 |    ],
 167 |    "source": [
 168 |     "decoder = seq2seqDecoder(vocab_size=10, embed_size=8, num_hiddens=16,\n",
 169 |     "                         num_layers=2)\n",
 170 |     "decoder.eval()\n",
 171 |     "state = decoder.init_state(encoder(X))\n",
 172 |     "output, state = decoder(X, state)\n",
 173 |     "output.shape, state.shape"
 174 |    ]
 175 |   },
 176 |   {
 177 |    "cell_type": "markdown",
 178 |    "id": "5e8833f1",
 179 |    "metadata": {},
 180 |    "source": [
 181 |     "零值化屏蔽不相关的项"
 182 |    ]
 183 |   },
 184 |   {
 185 |    "cell_type": "code",
 186 |    "execution_count": 8,
 187 |    "id": "db020d33",
 188 |    "metadata": {},
 189 |    "outputs": [],
 190 |    "source": [
 191 |     "def sequence_mask(X, valid_len, value=0):\n",
 192 |     "    # 时间步长设为最大序列长度\n",
 193 |     "    maxlen = X.size(1)\n",
 194 |     "    # 判断有效长度生成掩码\n",
 195 |     "    mask = torch.arange((maxlen), dtype=torch.float32, device=X.device)[None, :] < valid_len[:, None]\n",
 196 |     "    # 对掩码取反屏蔽对应的项\n",
 197 |     "    X[~mask] = value\n",
 198 |     "    return X"
 199 |    ]
 200 |   },
 201 |   {
 202 |    "cell_type": "code",
 203 |    "execution_count": 9,
 204 |    "id": "7a69c960",
 205 |    "metadata": {},
 206 |    "outputs": [],
 207 |    "source": [
 208 |     "class MaskedSoftmaxCELoss(nn.CrossEntropyLoss):\n",
 209 |     "    def forward(self, pred, label, valid_len):\n",
 210 |     "        # 按照标签的形状设置一组单位向量作为权重\n",
 211 |     "        weights = torch.ones_like(label)\n",
 212 |     "        # 将这组权重零值化屏蔽不相关的项\n",
 213 |     "        weights = sequence_mask(weights, valid_len)\n",
 214 |     "        self.reduction = 'none'\n",
 215 |     "        # 计算原始的交叉熵损失函数\n",
 216 |     "        unweighted_loss = super(MaskedSoftmaxCELoss, self).forward(pred.permute(0, 2, 1), label)\n",
 217 |     "        # 将损失函数与权重相乘以计算最终的有效损失函数\n",
 218 |     "        weight_loss = (unweighted_loss * weights).mean(dim=1)\n",
 219 |     "        return weight_loss"
 220 |    ]
 221 |   },
 222 |   {
 223 |    "cell_type": "markdown",
 224 |    "id": "e44e510b",
 225 |    "metadata": {},
 226 |    "source": [
 227 |     "测试输出"
 228 |    ]
 229 |   },
 230 |   {
 231 |    "cell_type": "markdown",
 232 |    "id": "ece57673",
 233 |    "metadata": {},
 234 |    "source": [
 235 |     "训练"
 236 |    ]
 237 |   },
 238 |   {
 239 |    "cell_type": "code",
 240 |    "execution_count": 16,
 241 |    "id": "8b8d9db0",
 242 |    "metadata": {},
 243 |    "outputs": [],
 244 |    "source": [
 245 |     "def train(net, data_iter, lr, num_epochs, tgt_vocab, device):\n",
 246 |     "    # 初始化权重\n",
 247 |     "    def xavier_init_weights(m):\n",
 248 |     "        if type(m) == nn.Linear:\n",
 249 |     "            nn.init.xavier_uniform_(m.weight)\n",
 250 |     "        if type(m) == nn.GRU:\n",
 251 |     "            for param in m._flat_weights_names:\n",
 252 |     "                if \"weight\" in param:\n",
 253 |     "                    nn.init.xavier_uniform_(m._parameters[param])\n",
 254 |     "    \n",
 255 |     "    net.apply(xavier_init_weights)\n",
 256 |     "    net.to(device)\n",
 257 |     "    optimizer = torch.optim.Adam(net.parameters(), lr=lr)\n",
 258 |     "    Loss = MaskedSoftmaxCELoss()\n",
 259 |     "    net.train()\n",
 260 |     "    animator = d2l.Animator(xlabel='epoch', ylabel='loss', xlim=[10, num_epochs])\n",
 261 |     "    \n",
 262 |     "    for epoch in range(num_epochs):\n",
 263 |     "        timer = d2l.Timer()\n",
 264 |     "        metrics = d2l.Accumulator(2)\n",
 265 |     "        \n",
 266 |     "        for batch in data_iter:\n",
 267 |     "            optimizer.zero_grad()\n",
 268 |     "            X, X_valid_len, Y, Y_valid_len = [x.to(device) for x in batch]\n",
 269 |     "            # 提取特定的序列起始词元\n",
 270 |     "            bos = torch.tensor([tgt_vocab['<bos>']] * Y.shape[0], device=device).reshape(-1,1)\n",
 271 |     "            # 强制教学，将序列起始词元与原始的输出序列（将序列结束词元剔除）合并作为解码器的输入\n",
 272 |     "            dec_input = torch.cat([bos, Y[:, :-1]], dim=1)\n",
 273 |     "            Y_hat, _ = net(X, dec_input, X_valid_len)\n",
 274 |     "            loss = Loss(Y_hat, Y, Y_valid_len)\n",
 275 |     "            loss.sum().backward()\n",
 276 |     "            d2l.grad_clipping(net, 1)\n",
 277 |     "            num_tokens = Y_valid_len.sum()\n",
 278 |     "            optimizer.step()\n",
 279 |     "            with torch.no_grad():\n",
 280 |     "                metrics.add(loss.sum(), num_tokens)\n",
 281 |     "        if (epoch+1) % 10 == 0:\n",
 282 |     "            animator.add(epoch+1, (metrics[0]/metrics[1]))\n",
 283 |     "            \n",
 284 |     "    print(f'loss {metrics[0]/metrics[1]:.3f},{metrics[1]/timer.stop():.1f}'f'token/sec on {str(device)}')"
 285 |    ]
 286 |   },
 287 |   {
 288 |    "cell_type": "markdown",
 289 |    "id": "262b3293",
 290 |    "metadata": {},
 291 |    "source": [
 292 |     "使用机器翻译数据集测试训练函数的输出"
 293 |    ]
 294 |   },
 295 |   {
 296 |    "cell_type": "code",
 297 |    "execution_count": 17,
 298 |    "id": "8d2d5497",
 299 |    "metadata": {},
 300 |    "outputs": [
 301 |     {
 302 |      "name": "stdout",
 303 |      "output_type": "stream",
 304 |      "text": [
 305 |       "loss 0.019,38254.4token/sec on cuda:0\n"
 306 |      ]
 307 |     },
 308 |     {
 309 |      "data": {
 310 |       "image/svg+xml": [
 311 |        "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
 312 |        "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
 313 |        "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
 314 |        "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"262.1875pt\" height=\"183.35625pt\" viewBox=\"0 0 262.1875 183.35625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
 315 |        " <metadata>\n",
 316 |        "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
 317 |        "   <cc:Work>\n",
 318 |        "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
 319 |        "    <dc:date>2023-06-22T17:06:07.829293</dc:date>\n",
 320 |        "    <dc:format>image/svg+xml</dc:format>\n",
 321 |        "    <dc:creator>\n",
 322 |        "     <cc:Agent>\n",
 323 |        "      <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
 324 |        "     </cc:Agent>\n",
 325 |        "    </dc:creator>\n",
 326 |        "   </cc:Work>\n",
 327 |        "  </rdf:RDF>\n",
 328 |        " </metadata>\n",
 329 |        " <defs>\n",
 330 |        "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
 331 |        " </defs>\n",
 332 |        " <g id=\"figure_1\">\n",
 333 |        "  <g id=\"patch_1\">\n",
 334 |        "   <path d=\"M 0 183.35625 \n",
 335 |        "L 262.1875 183.35625 \n",
 336 |        "L 262.1875 0 \n",
 337 |        "L 0 0 \n",
 338 |        "z\n",
 339 |        "\" style=\"fill: #ffffff\"/>\n",
 340 |        "  </g>\n",
 341 |        "  <g id=\"axes_1\">\n",
 342 |        "   <g id=\"patch_2\">\n",
 343 |        "    <path d=\"M 50.14375 145.8 \n",
 344 |        "L 245.44375 145.8 \n",
 345 |        "L 245.44375 7.2 \n",
 346 |        "L 50.14375 7.2 \n",
 347 |        "z\n",
 348 |        "\" style=\"fill: #ffffff\"/>\n",
 349 |        "   </g>\n",
 350 |        "   <g id=\"matplotlib.axis_1\">\n",
 351 |        "    <g id=\"xtick_1\">\n",
 352 |        "     <g id=\"line2d_1\">\n",
 353 |        "      <path d=\"M 77.081681 145.8 \n",
 354 |        "L 77.081681 7.2 \n",
 355 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 356 |        "     </g>\n",
 357 |        "     <g id=\"line2d_2\">\n",
 358 |        "      <defs>\n",
 359 |        "       <path id=\"mc969eb82b7\" d=\"M 0 0 \n",
 360 |        "L 0 3.5 \n",
 361 |        "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 362 |        "      </defs>\n",
 363 |        "      <g>\n",
 364 |        "       <use xlink:href=\"#mc969eb82b7\" x=\"77.081681\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 365 |        "      </g>\n",
 366 |        "     </g>\n",
 367 |        "     <g id=\"text_1\">\n",
 368 |        "      <!-- 50 -->\n",
 369 |        "      <g transform=\"translate(70.719181 160.398438)scale(0.1 -0.1)\">\n",
 370 |        "       <defs>\n",
 371 |        "        <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
 372 |        "L 3169 4666 \n",
 373 |        "L 3169 4134 \n",
 374 |        "L 1269 4134 \n",
 375 |        "L 1269 2991 \n",
 376 |        "Q 1406 3038 1543 3061 \n",
 377 |        "Q 1681 3084 1819 3084 \n",
 378 |        "Q 2600 3084 3056 2656 \n",
 379 |        "Q 3513 2228 3513 1497 \n",
 380 |        "Q 3513 744 3044 326 \n",
 381 |        "Q 2575 -91 1722 -91 \n",
 382 |        "Q 1428 -91 1123 -41 \n",
 383 |        "Q 819 9 494 109 \n",
 384 |        "L 494 744 \n",
 385 |        "Q 775 591 1075 516 \n",
 386 |        "Q 1375 441 1709 441 \n",
 387 |        "Q 2250 441 2565 725 \n",
 388 |        "Q 2881 1009 2881 1497 \n",
 389 |        "Q 2881 1984 2565 2268 \n",
 390 |        "Q 2250 2553 1709 2553 \n",
 391 |        "Q 1456 2553 1204 2497 \n",
 392 |        "Q 953 2441 691 2322 \n",
 393 |        "L 691 4666 \n",
 394 |        "z\n",
 395 |        "\" transform=\"scale(0.015625)\"/>\n",
 396 |        "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
 397 |        "Q 1547 4250 1301 3770 \n",
 398 |        "Q 1056 3291 1056 2328 \n",
 399 |        "Q 1056 1369 1301 889 \n",
 400 |        "Q 1547 409 2034 409 \n",
 401 |        "Q 2525 409 2770 889 \n",
 402 |        "Q 3016 1369 3016 2328 \n",
 403 |        "Q 3016 3291 2770 3770 \n",
 404 |        "Q 2525 4250 2034 4250 \n",
 405 |        "z\n",
 406 |        "M 2034 4750 \n",
 407 |        "Q 2819 4750 3233 4129 \n",
 408 |        "Q 3647 3509 3647 2328 \n",
 409 |        "Q 3647 1150 3233 529 \n",
 410 |        "Q 2819 -91 2034 -91 \n",
 411 |        "Q 1250 -91 836 529 \n",
 412 |        "Q 422 1150 422 2328 \n",
 413 |        "Q 422 3509 836 4129 \n",
 414 |        "Q 1250 4750 2034 4750 \n",
 415 |        "z\n",
 416 |        "\" transform=\"scale(0.015625)\"/>\n",
 417 |        "       </defs>\n",
 418 |        "       <use xlink:href=\"#DejaVuSans-35\"/>\n",
 419 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 420 |        "      </g>\n",
 421 |        "     </g>\n",
 422 |        "    </g>\n",
 423 |        "    <g id=\"xtick_2\">\n",
 424 |        "     <g id=\"line2d_3\">\n",
 425 |        "      <path d=\"M 110.754095 145.8 \n",
 426 |        "L 110.754095 7.2 \n",
 427 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 428 |        "     </g>\n",
 429 |        "     <g id=\"line2d_4\">\n",
 430 |        "      <g>\n",
 431 |        "       <use xlink:href=\"#mc969eb82b7\" x=\"110.754095\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 432 |        "      </g>\n",
 433 |        "     </g>\n",
 434 |        "     <g id=\"text_2\">\n",
 435 |        "      <!-- 100 -->\n",
 436 |        "      <g transform=\"translate(101.210345 160.398438)scale(0.1 -0.1)\">\n",
 437 |        "       <defs>\n",
 438 |        "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
 439 |        "L 1825 531 \n",
 440 |        "L 1825 4091 \n",
 441 |        "L 703 3866 \n",
 442 |        "L 703 4441 \n",
 443 |        "L 1819 4666 \n",
 444 |        "L 2450 4666 \n",
 445 |        "L 2450 531 \n",
 446 |        "L 3481 531 \n",
 447 |        "L 3481 0 \n",
 448 |        "L 794 0 \n",
 449 |        "L 794 531 \n",
 450 |        "z\n",
 451 |        "\" transform=\"scale(0.015625)\"/>\n",
 452 |        "       </defs>\n",
 453 |        "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
 454 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 455 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 456 |        "      </g>\n",
 457 |        "     </g>\n",
 458 |        "    </g>\n",
 459 |        "    <g id=\"xtick_3\">\n",
 460 |        "     <g id=\"line2d_5\">\n",
 461 |        "      <path d=\"M 144.426509 145.8 \n",
 462 |        "L 144.426509 7.2 \n",
 463 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 464 |        "     </g>\n",
 465 |        "     <g id=\"line2d_6\">\n",
 466 |        "      <g>\n",
 467 |        "       <use xlink:href=\"#mc969eb82b7\" x=\"144.426509\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 468 |        "      </g>\n",
 469 |        "     </g>\n",
 470 |        "     <g id=\"text_3\">\n",
 471 |        "      <!-- 150 -->\n",
 472 |        "      <g transform=\"translate(134.882759 160.398438)scale(0.1 -0.1)\">\n",
 473 |        "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
 474 |        "       <use xlink:href=\"#DejaVuSans-35\" x=\"63.623047\"/>\n",
 475 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 476 |        "      </g>\n",
 477 |        "     </g>\n",
 478 |        "    </g>\n",
 479 |        "    <g id=\"xtick_4\">\n",
 480 |        "     <g id=\"line2d_7\">\n",
 481 |        "      <path d=\"M 178.098922 145.8 \n",
 482 |        "L 178.098922 7.2 \n",
 483 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 484 |        "     </g>\n",
 485 |        "     <g id=\"line2d_8\">\n",
 486 |        "      <g>\n",
 487 |        "       <use xlink:href=\"#mc969eb82b7\" x=\"178.098922\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 488 |        "      </g>\n",
 489 |        "     </g>\n",
 490 |        "     <g id=\"text_4\">\n",
 491 |        "      <!-- 200 -->\n",
 492 |        "      <g transform=\"translate(168.555172 160.398438)scale(0.1 -0.1)\">\n",
 493 |        "       <defs>\n",
 494 |        "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
 495 |        "L 3431 531 \n",
 496 |        "L 3431 0 \n",
 497 |        "L 469 0 \n",
 498 |        "L 469 531 \n",
 499 |        "Q 828 903 1448 1529 \n",
 500 |        "Q 2069 2156 2228 2338 \n",
 501 |        "Q 2531 2678 2651 2914 \n",
 502 |        "Q 2772 3150 2772 3378 \n",
 503 |        "Q 2772 3750 2511 3984 \n",
 504 |        "Q 2250 4219 1831 4219 \n",
 505 |        "Q 1534 4219 1204 4116 \n",
 506 |        "Q 875 4013 500 3803 \n",
 507 |        "L 500 4441 \n",
 508 |        "Q 881 4594 1212 4672 \n",
 509 |        "Q 1544 4750 1819 4750 \n",
 510 |        "Q 2544 4750 2975 4387 \n",
 511 |        "Q 3406 4025 3406 3419 \n",
 512 |        "Q 3406 3131 3298 2873 \n",
 513 |        "Q 3191 2616 2906 2266 \n",
 514 |        "Q 2828 2175 2409 1742 \n",
 515 |        "Q 1991 1309 1228 531 \n",
 516 |        "z\n",
 517 |        "\" transform=\"scale(0.015625)\"/>\n",
 518 |        "       </defs>\n",
 519 |        "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
 520 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 521 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 522 |        "      </g>\n",
 523 |        "     </g>\n",
 524 |        "    </g>\n",
 525 |        "    <g id=\"xtick_5\">\n",
 526 |        "     <g id=\"line2d_9\">\n",
 527 |        "      <path d=\"M 211.771336 145.8 \n",
 528 |        "L 211.771336 7.2 \n",
 529 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 530 |        "     </g>\n",
 531 |        "     <g id=\"line2d_10\">\n",
 532 |        "      <g>\n",
 533 |        "       <use xlink:href=\"#mc969eb82b7\" x=\"211.771336\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 534 |        "      </g>\n",
 535 |        "     </g>\n",
 536 |        "     <g id=\"text_5\">\n",
 537 |        "      <!-- 250 -->\n",
 538 |        "      <g transform=\"translate(202.227586 160.398438)scale(0.1 -0.1)\">\n",
 539 |        "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
 540 |        "       <use xlink:href=\"#DejaVuSans-35\" x=\"63.623047\"/>\n",
 541 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 542 |        "      </g>\n",
 543 |        "     </g>\n",
 544 |        "    </g>\n",
 545 |        "    <g id=\"xtick_6\">\n",
 546 |        "     <g id=\"line2d_11\">\n",
 547 |        "      <path d=\"M 245.44375 145.8 \n",
 548 |        "L 245.44375 7.2 \n",
 549 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 550 |        "     </g>\n",
 551 |        "     <g id=\"line2d_12\">\n",
 552 |        "      <g>\n",
 553 |        "       <use xlink:href=\"#mc969eb82b7\" x=\"245.44375\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 554 |        "      </g>\n",
 555 |        "     </g>\n",
 556 |        "     <g id=\"text_6\">\n",
 557 |        "      <!-- 300 -->\n",
 558 |        "      <g transform=\"translate(235.9 160.398438)scale(0.1 -0.1)\">\n",
 559 |        "       <defs>\n",
 560 |        "        <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
 561 |        "Q 3050 2419 3304 2112 \n",
 562 |        "Q 3559 1806 3559 1356 \n",
 563 |        "Q 3559 666 3084 287 \n",
 564 |        "Q 2609 -91 1734 -91 \n",
 565 |        "Q 1441 -91 1130 -33 \n",
 566 |        "Q 819 25 488 141 \n",
 567 |        "L 488 750 \n",
 568 |        "Q 750 597 1062 519 \n",
 569 |        "Q 1375 441 1716 441 \n",
 570 |        "Q 2309 441 2620 675 \n",
 571 |        "Q 2931 909 2931 1356 \n",
 572 |        "Q 2931 1769 2642 2001 \n",
 573 |        "Q 2353 2234 1838 2234 \n",
 574 |        "L 1294 2234 \n",
 575 |        "L 1294 2753 \n",
 576 |        "L 1863 2753 \n",
 577 |        "Q 2328 2753 2575 2939 \n",
 578 |        "Q 2822 3125 2822 3475 \n",
 579 |        "Q 2822 3834 2567 4026 \n",
 580 |        "Q 2313 4219 1838 4219 \n",
 581 |        "Q 1578 4219 1281 4162 \n",
 582 |        "Q 984 4106 628 3988 \n",
 583 |        "L 628 4550 \n",
 584 |        "Q 988 4650 1302 4700 \n",
 585 |        "Q 1616 4750 1894 4750 \n",
 586 |        "Q 2613 4750 3031 4423 \n",
 587 |        "Q 3450 4097 3450 3541 \n",
 588 |        "Q 3450 3153 3228 2886 \n",
 589 |        "Q 3006 2619 2597 2516 \n",
 590 |        "z\n",
 591 |        "\" transform=\"scale(0.015625)\"/>\n",
 592 |        "       </defs>\n",
 593 |        "       <use xlink:href=\"#DejaVuSans-33\"/>\n",
 594 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 595 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 596 |        "      </g>\n",
 597 |        "     </g>\n",
 598 |        "    </g>\n",
 599 |        "    <g id=\"text_7\">\n",
 600 |        "     <!-- epoch -->\n",
 601 |        "     <g transform=\"translate(132.565625 174.076563)scale(0.1 -0.1)\">\n",
 602 |        "      <defs>\n",
 603 |        "       <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
 604 |        "L 3597 1613 \n",
 605 |        "L 953 1613 \n",
 606 |        "Q 991 1019 1311 708 \n",
 607 |        "Q 1631 397 2203 397 \n",
 608 |        "Q 2534 397 2845 478 \n",
 609 |        "Q 3156 559 3463 722 \n",
 610 |        "L 3463 178 \n",
 611 |        "Q 3153 47 2828 -22 \n",
 612 |        "Q 2503 -91 2169 -91 \n",
 613 |        "Q 1331 -91 842 396 \n",
 614 |        "Q 353 884 353 1716 \n",
 615 |        "Q 353 2575 817 3079 \n",
 616 |        "Q 1281 3584 2069 3584 \n",
 617 |        "Q 2775 3584 3186 3129 \n",
 618 |        "Q 3597 2675 3597 1894 \n",
 619 |        "z\n",
 620 |        "M 3022 2063 \n",
 621 |        "Q 3016 2534 2758 2815 \n",
 622 |        "Q 2500 3097 2075 3097 \n",
 623 |        "Q 1594 3097 1305 2825 \n",
 624 |        "Q 1016 2553 972 2059 \n",
 625 |        "L 3022 2063 \n",
 626 |        "z\n",
 627 |        "\" transform=\"scale(0.015625)\"/>\n",
 628 |        "       <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
 629 |        "L 1159 -1331 \n",
 630 |        "L 581 -1331 \n",
 631 |        "L 581 3500 \n",
 632 |        "L 1159 3500 \n",
 633 |        "L 1159 2969 \n",
 634 |        "Q 1341 3281 1617 3432 \n",
 635 |        "Q 1894 3584 2278 3584 \n",
 636 |        "Q 2916 3584 3314 3078 \n",
 637 |        "Q 3713 2572 3713 1747 \n",
 638 |        "Q 3713 922 3314 415 \n",
 639 |        "Q 2916 -91 2278 -91 \n",
 640 |        "Q 1894 -91 1617 61 \n",
 641 |        "Q 1341 213 1159 525 \n",
 642 |        "z\n",
 643 |        "M 3116 1747 \n",
 644 |        "Q 3116 2381 2855 2742 \n",
 645 |        "Q 2594 3103 2138 3103 \n",
 646 |        "Q 1681 3103 1420 2742 \n",
 647 |        "Q 1159 2381 1159 1747 \n",
 648 |        "Q 1159 1113 1420 752 \n",
 649 |        "Q 1681 391 2138 391 \n",
 650 |        "Q 2594 391 2855 752 \n",
 651 |        "Q 3116 1113 3116 1747 \n",
 652 |        "z\n",
 653 |        "\" transform=\"scale(0.015625)\"/>\n",
 654 |        "       <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
 655 |        "Q 1497 3097 1228 2736 \n",
 656 |        "Q 959 2375 959 1747 \n",
 657 |        "Q 959 1119 1226 758 \n",
 658 |        "Q 1494 397 1959 397 \n",
 659 |        "Q 2419 397 2687 759 \n",
 660 |        "Q 2956 1122 2956 1747 \n",
 661 |        "Q 2956 2369 2687 2733 \n",
 662 |        "Q 2419 3097 1959 3097 \n",
 663 |        "z\n",
 664 |        "M 1959 3584 \n",
 665 |        "Q 2709 3584 3137 3096 \n",
 666 |        "Q 3566 2609 3566 1747 \n",
 667 |        "Q 3566 888 3137 398 \n",
 668 |        "Q 2709 -91 1959 -91 \n",
 669 |        "Q 1206 -91 779 398 \n",
 670 |        "Q 353 888 353 1747 \n",
 671 |        "Q 353 2609 779 3096 \n",
 672 |        "Q 1206 3584 1959 3584 \n",
 673 |        "z\n",
 674 |        "\" transform=\"scale(0.015625)\"/>\n",
 675 |        "       <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
 676 |        "L 3122 2828 \n",
 677 |        "Q 2878 2963 2633 3030 \n",
 678 |        "Q 2388 3097 2138 3097 \n",
 679 |        "Q 1578 3097 1268 2742 \n",
 680 |        "Q 959 2388 959 1747 \n",
 681 |        "Q 959 1106 1268 751 \n",
 682 |        "Q 1578 397 2138 397 \n",
 683 |        "Q 2388 397 2633 464 \n",
 684 |        "Q 2878 531 3122 666 \n",
 685 |        "L 3122 134 \n",
 686 |        "Q 2881 22 2623 -34 \n",
 687 |        "Q 2366 -91 2075 -91 \n",
 688 |        "Q 1284 -91 818 406 \n",
 689 |        "Q 353 903 353 1747 \n",
 690 |        "Q 353 2603 823 3093 \n",
 691 |        "Q 1294 3584 2113 3584 \n",
 692 |        "Q 2378 3584 2631 3529 \n",
 693 |        "Q 2884 3475 3122 3366 \n",
 694 |        "z\n",
 695 |        "\" transform=\"scale(0.015625)\"/>\n",
 696 |        "       <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
 697 |        "L 3513 0 \n",
 698 |        "L 2938 0 \n",
 699 |        "L 2938 2094 \n",
 700 |        "Q 2938 2591 2744 2837 \n",
 701 |        "Q 2550 3084 2163 3084 \n",
 702 |        "Q 1697 3084 1428 2787 \n",
 703 |        "Q 1159 2491 1159 1978 \n",
 704 |        "L 1159 0 \n",
 705 |        "L 581 0 \n",
 706 |        "L 581 4863 \n",
 707 |        "L 1159 4863 \n",
 708 |        "L 1159 2956 \n",
 709 |        "Q 1366 3272 1645 3428 \n",
 710 |        "Q 1925 3584 2291 3584 \n",
 711 |        "Q 2894 3584 3203 3211 \n",
 712 |        "Q 3513 2838 3513 2113 \n",
 713 |        "z\n",
 714 |        "\" transform=\"scale(0.015625)\"/>\n",
 715 |        "      </defs>\n",
 716 |        "      <use xlink:href=\"#DejaVuSans-65\"/>\n",
 717 |        "      <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
 718 |        "      <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
 719 |        "      <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
 720 |        "      <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
 721 |        "     </g>\n",
 722 |        "    </g>\n",
 723 |        "   </g>\n",
 724 |        "   <g id=\"matplotlib.axis_2\">\n",
 725 |        "    <g id=\"ytick_1\">\n",
 726 |        "     <g id=\"line2d_13\">\n",
 727 |        "      <path d=\"M 50.14375 118.235493 \n",
 728 |        "L 245.44375 118.235493 \n",
 729 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 730 |        "     </g>\n",
 731 |        "     <g id=\"line2d_14\">\n",
 732 |        "      <defs>\n",
 733 |        "       <path id=\"m652f8bd1c0\" d=\"M 0 0 \n",
 734 |        "L -3.5 0 \n",
 735 |        "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 736 |        "      </defs>\n",
 737 |        "      <g>\n",
 738 |        "       <use xlink:href=\"#m652f8bd1c0\" x=\"50.14375\" y=\"118.235493\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 739 |        "      </g>\n",
 740 |        "     </g>\n",
 741 |        "     <g id=\"text_8\">\n",
 742 |        "      <!-- 0.05 -->\n",
 743 |        "      <g transform=\"translate(20.878125 122.034712)scale(0.1 -0.1)\">\n",
 744 |        "       <defs>\n",
 745 |        "        <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
 746 |        "L 1344 794 \n",
 747 |        "L 1344 0 \n",
 748 |        "L 684 0 \n",
 749 |        "L 684 794 \n",
 750 |        "z\n",
 751 |        "\" transform=\"scale(0.015625)\"/>\n",
 752 |        "       </defs>\n",
 753 |        "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
 754 |        "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
 755 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
 756 |        "       <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
 757 |        "      </g>\n",
 758 |        "     </g>\n",
 759 |        "    </g>\n",
 760 |        "    <g id=\"ytick_2\">\n",
 761 |        "     <g id=\"line2d_15\">\n",
 762 |        "      <path d=\"M 50.14375 84.425798 \n",
 763 |        "L 245.44375 84.425798 \n",
 764 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 765 |        "     </g>\n",
 766 |        "     <g id=\"line2d_16\">\n",
 767 |        "      <g>\n",
 768 |        "       <use xlink:href=\"#m652f8bd1c0\" x=\"50.14375\" y=\"84.425798\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 769 |        "      </g>\n",
 770 |        "     </g>\n",
 771 |        "     <g id=\"text_9\">\n",
 772 |        "      <!-- 0.10 -->\n",
 773 |        "      <g transform=\"translate(20.878125 88.225017)scale(0.1 -0.1)\">\n",
 774 |        "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
 775 |        "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
 776 |        "       <use xlink:href=\"#DejaVuSans-31\" x=\"95.410156\"/>\n",
 777 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
 778 |        "      </g>\n",
 779 |        "     </g>\n",
 780 |        "    </g>\n",
 781 |        "    <g id=\"ytick_3\">\n",
 782 |        "     <g id=\"line2d_17\">\n",
 783 |        "      <path d=\"M 50.14375 50.616103 \n",
 784 |        "L 245.44375 50.616103 \n",
 785 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 786 |        "     </g>\n",
 787 |        "     <g id=\"line2d_18\">\n",
 788 |        "      <g>\n",
 789 |        "       <use xlink:href=\"#m652f8bd1c0\" x=\"50.14375\" y=\"50.616103\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 790 |        "      </g>\n",
 791 |        "     </g>\n",
 792 |        "     <g id=\"text_10\">\n",
 793 |        "      <!-- 0.15 -->\n",
 794 |        "      <g transform=\"translate(20.878125 54.415322)scale(0.1 -0.1)\">\n",
 795 |        "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
 796 |        "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
 797 |        "       <use xlink:href=\"#DejaVuSans-31\" x=\"95.410156\"/>\n",
 798 |        "       <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
 799 |        "      </g>\n",
 800 |        "     </g>\n",
 801 |        "    </g>\n",
 802 |        "    <g id=\"ytick_4\">\n",
 803 |        "     <g id=\"line2d_19\">\n",
 804 |        "      <path d=\"M 50.14375 16.806408 \n",
 805 |        "L 245.44375 16.806408 \n",
 806 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
 807 |        "     </g>\n",
 808 |        "     <g id=\"line2d_20\">\n",
 809 |        "      <g>\n",
 810 |        "       <use xlink:href=\"#m652f8bd1c0\" x=\"50.14375\" y=\"16.806408\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 811 |        "      </g>\n",
 812 |        "     </g>\n",
 813 |        "     <g id=\"text_11\">\n",
 814 |        "      <!-- 0.20 -->\n",
 815 |        "      <g transform=\"translate(20.878125 20.605627)scale(0.1 -0.1)\">\n",
 816 |        "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
 817 |        "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
 818 |        "       <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
 819 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
 820 |        "      </g>\n",
 821 |        "     </g>\n",
 822 |        "    </g>\n",
 823 |        "    <g id=\"text_12\">\n",
 824 |        "     <!-- loss -->\n",
 825 |        "     <g transform=\"translate(14.798437 86.157813)rotate(-90)scale(0.1 -0.1)\">\n",
 826 |        "      <defs>\n",
 827 |        "       <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
 828 |        "L 1178 4863 \n",
 829 |        "L 1178 0 \n",
 830 |        "L 603 0 \n",
 831 |        "L 603 4863 \n",
 832 |        "z\n",
 833 |        "\" transform=\"scale(0.015625)\"/>\n",
 834 |        "       <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
 835 |        "L 2834 2853 \n",
 836 |        "Q 2591 2978 2328 3040 \n",
 837 |        "Q 2066 3103 1784 3103 \n",
 838 |        "Q 1356 3103 1142 2972 \n",
 839 |        "Q 928 2841 928 2578 \n",
 840 |        "Q 928 2378 1081 2264 \n",
 841 |        "Q 1234 2150 1697 2047 \n",
 842 |        "L 1894 2003 \n",
 843 |        "Q 2506 1872 2764 1633 \n",
 844 |        "Q 3022 1394 3022 966 \n",
 845 |        "Q 3022 478 2636 193 \n",
 846 |        "Q 2250 -91 1575 -91 \n",
 847 |        "Q 1294 -91 989 -36 \n",
 848 |        "Q 684 19 347 128 \n",
 849 |        "L 347 722 \n",
 850 |        "Q 666 556 975 473 \n",
 851 |        "Q 1284 391 1588 391 \n",
 852 |        "Q 1994 391 2212 530 \n",
 853 |        "Q 2431 669 2431 922 \n",
 854 |        "Q 2431 1156 2273 1281 \n",
 855 |        "Q 2116 1406 1581 1522 \n",
 856 |        "L 1381 1569 \n",
 857 |        "Q 847 1681 609 1914 \n",
 858 |        "Q 372 2147 372 2553 \n",
 859 |        "Q 372 3047 722 3315 \n",
 860 |        "Q 1072 3584 1716 3584 \n",
 861 |        "Q 2034 3584 2315 3537 \n",
 862 |        "Q 2597 3491 2834 3397 \n",
 863 |        "z\n",
 864 |        "\" transform=\"scale(0.015625)\"/>\n",
 865 |        "      </defs>\n",
 866 |        "      <use xlink:href=\"#DejaVuSans-6c\"/>\n",
 867 |        "      <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
 868 |        "      <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
 869 |        "      <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
 870 |        "     </g>\n",
 871 |        "    </g>\n",
 872 |        "   </g>\n",
 873 |        "   <g id=\"line2d_21\">\n",
 874 |        "    <path d=\"M 50.14375 13.5 \n",
 875 |        "L 56.878233 55.534874 \n",
 876 |        "L 63.612716 78.732629 \n",
 877 |        "L 70.347198 95.346676 \n",
 878 |        "L 77.081681 106.414415 \n",
 879 |        "L 83.816164 115.07807 \n",
 880 |        "L 90.550647 120.11331 \n",
 881 |        "L 97.285129 123.884773 \n",
 882 |        "L 104.019612 126.965736 \n",
 883 |        "L 110.754095 128.943247 \n",
 884 |        "L 117.488578 130.500616 \n",
 885 |        "L 124.22306 132.163631 \n",
 886 |        "L 130.957543 133.535082 \n",
 887 |        "L 137.692026 134.587701 \n",
 888 |        "L 144.426509 135.083235 \n",
 889 |        "L 151.160991 136.062388 \n",
 890 |        "L 157.895474 136.304679 \n",
 891 |        "L 164.629957 137.257881 \n",
 892 |        "L 171.36444 137.41007 \n",
 893 |        "L 178.098922 137.806472 \n",
 894 |        "L 184.833405 138.254089 \n",
 895 |        "L 191.567888 138.054099 \n",
 896 |        "L 198.302371 138.382341 \n",
 897 |        "L 205.036853 138.738361 \n",
 898 |        "L 211.771336 138.596545 \n",
 899 |        "L 218.505819 139.176029 \n",
 900 |        "L 225.240302 139.171644 \n",
 901 |        "L 231.974784 138.814896 \n",
 902 |        "L 238.709267 138.959794 \n",
 903 |        "L 245.44375 139.5 \n",
 904 |        "\" clip-path=\"url(#p89cfba6520)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
 905 |        "   </g>\n",
 906 |        "   <g id=\"patch_3\">\n",
 907 |        "    <path d=\"M 50.14375 145.8 \n",
 908 |        "L 50.14375 7.2 \n",
 909 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 910 |        "   </g>\n",
 911 |        "   <g id=\"patch_4\">\n",
 912 |        "    <path d=\"M 245.44375 145.8 \n",
 913 |        "L 245.44375 7.2 \n",
 914 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 915 |        "   </g>\n",
 916 |        "   <g id=\"patch_5\">\n",
 917 |        "    <path d=\"M 50.14375 145.8 \n",
 918 |        "L 245.44375 145.8 \n",
 919 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 920 |        "   </g>\n",
 921 |        "   <g id=\"patch_6\">\n",
 922 |        "    <path d=\"M 50.14375 7.2 \n",
 923 |        "L 245.44375 7.2 \n",
 924 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 925 |        "   </g>\n",
 926 |        "  </g>\n",
 927 |        " </g>\n",
 928 |        " <defs>\n",
 929 |        "  <clipPath id=\"p89cfba6520\">\n",
 930 |        "   <rect x=\"50.14375\" y=\"7.2\" width=\"195.3\" height=\"138.6\"/>\n",
 931 |        "  </clipPath>\n",
 932 |        " </defs>\n",
 933 |        "</svg>\n"
 934 |       ],
 935 |       "text/plain": [
 936 |        "<Figure size 350x250 with 1 Axes>"
 937 |       ]
 938 |      },
 939 |      "metadata": {},
 940 |      "output_type": "display_data"
 941 |     }
 942 |    ],
 943 |    "source": [
 944 |     "embed_size, num_hiddens, num_layers, dropout = 32, 32, 2, 0.1\n",
 945 |     "batch_size, num_steps = 64, 10\n",
 946 |     "lr, num_epochs, device = 0.005, 300, d2l.try_gpu()\n",
 947 |     "\n",
 948 |     "train_iter, src_vocab, tgt_vocab = d2l.load_data_nmt(batch_size, num_steps)\n",
 949 |     "encoder = seq2seqEncoder(len(src_vocab), embed_size, num_hiddens, num_layers, dropout)\n",
 950 |     "decoder = seq2seqDecoder(len(tgt_vocab), embed_size, num_hiddens, num_layers, dropout)\n",
 951 |     "net = d2l.EncoderDecoder(encoder, decoder)\n",
 952 |     "train(net, train_iter, lr, num_epochs, tgt_vocab, device)"
 953 |    ]
 954 |   },
 955 |   {
 956 |    "cell_type": "code",
 957 |    "execution_count": 19,
 958 |    "id": "49b0d998",
 959 |    "metadata": {},
 960 |    "outputs": [
 961 |     {
 962 |      "name": "stdout",
 963 |      "output_type": "stream",
 964 |      "text": [
 965 |       "torch.Size([2, 3])\n",
 966 |       "torch.Size([1, 2, 3])\n",
 967 |       "torch.Size([2, 1, 3])\n"
 968 |      ]
 969 |     }
 970 |    ],
 971 |    "source": [
 972 |     "a1 = torch.tensor([[1,2,3],\n",
 973 |     "                   [4,5,6]])\n",
 974 |     "print(a1.shape)\n",
 975 |     "\n",
 976 |     "b1 = torch.unsqueeze(a1, dim=0)\n",
 977 |     "\n",
 978 |     "print(b1.shape)\n",
 979 |     "\n",
 980 |     "c1 = torch.unsqueeze(a1, dim=1)\n",
 981 |     "\n",
 982 |     "print(c1.shape)"
 983 |    ]
 984 |   },
 985 |   {
 986 |    "cell_type": "markdown",
 987 |    "id": "a698a589",
 988 |    "metadata": {},
 989 |    "source": [
 990 |     "预测"
 991 |    ]
 992 |   },
 993 |   {
 994 |    "cell_type": "code",
 995 |    "execution_count": 44,
 996 |    "id": "27fcd8d3",
 997 |    "metadata": {},
 998 |    "outputs": [],
 999 |    "source": [
1000 |     "def predict(net, src_sentence, src_vocab, tgt_vocab, num_steps, device, save_attention_weights=False):\n",
1001 |     "    net.eval()\n",
1002 |     "    src_tokens = src_vocab[src_sentence.lower().split(' ')] + [src_vocab['<eos>']]\n",
1003 |     "    enc_valid_len = torch.tensor([len(src_tokens)], device=device)\n",
1004 |     "    src_tokens = d2l.truncate_pad(src_tokens, num_steps, src_vocab['<pad>'])\n",
1005 |     "    # 添加批量轴, 在源词元前面加一个batch的维度\n",
1006 |     "    enc_X = torch.unsqueeze(torch.tensor(src_tokens, dtype=torch.long, device=device), dim=0)\n",
1007 |     "    enc_output = net.encoder(enc_X, enc_valid_len)\n",
1008 |     "    dec_state = net.decoder.init_state(enc_output, enc_valid_len)\n",
1009 |     "    # 添加批量轴，在开始词元前面添加一个batch的维度\n",
1010 |     "    dec_X = torch.unsqueeze(torch.tensor([tgt_vocab['<bos>']], dtype=torch.long, device=device), dim=0)\n",
1011 |     "    \n",
1012 |     "    output_seq, save_attention_seq = [], []\n",
1013 |     "    for _ in range(num_steps):\n",
1014 |     "        Y, dec_state = net.decoder(dec_X, dec_state)\n",
1015 |     "        # 选取可能性最高的作为下一个时刻的解码器输入\n",
1016 |     "        dec_X = Y.argmax(dim=2)\n",
1017 |     "        # 把batch的维度去掉\n",
1018 |     "        pred = dec_X.squeeze(dim=0).type(torch.int32).item()\n",
1019 |     "        # 保存注意力权重\n",
1020 |     "        if save_attention_weights:\n",
1021 |     "            save_attention_seq.append(net.decoder.attention_weights)\n",
1022 |     "        # 当检测到预测为结束词元，结束预测   \n",
1023 |     "        if pred == tgt_vocab['<eos>']:\n",
1024 |     "            break\n",
1025 |     "        output_seq.append(pred)\n",
1026 |     "    return ' '.join(tgt_vocab.to_tokens(output_seq)), save_attention_seq"
1027 |    ]
1028 |   },
1029 |   {
1030 |    "cell_type": "markdown",
1031 |    "id": "299f1b55",
1032 |    "metadata": {},
1033 |    "source": [
1034 |     "通过BLEU评估预测序列的质量"
1035 |    ]
1036 |   },
1037 |   {
1038 |    "cell_type": "code",
1039 |    "execution_count": 38,
1040 |    "id": "4567c385",
1041 |    "metadata": {},
1042 |    "outputs": [],
1043 |    "source": [
1044 |     "def BLEU(pred_seq, label_seq, k):\n",
1045 |     "    pred_tokens, label_tokens = pred_seq.split(' '), label_seq.split(' ')\n",
1046 |     "    len_pred, len_label = len(pred_tokens), len(label_tokens)\n",
1047 |     "    score = math.exp(min(0, 1-len_label/len_pred))\n",
1048 |     "    \n",
1049 |     "    for n in range(1, k+1):\n",
1050 |     "        num_matches, label_subs = 0, collections.defaultdict(int)\n",
1051 |     "        for i in range(len_label - n + 1):\n",
1052 |     "            label_subs[' '.join(label_tokens[i: i + n])] += 1\n",
1053 |     "        for i in range(len_pred - n + 1):\n",
1054 |     "            if label_subs[' '.join(pred_tokens[i: i + n])] > 0:\n",
1055 |     "                num_matches += 1\n",
1056 |     "                label_subs[' '.join(pred_tokens[i: i + n])] -= 1\n",
1057 |     "        score *= math.pow(num_matches / (len_pred - n + 1), math.pow(0.5, n))\n",
1058 |     "    return score "
1059 |    ]
1060 |   },
1061 |   {
1062 |    "cell_type": "code",
1063 |    "execution_count": 45,
1064 |    "id": "9f7fc0a9",
1065 |    "metadata": {},
1066 |    "outputs": [
1067 |     {
1068 |      "name": "stdout",
1069 |      "output_type": "stream",
1070 |      "text": [
1071 |       "go . => va <unk> !, bleu 0.000\n",
1072 |       "i lost . => j'ai <unk> ., bleu 0.000\n",
1073 |       "he's calm . => il est tombé ?, bleu 0.537\n",
1074 |       "i'm home . => je suis certain ., bleu 0.512\n"
1075 |      ]
1076 |     }
1077 |    ],
1078 |    "source": [
1079 |     "engs = ['go .', \"i lost .\", 'he\\'s calm .', 'i\\'m home .']\n",
1080 |     "fras = ['va !', 'j\\'ai perdu .', 'il est calme .', 'je suis chez moi .']\n",
1081 |     "for eng, fra in zip(engs, fras):\n",
1082 |     "    translation, attention_weight_seq = predict(\n",
1083 |     "        net, eng, src_vocab, tgt_vocab, num_steps, device)\n",
1084 |     "    print(f'{eng} => {translation}, bleu {BLEU(translation, fra, k=2):.3f}')"
1085 |    ]
1086 |   },
1087 |   {
1088 |    "cell_type": "code",
1089 |    "execution_count": null,
1090 |    "id": "58f0d057",
1091 |    "metadata": {},
1092 |    "outputs": [],
1093 |    "source": []
1094 |   }
1095 |  ],
1096 |  "metadata": {
1097 |   "kernelspec": {
1098 |    "display_name": "Python [conda env:.conda-torch] *",
1099 |    "language": "python",
1100 |    "name": "conda-env-.conda-torch-py"
1101 |   },
1102 |   "language_info": {
1103 |    "codemirror_mode": {
1104 |     "name": "ipython",
1105 |     "version": 3
1106 |    },
1107 |    "file_extension": ".py",
1108 |    "mimetype": "text/x-python",
1109 |    "name": "python",
1110 |    "nbconvert_exporter": "python",
1111 |    "pygments_lexer": "ipython3",
1112 |    "version": "3.8.16"
1113 |   }
1114 |  },
1115 |  "nbformat": 4,
1116 |  "nbformat_minor": 5
1117 | }
1118 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/文本预处理/result1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/文本预处理/result1.png


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/文本预处理/result2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/文本预处理/result2.png


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/语言模型/fig1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/语言模型/fig1.png


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/语言模型/fig2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/语言模型/fig2.png


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/语言模型/fig3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/语言模型/fig3.png


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/语言模型/fig4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/语言模型/fig4.png


--------------------------------------------------------------------------------
/李沐DeepLearning/图片/语言模型/fig5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pod2c/Machine_Learning/355a729d156888d5fc8b1653b3e181b638e859e8/李沐DeepLearning/图片/语言模型/fig5.png


--------------------------------------------------------------------------------
/李沐DeepLearning/数据处理.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 2,
  6 |    "id": "0567362c",
  7 |    "metadata": {},
  8 |    "outputs": [],
  9 |    "source": [
 10 |     "import torch"
 11 |    ]
 12 |   },
 13 |   {
 14 |    "cell_type": "markdown",
 15 |    "id": "012b8394",
 16 |    "metadata": {},
 17 |    "source": [
 18 |     "张量表示一个数值组成的数组，这个数组可能拥有多个维度"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 3,
 24 |    "id": "68788921",
 25 |    "metadata": {},
 26 |    "outputs": [
 27 |     {
 28 |      "data": {
 29 |       "text/plain": [
 30 |        "tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
 31 |       ]
 32 |      },
 33 |      "execution_count": 3,
 34 |      "metadata": {},
 35 |      "output_type": "execute_result"
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "x=torch.arange(10)\n",
 40 |     "x"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "markdown",
 45 |    "id": "c916d49f",
 46 |    "metadata": {},
 47 |    "source": [
 48 |     " 张量的shape属性能够访问张量的形状和张量内的元素总数"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 4,
 54 |    "id": "1f30a700",
 55 |    "metadata": {},
 56 |    "outputs": [
 57 |     {
 58 |      "data": {
 59 |       "text/plain": [
 60 |        "torch.Size([10])"
 61 |       ]
 62 |      },
 63 |      "execution_count": 4,
 64 |      "metadata": {},
 65 |      "output_type": "execute_result"
 66 |     }
 67 |    ],
 68 |    "source": [
 69 |     "x.shape"
 70 |    ]
 71 |   },
 72 |   {
 73 |    "cell_type": "code",
 74 |    "execution_count": 5,
 75 |    "id": "ee144095",
 76 |    "metadata": {},
 77 |    "outputs": [
 78 |     {
 79 |      "data": {
 80 |       "text/plain": [
 81 |        "10"
 82 |       ]
 83 |      },
 84 |      "execution_count": 5,
 85 |      "metadata": {},
 86 |      "output_type": "execute_result"
 87 |     }
 88 |    ],
 89 |    "source": [
 90 |     "x.numel() ##表示张量内的元素数量，是一个标量"
 91 |    ]
 92 |   },
 93 |   {
 94 |    "cell_type": "markdown",
 95 |    "id": "4a66e5f8",
 96 |    "metadata": {},
 97 |    "source": [
 98 |     "要改变一个张量的形状而不改变张量内的元素的数量和数值需要调用reshape函数"
 99 |    ]
100 |   },
101 |   {
102 |    "cell_type": "code",
103 |    "execution_count": 6,
104 |    "id": "80bdc2b0",
105 |    "metadata": {},
106 |    "outputs": [
107 |     {
108 |      "data": {
109 |       "text/plain": [
110 |        "tensor([[0, 1],\n",
111 |        "        [2, 3],\n",
112 |        "        [4, 5],\n",
113 |        "        [6, 7],\n",
114 |        "        [8, 9]])"
115 |       ]
116 |      },
117 |      "execution_count": 6,
118 |      "metadata": {},
119 |      "output_type": "execute_result"
120 |     }
121 |    ],
122 |    "source": [
123 |     "x.reshape(5,2) ##将张量改为5行2列"
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "markdown",
128 |    "id": "5eae3ff6",
129 |    "metadata": {},
130 |    "source": [
131 |     "可以生成一些全是1、全是0或是从特定分布中随机采样的张量"
132 |    ]
133 |   },
134 |   {
135 |    "cell_type": "code",
136 |    "execution_count": 7,
137 |    "id": "1dab6a82",
138 |    "metadata": {},
139 |    "outputs": [
140 |     {
141 |      "data": {
142 |       "text/plain": [
143 |        "tensor([[[0., 0., 0., 0., 0.],\n",
144 |        "         [0., 0., 0., 0., 0.],\n",
145 |        "         [0., 0., 0., 0., 0.],\n",
146 |        "         [0., 0., 0., 0., 0.]],\n",
147 |        "\n",
148 |        "        [[0., 0., 0., 0., 0.],\n",
149 |        "         [0., 0., 0., 0., 0.],\n",
150 |        "         [0., 0., 0., 0., 0.],\n",
151 |        "         [0., 0., 0., 0., 0.]],\n",
152 |        "\n",
153 |        "        [[0., 0., 0., 0., 0.],\n",
154 |        "         [0., 0., 0., 0., 0.],\n",
155 |        "         [0., 0., 0., 0., 0.],\n",
156 |        "         [0., 0., 0., 0., 0.]]])"
157 |       ]
158 |      },
159 |      "execution_count": 7,
160 |      "metadata": {},
161 |      "output_type": "execute_result"
162 |     }
163 |    ],
164 |    "source": [
165 |     "torch.zeros((3,4,5))"
166 |    ]
167 |   },
168 |   {
169 |    "cell_type": "code",
170 |    "execution_count": 8,
171 |    "id": "ec04ec80",
172 |    "metadata": {},
173 |    "outputs": [
174 |     {
175 |      "data": {
176 |       "text/plain": [
177 |        "tensor([[[1., 1., 1., 1., 1.],\n",
178 |        "         [1., 1., 1., 1., 1.],\n",
179 |        "         [1., 1., 1., 1., 1.],\n",
180 |        "         [1., 1., 1., 1., 1.]],\n",
181 |        "\n",
182 |        "        [[1., 1., 1., 1., 1.],\n",
183 |        "         [1., 1., 1., 1., 1.],\n",
184 |        "         [1., 1., 1., 1., 1.],\n",
185 |        "         [1., 1., 1., 1., 1.]],\n",
186 |        "\n",
187 |        "        [[1., 1., 1., 1., 1.],\n",
188 |        "         [1., 1., 1., 1., 1.],\n",
189 |        "         [1., 1., 1., 1., 1.],\n",
190 |        "         [1., 1., 1., 1., 1.]]])"
191 |       ]
192 |      },
193 |      "execution_count": 8,
194 |      "metadata": {},
195 |      "output_type": "execute_result"
196 |     }
197 |    ],
198 |    "source": [
199 |     "torch.ones((3,4,5))"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "markdown",
204 |    "id": "554cecd1",
205 |    "metadata": {},
206 |    "source": [
207 |     "常见的标准运算都能升级成按元素计算"
208 |    ]
209 |   },
210 |   {
211 |    "cell_type": "code",
212 |    "execution_count": 9,
213 |    "id": "91f8c585",
214 |    "metadata": {},
215 |    "outputs": [
216 |     {
217 |      "data": {
218 |       "text/plain": [
219 |        "(tensor([ 3,  4,  6, 10]),\n",
220 |        " tensor([-1,  0,  2,  6]),\n",
221 |        " tensor([ 2,  4,  8, 16]),\n",
222 |        " tensor([0.5000, 1.0000, 2.0000, 4.0000]),\n",
223 |        " tensor([ 1,  4, 16, 64]))"
224 |       ]
225 |      },
226 |      "execution_count": 9,
227 |      "metadata": {},
228 |      "output_type": "execute_result"
229 |     }
230 |    ],
231 |    "source": [
232 |     "x = torch.tensor([1,2,4,8])\n",
233 |     "y = torch.tensor([2,2,2,2])\n",
234 |     "x+y,x-y,x*y,x/y,x**y"
235 |    ]
236 |   },
237 |   {
238 |    "cell_type": "code",
239 |    "execution_count": 10,
240 |    "id": "830b6367",
241 |    "metadata": {},
242 |    "outputs": [
243 |     {
244 |      "data": {
245 |       "text/plain": [
246 |        "tensor([2.7183e+00, 7.3891e+00, 5.4598e+01, 2.9810e+03])"
247 |       ]
248 |      },
249 |      "execution_count": 10,
250 |      "metadata": {},
251 |      "output_type": "execute_result"
252 |     }
253 |    ],
254 |    "source": [
255 |     "##指数计算\n",
256 |     "torch.exp(x)"
257 |    ]
258 |   },
259 |   {
260 |    "cell_type": "markdown",
261 |    "id": "3b071b5c",
262 |    "metadata": {},
263 |    "source": [
264 |     "连接多个张量"
265 |    ]
266 |   },
267 |   {
268 |    "cell_type": "code",
269 |    "execution_count": 11,
270 |    "id": "9972dcff",
271 |    "metadata": {},
272 |    "outputs": [
273 |     {
274 |      "data": {
275 |       "text/plain": [
276 |        "(tensor([[ 0.,  1.,  2.,  3.],\n",
277 |        "         [ 4.,  5.,  6.,  7.],\n",
278 |        "         [ 8.,  9., 10., 11.],\n",
279 |        "         [ 2.,  1.,  4.,  3.],\n",
280 |        "         [ 1.,  2.,  3.,  4.],\n",
281 |        "         [ 4.,  1.,  2.,  3.]]),\n",
282 |        " tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],\n",
283 |        "         [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],\n",
284 |        "         [ 8.,  9., 10., 11.,  4.,  1.,  2.,  3.]]))"
285 |       ]
286 |      },
287 |      "execution_count": 11,
288 |      "metadata": {},
289 |      "output_type": "execute_result"
290 |     }
291 |    ],
292 |    "source": [
293 |     "X = torch.arange(12, dtype=torch.float32).reshape(3,4)\n",
294 |     "Y = torch.tensor([[2,1,4,3],[1,2,3,4],[4,1,2,3]])\n",
295 |     "torch.cat((X,Y), dim=0), torch.cat((X,Y), dim=1)"
296 |    ]
297 |   },
298 |   {
299 |    "cell_type": "markdown",
300 |    "id": "ac32f85d",
301 |    "metadata": {},
302 |    "source": [
303 |     "使用逻辑运算符构建二元张量"
304 |    ]
305 |   },
306 |   {
307 |    "cell_type": "code",
308 |    "execution_count": 12,
309 |    "id": "5245e6ea",
310 |    "metadata": {},
311 |    "outputs": [
312 |     {
313 |      "data": {
314 |       "text/plain": [
315 |        "tensor([[False,  True, False,  True],\n",
316 |        "        [False, False, False, False],\n",
317 |        "        [False, False, False, False]])"
318 |       ]
319 |      },
320 |      "execution_count": 12,
321 |      "metadata": {},
322 |      "output_type": "execute_result"
323 |     }
324 |    ],
325 |    "source": [
326 |     "X == Y"
327 |    ]
328 |   },
329 |   {
330 |    "cell_type": "markdown",
331 |    "id": "7f6d5356",
332 |    "metadata": {},
333 |    "source": [
334 |     "对一个张量中的所有元素求和会产生只有一个元素的张量"
335 |    ]
336 |   },
337 |   {
338 |    "cell_type": "code",
339 |    "execution_count": 13,
340 |    "id": "1ab11c93",
341 |    "metadata": {},
342 |    "outputs": [
343 |     {
344 |      "data": {
345 |       "text/plain": [
346 |        "tensor(66.)"
347 |       ]
348 |      },
349 |      "execution_count": 13,
350 |      "metadata": {},
351 |      "output_type": "execute_result"
352 |     }
353 |    ],
354 |    "source": [
355 |     "X.sum()"
356 |    ]
357 |   },
358 |   {
359 |    "cell_type": "markdown",
360 |    "id": "418ddf70",
361 |    "metadata": {},
362 |    "source": [
363 |     "面对不同形状的张量运算，可使用广播机制来执行按位操作，结果张量的shape为每个张量的shape对应位置取最大值。"
364 |    ]
365 |   },
366 |   {
367 |    "cell_type": "code",
368 |    "execution_count": 14,
369 |    "id": "844d747c",
370 |    "metadata": {},
371 |    "outputs": [
372 |     {
373 |      "data": {
374 |       "text/plain": [
375 |        "(tensor([[0],\n",
376 |        "         [1],\n",
377 |        "         [2]]),\n",
378 |        " tensor([[0, 1]]))"
379 |       ]
380 |      },
381 |      "execution_count": 14,
382 |      "metadata": {},
383 |      "output_type": "execute_result"
384 |     }
385 |    ],
386 |    "source": [
387 |     "a = torch.arange(3).reshape(3,1)\n",
388 |     "b = torch.arange(2).reshape(1,2)\n",
389 |     "a,b"
390 |    ]
391 |   },
392 |   {
393 |    "cell_type": "code",
394 |    "execution_count": 18,
395 |    "id": "541c1321",
396 |    "metadata": {},
397 |    "outputs": [
398 |     {
399 |      "data": {
400 |       "text/plain": [
401 |        "tensor([[0, 1],\n",
402 |        "        [1, 2],\n",
403 |        "        [2, 3]])"
404 |       ]
405 |      },
406 |      "execution_count": 18,
407 |      "metadata": {},
408 |      "output_type": "execute_result"
409 |     }
410 |    ],
411 |    "source": [
412 |     "c = a + b\n",
413 |     "c"
414 |    ]
415 |   },
416 |   {
417 |    "cell_type": "code",
418 |    "execution_count": 17,
419 |    "id": "9f8af5c9",
420 |    "metadata": {},
421 |    "outputs": [
422 |     {
423 |      "data": {
424 |       "text/plain": [
425 |        "torch.Size([3, 2])"
426 |       ]
427 |      },
428 |      "execution_count": 17,
429 |      "metadata": {},
430 |      "output_type": "execute_result"
431 |     }
432 |    ],
433 |    "source": [
434 |     "c.shape"
435 |    ]
436 |   },
437 |   {
438 |    "cell_type": "markdown",
439 |    "id": "3cf73063",
440 |    "metadata": {},
441 |    "source": [
442 |     "元素的访问：用[-1]访问最后一个元素，用[1:3]访问第一个和第二个元素"
443 |    ]
444 |   },
445 |   {
446 |    "cell_type": "code",
447 |    "execution_count": 19,
448 |    "id": "059fb8ca",
449 |    "metadata": {},
450 |    "outputs": [
451 |     {
452 |      "data": {
453 |       "text/plain": [
454 |        "(tensor([ 8.,  9., 10., 11.]),\n",
455 |        " tensor([[ 4.,  5.,  6.,  7.],\n",
456 |        "         [ 8.,  9., 10., 11.]]))"
457 |       ]
458 |      },
459 |      "execution_count": 19,
460 |      "metadata": {},
461 |      "output_type": "execute_result"
462 |     }
463 |    ],
464 |    "source": [
465 |     "X[-1],X[1:3]"
466 |    ]
467 |   },
468 |   {
469 |    "cell_type": "markdown",
470 |    "id": "fe4bde30",
471 |    "metadata": {},
472 |    "source": [
473 |     "通过索引改变张量内的元素数值"
474 |    ]
475 |   },
476 |   {
477 |    "cell_type": "code",
478 |    "execution_count": 22,
479 |    "id": "6fb5820b",
480 |    "metadata": {},
481 |    "outputs": [
482 |     {
483 |      "data": {
484 |       "text/plain": [
485 |        "tensor([[ 0.,  1.,  2.,  3.],\n",
486 |        "        [ 4.,  5.,  7.,  7.],\n",
487 |        "        [ 8.,  9., 10., 11.]])"
488 |       ]
489 |      },
490 |      "execution_count": 22,
491 |      "metadata": {},
492 |      "output_type": "execute_result"
493 |     }
494 |    ],
495 |    "source": [
496 |     "X[1,2]=7\n",
497 |     "X"
498 |    ]
499 |   },
500 |   {
501 |    "cell_type": "markdown",
502 |    "id": "4a2684db",
503 |    "metadata": {},
504 |    "source": [
505 |     "为多个元素赋值，只需要索引所有元素，然后赋值"
506 |    ]
507 |   },
508 |   {
509 |    "cell_type": "code",
510 |    "execution_count": 23,
511 |    "id": "e565db0e",
512 |    "metadata": {},
513 |    "outputs": [
514 |     {
515 |      "data": {
516 |       "text/plain": [
517 |        "tensor([[12., 12., 12., 12.],\n",
518 |        "        [12., 12., 12., 12.],\n",
519 |        "        [ 8.,  9., 10., 11.]])"
520 |       ]
521 |      },
522 |      "execution_count": 23,
523 |      "metadata": {},
524 |      "output_type": "execute_result"
525 |     }
526 |    ],
527 |    "source": [
528 |     "X[0:2,:]=12\n",
529 |     "X"
530 |    ]
531 |   },
532 |   {
533 |    "cell_type": "markdown",
534 |    "id": "64bc3066",
535 |    "metadata": {},
536 |    "source": [
537 |     "pytorch张量转为NumPy张量"
538 |    ]
539 |   },
540 |   {
541 |    "cell_type": "code",
542 |    "execution_count": 26,
543 |    "id": "80c6af70",
544 |    "metadata": {},
545 |    "outputs": [
546 |     {
547 |      "data": {
548 |       "text/plain": [
549 |        "(numpy.ndarray, torch.Tensor)"
550 |       ]
551 |      },
552 |      "execution_count": 26,
553 |      "metadata": {},
554 |      "output_type": "execute_result"
555 |     }
556 |    ],
557 |    "source": [
558 |     "A=X.numpy()\n",
559 |     "B=torch.tensor(A)\n",
560 |     "A,B\n",
561 |     "type(A),type(B)"
562 |    ]
563 |   },
564 |   {
565 |    "cell_type": "markdown",
566 |    "id": "73f9fb59",
567 |    "metadata": {},
568 |    "source": [
569 |     "将张量转为python标量"
570 |    ]
571 |   },
572 |   {
573 |    "cell_type": "code",
574 |    "execution_count": 28,
575 |    "id": "06c2c8eb",
576 |    "metadata": {},
577 |    "outputs": [
578 |     {
579 |      "data": {
580 |       "text/plain": [
581 |        "(tensor(3.5000), 3.5, 3.5, 3)"
582 |       ]
583 |      },
584 |      "execution_count": 28,
585 |      "metadata": {},
586 |      "output_type": "execute_result"
587 |     }
588 |    ],
589 |    "source": [
590 |     "C = torch.tensor(3.5)\n",
591 |     "C,C.item(),float(C),int(C)"
592 |    ]
593 |   },
594 |   {
595 |    "cell_type": "markdown",
596 |    "id": "3ab68bfc",
597 |    "metadata": {},
598 |    "source": [
599 |     "创建一个csv数据集"
600 |    ]
601 |   },
602 |   {
603 |    "cell_type": "code",
604 |    "execution_count": 34,
605 |    "id": "99a3d717",
606 |    "metadata": {},
607 |    "outputs": [],
608 |    "source": [
609 |     "import os\n",
610 |     "\n",
611 |     "os.makedirs(os.path.join('..','data'),exist_ok=True)\n",
612 |     "data_file = os.path.join('..','data','house.csv')\n",
613 |     "\n",
614 |     "with open(data_file,'w') as f:\n",
615 |     "    f.write('NumRooms,Name,Prince\\n')\n",
616 |     "    f.write('NA,Alice,256000\\n')\n",
617 |     "    f.write('3,NA,NA\\n')\n",
618 |     "    f.write('5,Dave,362110\\n')\n",
619 |     "    f.write('NA,Alex,NA\\n')"
620 |    ]
621 |   },
622 |   {
623 |    "cell_type": "markdown",
624 |    "id": "e328a568",
625 |    "metadata": {},
626 |    "source": [
627 |     "使用pandas读取数据集"
628 |    ]
629 |   },
630 |   {
631 |    "cell_type": "code",
632 |    "execution_count": 37,
633 |    "id": "f34eb725",
634 |    "metadata": {},
635 |    "outputs": [
636 |     {
637 |      "data": {
638 |       "text/html": [
639 |        "<div>\n",
640 |        "<style scoped>\n",
641 |        "    .dataframe tbody tr th:only-of-type {\n",
642 |        "        vertical-align: middle;\n",
643 |        "    }\n",
644 |        "\n",
645 |        "    .dataframe tbody tr th {\n",
646 |        "        vertical-align: top;\n",
647 |        "    }\n",
648 |        "\n",
649 |        "    .dataframe thead th {\n",
650 |        "        text-align: right;\n",
651 |        "    }\n",
652 |        "</style>\n",
653 |        "<table border=\"1\" class=\"dataframe\">\n",
654 |        "  <thead>\n",
655 |        "    <tr style=\"text-align: right;\">\n",
656 |        "      <th></th>\n",
657 |        "      <th>NumRooms</th>\n",
658 |        "      <th>Name</th>\n",
659 |        "      <th>Prince</th>\n",
660 |        "    </tr>\n",
661 |        "  </thead>\n",
662 |        "  <tbody>\n",
663 |        "    <tr>\n",
664 |        "      <th>0</th>\n",
665 |        "      <td>NaN</td>\n",
666 |        "      <td>Alice</td>\n",
667 |        "      <td>256000.0</td>\n",
668 |        "    </tr>\n",
669 |        "    <tr>\n",
670 |        "      <th>1</th>\n",
671 |        "      <td>3.0</td>\n",
672 |        "      <td>NaN</td>\n",
673 |        "      <td>NaN</td>\n",
674 |        "    </tr>\n",
675 |        "    <tr>\n",
676 |        "      <th>2</th>\n",
677 |        "      <td>5.0</td>\n",
678 |        "      <td>Dave</td>\n",
679 |        "      <td>362110.0</td>\n",
680 |        "    </tr>\n",
681 |        "    <tr>\n",
682 |        "      <th>3</th>\n",
683 |        "      <td>NaN</td>\n",
684 |        "      <td>Alex</td>\n",
685 |        "      <td>NaN</td>\n",
686 |        "    </tr>\n",
687 |        "  </tbody>\n",
688 |        "</table>\n",
689 |        "</div>"
690 |       ],
691 |       "text/plain": [
692 |        "   NumRooms   Name    Prince\n",
693 |        "0       NaN  Alice  256000.0\n",
694 |        "1       3.0    NaN       NaN\n",
695 |        "2       5.0   Dave  362110.0\n",
696 |        "3       NaN   Alex       NaN"
697 |       ]
698 |      },
699 |      "execution_count": 37,
700 |      "metadata": {},
701 |      "output_type": "execute_result"
702 |     }
703 |    ],
704 |    "source": [
705 |     "import pandas as pd\n",
706 |     "\n",
707 |     "data = pd.read_csv(data_file)\n",
708 |     "data"
709 |    ]
710 |   },
711 |   {
712 |    "cell_type": "markdown",
713 |    "id": "67adfc62",
714 |    "metadata": {},
715 |    "source": [
716 |     "通过插值和删除来处理缺失值"
717 |    ]
718 |   },
719 |   {
720 |    "cell_type": "code",
721 |    "execution_count": 39,
722 |    "id": "50c1c64a",
723 |    "metadata": {},
724 |    "outputs": [
725 |     {
726 |      "name": "stderr",
727 |      "output_type": "stream",
728 |      "text": [
729 |       "C:\\Users\\pod2g\\AppData\\Local\\Temp\\ipykernel_7636\\3505320478.py:2: FutureWarning: The default value of numeric_only in DataFrame.mean is deprecated. In a future version, it will default to False. In addition, specifying 'numeric_only=None' is deprecated. Select only valid columns or specify the value of numeric_only to silence this warning.\n",
730 |       "  inputs = inputs.fillna(inputs.mean())\n"
731 |      ]
732 |     },
733 |     {
734 |      "data": {
735 |       "text/html": [
736 |        "<div>\n",
737 |        "<style scoped>\n",
738 |        "    .dataframe tbody tr th:only-of-type {\n",
739 |        "        vertical-align: middle;\n",
740 |        "    }\n",
741 |        "\n",
742 |        "    .dataframe tbody tr th {\n",
743 |        "        vertical-align: top;\n",
744 |        "    }\n",
745 |        "\n",
746 |        "    .dataframe thead th {\n",
747 |        "        text-align: right;\n",
748 |        "    }\n",
749 |        "</style>\n",
750 |        "<table border=\"1\" class=\"dataframe\">\n",
751 |        "  <thead>\n",
752 |        "    <tr style=\"text-align: right;\">\n",
753 |        "      <th></th>\n",
754 |        "      <th>NumRooms</th>\n",
755 |        "      <th>Name</th>\n",
756 |        "    </tr>\n",
757 |        "  </thead>\n",
758 |        "  <tbody>\n",
759 |        "    <tr>\n",
760 |        "      <th>0</th>\n",
761 |        "      <td>4.0</td>\n",
762 |        "      <td>Alice</td>\n",
763 |        "    </tr>\n",
764 |        "    <tr>\n",
765 |        "      <th>1</th>\n",
766 |        "      <td>3.0</td>\n",
767 |        "      <td>NaN</td>\n",
768 |        "    </tr>\n",
769 |        "    <tr>\n",
770 |        "      <th>2</th>\n",
771 |        "      <td>5.0</td>\n",
772 |        "      <td>Dave</td>\n",
773 |        "    </tr>\n",
774 |        "    <tr>\n",
775 |        "      <th>3</th>\n",
776 |        "      <td>4.0</td>\n",
777 |        "      <td>Alex</td>\n",
778 |        "    </tr>\n",
779 |        "  </tbody>\n",
780 |        "</table>\n",
781 |        "</div>"
782 |       ],
783 |       "text/plain": [
784 |        "   NumRooms   Name\n",
785 |        "0       4.0  Alice\n",
786 |        "1       3.0    NaN\n",
787 |        "2       5.0   Dave\n",
788 |        "3       4.0   Alex"
789 |       ]
790 |      },
791 |      "execution_count": 39,
792 |      "metadata": {},
793 |      "output_type": "execute_result"
794 |     }
795 |    ],
796 |    "source": [
797 |     "inputs, outputs = data.iloc[:,0:2], data.iloc[:,2]\n",
798 |     "inputs = inputs.fillna(inputs.mean())\n",
799 |     "inputs"
800 |    ]
801 |   },
802 |   {
803 |    "cell_type": "markdown",
804 |    "id": "2a51984c",
805 |    "metadata": {},
806 |    "source": [
807 |     "对于数据集中的离散值或是类别值，可以将Na视为一个类别"
808 |    ]
809 |   },
810 |   {
811 |    "cell_type": "code",
812 |    "execution_count": 40,
813 |    "id": "40ea93c9",
814 |    "metadata": {},
815 |    "outputs": [
816 |     {
817 |      "data": {
818 |       "text/html": [
819 |        "<div>\n",
820 |        "<style scoped>\n",
821 |        "    .dataframe tbody tr th:only-of-type {\n",
822 |        "        vertical-align: middle;\n",
823 |        "    }\n",
824 |        "\n",
825 |        "    .dataframe tbody tr th {\n",
826 |        "        vertical-align: top;\n",
827 |        "    }\n",
828 |        "\n",
829 |        "    .dataframe thead th {\n",
830 |        "        text-align: right;\n",
831 |        "    }\n",
832 |        "</style>\n",
833 |        "<table border=\"1\" class=\"dataframe\">\n",
834 |        "  <thead>\n",
835 |        "    <tr style=\"text-align: right;\">\n",
836 |        "      <th></th>\n",
837 |        "      <th>NumRooms</th>\n",
838 |        "      <th>Name_Alex</th>\n",
839 |        "      <th>Name_Alice</th>\n",
840 |        "      <th>Name_Dave</th>\n",
841 |        "      <th>Name_nan</th>\n",
842 |        "    </tr>\n",
843 |        "  </thead>\n",
844 |        "  <tbody>\n",
845 |        "    <tr>\n",
846 |        "      <th>0</th>\n",
847 |        "      <td>4.0</td>\n",
848 |        "      <td>0</td>\n",
849 |        "      <td>1</td>\n",
850 |        "      <td>0</td>\n",
851 |        "      <td>0</td>\n",
852 |        "    </tr>\n",
853 |        "    <tr>\n",
854 |        "      <th>1</th>\n",
855 |        "      <td>3.0</td>\n",
856 |        "      <td>0</td>\n",
857 |        "      <td>0</td>\n",
858 |        "      <td>0</td>\n",
859 |        "      <td>1</td>\n",
860 |        "    </tr>\n",
861 |        "    <tr>\n",
862 |        "      <th>2</th>\n",
863 |        "      <td>5.0</td>\n",
864 |        "      <td>0</td>\n",
865 |        "      <td>0</td>\n",
866 |        "      <td>1</td>\n",
867 |        "      <td>0</td>\n",
868 |        "    </tr>\n",
869 |        "    <tr>\n",
870 |        "      <th>3</th>\n",
871 |        "      <td>4.0</td>\n",
872 |        "      <td>1</td>\n",
873 |        "      <td>0</td>\n",
874 |        "      <td>0</td>\n",
875 |        "      <td>0</td>\n",
876 |        "    </tr>\n",
877 |        "  </tbody>\n",
878 |        "</table>\n",
879 |        "</div>"
880 |       ],
881 |       "text/plain": [
882 |        "   NumRooms  Name_Alex  Name_Alice  Name_Dave  Name_nan\n",
883 |        "0       4.0          0           1          0         0\n",
884 |        "1       3.0          0           0          0         1\n",
885 |        "2       5.0          0           0          1         0\n",
886 |        "3       4.0          1           0          0         0"
887 |       ]
888 |      },
889 |      "execution_count": 40,
890 |      "metadata": {},
891 |      "output_type": "execute_result"
892 |     }
893 |    ],
894 |    "source": [
895 |     "inputs = pd.get_dummies(inputs, dummy_na=True)\n",
896 |     "inputs"
897 |    ]
898 |   },
899 |   {
900 |    "cell_type": "markdown",
901 |    "id": "35b6f411",
902 |    "metadata": {},
903 |    "source": [
904 |     "在将数据集中的数据都转化为数值后，可以将这些数值转化为tensor张量"
905 |    ]
906 |   },
907 |   {
908 |    "cell_type": "code",
909 |    "execution_count": 41,
910 |    "id": "ebda561b",
911 |    "metadata": {},
912 |    "outputs": [
913 |     {
914 |      "data": {
915 |       "text/plain": [
916 |        "(tensor([[4., 0., 1., 0., 0.],\n",
917 |        "         [3., 0., 0., 0., 1.],\n",
918 |        "         [5., 0., 0., 1., 0.],\n",
919 |        "         [4., 1., 0., 0., 0.]], dtype=torch.float64),\n",
920 |        " tensor([256000.,     nan, 362110.,     nan], dtype=torch.float64))"
921 |       ]
922 |      },
923 |      "execution_count": 41,
924 |      "metadata": {},
925 |      "output_type": "execute_result"
926 |     }
927 |    ],
928 |    "source": [
929 |     "X, y = torch.tensor(inputs.values),torch.tensor(outputs.values)\n",
930 |     "X, y"
931 |    ]
932 |   },
933 |   {
934 |    "cell_type": "code",
935 |    "execution_count": null,
936 |    "id": "1861ef0b",
937 |    "metadata": {},
938 |    "outputs": [],
939 |    "source": []
940 |   }
941 |  ],
942 |  "metadata": {
943 |   "kernelspec": {
944 |    "display_name": "Python [conda env:torch] *",
945 |    "language": "python",
946 |    "name": "conda-env-torch-py"
947 |   },
948 |   "language_info": {
949 |    "codemirror_mode": {
950 |     "name": "ipython",
951 |     "version": 3
952 |    },
953 |    "file_extension": ".py",
954 |    "mimetype": "text/x-python",
955 |    "name": "python",
956 |    "nbconvert_exporter": "python",
957 |    "pygments_lexer": "ipython3",
958 |    "version": "3.8.16"
959 |   }
960 |  },
961 |  "nbformat": 4,
962 |  "nbformat_minor": 5
963 | }
964 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/文本预处理/README.md:
--------------------------------------------------------------------------------
  1 | 最近在B站上跟着李沐老师学NLP，在这里把文本预处理的代码做一个小总结。
  2 | 
  3 | ### 一. 导入文本
  4 | ```python
  5 | d2l.DATA_HUB['time_machine'] = (d2l.DATA_URL + 'timemachine.txt','090b5e7e70c295757f55df93cb0a180b9691891a')
  6 | 
  7 | def read_book():
  8 |     with open(d2l.download('time_machine'), 'r') as f:
  9 |         lines = f.readlines()
 10 |     return [re.sub('[^A-Za-z]+', ' ', line).strip().lower() for line in lines]
 11 | 
 12 | lines = read_book()
 13 | print(lines[0])
 14 | print(lines[10])
 15 | ```
 16 | 
 17 | 这里使用了Dive to Learning提供的d2l库，方便导入文本数据。
 18 | 
 19 | 在导入文本数据后，构造一个函数来读取数据中的文本行，此处为了简化数据集，将文本中除了英文字母以外的符号全部变成空格，并且将大写字母转为小写字母。
 20 | 
 21 | ### 二. 词元化
 22 | 
 23 | 词元化是将一个个文本行（lines）作为输入，将文本行中的词汇拆开来变成一个个词元。词元是文本的基本单位。
 24 | ```python
 25 | def tokenize(lines, token='word'):
 26 |     if (token == 'word'):
 27 |         return [line.split() for line in lines]
 28 |     elif (token == 'char'):
 29 |         return [list(line) for line in lines]
 30 |     else:
 31 |         print ('Error Token Type:' + token)
 32 | 
 33 | tokens = tokenize(lines)
 34 | for i in range(22):
 35 |     print(tokens[i])
 36 | ```
 37 | 这里构造一个tokenize函数，其输入为一个包含若干个文本行数据的列表以及一个token用作分辨词元类型。在此函数中，若token为单词（word），则使用split函数将文本行中的单词逐个拆分，然后返回一个包含若干个单词的列表；若token为字符（char），则使用list函数将文本行中的字母逐个拆分，然后返回包含若干个字母的列表；最后若输入的token无法识别则返回Error。
 38 | 
 39 | ### 三. 构建词汇表
 40 | 
 41 | 词元的数据类型为字符串，而深度学习模型要求的输入为数字，单纯用词元不符合模型的输入要求，需要将词元映射到从0开始的数字索引当中。首先先将所有的文本数据合并，接着对每个唯一词元进行频率统计，统计结果被称为语料库（corpus），然后对每个词元的出现频率分配一个数字索引。很少出现的词元将被删除以降低复杂性。并且对于不存在语料库中的词元或者已经删除的词元都将被映射到一个未知词元中。通常地，可以人为地增加一个列表，用于保存那些被保留的词元，例如序列开始次元表示一个句子的开始,序列结束词元表示一个句子的结束。
 42 | ```python
 43 | class Vocab:
 44 |     def __init__(self, tokens=None, mini_freq=0, reserved_token=None):
 45 |         """文本词汇表"""
 46 |         if(tokens is None):
 47 |             tokens = [ ]
 48 |         if(reserved_token is None):
 49 |             reserved_token = [ ]
 50 |         counter = corpus_counter(tokens) #计算词元频率构造语料库
 51 |         self.token_freq = sorted(counter.items(), key=lambda x:x[1], reverse=True) #将词元频率按照出现频率从高到低排列
 52 |         
 53 |         self.unk, uniq_tokens = 0, ['<unk>'] + reserved_token #构造一个能够存放词元的字典
 54 |         #对于语料库中出现频率满足设定的最小频率的词元以及不在字典中的词元，逐个将这些满足条件的词元放入字典中。
 55 |         uniq_tokens += [token for token, freq in self.token_freq if freq >= mini_freq and token not in uniq_tokens] 
 56 |         self.token_to_idx = dict() #给定词元返回数字索引
 57 |         self.idx_to_token = [ ] #给定数字索引返回词元
 58 |         #将数字索引和字典中的词元一一对应
 59 |         for token in uniq_tokens:
 60 |             self.idx_to_token.append(token)
 61 |             self.token_to_idx[token] = len(self.idx_to_token) - 1
 62 |         
 63 |     def __len__(self):
 64 |         """返回储存词元字典的长度"""
 65 |         return len(self.idx_to_token) 
 66 |         
 67 |     def __getitem__(self, tokens):
 68 |         """输入一个词元，返回一个数字索引"""
 69 |         if not isinstance(tokens, (list, tuple)):
 70 |             return self.token_to_idx.get(tokens, self.unk)
 71 |         return [self.__getitem__(token) for token in tokens]
 72 |         
 73 |     def to_token(self, indices):
 74 |         """输入一个数字索引，返回一个词元"""
 75 |         if not isinstance(indices, (list, tuple)):
 76 |             return self.idx_to_token.get[indices]
 77 |         return [self.to_token[idx] for idx in indices]
 78 |                
 79 | def corpus_counter(tokens):
 80 |     """统计词频"""
 81 |     if (len(tokens)==0 or isinstance(tokens[0], list)): 
 82 |         """统计词元的出现频率"""
 83 |         tokens = [token for line in tokens for token in line]
 84 |     return collections.Counter(tokens)
 85 | ```
 86 | 在这部分，需要构造一个Vocab类用来处理词元和索引之间的关系和一个corpus_counter函数来统计词元的出现频率以构造语料库。首先，构造corpus_counter函数来建立语料库。对于文本数据当中的每一个唯一的词元，使用collections中的Counter()进行统计，最后返回一个词元出现频率的列表。
 87 | 
 88 | 接着构造Vocab类，在此类中包含三大类函数：第一类函数__init__()用来定义和初始化变量。
 89 | 
 90 | 1. 输入的变量应该是一个多个词元组成的token列表，并且设置一个mini_freq作为词元出现的最小频率用于后面过滤出现频率太少的词元以及一个用来储存保留词元的reserved_token。
 91 | 2. 先使用corpus_counter函数构造一个语料库来储存词元出现频率，接着还需要对词元的出现频率按照从高到低的排序。
 92 | 3. 然后声明一个unk变量，初始值为0，用来储存不在语料库中的词元和已经删除的词元。之后声明一个字典uniq_token来储存词元以及词元对应的出现频率（包括未知词元unk和保留词元reserved_token）。
 93 | 4. 接着，对于语料库中出现频率满足设定的最小频率的词元以及不在字典中的词元，逐个将这些满足条件的词元放入字典中。
 94 | 5. 下一步需要声明两个变量用于词元和数字索引之间的转化。
 95 | 6. 最后一步，需要把词元逐个放入到idx_to_token中用于给定数字索引时返回对应词元，同时将词元对应的数字索引放入token_to_idx中用于给定词元返回数字索引。
 96 | 
 97 | 第二类函数__len__()用来返回储存词元字典的长度。
 98 | 
 99 | 第三类函数包含两个函数__getitem__()和to_token()。__getitem__()用来给定一个词元返回对应的数字索引，to_token()用来给定一个数字索引返回对应的词元。
100 | 
101 | ### 四. 加入真实数据集
102 | 
103 | 这一步，使用之前导入的时光机器数据来构造词汇表，并且打印部分高频词元。
104 | ```python
105 | vocab = Vocab(tokens)
106 | print(list(vocab.token_to_idx.items())[:10])
107 | ```
108 | 运行结果：
109 | 
110 | 
111 | <img width="200%" src="https://github.com/pod2c/Machine_Learning/blob/b60fe58e31e8b734b086b9a7b34ea65fa40091c3/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E6%96%87%E6%9C%AC%E9%A2%84%E5%A4%84%E7%90%86/result1.png" />
112 | 
113 | 
114 | ### 五. 将文本行转化为数字索引列表
115 | ```python
116 | for i in [0, 10]:
117 |     print('word:',tokens[i])
118 |     print('index:',vocab[tokens[i]])
119 | ```
120 | 
121 | 运行结果：
122 | <img width="200%" src="https://github.com/pod2c/Machine_Learning/blob/94a6e14224c66325332b7a05bb9f6845a178fb52/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E6%96%87%E6%9C%AC%E9%A2%84%E5%A4%84%E7%90%86/result2.png" />
123 | 
124 | ### 六. 整合所有功能
125 | 现在将之前的所有功能合并到一个函数load_corpus_time_machine()当中，此函数最终返回一个词元的索引列表corpus和一个词汇表vocabu。
126 | ```python
127 | def load_corpus_time_machine(max_token=-1):
128 |     lines = read_book() #导入文本数据
129 |     tokens = tokenize(lines, 'char') #拆分文本数据转为词元
130 |     vocabu = Vocab(tokens) #构造词汇表
131 |     corpus = [vocabu[token] for line in tokens for token in line] #得到词元索引列表
132 |     
133 |     if (max_token > 0):
134 |         corpus = corpus[:max_token] #按设置好的数量提取需要用来训练的词元
135 |     return vocabu, corpus #返回词汇表以及数字索引列表
136 | 
137 | vocabu, corpus = load_corpus_time_machine()
138 | len(vocabu), len(corpus)
139 | ```
140 | 需要注意的是：
141 | 
142 | 1. 为了简化训练，这里使用字符（而不是单词）实现文本词元化；
143 | 2. 时光机器数据集中的每个文本行不一定是一个句子或一个段落，还可能是一个单词，因此返回的corpus仅处理为单个列表，而不是使用多词元列表构成的一个列表。
144 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/文本预处理/文本预处理.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "id": "75971f1d",
  7 |    "metadata": {},
  8 |    "outputs": [],
  9 |    "source": [
 10 |     "from d2l import torch as d2l\n",
 11 |     "import collections\n",
 12 |     "import re"
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "markdown",
 17 |    "id": "ee3e072f",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "导入一本书的数据集并且转化为一系列的文本"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "code",
 25 |    "execution_count": 2,
 26 |    "id": "d2026236",
 27 |    "metadata": {},
 28 |    "outputs": [
 29 |     {
 30 |      "name": "stdout",
 31 |      "output_type": "stream",
 32 |      "text": [
 33 |       "the time machine by h g wells\n",
 34 |       "twinkled and his usually pale face was flushed and animated the\n"
 35 |      ]
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "d2l.DATA_HUB['time_machine'] = (d2l.DATA_URL + 'timemachine.txt','090b5e7e70c295757f55df93cb0a180b9691891a')\n",
 40 |     "\n",
 41 |     "def read_book():\n",
 42 |     "    with open(d2l.download('time_machine'), 'r') as f:\n",
 43 |     "        lines = f.readlines()\n",
 44 |     "    return [re.sub('[^A-Za-z]+', ' ', line).strip().lower() for line in lines]\n",
 45 |     "\n",
 46 |     "lines = read_book()\n",
 47 |     "print(lines[0])\n",
 48 |     "print(lines[10])"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "markdown",
 53 |    "id": "98137cd1",
 54 |    "metadata": {},
 55 |    "source": [
 56 |     "词元化：tokenize函数将文本行列表（lines）作为输入，此列表中的元素为一个个文本序列，tokenize函数将每个文本序列拆开成为一个个词元（token）,词元是文本的基本单位，最后函数会返回一个由词元构成的列表（list）。"
 57 |    ]
 58 |   },
 59 |   {
 60 |    "cell_type": "code",
 61 |    "execution_count": 3,
 62 |    "id": "5ab85a7a",
 63 |    "metadata": {},
 64 |    "outputs": [
 65 |     {
 66 |      "name": "stdout",
 67 |      "output_type": "stream",
 68 |      "text": [
 69 |       "['the', 'time', 'machine', 'by', 'h', 'g', 'wells']\n",
 70 |       "[]\n",
 71 |       "[]\n",
 72 |       "[]\n",
 73 |       "[]\n",
 74 |       "['i']\n",
 75 |       "[]\n",
 76 |       "[]\n",
 77 |       "['the', 'time', 'traveller', 'for', 'so', 'it', 'will', 'be', 'convenient', 'to', 'speak', 'of', 'him']\n",
 78 |       "['was', 'expounding', 'a', 'recondite', 'matter', 'to', 'us', 'his', 'grey', 'eyes', 'shone', 'and']\n",
 79 |       "['twinkled', 'and', 'his', 'usually', 'pale', 'face', 'was', 'flushed', 'and', 'animated', 'the']\n",
 80 |       "['fire', 'burned', 'brightly', 'and', 'the', 'soft', 'radiance', 'of', 'the', 'incandescent']\n",
 81 |       "['lights', 'in', 'the', 'lilies', 'of', 'silver', 'caught', 'the', 'bubbles', 'that', 'flashed', 'and']\n",
 82 |       "['passed', 'in', 'our', 'glasses', 'our', 'chairs', 'being', 'his', 'patents', 'embraced', 'and']\n",
 83 |       "['caressed', 'us', 'rather', 'than', 'submitted', 'to', 'be', 'sat', 'upon', 'and', 'there', 'was', 'that']\n",
 84 |       "['luxurious', 'after', 'dinner', 'atmosphere', 'when', 'thought', 'roams', 'gracefully']\n",
 85 |       "['free', 'of', 'the', 'trammels', 'of', 'precision', 'and', 'he', 'put', 'it', 'to', 'us', 'in', 'this']\n",
 86 |       "['way', 'marking', 'the', 'points', 'with', 'a', 'lean', 'forefinger', 'as', 'we', 'sat', 'and', 'lazily']\n",
 87 |       "['admired', 'his', 'earnestness', 'over', 'this', 'new', 'paradox', 'as', 'we', 'thought', 'it']\n",
 88 |       "['and', 'his', 'fecundity']\n",
 89 |       "[]\n",
 90 |       "['you', 'must', 'follow', 'me', 'carefully', 'i', 'shall', 'have', 'to', 'controvert', 'one', 'or', 'two']\n"
 91 |      ]
 92 |     }
 93 |    ],
 94 |    "source": [
 95 |     "def tokenize(lines, token='word'):\n",
 96 |     "    if (token == 'word'):\n",
 97 |     "        return [line.split() for line in lines]\n",
 98 |     "    elif (token == 'char'):\n",
 99 |     "        return [list(line) for line in lines]\n",
100 |     "    else:\n",
101 |     "        print ('Error Token Type:' + token)\n",
102 |     "\n",
103 |     "tokens = tokenize(lines)\n",
104 |     "for i in range(22):\n",
105 |     "    print(tokens[i])"
106 |    ]
107 |   },
108 |   {
109 |    "cell_type": "markdown",
110 |    "id": "f501572a",
111 |    "metadata": {},
112 |    "source": [
113 |     "构建词汇表类：词元的类型为字符串，而模型需要的输入为数字，因此单纯的词元并不适合输入模型进行训练，需要将词元映射到从0开始的数字索引当中。首先需要先将所有文本合并到一起，接着对每个唯一的词元的出现频率进行统计，统计结果被称为语料库（corpus），然后为每个唯一词元的出现频率分配一个数字索引。很少出现的词元将被删除以降低复杂性。并且对于不存在语料库中的词元或者已经删除的词元都将被映射到一个未知词元<unk>中。通常地，可以人为地增加一个列表，用于保存那些被保留的词元，例如序列开始次元<bos>表示一个句子的开始,序列结束词元<eos>表示一个句子的结束。"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": 18,
119 |    "id": "ab738886",
120 |    "metadata": {},
121 |    "outputs": [],
122 |    "source": [
123 |     "class Vocab:\n",
124 |     "    def __init__(self, tokens=None, mini_freq=0, reserved_token=None):\n",
125 |     "        \"\"\"文本词汇表\"\"\"\n",
126 |     "        if(tokens is None):\n",
127 |     "            tokens = [ ]\n",
128 |     "        if(reserved_token is None):\n",
129 |     "            reserved_token = [ ]\n",
130 |     "        counter = corpus_counter(tokens) #计算词元频率构造语料库\n",
131 |     "        self.token_freq = sorted(counter.items(), key=lambda x:x[1], reverse=True) #将词元频率按照出现频率从高到低排列\n",
132 |     "        \n",
133 |     "        self.unk, uniq_tokens = 0, ['<unk>'] + reserved_token #构造一个能够存放词元的字典\n",
134 |     "        #对于语料库中出现频率满足设定的最小频率的词元以及不在字典中的词元，逐个将这些满足条件的词元放入字典中。\n",
135 |     "        uniq_tokens += [token for token, freq in self.token_freq if freq >= mini_freq and token not in uniq_tokens] \n",
136 |     "        self.token_to_idx = dict() #给定词元返回数字索引\n",
137 |     "        self.idx_to_token = [ ] #给定数字索引返回词元\n",
138 |     "        #将数字索引和字典中的词元一一对应\n",
139 |     "        for token in uniq_tokens:\n",
140 |     "            self.idx_to_token.append(token)\n",
141 |     "            self.token_to_idx[token] = len(self.idx_to_token) - 1\n",
142 |     "        \n",
143 |     "    def __len__(self):\n",
144 |     "        \"\"\"返回储存词元字典的长度\"\"\"\n",
145 |     "        return len(self.idx_to_token) \n",
146 |     "        \n",
147 |     "    def __getitem__(self, tokens):\n",
148 |     "        \"\"\"输入一个词元，返回一个数字索引\"\"\"\n",
149 |     "        if not isinstance(tokens, (list, tuple)):\n",
150 |     "            return self.token_to_idx.get(tokens, self.unk)\n",
151 |     "        return [self.__getitem__(token) for token in tokens]\n",
152 |     "        \n",
153 |     "    def to_token(self, indices):\n",
154 |     "        \"\"\"输入一个数字索引，返回一个词元\"\"\"\n",
155 |     "        if not isinstance(indices, (list, tuple)):\n",
156 |     "            return self.idx_to_token.get[indices]\n",
157 |     "        return [self.to_token[idx] for idx in indices]\n",
158 |     "               \n",
159 |     "def corpus_counter(tokens):\n",
160 |     "    \"\"\"统计词频\"\"\"\n",
161 |     "    if (len(tokens)==0 or isinstance(tokens[0], list)): \n",
162 |     "        \"\"\"将词元映射到数字索引中以统计词元的出现频率\"\"\"\n",
163 |     "        tokens = [token for line in tokens for token in line]\n",
164 |     "    return collections.Counter(tokens)"
165 |    ]
166 |   },
167 |   {
168 |    "cell_type": "code",
169 |    "execution_count": 14,
170 |    "id": "d71fa711",
171 |    "metadata": {},
172 |    "outputs": [
173 |     {
174 |      "name": "stdout",
175 |      "output_type": "stream",
176 |      "text": [
177 |       "[('<unk>', 0), ('the', 1), ('i', 2), ('and', 3), ('of', 4), ('a', 5), ('to', 6), ('was', 7), ('in', 8), ('that', 9)]\n"
178 |      ]
179 |     }
180 |    ],
181 |    "source": [
182 |     "vocab = Vocab(tokens)\n",
183 |     "print(list(vocab.token_to_idx.items())[:10])"
184 |    ]
185 |   },
186 |   {
187 |    "cell_type": "code",
188 |    "execution_count": 16,
189 |    "id": "0fa274dc",
190 |    "metadata": {},
191 |    "outputs": [
192 |     {
193 |      "data": {
194 |       "text/plain": [
195 |        "<__main__.Vocab at 0x17b93959b50>"
196 |       ]
197 |      },
198 |      "execution_count": 16,
199 |      "metadata": {},
200 |      "output_type": "execute_result"
201 |     }
202 |    ],
203 |    "source": [
204 |     "vocab"
205 |    ]
206 |   },
207 |   {
208 |    "cell_type": "markdown",
209 |    "id": "31356f99",
210 |    "metadata": {},
211 |    "source": [
212 |     "将文本行转为数字索引列表"
213 |    ]
214 |   },
215 |   {
216 |    "cell_type": "code",
217 |    "execution_count": 15,
218 |    "id": "ad02d78c",
219 |    "metadata": {},
220 |    "outputs": [
221 |     {
222 |      "name": "stdout",
223 |      "output_type": "stream",
224 |      "text": [
225 |       "word: ['the', 'time', 'machine', 'by', 'h', 'g', 'wells']\n",
226 |       "index: [1, 19, 50, 40, 2183, 2184, 400]\n",
227 |       "word: ['twinkled', 'and', 'his', 'usually', 'pale', 'face', 'was', 'flushed', 'and', 'animated', 'the']\n",
228 |       "index: [2186, 3, 25, 1044, 362, 113, 7, 1421, 3, 1045, 1]\n"
229 |      ]
230 |     }
231 |    ],
232 |    "source": [
233 |     "for i in [0, 10]:\n",
234 |     "    print('word:',tokens[i])\n",
235 |     "    print('index:',vocab[tokens[i]])"
236 |    ]
237 |   },
238 |   {
239 |    "cell_type": "code",
240 |    "execution_count": 20,
241 |    "id": "5d8f1f8b",
242 |    "metadata": {},
243 |    "outputs": [
244 |     {
245 |      "data": {
246 |       "text/plain": [
247 |        "(28, 170580)"
248 |       ]
249 |      },
250 |      "execution_count": 20,
251 |      "metadata": {},
252 |      "output_type": "execute_result"
253 |     }
254 |    ],
255 |    "source": [
256 |     "def load_corpus_time_machine(max_token=-1):\n",
257 |     "    lines = read_book()\n",
258 |     "    tokens = tokenize(lines, 'char')\n",
259 |     "    vocabu = Vocab(tokens)\n",
260 |     "    corpus = [vocabu[token] for line in tokens for token in line]\n",
261 |     "    \n",
262 |     "    if (max_token > 0):\n",
263 |     "        corpus = corpus[:max_token]\n",
264 |     "    return vocabu, corpus\n",
265 |     "\n",
266 |     "vocabu, corpus = load_corpus_time_machine()\n",
267 |     "len(vocabu), len(corpus)"
268 |    ]
269 |   },
270 |   {
271 |    "cell_type": "code",
272 |    "execution_count": null,
273 |    "id": "c896d67f",
274 |    "metadata": {},
275 |    "outputs": [],
276 |    "source": []
277 |   }
278 |  ],
279 |  "metadata": {
280 |   "kernelspec": {
281 |    "display_name": "Python [conda env:torch] *",
282 |    "language": "python",
283 |    "name": "conda-env-torch-py"
284 |   },
285 |   "language_info": {
286 |    "codemirror_mode": {
287 |     "name": "ipython",
288 |     "version": 3
289 |    },
290 |    "file_extension": ".py",
291 |    "mimetype": "text/x-python",
292 |    "name": "python",
293 |    "nbconvert_exporter": "python",
294 |    "pygments_lexer": "ipython3",
295 |    "version": "3.8.16"
296 |   }
297 |  },
298 |  "nbformat": 4,
299 |  "nbformat_minor": 5
300 | }
301 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/机器翻译数据集.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "code",
   5 |    "execution_count": 1,
   6 |    "id": "527b12d7",
   7 |    "metadata": {},
   8 |    "outputs": [],
   9 |    "source": [
  10 |     "import torch\n",
  11 |     "import os\n",
  12 |     "from d2l import torch as d2l"
  13 |    ]
  14 |   },
  15 |   {
  16 |    "cell_type": "markdown",
  17 |    "id": "cb6703e5",
  18 |    "metadata": {},
  19 |    "source": [
  20 |     "下载数据集，数据集中的每一行都是制表符分隔的文本序列对， 序列对由英文文本序列和翻译后的法语文本序列组成。 请注意，每个文本序列可以是一个句子， 也可以是包含多个句子的一个段落。 在这个将英语翻译成法语的机器翻译问题中， 英语是源语言（source language）， 法语是目标语言（target language）。"
  21 |    ]
  22 |   },
  23 |   {
  24 |    "cell_type": "code",
  25 |    "execution_count": 2,
  26 |    "id": "23f592b7",
  27 |    "metadata": {},
  28 |    "outputs": [
  29 |     {
  30 |      "name": "stdout",
  31 |      "output_type": "stream",
  32 |      "text": [
  33 |       "Go.\tVa !\n",
  34 |       "Hi.\tSalut !\n",
  35 |       "Run!\tCours !\n",
  36 |       "Run!\tCourez !\n",
  37 |       "Who?\tQui ?\n",
  38 |       "Wow!\tÇa alors !\n",
  39 |       "\n"
  40 |      ]
  41 |     }
  42 |    ],
  43 |    "source": [
  44 |     "d2l.DATA_HUB['fra-eng'] = (d2l.DATA_URL + 'fra-eng.zip', '94646ad1522d915e7b0f9296181140edcf86a4f5')\n",
  45 |     "\n",
  46 |     "def read_data():\n",
  47 |     "    data_dir = d2l.download_extract('fra-eng')\n",
  48 |     "    with open(os.path.join(data_dir, 'fra.txt'), 'r', encoding='utf-8') as f:\n",
  49 |     "        return f.read()\n",
  50 |     "\n",
  51 |     "raw_txt = read_data()\n",
  52 |     "print(raw_txt[:75])"
  53 |    ]
  54 |   },
  55 |   {
  56 |    "cell_type": "markdown",
  57 |    "id": "38ecbb95",
  58 |    "metadata": {},
  59 |    "source": [
  60 |     "对数据集进行预处理"
  61 |    ]
  62 |   },
  63 |   {
  64 |    "cell_type": "code",
  65 |    "execution_count": 3,
  66 |    "id": "87ae94ce",
  67 |    "metadata": {},
  68 |    "outputs": [
  69 |     {
  70 |      "name": "stdout",
  71 |      "output_type": "stream",
  72 |      "text": [
  73 |       "go .\tva !\n",
  74 |       "hi .\tsalut !\n",
  75 |       "run !\tcours !\n",
  76 |       "run !\tcourez !\n",
  77 |       "who ?\tqui ?\n",
  78 |       "wow !\tça alors !\n"
  79 |      ]
  80 |     }
  81 |    ],
  82 |    "source": [
  83 |     "def preprocess(text):\n",
  84 |     "    # 将标点符号分离出来\n",
  85 |     "    def no_space(char, prev_char):\n",
  86 |     "        return char in set (',.!?') and prev_char != ' '\n",
  87 |     "    \n",
  88 |     "    # 将不连续空格替换成空格，将大写换成小写\n",
  89 |     "    text = text.replace('\\u202f', ' ').replace('\\xa0', ' ').lower()\n",
  90 |     "    \n",
  91 |     "    # 在标点符号前面添加空格\n",
  92 |     "    out = [' '+ char if i>0 and no_space(char, text[i-1]) else char for i, char in enumerate(text)]\n",
  93 |     "    return ''.join(out)\n",
  94 |     "\n",
  95 |     "text = preprocess(raw_txt)\n",
  96 |     "print(text[:80])"
  97 |    ]
  98 |   },
  99 |   {
 100 |    "cell_type": "markdown",
 101 |    "id": "fec7663a",
 102 |    "metadata": {},
 103 |    "source": [
 104 |     "词元化"
 105 |    ]
 106 |   },
 107 |   {
 108 |    "cell_type": "code",
 109 |    "execution_count": 4,
 110 |    "id": "ce82cedc",
 111 |    "metadata": {},
 112 |    "outputs": [
 113 |     {
 114 |      "data": {
 115 |       "text/plain": [
 116 |        "([['go', '.'],\n",
 117 |        "  ['hi', '.'],\n",
 118 |        "  ['run', '!'],\n",
 119 |        "  ['run', '!'],\n",
 120 |        "  ['who', '?'],\n",
 121 |        "  ['wow', '!']],\n",
 122 |        " [['va', '!'],\n",
 123 |        "  ['salut', '!'],\n",
 124 |        "  ['cours', '!'],\n",
 125 |        "  ['courez', '!'],\n",
 126 |        "  ['qui', '?'],\n",
 127 |        "  ['ça', 'alors', '!']])"
 128 |       ]
 129 |      },
 130 |      "execution_count": 4,
 131 |      "metadata": {},
 132 |      "output_type": "execute_result"
 133 |     }
 134 |    ],
 135 |    "source": [
 136 |     "def tokenize(text, num_examples=None):\n",
 137 |     "    # 构造源词元表和目标词元表\n",
 138 |     "    source, target = [], []\n",
 139 |     "    # 先按照行进行分离\n",
 140 |     "    for i, line in enumerate(text.split('\\n')):\n",
 141 |     "        # 判断是否超出范围\n",
 142 |     "        if num_examples and i>num_examples:\n",
 143 |     "            break\n",
 144 |     "        # 按分隔间距进行分词\n",
 145 |     "        parts = line.split('\\t')\n",
 146 |     "        # 若长度为2则说明里面仅包含一个英语词和一个法语词，按对压入源词元表和目标词元表\n",
 147 |     "        if len(parts) == 2:\n",
 148 |     "            source.append(parts[0].split(' '))\n",
 149 |     "            target.append(parts[1].split(' '))\n",
 150 |     "    return source, target\n",
 151 |     "\n",
 152 |     "src, tar = tokenize(text)\n",
 153 |     "src[:6], tar[:6]"
 154 |    ]
 155 |   },
 156 |   {
 157 |    "cell_type": "markdown",
 158 |    "id": "326564a7",
 159 |    "metadata": {},
 160 |    "source": [
 161 |     "画出每个文本序列含有多少词元"
 162 |    ]
 163 |   },
 164 |   {
 165 |    "cell_type": "code",
 166 |    "execution_count": 7,
 167 |    "id": "dd50052b",
 168 |    "metadata": {},
 169 |    "outputs": [
 170 |     {
 171 |      "data": {
 172 |       "image/svg+xml": [
 173 |        "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
 174 |        "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
 175 |        "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
 176 |        "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"274.320356pt\" height=\"183.35625pt\" viewBox=\"0 0 274.320356 183.35625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
 177 |        " <metadata>\n",
 178 |        "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
 179 |        "   <cc:Work>\n",
 180 |        "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
 181 |        "    <dc:date>2023-06-05T17:06:13.511828</dc:date>\n",
 182 |        "    <dc:format>image/svg+xml</dc:format>\n",
 183 |        "    <dc:creator>\n",
 184 |        "     <cc:Agent>\n",
 185 |        "      <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
 186 |        "     </cc:Agent>\n",
 187 |        "    </dc:creator>\n",
 188 |        "   </cc:Work>\n",
 189 |        "  </rdf:RDF>\n",
 190 |        " </metadata>\n",
 191 |        " <defs>\n",
 192 |        "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
 193 |        " </defs>\n",
 194 |        " <g id=\"figure_1\">\n",
 195 |        "  <g id=\"patch_1\">\n",
 196 |        "   <path d=\"M 0 183.35625 \n",
 197 |        "L 274.320356 183.35625 \n",
 198 |        "L 274.320356 0 \n",
 199 |        "L 0 0 \n",
 200 |        "z\n",
 201 |        "\" style=\"fill: #ffffff\"/>\n",
 202 |        "  </g>\n",
 203 |        "  <g id=\"axes_1\">\n",
 204 |        "   <g id=\"patch_2\">\n",
 205 |        "    <path d=\"M 66.053125 145.8 \n",
 206 |        "L 261.353125 145.8 \n",
 207 |        "L 261.353125 7.2 \n",
 208 |        "L 66.053125 7.2 \n",
 209 |        "z\n",
 210 |        "\" style=\"fill: #ffffff\"/>\n",
 211 |        "   </g>\n",
 212 |        "   <g id=\"patch_3\">\n",
 213 |        "    <path d=\"M 74.930398 145.8 \n",
 214 |        "L 82.177151 145.8 \n",
 215 |        "L 82.177151 13.8 \n",
 216 |        "L 74.930398 13.8 \n",
 217 |        "z\n",
 218 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 219 |        "   </g>\n",
 220 |        "   <g id=\"patch_4\">\n",
 221 |        "    <path d=\"M 93.047281 145.8 \n",
 222 |        "L 100.294034 145.8 \n",
 223 |        "L 100.294034 70.342894 \n",
 224 |        "L 93.047281 70.342894 \n",
 225 |        "z\n",
 226 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 227 |        "   </g>\n",
 228 |        "   <g id=\"patch_5\">\n",
 229 |        "    <path d=\"M 111.164164 145.8 \n",
 230 |        "L 118.410917 145.8 \n",
 231 |        "L 118.410917 141.170363 \n",
 232 |        "L 111.164164 141.170363 \n",
 233 |        "z\n",
 234 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 235 |        "   </g>\n",
 236 |        "   <g id=\"patch_6\">\n",
 237 |        "    <path d=\"M 129.281047 145.8 \n",
 238 |        "L 136.5278 145.8 \n",
 239 |        "L 136.5278 145.34327 \n",
 240 |        "L 129.281047 145.34327 \n",
 241 |        "z\n",
 242 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 243 |        "   </g>\n",
 244 |        "   <g id=\"patch_7\">\n",
 245 |        "    <path d=\"M 147.39793 145.8 \n",
 246 |        "L 154.644683 145.8 \n",
 247 |        "L 154.644683 145.744022 \n",
 248 |        "L 147.39793 145.744022 \n",
 249 |        "z\n",
 250 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 251 |        "   </g>\n",
 252 |        "   <g id=\"patch_8\">\n",
 253 |        "    <path d=\"M 165.514813 145.8 \n",
 254 |        "L 172.761567 145.8 \n",
 255 |        "L 172.761567 145.783461 \n",
 256 |        "L 165.514813 145.783461 \n",
 257 |        "z\n",
 258 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 259 |        "   </g>\n",
 260 |        "   <g id=\"patch_9\">\n",
 261 |        "    <path d=\"M 183.631696 145.8 \n",
 262 |        "L 190.87845 145.8 \n",
 263 |        "L 190.87845 145.792367 \n",
 264 |        "L 183.631696 145.792367 \n",
 265 |        "z\n",
 266 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 267 |        "   </g>\n",
 268 |        "   <g id=\"patch_10\">\n",
 269 |        "    <path d=\"M 201.74858 145.8 \n",
 270 |        "L 208.995333 145.8 \n",
 271 |        "L 208.995333 145.798728 \n",
 272 |        "L 201.74858 145.798728 \n",
 273 |        "z\n",
 274 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 275 |        "   </g>\n",
 276 |        "   <g id=\"patch_11\">\n",
 277 |        "    <path d=\"M 219.865463 145.8 \n",
 278 |        "L 227.112216 145.8 \n",
 279 |        "L 227.112216 145.797456 \n",
 280 |        "L 219.865463 145.797456 \n",
 281 |        "z\n",
 282 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 283 |        "   </g>\n",
 284 |        "   <g id=\"patch_12\">\n",
 285 |        "    <path d=\"M 237.982346 145.8 \n",
 286 |        "L 245.229099 145.8 \n",
 287 |        "L 245.229099 145.8 \n",
 288 |        "L 237.982346 145.8 \n",
 289 |        "z\n",
 290 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: #1f77b4\"/>\n",
 291 |        "   </g>\n",
 292 |        "   <g id=\"patch_13\">\n",
 293 |        "    <path d=\"M 82.177151 145.8 \n",
 294 |        "L 89.423904 145.8 \n",
 295 |        "L 89.423904 26.779268 \n",
 296 |        "L 82.177151 26.779268 \n",
 297 |        "z\n",
 298 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 299 |        "   </g>\n",
 300 |        "   <g id=\"patch_14\">\n",
 301 |        "    <path d=\"M 100.294034 145.8 \n",
 302 |        "L 107.540787 145.8 \n",
 303 |        "L 107.540787 60.492034 \n",
 304 |        "L 100.294034 60.492034 \n",
 305 |        "z\n",
 306 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 307 |        "   </g>\n",
 308 |        "   <g id=\"patch_15\">\n",
 309 |        "    <path d=\"M 118.410917 145.8 \n",
 310 |        "L 125.65767 145.8 \n",
 311 |        "L 125.65767 138.604279 \n",
 312 |        "L 118.410917 138.604279 \n",
 313 |        "z\n",
 314 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 315 |        "   </g>\n",
 316 |        "   <g id=\"patch_16\">\n",
 317 |        "    <path d=\"M 136.5278 145.8 \n",
 318 |        "L 143.774554 145.8 \n",
 319 |        "L 143.774554 144.858551 \n",
 320 |        "L 136.5278 144.858551 \n",
 321 |        "z\n",
 322 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 323 |        "   </g>\n",
 324 |        "   <g id=\"patch_17\">\n",
 325 |        "    <path d=\"M 154.644683 145.8 \n",
 326 |        "L 161.891437 145.8 \n",
 327 |        "L 161.891437 145.691861 \n",
 328 |        "L 154.644683 145.691861 \n",
 329 |        "z\n",
 330 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 331 |        "   </g>\n",
 332 |        "   <g id=\"patch_18\">\n",
 333 |        "    <path d=\"M 172.761567 145.8 \n",
 334 |        "L 180.00832 145.8 \n",
 335 |        "L 180.00832 145.758016 \n",
 336 |        "L 172.761567 145.758016 \n",
 337 |        "z\n",
 338 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 339 |        "   </g>\n",
 340 |        "   <g id=\"patch_19\">\n",
 341 |        "    <path d=\"M 190.87845 145.8 \n",
 342 |        "L 198.125203 145.8 \n",
 343 |        "L 198.125203 145.797456 \n",
 344 |        "L 190.87845 145.797456 \n",
 345 |        "z\n",
 346 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 347 |        "   </g>\n",
 348 |        "   <g id=\"patch_20\">\n",
 349 |        "    <path d=\"M 208.995333 145.8 \n",
 350 |        "L 216.242086 145.8 \n",
 351 |        "L 216.242086 145.797456 \n",
 352 |        "L 208.995333 145.797456 \n",
 353 |        "z\n",
 354 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 355 |        "   </g>\n",
 356 |        "   <g id=\"patch_21\">\n",
 357 |        "    <path d=\"M 227.112216 145.8 \n",
 358 |        "L 234.358969 145.8 \n",
 359 |        "L 234.358969 145.797456 \n",
 360 |        "L 227.112216 145.797456 \n",
 361 |        "z\n",
 362 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 363 |        "   </g>\n",
 364 |        "   <g id=\"patch_22\">\n",
 365 |        "    <path d=\"M 245.229099 145.8 \n",
 366 |        "L 252.475852 145.8 \n",
 367 |        "L 252.475852 145.796183 \n",
 368 |        "L 245.229099 145.796183 \n",
 369 |        "z\n",
 370 |        "\" clip-path=\"url(#p989e76dd71)\" style=\"fill: url(#h7e19e8847f)\"/>\n",
 371 |        "   </g>\n",
 372 |        "   <g id=\"matplotlib.axis_1\">\n",
 373 |        "    <g id=\"xtick_1\">\n",
 374 |        "     <g id=\"line2d_1\">\n",
 375 |        "      <defs>\n",
 376 |        "       <path id=\"m4988334681\" d=\"M 0 0 \n",
 377 |        "L 0 3.5 \n",
 378 |        "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 379 |        "      </defs>\n",
 380 |        "      <g>\n",
 381 |        "       <use xlink:href=\"#m4988334681\" x=\"66.648394\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 382 |        "      </g>\n",
 383 |        "     </g>\n",
 384 |        "     <g id=\"text_1\">\n",
 385 |        "      <!-- 0 -->\n",
 386 |        "      <g transform=\"translate(63.467144 160.398438)scale(0.1 -0.1)\">\n",
 387 |        "       <defs>\n",
 388 |        "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
 389 |        "Q 1547 4250 1301 3770 \n",
 390 |        "Q 1056 3291 1056 2328 \n",
 391 |        "Q 1056 1369 1301 889 \n",
 392 |        "Q 1547 409 2034 409 \n",
 393 |        "Q 2525 409 2770 889 \n",
 394 |        "Q 3016 1369 3016 2328 \n",
 395 |        "Q 3016 3291 2770 3770 \n",
 396 |        "Q 2525 4250 2034 4250 \n",
 397 |        "z\n",
 398 |        "M 2034 4750 \n",
 399 |        "Q 2819 4750 3233 4129 \n",
 400 |        "Q 3647 3509 3647 2328 \n",
 401 |        "Q 3647 1150 3233 529 \n",
 402 |        "Q 2819 -91 2034 -91 \n",
 403 |        "Q 1250 -91 836 529 \n",
 404 |        "Q 422 1150 422 2328 \n",
 405 |        "Q 422 3509 836 4129 \n",
 406 |        "Q 1250 4750 2034 4750 \n",
 407 |        "z\n",
 408 |        "\" transform=\"scale(0.015625)\"/>\n",
 409 |        "       </defs>\n",
 410 |        "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
 411 |        "      </g>\n",
 412 |        "     </g>\n",
 413 |        "    </g>\n",
 414 |        "    <g id=\"xtick_2\">\n",
 415 |        "     <g id=\"line2d_2\">\n",
 416 |        "      <g>\n",
 417 |        "       <use xlink:href=\"#m4988334681\" x=\"131.351548\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 418 |        "      </g>\n",
 419 |        "     </g>\n",
 420 |        "     <g id=\"text_2\">\n",
 421 |        "      <!-- 20 -->\n",
 422 |        "      <g transform=\"translate(124.989048 160.398438)scale(0.1 -0.1)\">\n",
 423 |        "       <defs>\n",
 424 |        "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
 425 |        "L 3431 531 \n",
 426 |        "L 3431 0 \n",
 427 |        "L 469 0 \n",
 428 |        "L 469 531 \n",
 429 |        "Q 828 903 1448 1529 \n",
 430 |        "Q 2069 2156 2228 2338 \n",
 431 |        "Q 2531 2678 2651 2914 \n",
 432 |        "Q 2772 3150 2772 3378 \n",
 433 |        "Q 2772 3750 2511 3984 \n",
 434 |        "Q 2250 4219 1831 4219 \n",
 435 |        "Q 1534 4219 1204 4116 \n",
 436 |        "Q 875 4013 500 3803 \n",
 437 |        "L 500 4441 \n",
 438 |        "Q 881 4594 1212 4672 \n",
 439 |        "Q 1544 4750 1819 4750 \n",
 440 |        "Q 2544 4750 2975 4387 \n",
 441 |        "Q 3406 4025 3406 3419 \n",
 442 |        "Q 3406 3131 3298 2873 \n",
 443 |        "Q 3191 2616 2906 2266 \n",
 444 |        "Q 2828 2175 2409 1742 \n",
 445 |        "Q 1991 1309 1228 531 \n",
 446 |        "z\n",
 447 |        "\" transform=\"scale(0.015625)\"/>\n",
 448 |        "       </defs>\n",
 449 |        "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
 450 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 451 |        "      </g>\n",
 452 |        "     </g>\n",
 453 |        "    </g>\n",
 454 |        "    <g id=\"xtick_3\">\n",
 455 |        "     <g id=\"line2d_3\">\n",
 456 |        "      <g>\n",
 457 |        "       <use xlink:href=\"#m4988334681\" x=\"196.054702\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 458 |        "      </g>\n",
 459 |        "     </g>\n",
 460 |        "     <g id=\"text_3\">\n",
 461 |        "      <!-- 40 -->\n",
 462 |        "      <g transform=\"translate(189.692202 160.398438)scale(0.1 -0.1)\">\n",
 463 |        "       <defs>\n",
 464 |        "        <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
 465 |        "L 825 1625 \n",
 466 |        "L 2419 1625 \n",
 467 |        "L 2419 4116 \n",
 468 |        "z\n",
 469 |        "M 2253 4666 \n",
 470 |        "L 3047 4666 \n",
 471 |        "L 3047 1625 \n",
 472 |        "L 3713 1625 \n",
 473 |        "L 3713 1100 \n",
 474 |        "L 3047 1100 \n",
 475 |        "L 3047 0 \n",
 476 |        "L 2419 0 \n",
 477 |        "L 2419 1100 \n",
 478 |        "L 313 1100 \n",
 479 |        "L 313 1709 \n",
 480 |        "L 2253 4666 \n",
 481 |        "z\n",
 482 |        "\" transform=\"scale(0.015625)\"/>\n",
 483 |        "       </defs>\n",
 484 |        "       <use xlink:href=\"#DejaVuSans-34\"/>\n",
 485 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 486 |        "      </g>\n",
 487 |        "     </g>\n",
 488 |        "    </g>\n",
 489 |        "    <g id=\"xtick_4\">\n",
 490 |        "     <g id=\"line2d_4\">\n",
 491 |        "      <g>\n",
 492 |        "       <use xlink:href=\"#m4988334681\" x=\"260.757856\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 493 |        "      </g>\n",
 494 |        "     </g>\n",
 495 |        "     <g id=\"text_4\">\n",
 496 |        "      <!-- 60 -->\n",
 497 |        "      <g transform=\"translate(254.395356 160.398438)scale(0.1 -0.1)\">\n",
 498 |        "       <defs>\n",
 499 |        "        <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
 500 |        "Q 1688 2584 1439 2293 \n",
 501 |        "Q 1191 2003 1191 1497 \n",
 502 |        "Q 1191 994 1439 701 \n",
 503 |        "Q 1688 409 2113 409 \n",
 504 |        "Q 2538 409 2786 701 \n",
 505 |        "Q 3034 994 3034 1497 \n",
 506 |        "Q 3034 2003 2786 2293 \n",
 507 |        "Q 2538 2584 2113 2584 \n",
 508 |        "z\n",
 509 |        "M 3366 4563 \n",
 510 |        "L 3366 3988 \n",
 511 |        "Q 3128 4100 2886 4159 \n",
 512 |        "Q 2644 4219 2406 4219 \n",
 513 |        "Q 1781 4219 1451 3797 \n",
 514 |        "Q 1122 3375 1075 2522 \n",
 515 |        "Q 1259 2794 1537 2939 \n",
 516 |        "Q 1816 3084 2150 3084 \n",
 517 |        "Q 2853 3084 3261 2657 \n",
 518 |        "Q 3669 2231 3669 1497 \n",
 519 |        "Q 3669 778 3244 343 \n",
 520 |        "Q 2819 -91 2113 -91 \n",
 521 |        "Q 1303 -91 875 529 \n",
 522 |        "Q 447 1150 447 2328 \n",
 523 |        "Q 447 3434 972 4092 \n",
 524 |        "Q 1497 4750 2381 4750 \n",
 525 |        "Q 2619 4750 2861 4703 \n",
 526 |        "Q 3103 4656 3366 4563 \n",
 527 |        "z\n",
 528 |        "\" transform=\"scale(0.015625)\"/>\n",
 529 |        "       </defs>\n",
 530 |        "       <use xlink:href=\"#DejaVuSans-36\"/>\n",
 531 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 532 |        "      </g>\n",
 533 |        "     </g>\n",
 534 |        "    </g>\n",
 535 |        "    <g id=\"text_5\">\n",
 536 |        "     <!-- tokens per sequence -->\n",
 537 |        "     <g transform=\"translate(111.539844 174.076563)scale(0.1 -0.1)\">\n",
 538 |        "      <defs>\n",
 539 |        "       <path id=\"DejaVuSans-74\" d=\"M 1172 4494 \n",
 540 |        "L 1172 3500 \n",
 541 |        "L 2356 3500 \n",
 542 |        "L 2356 3053 \n",
 543 |        "L 1172 3053 \n",
 544 |        "L 1172 1153 \n",
 545 |        "Q 1172 725 1289 603 \n",
 546 |        "Q 1406 481 1766 481 \n",
 547 |        "L 2356 481 \n",
 548 |        "L 2356 0 \n",
 549 |        "L 1766 0 \n",
 550 |        "Q 1100 0 847 248 \n",
 551 |        "Q 594 497 594 1153 \n",
 552 |        "L 594 3053 \n",
 553 |        "L 172 3053 \n",
 554 |        "L 172 3500 \n",
 555 |        "L 594 3500 \n",
 556 |        "L 594 4494 \n",
 557 |        "L 1172 4494 \n",
 558 |        "z\n",
 559 |        "\" transform=\"scale(0.015625)\"/>\n",
 560 |        "       <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
 561 |        "Q 1497 3097 1228 2736 \n",
 562 |        "Q 959 2375 959 1747 \n",
 563 |        "Q 959 1119 1226 758 \n",
 564 |        "Q 1494 397 1959 397 \n",
 565 |        "Q 2419 397 2687 759 \n",
 566 |        "Q 2956 1122 2956 1747 \n",
 567 |        "Q 2956 2369 2687 2733 \n",
 568 |        "Q 2419 3097 1959 3097 \n",
 569 |        "z\n",
 570 |        "M 1959 3584 \n",
 571 |        "Q 2709 3584 3137 3096 \n",
 572 |        "Q 3566 2609 3566 1747 \n",
 573 |        "Q 3566 888 3137 398 \n",
 574 |        "Q 2709 -91 1959 -91 \n",
 575 |        "Q 1206 -91 779 398 \n",
 576 |        "Q 353 888 353 1747 \n",
 577 |        "Q 353 2609 779 3096 \n",
 578 |        "Q 1206 3584 1959 3584 \n",
 579 |        "z\n",
 580 |        "\" transform=\"scale(0.015625)\"/>\n",
 581 |        "       <path id=\"DejaVuSans-6b\" d=\"M 581 4863 \n",
 582 |        "L 1159 4863 \n",
 583 |        "L 1159 1991 \n",
 584 |        "L 2875 3500 \n",
 585 |        "L 3609 3500 \n",
 586 |        "L 1753 1863 \n",
 587 |        "L 3688 0 \n",
 588 |        "L 2938 0 \n",
 589 |        "L 1159 1709 \n",
 590 |        "L 1159 0 \n",
 591 |        "L 581 0 \n",
 592 |        "L 581 4863 \n",
 593 |        "z\n",
 594 |        "\" transform=\"scale(0.015625)\"/>\n",
 595 |        "       <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
 596 |        "L 3597 1613 \n",
 597 |        "L 953 1613 \n",
 598 |        "Q 991 1019 1311 708 \n",
 599 |        "Q 1631 397 2203 397 \n",
 600 |        "Q 2534 397 2845 478 \n",
 601 |        "Q 3156 559 3463 722 \n",
 602 |        "L 3463 178 \n",
 603 |        "Q 3153 47 2828 -22 \n",
 604 |        "Q 2503 -91 2169 -91 \n",
 605 |        "Q 1331 -91 842 396 \n",
 606 |        "Q 353 884 353 1716 \n",
 607 |        "Q 353 2575 817 3079 \n",
 608 |        "Q 1281 3584 2069 3584 \n",
 609 |        "Q 2775 3584 3186 3129 \n",
 610 |        "Q 3597 2675 3597 1894 \n",
 611 |        "z\n",
 612 |        "M 3022 2063 \n",
 613 |        "Q 3016 2534 2758 2815 \n",
 614 |        "Q 2500 3097 2075 3097 \n",
 615 |        "Q 1594 3097 1305 2825 \n",
 616 |        "Q 1016 2553 972 2059 \n",
 617 |        "L 3022 2063 \n",
 618 |        "z\n",
 619 |        "\" transform=\"scale(0.015625)\"/>\n",
 620 |        "       <path id=\"DejaVuSans-6e\" d=\"M 3513 2113 \n",
 621 |        "L 3513 0 \n",
 622 |        "L 2938 0 \n",
 623 |        "L 2938 2094 \n",
 624 |        "Q 2938 2591 2744 2837 \n",
 625 |        "Q 2550 3084 2163 3084 \n",
 626 |        "Q 1697 3084 1428 2787 \n",
 627 |        "Q 1159 2491 1159 1978 \n",
 628 |        "L 1159 0 \n",
 629 |        "L 581 0 \n",
 630 |        "L 581 3500 \n",
 631 |        "L 1159 3500 \n",
 632 |        "L 1159 2956 \n",
 633 |        "Q 1366 3272 1645 3428 \n",
 634 |        "Q 1925 3584 2291 3584 \n",
 635 |        "Q 2894 3584 3203 3211 \n",
 636 |        "Q 3513 2838 3513 2113 \n",
 637 |        "z\n",
 638 |        "\" transform=\"scale(0.015625)\"/>\n",
 639 |        "       <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
 640 |        "L 2834 2853 \n",
 641 |        "Q 2591 2978 2328 3040 \n",
 642 |        "Q 2066 3103 1784 3103 \n",
 643 |        "Q 1356 3103 1142 2972 \n",
 644 |        "Q 928 2841 928 2578 \n",
 645 |        "Q 928 2378 1081 2264 \n",
 646 |        "Q 1234 2150 1697 2047 \n",
 647 |        "L 1894 2003 \n",
 648 |        "Q 2506 1872 2764 1633 \n",
 649 |        "Q 3022 1394 3022 966 \n",
 650 |        "Q 3022 478 2636 193 \n",
 651 |        "Q 2250 -91 1575 -91 \n",
 652 |        "Q 1294 -91 989 -36 \n",
 653 |        "Q 684 19 347 128 \n",
 654 |        "L 347 722 \n",
 655 |        "Q 666 556 975 473 \n",
 656 |        "Q 1284 391 1588 391 \n",
 657 |        "Q 1994 391 2212 530 \n",
 658 |        "Q 2431 669 2431 922 \n",
 659 |        "Q 2431 1156 2273 1281 \n",
 660 |        "Q 2116 1406 1581 1522 \n",
 661 |        "L 1381 1569 \n",
 662 |        "Q 847 1681 609 1914 \n",
 663 |        "Q 372 2147 372 2553 \n",
 664 |        "Q 372 3047 722 3315 \n",
 665 |        "Q 1072 3584 1716 3584 \n",
 666 |        "Q 2034 3584 2315 3537 \n",
 667 |        "Q 2597 3491 2834 3397 \n",
 668 |        "z\n",
 669 |        "\" transform=\"scale(0.015625)\"/>\n",
 670 |        "       <path id=\"DejaVuSans-20\" transform=\"scale(0.015625)\"/>\n",
 671 |        "       <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
 672 |        "L 1159 -1331 \n",
 673 |        "L 581 -1331 \n",
 674 |        "L 581 3500 \n",
 675 |        "L 1159 3500 \n",
 676 |        "L 1159 2969 \n",
 677 |        "Q 1341 3281 1617 3432 \n",
 678 |        "Q 1894 3584 2278 3584 \n",
 679 |        "Q 2916 3584 3314 3078 \n",
 680 |        "Q 3713 2572 3713 1747 \n",
 681 |        "Q 3713 922 3314 415 \n",
 682 |        "Q 2916 -91 2278 -91 \n",
 683 |        "Q 1894 -91 1617 61 \n",
 684 |        "Q 1341 213 1159 525 \n",
 685 |        "z\n",
 686 |        "M 3116 1747 \n",
 687 |        "Q 3116 2381 2855 2742 \n",
 688 |        "Q 2594 3103 2138 3103 \n",
 689 |        "Q 1681 3103 1420 2742 \n",
 690 |        "Q 1159 2381 1159 1747 \n",
 691 |        "Q 1159 1113 1420 752 \n",
 692 |        "Q 1681 391 2138 391 \n",
 693 |        "Q 2594 391 2855 752 \n",
 694 |        "Q 3116 1113 3116 1747 \n",
 695 |        "z\n",
 696 |        "\" transform=\"scale(0.015625)\"/>\n",
 697 |        "       <path id=\"DejaVuSans-72\" d=\"M 2631 2963 \n",
 698 |        "Q 2534 3019 2420 3045 \n",
 699 |        "Q 2306 3072 2169 3072 \n",
 700 |        "Q 1681 3072 1420 2755 \n",
 701 |        "Q 1159 2438 1159 1844 \n",
 702 |        "L 1159 0 \n",
 703 |        "L 581 0 \n",
 704 |        "L 581 3500 \n",
 705 |        "L 1159 3500 \n",
 706 |        "L 1159 2956 \n",
 707 |        "Q 1341 3275 1631 3429 \n",
 708 |        "Q 1922 3584 2338 3584 \n",
 709 |        "Q 2397 3584 2469 3576 \n",
 710 |        "Q 2541 3569 2628 3553 \n",
 711 |        "L 2631 2963 \n",
 712 |        "z\n",
 713 |        "\" transform=\"scale(0.015625)\"/>\n",
 714 |        "       <path id=\"DejaVuSans-71\" d=\"M 947 1747 \n",
 715 |        "Q 947 1113 1208 752 \n",
 716 |        "Q 1469 391 1925 391 \n",
 717 |        "Q 2381 391 2643 752 \n",
 718 |        "Q 2906 1113 2906 1747 \n",
 719 |        "Q 2906 2381 2643 2742 \n",
 720 |        "Q 2381 3103 1925 3103 \n",
 721 |        "Q 1469 3103 1208 2742 \n",
 722 |        "Q 947 2381 947 1747 \n",
 723 |        "z\n",
 724 |        "M 2906 525 \n",
 725 |        "Q 2725 213 2448 61 \n",
 726 |        "Q 2172 -91 1784 -91 \n",
 727 |        "Q 1150 -91 751 415 \n",
 728 |        "Q 353 922 353 1747 \n",
 729 |        "Q 353 2572 751 3078 \n",
 730 |        "Q 1150 3584 1784 3584 \n",
 731 |        "Q 2172 3584 2448 3432 \n",
 732 |        "Q 2725 3281 2906 2969 \n",
 733 |        "L 2906 3500 \n",
 734 |        "L 3481 3500 \n",
 735 |        "L 3481 -1331 \n",
 736 |        "L 2906 -1331 \n",
 737 |        "L 2906 525 \n",
 738 |        "z\n",
 739 |        "\" transform=\"scale(0.015625)\"/>\n",
 740 |        "       <path id=\"DejaVuSans-75\" d=\"M 544 1381 \n",
 741 |        "L 544 3500 \n",
 742 |        "L 1119 3500 \n",
 743 |        "L 1119 1403 \n",
 744 |        "Q 1119 906 1312 657 \n",
 745 |        "Q 1506 409 1894 409 \n",
 746 |        "Q 2359 409 2629 706 \n",
 747 |        "Q 2900 1003 2900 1516 \n",
 748 |        "L 2900 3500 \n",
 749 |        "L 3475 3500 \n",
 750 |        "L 3475 0 \n",
 751 |        "L 2900 0 \n",
 752 |        "L 2900 538 \n",
 753 |        "Q 2691 219 2414 64 \n",
 754 |        "Q 2138 -91 1772 -91 \n",
 755 |        "Q 1169 -91 856 284 \n",
 756 |        "Q 544 659 544 1381 \n",
 757 |        "z\n",
 758 |        "M 1991 3584 \n",
 759 |        "L 1991 3584 \n",
 760 |        "z\n",
 761 |        "\" transform=\"scale(0.015625)\"/>\n",
 762 |        "       <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
 763 |        "L 3122 2828 \n",
 764 |        "Q 2878 2963 2633 3030 \n",
 765 |        "Q 2388 3097 2138 3097 \n",
 766 |        "Q 1578 3097 1268 2742 \n",
 767 |        "Q 959 2388 959 1747 \n",
 768 |        "Q 959 1106 1268 751 \n",
 769 |        "Q 1578 397 2138 397 \n",
 770 |        "Q 2388 397 2633 464 \n",
 771 |        "Q 2878 531 3122 666 \n",
 772 |        "L 3122 134 \n",
 773 |        "Q 2881 22 2623 -34 \n",
 774 |        "Q 2366 -91 2075 -91 \n",
 775 |        "Q 1284 -91 818 406 \n",
 776 |        "Q 353 903 353 1747 \n",
 777 |        "Q 353 2603 823 3093 \n",
 778 |        "Q 1294 3584 2113 3584 \n",
 779 |        "Q 2378 3584 2631 3529 \n",
 780 |        "Q 2884 3475 3122 3366 \n",
 781 |        "z\n",
 782 |        "\" transform=\"scale(0.015625)\"/>\n",
 783 |        "      </defs>\n",
 784 |        "      <use xlink:href=\"#DejaVuSans-74\"/>\n",
 785 |        "      <use xlink:href=\"#DejaVuSans-6f\" x=\"39.208984\"/>\n",
 786 |        "      <use xlink:href=\"#DejaVuSans-6b\" x=\"100.390625\"/>\n",
 787 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"154.675781\"/>\n",
 788 |        "      <use xlink:href=\"#DejaVuSans-6e\" x=\"216.199219\"/>\n",
 789 |        "      <use xlink:href=\"#DejaVuSans-73\" x=\"279.578125\"/>\n",
 790 |        "      <use xlink:href=\"#DejaVuSans-20\" x=\"331.677734\"/>\n",
 791 |        "      <use xlink:href=\"#DejaVuSans-70\" x=\"363.464844\"/>\n",
 792 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"426.941406\"/>\n",
 793 |        "      <use xlink:href=\"#DejaVuSans-72\" x=\"488.464844\"/>\n",
 794 |        "      <use xlink:href=\"#DejaVuSans-20\" x=\"529.578125\"/>\n",
 795 |        "      <use xlink:href=\"#DejaVuSans-73\" x=\"561.365234\"/>\n",
 796 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"613.464844\"/>\n",
 797 |        "      <use xlink:href=\"#DejaVuSans-71\" x=\"674.988281\"/>\n",
 798 |        "      <use xlink:href=\"#DejaVuSans-75\" x=\"738.464844\"/>\n",
 799 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"801.84375\"/>\n",
 800 |        "      <use xlink:href=\"#DejaVuSans-6e\" x=\"863.367188\"/>\n",
 801 |        "      <use xlink:href=\"#DejaVuSans-63\" x=\"926.746094\"/>\n",
 802 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"981.726562\"/>\n",
 803 |        "     </g>\n",
 804 |        "    </g>\n",
 805 |        "   </g>\n",
 806 |        "   <g id=\"matplotlib.axis_2\">\n",
 807 |        "    <g id=\"ytick_1\">\n",
 808 |        "     <g id=\"line2d_5\">\n",
 809 |        "      <defs>\n",
 810 |        "       <path id=\"m978d74b18b\" d=\"M 0 0 \n",
 811 |        "L -3.5 0 \n",
 812 |        "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 813 |        "      </defs>\n",
 814 |        "      <g>\n",
 815 |        "       <use xlink:href=\"#m978d74b18b\" x=\"66.053125\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 816 |        "      </g>\n",
 817 |        "     </g>\n",
 818 |        "     <g id=\"text_6\">\n",
 819 |        "      <!-- 0 -->\n",
 820 |        "      <g transform=\"translate(52.690625 149.599219)scale(0.1 -0.1)\">\n",
 821 |        "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
 822 |        "      </g>\n",
 823 |        "     </g>\n",
 824 |        "    </g>\n",
 825 |        "    <g id=\"ytick_2\">\n",
 826 |        "     <g id=\"line2d_6\">\n",
 827 |        "      <g>\n",
 828 |        "       <use xlink:href=\"#m978d74b18b\" x=\"66.053125\" y=\"120.355443\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 829 |        "      </g>\n",
 830 |        "     </g>\n",
 831 |        "     <g id=\"text_7\">\n",
 832 |        "      <!-- 20000 -->\n",
 833 |        "      <g transform=\"translate(27.240625 124.154662)scale(0.1 -0.1)\">\n",
 834 |        "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
 835 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 836 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 837 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
 838 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"254.492188\"/>\n",
 839 |        "      </g>\n",
 840 |        "     </g>\n",
 841 |        "    </g>\n",
 842 |        "    <g id=\"ytick_3\">\n",
 843 |        "     <g id=\"line2d_7\">\n",
 844 |        "      <g>\n",
 845 |        "       <use xlink:href=\"#m978d74b18b\" x=\"66.053125\" y=\"94.910886\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 846 |        "      </g>\n",
 847 |        "     </g>\n",
 848 |        "     <g id=\"text_8\">\n",
 849 |        "      <!-- 40000 -->\n",
 850 |        "      <g transform=\"translate(27.240625 98.710105)scale(0.1 -0.1)\">\n",
 851 |        "       <use xlink:href=\"#DejaVuSans-34\"/>\n",
 852 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 853 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 854 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
 855 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"254.492188\"/>\n",
 856 |        "      </g>\n",
 857 |        "     </g>\n",
 858 |        "    </g>\n",
 859 |        "    <g id=\"ytick_4\">\n",
 860 |        "     <g id=\"line2d_8\">\n",
 861 |        "      <g>\n",
 862 |        "       <use xlink:href=\"#m978d74b18b\" x=\"66.053125\" y=\"69.466329\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 863 |        "      </g>\n",
 864 |        "     </g>\n",
 865 |        "     <g id=\"text_9\">\n",
 866 |        "      <!-- 60000 -->\n",
 867 |        "      <g transform=\"translate(27.240625 73.265548)scale(0.1 -0.1)\">\n",
 868 |        "       <use xlink:href=\"#DejaVuSans-36\"/>\n",
 869 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 870 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 871 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
 872 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"254.492188\"/>\n",
 873 |        "      </g>\n",
 874 |        "     </g>\n",
 875 |        "    </g>\n",
 876 |        "    <g id=\"ytick_5\">\n",
 877 |        "     <g id=\"line2d_9\">\n",
 878 |        "      <g>\n",
 879 |        "       <use xlink:href=\"#m978d74b18b\" x=\"66.053125\" y=\"44.021772\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 880 |        "      </g>\n",
 881 |        "     </g>\n",
 882 |        "     <g id=\"text_10\">\n",
 883 |        "      <!-- 80000 -->\n",
 884 |        "      <g transform=\"translate(27.240625 47.820991)scale(0.1 -0.1)\">\n",
 885 |        "       <defs>\n",
 886 |        "        <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
 887 |        "Q 1584 2216 1326 1975 \n",
 888 |        "Q 1069 1734 1069 1313 \n",
 889 |        "Q 1069 891 1326 650 \n",
 890 |        "Q 1584 409 2034 409 \n",
 891 |        "Q 2484 409 2743 651 \n",
 892 |        "Q 3003 894 3003 1313 \n",
 893 |        "Q 3003 1734 2745 1975 \n",
 894 |        "Q 2488 2216 2034 2216 \n",
 895 |        "z\n",
 896 |        "M 1403 2484 \n",
 897 |        "Q 997 2584 770 2862 \n",
 898 |        "Q 544 3141 544 3541 \n",
 899 |        "Q 544 4100 942 4425 \n",
 900 |        "Q 1341 4750 2034 4750 \n",
 901 |        "Q 2731 4750 3128 4425 \n",
 902 |        "Q 3525 4100 3525 3541 \n",
 903 |        "Q 3525 3141 3298 2862 \n",
 904 |        "Q 3072 2584 2669 2484 \n",
 905 |        "Q 3125 2378 3379 2068 \n",
 906 |        "Q 3634 1759 3634 1313 \n",
 907 |        "Q 3634 634 3220 271 \n",
 908 |        "Q 2806 -91 2034 -91 \n",
 909 |        "Q 1263 -91 848 271 \n",
 910 |        "Q 434 634 434 1313 \n",
 911 |        "Q 434 1759 690 2068 \n",
 912 |        "Q 947 2378 1403 2484 \n",
 913 |        "z\n",
 914 |        "M 1172 3481 \n",
 915 |        "Q 1172 3119 1398 2916 \n",
 916 |        "Q 1625 2713 2034 2713 \n",
 917 |        "Q 2441 2713 2670 2916 \n",
 918 |        "Q 2900 3119 2900 3481 \n",
 919 |        "Q 2900 3844 2670 4047 \n",
 920 |        "Q 2441 4250 2034 4250 \n",
 921 |        "Q 1625 4250 1398 4047 \n",
 922 |        "Q 1172 3844 1172 3481 \n",
 923 |        "z\n",
 924 |        "\" transform=\"scale(0.015625)\"/>\n",
 925 |        "       </defs>\n",
 926 |        "       <use xlink:href=\"#DejaVuSans-38\"/>\n",
 927 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 928 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 929 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
 930 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"254.492188\"/>\n",
 931 |        "      </g>\n",
 932 |        "     </g>\n",
 933 |        "    </g>\n",
 934 |        "    <g id=\"ytick_6\">\n",
 935 |        "     <g id=\"line2d_10\">\n",
 936 |        "      <g>\n",
 937 |        "       <use xlink:href=\"#m978d74b18b\" x=\"66.053125\" y=\"18.577216\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
 938 |        "      </g>\n",
 939 |        "     </g>\n",
 940 |        "     <g id=\"text_11\">\n",
 941 |        "      <!-- 100000 -->\n",
 942 |        "      <g transform=\"translate(20.878125 22.376434)scale(0.1 -0.1)\">\n",
 943 |        "       <defs>\n",
 944 |        "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
 945 |        "L 1825 531 \n",
 946 |        "L 1825 4091 \n",
 947 |        "L 703 3866 \n",
 948 |        "L 703 4441 \n",
 949 |        "L 1819 4666 \n",
 950 |        "L 2450 4666 \n",
 951 |        "L 2450 531 \n",
 952 |        "L 3481 531 \n",
 953 |        "L 3481 0 \n",
 954 |        "L 794 0 \n",
 955 |        "L 794 531 \n",
 956 |        "z\n",
 957 |        "\" transform=\"scale(0.015625)\"/>\n",
 958 |        "       </defs>\n",
 959 |        "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
 960 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
 961 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
 962 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
 963 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"254.492188\"/>\n",
 964 |        "       <use xlink:href=\"#DejaVuSans-30\" x=\"318.115234\"/>\n",
 965 |        "      </g>\n",
 966 |        "     </g>\n",
 967 |        "    </g>\n",
 968 |        "    <g id=\"text_12\">\n",
 969 |        "     <!-- count -->\n",
 970 |        "     <g transform=\"translate(14.798438 90.60625)rotate(-90)scale(0.1 -0.1)\">\n",
 971 |        "      <use xlink:href=\"#DejaVuSans-63\"/>\n",
 972 |        "      <use xlink:href=\"#DejaVuSans-6f\" x=\"54.980469\"/>\n",
 973 |        "      <use xlink:href=\"#DejaVuSans-75\" x=\"116.162109\"/>\n",
 974 |        "      <use xlink:href=\"#DejaVuSans-6e\" x=\"179.541016\"/>\n",
 975 |        "      <use xlink:href=\"#DejaVuSans-74\" x=\"242.919922\"/>\n",
 976 |        "     </g>\n",
 977 |        "    </g>\n",
 978 |        "   </g>\n",
 979 |        "   <g id=\"patch_23\">\n",
 980 |        "    <path d=\"M 66.053125 145.8 \n",
 981 |        "L 66.053125 7.2 \n",
 982 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 983 |        "   </g>\n",
 984 |        "   <g id=\"patch_24\">\n",
 985 |        "    <path d=\"M 261.353125 145.8 \n",
 986 |        "L 261.353125 7.2 \n",
 987 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 988 |        "   </g>\n",
 989 |        "   <g id=\"patch_25\">\n",
 990 |        "    <path d=\"M 66.053125 145.8 \n",
 991 |        "L 261.353125 145.8 \n",
 992 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 993 |        "   </g>\n",
 994 |        "   <g id=\"patch_26\">\n",
 995 |        "    <path d=\"M 66.053125 7.2 \n",
 996 |        "L 261.353125 7.2 \n",
 997 |        "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
 998 |        "   </g>\n",
 999 |        "   <g id=\"legend_1\">\n",
1000 |        "    <g id=\"patch_27\">\n",
1001 |        "     <path d=\"M 189.15 44.55625 \n",
1002 |        "L 254.353125 44.55625 \n",
1003 |        "Q 256.353125 44.55625 256.353125 42.55625 \n",
1004 |        "L 256.353125 14.2 \n",
1005 |        "Q 256.353125 12.2 254.353125 12.2 \n",
1006 |        "L 189.15 12.2 \n",
1007 |        "Q 187.15 12.2 187.15 14.2 \n",
1008 |        "L 187.15 42.55625 \n",
1009 |        "Q 187.15 44.55625 189.15 44.55625 \n",
1010 |        "z\n",
1011 |        "\" style=\"fill: #ffffff; opacity: 0.8; stroke: #cccccc; stroke-linejoin: miter\"/>\n",
1012 |        "    </g>\n",
1013 |        "    <g id=\"patch_28\">\n",
1014 |        "     <path d=\"M 191.15 23.798438 \n",
1015 |        "L 211.15 23.798438 \n",
1016 |        "L 211.15 16.798438 \n",
1017 |        "L 191.15 16.798438 \n",
1018 |        "z\n",
1019 |        "\" style=\"fill: #1f77b4\"/>\n",
1020 |        "    </g>\n",
1021 |        "    <g id=\"text_13\">\n",
1022 |        "     <!-- source -->\n",
1023 |        "     <g transform=\"translate(219.15 23.798438)scale(0.1 -0.1)\">\n",
1024 |        "      <use xlink:href=\"#DejaVuSans-73\"/>\n",
1025 |        "      <use xlink:href=\"#DejaVuSans-6f\" x=\"52.099609\"/>\n",
1026 |        "      <use xlink:href=\"#DejaVuSans-75\" x=\"113.28125\"/>\n",
1027 |        "      <use xlink:href=\"#DejaVuSans-72\" x=\"176.660156\"/>\n",
1028 |        "      <use xlink:href=\"#DejaVuSans-63\" x=\"215.523438\"/>\n",
1029 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"270.503906\"/>\n",
1030 |        "     </g>\n",
1031 |        "    </g>\n",
1032 |        "    <g id=\"patch_29\">\n",
1033 |        "     <path d=\"M 191.15 38.476562 \n",
1034 |        "L 211.15 38.476562 \n",
1035 |        "L 211.15 31.476562 \n",
1036 |        "L 191.15 31.476562 \n",
1037 |        "z\n",
1038 |        "\" style=\"fill: url(#h7e19e8847f)\"/>\n",
1039 |        "    </g>\n",
1040 |        "    <g id=\"text_14\">\n",
1041 |        "     <!-- target -->\n",
1042 |        "     <g transform=\"translate(219.15 38.476562)scale(0.1 -0.1)\">\n",
1043 |        "      <defs>\n",
1044 |        "       <path id=\"DejaVuSans-61\" d=\"M 2194 1759 \n",
1045 |        "Q 1497 1759 1228 1600 \n",
1046 |        "Q 959 1441 959 1056 \n",
1047 |        "Q 959 750 1161 570 \n",
1048 |        "Q 1363 391 1709 391 \n",
1049 |        "Q 2188 391 2477 730 \n",
1050 |        "Q 2766 1069 2766 1631 \n",
1051 |        "L 2766 1759 \n",
1052 |        "L 2194 1759 \n",
1053 |        "z\n",
1054 |        "M 3341 1997 \n",
1055 |        "L 3341 0 \n",
1056 |        "L 2766 0 \n",
1057 |        "L 2766 531 \n",
1058 |        "Q 2569 213 2275 61 \n",
1059 |        "Q 1981 -91 1556 -91 \n",
1060 |        "Q 1019 -91 701 211 \n",
1061 |        "Q 384 513 384 1019 \n",
1062 |        "Q 384 1609 779 1909 \n",
1063 |        "Q 1175 2209 1959 2209 \n",
1064 |        "L 2766 2209 \n",
1065 |        "L 2766 2266 \n",
1066 |        "Q 2766 2663 2505 2880 \n",
1067 |        "Q 2244 3097 1772 3097 \n",
1068 |        "Q 1472 3097 1187 3025 \n",
1069 |        "Q 903 2953 641 2809 \n",
1070 |        "L 641 3341 \n",
1071 |        "Q 956 3463 1253 3523 \n",
1072 |        "Q 1550 3584 1831 3584 \n",
1073 |        "Q 2591 3584 2966 3190 \n",
1074 |        "Q 3341 2797 3341 1997 \n",
1075 |        "z\n",
1076 |        "\" transform=\"scale(0.015625)\"/>\n",
1077 |        "       <path id=\"DejaVuSans-67\" d=\"M 2906 1791 \n",
1078 |        "Q 2906 2416 2648 2759 \n",
1079 |        "Q 2391 3103 1925 3103 \n",
1080 |        "Q 1463 3103 1205 2759 \n",
1081 |        "Q 947 2416 947 1791 \n",
1082 |        "Q 947 1169 1205 825 \n",
1083 |        "Q 1463 481 1925 481 \n",
1084 |        "Q 2391 481 2648 825 \n",
1085 |        "Q 2906 1169 2906 1791 \n",
1086 |        "z\n",
1087 |        "M 3481 434 \n",
1088 |        "Q 3481 -459 3084 -895 \n",
1089 |        "Q 2688 -1331 1869 -1331 \n",
1090 |        "Q 1566 -1331 1297 -1286 \n",
1091 |        "Q 1028 -1241 775 -1147 \n",
1092 |        "L 775 -588 \n",
1093 |        "Q 1028 -725 1275 -790 \n",
1094 |        "Q 1522 -856 1778 -856 \n",
1095 |        "Q 2344 -856 2625 -561 \n",
1096 |        "Q 2906 -266 2906 331 \n",
1097 |        "L 2906 616 \n",
1098 |        "Q 2728 306 2450 153 \n",
1099 |        "Q 2172 0 1784 0 \n",
1100 |        "Q 1141 0 747 490 \n",
1101 |        "Q 353 981 353 1791 \n",
1102 |        "Q 353 2603 747 3093 \n",
1103 |        "Q 1141 3584 1784 3584 \n",
1104 |        "Q 2172 3584 2450 3431 \n",
1105 |        "Q 2728 3278 2906 2969 \n",
1106 |        "L 2906 3500 \n",
1107 |        "L 3481 3500 \n",
1108 |        "L 3481 434 \n",
1109 |        "z\n",
1110 |        "\" transform=\"scale(0.015625)\"/>\n",
1111 |        "      </defs>\n",
1112 |        "      <use xlink:href=\"#DejaVuSans-74\"/>\n",
1113 |        "      <use xlink:href=\"#DejaVuSans-61\" x=\"39.208984\"/>\n",
1114 |        "      <use xlink:href=\"#DejaVuSans-72\" x=\"100.488281\"/>\n",
1115 |        "      <use xlink:href=\"#DejaVuSans-67\" x=\"139.851562\"/>\n",
1116 |        "      <use xlink:href=\"#DejaVuSans-65\" x=\"203.328125\"/>\n",
1117 |        "      <use xlink:href=\"#DejaVuSans-74\" x=\"264.851562\"/>\n",
1118 |        "     </g>\n",
1119 |        "    </g>\n",
1120 |        "   </g>\n",
1121 |        "  </g>\n",
1122 |        " </g>\n",
1123 |        " <defs>\n",
1124 |        "  <clipPath id=\"p989e76dd71\">\n",
1125 |        "   <rect x=\"66.053125\" y=\"7.2\" width=\"195.3\" height=\"138.6\"/>\n",
1126 |        "  </clipPath>\n",
1127 |        " </defs>\n",
1128 |        " <defs>\n",
1129 |        "  <pattern id=\"h7e19e8847f\" patternUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"72\" height=\"72\">\n",
1130 |        "   <rect x=\"0\" y=\"0\" width=\"73\" height=\"73\" fill=\"#ff7f0e\"/>\n",
1131 |        "   <path d=\"M -36 36 \n",
1132 |        "L 36 -36 \n",
1133 |        "M -24 48 \n",
1134 |        "L 48 -24 \n",
1135 |        "M -12 60 \n",
1136 |        "L 60 -12 \n",
1137 |        "M 0 72 \n",
1138 |        "L 72 0 \n",
1139 |        "M 12 84 \n",
1140 |        "L 84 12 \n",
1141 |        "M 24 96 \n",
1142 |        "L 96 24 \n",
1143 |        "M 36 108 \n",
1144 |        "L 108 36 \n",
1145 |        "\" style=\"fill: #000000; stroke: #000000; stroke-width: 1.0; stroke-linecap: butt; stroke-linejoin: miter\"/>\n",
1146 |        "  </pattern>\n",
1147 |        " </defs>\n",
1148 |        "</svg>\n"
1149 |       ],
1150 |       "text/plain": [
1151 |        "<Figure size 350x250 with 1 Axes>"
1152 |       ]
1153 |      },
1154 |      "metadata": {},
1155 |      "output_type": "display_data"
1156 |     }
1157 |    ],
1158 |    "source": [
1159 |     "def show_tokens_per_seq(legend, xlabel, ylabel, xlist, ylist):\n",
1160 |     "    d2l.set_figsize()\n",
1161 |     "    _, _, patches = d2l.plt.hist([[len(l) for l in xlist], [len(l) for l in ylist]])\n",
1162 |     "    d2l.plt.xlabel(xlabel)\n",
1163 |     "    d2l.plt.ylabel(ylabel)\n",
1164 |     "    for patch in patches[1].patches:\n",
1165 |     "        patch.set_hatch('/')\n",
1166 |     "    d2l.plt.legend(legend)\n",
1167 |     "\n",
1168 |     "show_tokens_per_seq(['source', 'target'], 'tokens per sequence', 'count', src, tar)"
1169 |    ]
1170 |   },
1171 |   {
1172 |    "cell_type": "markdown",
1173 |    "id": "45b138e9",
1174 |    "metadata": {},
1175 |    "source": [
1176 |     "构造一个源语言的词汇表"
1177 |    ]
1178 |   },
1179 |   {
1180 |    "cell_type": "code",
1181 |    "execution_count": 41,
1182 |    "id": "b0d35122",
1183 |    "metadata": {},
1184 |    "outputs": [
1185 |     {
1186 |      "data": {
1187 |       "text/plain": [
1188 |        "10012"
1189 |       ]
1190 |      },
1191 |      "execution_count": 41,
1192 |      "metadata": {},
1193 |      "output_type": "execute_result"
1194 |     }
1195 |    ],
1196 |    "source": [
1197 |     "src_vocab1 = d2l.Vocab(src, min_freq = 2, reserved_tokens = ['<pad>', '<bos>', '<eos>'])\n",
1198 |     "len(src_vocab)"
1199 |    ]
1200 |   },
1201 |   {
1202 |    "cell_type": "markdown",
1203 |    "id": "77911d0c",
1204 |    "metadata": {},
1205 |    "source": [
1206 |     "构造一个函数能在序列中制造小批量数据：给定一个时间步长num_steps，一个小批量数据内的每个序列的长度都应该为num_steps。若一个序列的的长度大于num_steps，就应该截取前num_steps个词元组成序列；若一个序列的长度小于num_steps，则应该在其尾部补充词元数量直至满足num_steps的长度。"
1207 |    ]
1208 |   },
1209 |   {
1210 |    "cell_type": "code",
1211 |    "execution_count": 37,
1212 |    "id": "ccac8888",
1213 |    "metadata": {},
1214 |    "outputs": [
1215 |     {
1216 |      "data": {
1217 |       "text/plain": [
1218 |        "[47, 4, 1, 1, 1, 1, 1, 1, 1, 1]"
1219 |       ]
1220 |      },
1221 |      "execution_count": 37,
1222 |      "metadata": {},
1223 |      "output_type": "execute_result"
1224 |     }
1225 |    ],
1226 |    "source": [
1227 |     "def truncate_pad(line, num_steps, padding_token):\n",
1228 |     "    if len(line) > num_steps:\n",
1229 |     "        line = line[:num_steps]\n",
1230 |     "    else:\n",
1231 |     "        line = line + [padding_token] * (num_steps - len(line))\n",
1232 |     "    return line\n",
1233 |     "\n",
1234 |     "truncate_pad(src_vocab[src[0]], 10, src_vocab['<pad>'])"
1235 |    ]
1236 |   },
1237 |   {
1238 |    "cell_type": "code",
1239 |    "execution_count": 43,
1240 |    "id": "d8850cc2",
1241 |    "metadata": {},
1242 |    "outputs": [],
1243 |    "source": [
1244 |     "def array_build(lines, vocab, num_steps):\n",
1245 |     "    lines = [vocab[l] for l in lines]\n",
1246 |     "    lines = [l + [vocab['<eos>']] for l in lines]\n",
1247 |     "    array = torch.tensor([truncate_pad(l, num_steps, vocab['<pad>']) for l in lines])\n",
1248 |     "    valid_len = (array != vocab['<pad>']).type(torch.int32).sum(1)\n",
1249 |     "    return array, valid_len"
1250 |    ]
1251 |   },
1252 |   {
1253 |    "cell_type": "code",
1254 |    "execution_count": 44,
1255 |    "id": "2c25c5c3",
1256 |    "metadata": {},
1257 |    "outputs": [],
1258 |    "source": [
1259 |     "def load_data(batch_size, num_steps, num_examples=600):\n",
1260 |     "    # 文本序列预处理\n",
1261 |     "    text = preprocess(read_data())\n",
1262 |     "    # 词元化\n",
1263 |     "    source, target = tokenize(text, num_examples)\n",
1264 |     "    src_vocab = d2l.Vocab(source, min_freq=2, reserved_tokens = ['<pad>', '<bos>', '<eos>'])\n",
1265 |     "    tar_vocab = d2l.Vocab(target, min_freq=2, reserved_tokens = ['<pad>', '<bos>', '<eos>'])\n",
1266 |     "    src_array, src_valid_len = array_build(source, src_vocab, num_steps)\n",
1267 |     "    tar_array, tar_valid_len = array_build(target, tar_vocab, num_steps)\n",
1268 |     "    data_array = (src_array, src_valid_len, tar_array, tar_valid_len)\n",
1269 |     "    data_iter = d2l.load_array(data_array, batch_size)\n",
1270 |     "    return src_vocab, tar_vocab, data_iter"
1271 |    ]
1272 |   },
1273 |   {
1274 |    "cell_type": "code",
1275 |    "execution_count": 45,
1276 |    "id": "c3dcd1a0",
1277 |    "metadata": {
1278 |     "scrolled": true
1279 |    },
1280 |    "outputs": [
1281 |     {
1282 |      "name": "stdout",
1283 |      "output_type": "stream",
1284 |      "text": [
1285 |       "X: tensor([[10, 73,  4,  3,  1,  1,  1,  1],\n",
1286 |       "        [14, 27,  4,  3,  1,  1,  1,  1]], dtype=torch.int32)\n",
1287 |       "X的有效长度: tensor([4, 4])\n",
1288 |       "Y: tensor([[ 8,  0,  4,  3,  1,  1,  1,  1],\n",
1289 |       "        [26, 58,  5,  3,  1,  1,  1,  1]], dtype=torch.int32)\n",
1290 |       "Y的有效长度: tensor([4, 4])\n"
1291 |      ]
1292 |     }
1293 |    ],
1294 |    "source": [
1295 |     "src_vocab, tar_vocab, train_iter = load_data(batch_size=2, num_steps=8)\n",
1296 |     "for X, X_valid_len, Y, Y_valid_len in train_iter:\n",
1297 |     "    print('X:', X.type(torch.int32))\n",
1298 |     "    print('X的有效长度:', X_valid_len)\n",
1299 |     "    print('Y:', Y.type(torch.int32))\n",
1300 |     "    print('Y的有效长度:', Y_valid_len)\n",
1301 |     "    break"
1302 |    ]
1303 |   },
1304 |   {
1305 |    "cell_type": "code",
1306 |    "execution_count": null,
1307 |    "id": "80d01070",
1308 |    "metadata": {},
1309 |    "outputs": [],
1310 |    "source": []
1311 |   }
1312 |  ],
1313 |  "metadata": {
1314 |   "kernelspec": {
1315 |    "display_name": "Python [conda env:torch] *",
1316 |    "language": "python",
1317 |    "name": "conda-env-torch-py"
1318 |   },
1319 |   "language_info": {
1320 |    "codemirror_mode": {
1321 |     "name": "ipython",
1322 |     "version": 3
1323 |    },
1324 |    "file_extension": ".py",
1325 |    "mimetype": "text/x-python",
1326 |    "name": "python",
1327 |    "nbconvert_exporter": "python",
1328 |    "pygments_lexer": "ipython3",
1329 |    "version": "3.8.16"
1330 |   }
1331 |  },
1332 |  "nbformat": 4,
1333 |  "nbformat_minor": 5
1334 | }
1335 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/线性回归.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 2,
  6 |    "id": "232a5455",
  7 |    "metadata": {},
  8 |    "outputs": [],
  9 |    "source": [
 10 |     "%matplotlib inline\n",
 11 |     "import torch\n",
 12 |     "import random"
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "markdown",
 17 |    "id": "81fe2d86",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "首先构造参数为w和b以及带有一个噪声项$\\epsilon$的人造数据集**y** = **X** **w** + b + $\\epsilon$"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "code",
 25 |    "execution_count": 7,
 26 |    "id": "6fe5f7df",
 27 |    "metadata": {},
 28 |    "outputs": [],
 29 |    "source": [
 30 |     "def synthentic_data(w, b, num_samples):\n",
 31 |     "    X = torch.normal(0, 1, (num_samples, len(w)))  ##X为均值为0方差为1的随机数，数量和参数w的数量一致\n",
 32 |     "    y = torch.matmul(X, w) + b \n",
 33 |     "    y += torch.normal(0, 0.01, y.shape)\n",
 34 |     "    return X, y.reshape(-1, 1) ##将x和y的形状都变为列向量\n",
 35 |     "    \n",
 36 |     "w_true = torch.tensor([2, -3.4])\n",
 37 |     "b_true = 4.2\n",
 38 |     "\n",
 39 |     "features, labels = synthentic_data(w_true, b_true, 1000)"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 9,
 45 |    "id": "1e7af7c4",
 46 |    "metadata": {},
 47 |    "outputs": [
 48 |     {
 49 |      "name": "stdout",
 50 |      "output_type": "stream",
 51 |      "text": [
 52 |       "features: tensor([-0.8105, -0.3568]) \n",
 53 |       "label: tensor([3.8019])\n"
 54 |      ]
 55 |     }
 56 |    ],
 57 |    "source": [
 58 |     "print('features:', features[0], '\\nlabel:', labels[0])"
 59 |    ]
 60 |   },
 61 |   {
 62 |    "cell_type": "markdown",
 63 |    "id": "5cb47cac",
 64 |    "metadata": {},
 65 |    "source": [
 66 |     "构造一个函数能够每次随机小批量地从数据集中采样"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "code",
 71 |    "execution_count": 12,
 72 |    "id": "56908aa6",
 73 |    "metadata": {},
 74 |    "outputs": [
 75 |     {
 76 |      "name": "stdout",
 77 |      "output_type": "stream",
 78 |      "text": [
 79 |       "tensor([[ 0.4418, -0.2143],\n",
 80 |       "        [-1.8064,  0.6571],\n",
 81 |       "        [ 0.7412,  0.1856],\n",
 82 |       "        [ 0.6559,  0.8517],\n",
 83 |       "        [-0.6508, -0.8636],\n",
 84 |       "        [-0.2165,  0.0563],\n",
 85 |       "        [-0.1913, -0.6041],\n",
 86 |       "        [-2.3457,  1.4957],\n",
 87 |       "        [-1.1084, -1.6875],\n",
 88 |       "        [ 0.9823,  0.6570]]) \n",
 89 |       " tensor([[ 5.8165],\n",
 90 |       "        [-1.6394],\n",
 91 |       "        [ 5.0659],\n",
 92 |       "        [ 2.6219],\n",
 93 |       "        [ 5.8414],\n",
 94 |       "        [ 3.5669],\n",
 95 |       "        [ 5.8631],\n",
 96 |       "        [-5.5850],\n",
 97 |       "        [ 7.7229],\n",
 98 |       "        [ 3.9221]])\n"
 99 |      ]
100 |     }
101 |    ],
102 |    "source": [
103 |     "def data_batch(batch_size, feature, label):\n",
104 |     "    num_example = len(feature)\n",
105 |     "    induice = list(range(num_example))\n",
106 |     "    random.shuffle(induice)\n",
107 |     "    \n",
108 |     "    for i in range(0, num_example, batch_size):\n",
109 |     "        batch_induice = torch.tensor(induice[i:min(i + batch_size,num_example)])\n",
110 |     "        yield feature[batch_induice], label[batch_induice]\n",
111 |     "        \n",
112 |     "batch_size = 10\n",
113 |     "for X, y in data_batch(batch_size, features, labels):\n",
114 |     "    print (X, '\\n', y)\n",
115 |     "    break"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "markdown",
120 |    "id": "7d4bdc1c",
121 |    "metadata": {},
122 |    "source": [
123 |     "初始化模型参数w和b"
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "code",
128 |    "execution_count": 16,
129 |    "id": "f191d148",
130 |    "metadata": {},
131 |    "outputs": [],
132 |    "source": [
133 |     "w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)\n",
134 |     "b = torch.zeros(1, requires_grad=True)"
135 |    ]
136 |   },
137 |   {
138 |    "cell_type": "markdown",
139 |    "id": "bdeb2478",
140 |    "metadata": {},
141 |    "source": [
142 |     "构造线性模型"
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "code",
147 |    "execution_count": 17,
148 |    "id": "242d2b86",
149 |    "metadata": {},
150 |    "outputs": [],
151 |    "source": [
152 |     "def linear_reg(X, w, b):\n",
153 |     "    return torch.matmul(X, w) + b"
154 |    ]
155 |   },
156 |   {
157 |    "cell_type": "markdown",
158 |    "id": "ec33debf",
159 |    "metadata": {},
160 |    "source": [
161 |     "定义损失函数"
162 |    ]
163 |   },
164 |   {
165 |    "cell_type": "code",
166 |    "execution_count": 21,
167 |    "id": "afd4dbad",
168 |    "metadata": {},
169 |    "outputs": [],
170 |    "source": [
171 |     "##损失函数为MSE\n",
172 |     "def loss_func(y, y_hat):\n",
173 |     "    return (y_hat - y.reshape(y_hat.shape)) ** 2 / 2"
174 |    ]
175 |   },
176 |   {
177 |    "cell_type": "markdown",
178 |    "id": "d07ee339",
179 |    "metadata": {},
180 |    "source": [
181 |     "定义优化器"
182 |    ]
183 |   },
184 |   {
185 |    "cell_type": "code",
186 |    "execution_count": 19,
187 |    "id": "4fb8dbb2",
188 |    "metadata": {},
189 |    "outputs": [],
190 |    "source": [
191 |     "##优化器为随机梯度下降SGD\n",
192 |     "def SGD(params, learning_rate, batch_size):\n",
193 |     "    with torch.no_grad():\n",
194 |     "        for param in params:\n",
195 |     "            param -= learning_rate * param.grad / batch_size\n",
196 |     "            param.grad.zero_()"
197 |    ]
198 |   },
199 |   {
200 |    "cell_type": "code",
201 |    "execution_count": 22,
202 |    "id": "84c5931e",
203 |    "metadata": {},
204 |    "outputs": [
205 |     {
206 |      "name": "stdout",
207 |      "output_type": "stream",
208 |      "text": [
209 |       "epoch 1, loss 0.055980\n",
210 |       "epoch 2, loss 0.000252\n",
211 |       "epoch 3, loss 0.000050\n",
212 |       "epoch 4, loss 0.000049\n",
213 |       "epoch 5, loss 0.000049\n",
214 |       "epoch 6, loss 0.000049\n",
215 |       "epoch 7, loss 0.000049\n",
216 |       "epoch 8, loss 0.000049\n",
217 |       "epoch 9, loss 0.000049\n",
218 |       "epoch 10, loss 0.000049\n"
219 |      ]
220 |     }
221 |    ],
222 |    "source": [
223 |     "learning_rate = 0.03\n",
224 |     "net = linear_reg\n",
225 |     "loss = loss_func\n",
226 |     "epochs = 10\n",
227 |     "\n",
228 |     "for epoch in range(epochs):\n",
229 |     "    for X,y in data_batch(batch_size, features, labels):\n",
230 |     "        loss = loss_func(net(X, w, b), y)\n",
231 |     "        loss.sum().backward()\n",
232 |     "        SGD([w,b], learning_rate, batch_size)\n",
233 |     "    \n",
234 |     "    with torch.no_grad():\n",
235 |     "        train_loss = loss_func(net(features,w,b), labels)\n",
236 |     "        print (f'epoch { epoch + 1 }, loss {float(train_loss.mean()):f}')"
237 |    ]
238 |   },
239 |   {
240 |    "cell_type": "markdown",
241 |    "id": "b74f8bfc",
242 |    "metadata": {},
243 |    "source": [
244 |     "使用pytorch设定好的内置函数实现"
245 |    ]
246 |   },
247 |   {
248 |    "cell_type": "code",
249 |    "execution_count": 27,
250 |    "id": "e3152bb2",
251 |    "metadata": {},
252 |    "outputs": [],
253 |    "source": [
254 |     "from torch.utils import data\n",
255 |     "import numpy as np"
256 |    ]
257 |   },
258 |   {
259 |    "cell_type": "markdown",
260 |    "id": "1d886f0b",
261 |    "metadata": {},
262 |    "source": [
263 |     "构造自定义数据集"
264 |    ]
265 |   },
266 |   {
267 |    "cell_type": "code",
268 |    "execution_count": 32,
269 |    "id": "02227ccb",
270 |    "metadata": {},
271 |    "outputs": [
272 |     {
273 |      "data": {
274 |       "text/plain": [
275 |        "[tensor([[-1.6186,  1.1914],\n",
276 |        "         [-0.7390,  0.7205],\n",
277 |        "         [ 0.9826,  0.6103],\n",
278 |        "         [ 0.8132, -0.0249],\n",
279 |        "         [-0.4938,  0.8550],\n",
280 |        "         [-0.0217,  0.4927],\n",
281 |        "         [ 0.8233,  0.3651],\n",
282 |        "         [ 0.3465, -0.4650],\n",
283 |        "         [ 0.0432, -0.1148],\n",
284 |        "         [ 0.4177,  0.7377]]),\n",
285 |        " tensor([[-3.0947],\n",
286 |        "         [ 0.2740],\n",
287 |        "         [ 4.0838],\n",
288 |        "         [ 5.9232],\n",
289 |        "         [ 0.3082],\n",
290 |        "         [ 2.4810],\n",
291 |        "         [ 4.6042],\n",
292 |        "         [ 6.4671],\n",
293 |        "         [ 4.6845],\n",
294 |        "         [ 2.5223]])]"
295 |       ]
296 |      },
297 |      "execution_count": 32,
298 |      "metadata": {},
299 |      "output_type": "execute_result"
300 |     }
301 |    ],
302 |    "source": [
303 |     "def Dataset(Data, batch_size, is_train=True):\n",
304 |     "    dataset = data.TensorDataset(*Data)\n",
305 |     "    dataloader = data.DataLoader(dataset, batch_size, shuffle=is_train)\n",
306 |     "    return dataloader\n",
307 |     "\n",
308 |     "batch_size = 10\n",
309 |     "DataSet = Dataset((features, labels), batch_size)\n",
310 |     "\n",
311 |     "next(iter(DataSet))"
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "markdown",
316 |    "id": "d8c7f994",
317 |    "metadata": {},
318 |    "source": [
319 |     "定义网络"
320 |    ]
321 |   },
322 |   {
323 |    "cell_type": "code",
324 |    "execution_count": 37,
325 |    "id": "414e3749",
326 |    "metadata": {},
327 |    "outputs": [],
328 |    "source": [
329 |     "import torch.nn as nn\n",
330 |     "\n",
331 |     "reg = nn.Sequential(nn.Linear(2,1))"
332 |    ]
333 |   },
334 |   {
335 |    "cell_type": "markdown",
336 |    "id": "047d73cc",
337 |    "metadata": {},
338 |    "source": [
339 |     "初始化权重"
340 |    ]
341 |   },
342 |   {
343 |    "cell_type": "code",
344 |    "execution_count": 35,
345 |    "id": "b33bc9d8",
346 |    "metadata": {},
347 |    "outputs": [
348 |     {
349 |      "data": {
350 |       "text/plain": [
351 |        "tensor([0.])"
352 |       ]
353 |      },
354 |      "execution_count": 35,
355 |      "metadata": {},
356 |      "output_type": "execute_result"
357 |     }
358 |    ],
359 |    "source": [
360 |     "net[0].weight.data.normal_(0, 0.01)\n",
361 |     "net[0].bias.data.fill_(0)"
362 |    ]
363 |   },
364 |   {
365 |    "cell_type": "markdown",
366 |    "id": "ddf346e7",
367 |    "metadata": {},
368 |    "source": [
369 |     "定义损失函数"
370 |    ]
371 |   },
372 |   {
373 |    "cell_type": "code",
374 |    "execution_count": 36,
375 |    "id": "571b96fd",
376 |    "metadata": {},
377 |    "outputs": [],
378 |    "source": [
379 |     "Loss = nn.MSELoss()"
380 |    ]
381 |   },
382 |   {
383 |    "cell_type": "markdown",
384 |    "id": "211aa4e7",
385 |    "metadata": {},
386 |    "source": [
387 |     "构造优化器"
388 |    ]
389 |   },
390 |   {
391 |    "cell_type": "code",
392 |    "execution_count": 38,
393 |    "id": "79c30d42",
394 |    "metadata": {},
395 |    "outputs": [],
396 |    "source": [
397 |     "optimizer = torch.optim.SGD(net.parameters(), lr=0.03)"
398 |    ]
399 |   },
400 |   {
401 |    "cell_type": "code",
402 |    "execution_count": 40,
403 |    "id": "9485620b",
404 |    "metadata": {},
405 |    "outputs": [
406 |     {
407 |      "name": "stdout",
408 |      "output_type": "stream",
409 |      "text": [
410 |       " epoch 1, Loss 0.000420\n",
411 |       " epoch 2, Loss 0.000099\n",
412 |       " epoch 3, Loss 0.000099\n"
413 |      ]
414 |     }
415 |    ],
416 |    "source": [
417 |     "num_epochs = 3\n",
418 |     "\n",
419 |     "for epoch in range(num_epochs):\n",
420 |     "    for X, y in DataSet:\n",
421 |     "        train_loss = Loss(net(X), y)\n",
422 |     "        optimizer.zero_grad()\n",
423 |     "        train_loss.backward()\n",
424 |     "        optimizer.step()\n",
425 |     "    train_loss = Loss(net(features), labels)\n",
426 |     "    print(f' epoch {epoch + 1}, Loss {train_loss:f}')"
427 |    ]
428 |   },
429 |   {
430 |    "cell_type": "code",
431 |    "execution_count": null,
432 |    "id": "5da10112",
433 |    "metadata": {},
434 |    "outputs": [],
435 |    "source": []
436 |   }
437 |  ],
438 |  "metadata": {
439 |   "kernelspec": {
440 |    "display_name": "Python [conda env:torch] *",
441 |    "language": "python",
442 |    "name": "conda-env-torch-py"
443 |   },
444 |   "language_info": {
445 |    "codemirror_mode": {
446 |     "name": "ipython",
447 |     "version": 3
448 |    },
449 |    "file_extension": ".py",
450 |    "mimetype": "text/x-python",
451 |    "name": "python",
452 |    "nbconvert_exporter": "python",
453 |    "pygments_lexer": "ipython3",
454 |    "version": "3.8.16"
455 |   }
456 |  },
457 |  "nbformat": 4,
458 |  "nbformat_minor": 5
459 | }
460 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/自动求导.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "id": "d01576cb",
  7 |    "metadata": {},
  8 |    "outputs": [],
  9 |    "source": [
 10 |     "import torch"
 11 |    ]
 12 |   },
 13 |   {
 14 |    "cell_type": "markdown",
 15 |    "id": "eed543e0",
 16 |    "metadata": {},
 17 |    "source": [
 18 |     "假设现在要对y=2**x** <sup>T</sup> **x** 中的**x**向量求导"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "markdown",
 23 |    "id": "cfbbc43d",
 24 |    "metadata": {},
 25 |    "source": [
 26 |     "先声明一个**x**向量"
 27 |    ]
 28 |   },
 29 |   {
 30 |    "cell_type": "code",
 31 |    "execution_count": 3,
 32 |    "id": "f2e8857d",
 33 |    "metadata": {},
 34 |    "outputs": [
 35 |     {
 36 |      "data": {
 37 |       "text/plain": [
 38 |        "tensor([0., 1., 2., 3.])"
 39 |       ]
 40 |      },
 41 |      "execution_count": 3,
 42 |      "metadata": {},
 43 |      "output_type": "execute_result"
 44 |     }
 45 |    ],
 46 |    "source": [
 47 |     "x = torch.arange(4.0)\n",
 48 |     "x"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "markdown",
 53 |    "id": "c9c6a311",
 54 |    "metadata": {},
 55 |    "source": [
 56 |     "在计算梯度（求导）之前，需要一个地方来储存梯度"
 57 |    ]
 58 |   },
 59 |   {
 60 |    "cell_type": "code",
 61 |    "execution_count": 6,
 62 |    "id": "ca5b3ad0",
 63 |    "metadata": {},
 64 |    "outputs": [
 65 |     {
 66 |      "data": {
 67 |       "text/plain": [
 68 |        "tensor([0., 1., 2., 3.], requires_grad=True)"
 69 |       ]
 70 |      },
 71 |      "execution_count": 6,
 72 |      "metadata": {},
 73 |      "output_type": "execute_result"
 74 |     }
 75 |    ],
 76 |    "source": [
 77 |     "x.requires_grad_(True)\n",
 78 |     "x"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "markdown",
 83 |    "id": "3dd30e14",
 84 |    "metadata": {},
 85 |    "source": [
 86 |     "计算y"
 87 |    ]
 88 |   },
 89 |   {
 90 |    "cell_type": "code",
 91 |    "execution_count": 7,
 92 |    "id": "ac661ae9",
 93 |    "metadata": {},
 94 |    "outputs": [
 95 |     {
 96 |      "data": {
 97 |       "text/plain": [
 98 |        "tensor(28., grad_fn=<MulBackward0>)"
 99 |       ]
100 |      },
101 |      "execution_count": 7,
102 |      "metadata": {},
103 |      "output_type": "execute_result"
104 |     }
105 |    ],
106 |    "source": [
107 |     "y = 2 * torch.dot(x,x)\n",
108 |     "y"
109 |    ]
110 |   },
111 |   {
112 |    "cell_type": "markdown",
113 |    "id": "c8b267e4",
114 |    "metadata": {},
115 |    "source": [
116 |     "调用反向传播函数来自动计算y关于x每个分量的梯度"
117 |    ]
118 |   },
119 |   {
120 |    "cell_type": "code",
121 |    "execution_count": 8,
122 |    "id": "be1c4f8b",
123 |    "metadata": {},
124 |    "outputs": [
125 |     {
126 |      "data": {
127 |       "text/plain": [
128 |        "tensor([ 0.,  4.,  8., 12.])"
129 |       ]
130 |      },
131 |      "execution_count": 8,
132 |      "metadata": {},
133 |      "output_type": "execute_result"
134 |     }
135 |    ],
136 |    "source": [
137 |     "y.backward()\n",
138 |     "x.grad"
139 |    ]
140 |   },
141 |   {
142 |    "cell_type": "markdown",
143 |    "id": "368da92c",
144 |    "metadata": {},
145 |    "source": [
146 |     "pytorch自动求导时会累积梯度，因此在计算另一个x的函数时，需要对之前的梯度清零。"
147 |    ]
148 |   },
149 |   {
150 |    "cell_type": "code",
151 |    "execution_count": 9,
152 |    "id": "97f73a30",
153 |    "metadata": {},
154 |    "outputs": [
155 |     {
156 |      "data": {
157 |       "text/plain": [
158 |        "tensor([1., 1., 1., 1.])"
159 |       ]
160 |      },
161 |      "execution_count": 9,
162 |      "metadata": {},
163 |      "output_type": "execute_result"
164 |     }
165 |    ],
166 |    "source": [
167 |     "x.grad.zero_()\n",
168 |     "y = x.sum()\n",
169 |     "y.backward()\n",
170 |     "x.grad"
171 |    ]
172 |   },
173 |   {
174 |    "cell_type": "code",
175 |    "execution_count": null,
176 |    "id": "6d5eb437",
177 |    "metadata": {},
178 |    "outputs": [],
179 |    "source": []
180 |   }
181 |  ],
182 |  "metadata": {
183 |   "kernelspec": {
184 |    "display_name": "Python [conda env:torch] *",
185 |    "language": "python",
186 |    "name": "conda-env-torch-py"
187 |   },
188 |   "language_info": {
189 |    "codemirror_mode": {
190 |     "name": "ipython",
191 |     "version": 3
192 |    },
193 |    "file_extension": ".py",
194 |    "mimetype": "text/x-python",
195 |    "name": "python",
196 |    "nbconvert_exporter": "python",
197 |    "pygments_lexer": "ipython3",
198 |    "version": "3.8.16"
199 |   }
200 |  },
201 |  "nbformat": 4,
202 |  "nbformat_minor": 5
203 | }
204 | 


--------------------------------------------------------------------------------
/李沐DeepLearning/语言模型/README.md:
--------------------------------------------------------------------------------
  1 | ### 一. 语言模型的概念
  2 | 在自然语言处理(NLP)中，语言模型是一种用来对语言进行建模的统计模型。其主要目的是计算给定一段文本序列的概率值或对下一个词或字符的预测值。
  3 | 
  4 | 语言模型通常基于概率模型来构建，它考虑了语言的各种特征，例如语法、语义和上下文。具体来说，语言模型可以根据一定的训练数据学习到一个概率分布，该分布可以描述一个给定的文本序列中每个单词出现的概率，或者是下一个单词的预测概率。这些概率可以用来评估一个给定的文本序列是否合理，或者给出一个可能的下一个单词或短语。
  5 | 
  6 | ### 二. 学习语言模型
  7 | 首先，从基本的概率规则开始：假设给定一个长度为$T$的文本序列，文本序列中的词元依次为$x_1,x_2,...,x_T$。于是 $x_t(1\le t \le T)$可以被认为是文本序列在时间步$t$的观测或是标签。则语言模型的目标是估计文本序列的联合概率：
  8 | 
  9 | $P(x_1,x_2,...,x_T)=\prod_{t=1}^{T}P(x_t|x_{t-1},...,x_2,x_1)$ (1)
 10 | 
 11 | 为了训练语言模型，需要计算一个单词的概率以及给定前面若干个词的情况下出现某个单词的条件概率。这些概率本质上就是语言模型的参数。
 12 | 
 13 | 在这里列出条件概率的定义：
 14 | 
 15 | 条件概率（Conditional Probability）是指在已知某个事件发生的前提下，另一个事件发生的概率。一般来说，它是指一个事件B在已知事件A发生的条件下发生的概率，用$P(B|A)$来表示。
 16 | 
 17 | 例如，以下为包含了四个单词的文本序列的概率：
 18 | 
 19 | $P(deep,learning,is,fun)=P(deep)P(learning|deep)P(is|deep,learning)P(fun|deep,learning,is)$
 20 | 
 21 | 接下来，可以估计文本中所有以deep为开头序列的概率，记为$\hat{P}(deep)$，则有：
 22 | 
 23 | $\hat{P}(deep)=\frac{n(deep)}{n(total)}$
 24 | 
 25 | 推广到一对连续的单词对，现在计算$\hat{P}(learning|deep)$，有：
 26 | 
 27 | $\hat{P}(learning|deep)=\frac{n(deep,learning)}{n(deep)}$
 28 | 
 29 | 其中$n(x)$和$n(x,x')$分别是单个词和一对连续单词出现的次数。由于在文本序列中，类似“deep learning”这样的连续单词对出现的次数相对于单个词的出现次数低得多，要准确预测此类单词对的出现概率十分困难。对于三个以上的词组组合，对于这种词组组合的概率预测更加困难。
 30 | 
 31 | 下面提供了一种方法，对于两个以及两个以上的词组组合的计数中分别添加一个小常量。这种方法被称为拉普拉斯平滑（Lapalace Smoothing）。但是这种方法很容易使得模型变得无效。首先，所有的计数都需要被储存，计算量大；其次，这种方法忽略了单词本身的意思，完全使用数字去计算概率使得无法根据单词的上下文调整；最后 ，很多包含多个单词的词组是没出现过的，如果只统计之前看到过的单词也就无法正确预测某些之前没出现过的长序列词组组合。
 32 | 
 33 | 因此，下面引入了马尔可夫模型来解决长序列单词组和的概况计算问题。
 34 | 
 35 | ### 三. 马尔可夫模型以及n元语法
 36 | 马尔可夫模型（Markov model）是一种基于马尔可夫过程的概率模型。它描述了一个随机过程中，每个状态的转移是基于前一个状态的概率分布，并且当前状态只与前一个状态有关，而与之前的状态无关。
 37 | 
 38 | 马尔可夫模型可以分为不同的类型，如一阶马尔可夫模型、二阶马尔可夫模型等。一阶马尔可夫模型是指当前状态只与前一个状态有关，而二阶马尔可夫模型是指当前状态不仅与前一个状态有关，还与前两个状态有关。
 39 | 
 40 | 把马尔可夫模型应用到语言模型当中，若$P(x_{t+1}|x_t,...x_1)=P(x_{t+1}|x_t)$ ,即序列在时间步$t+1$处的观测只与序列在时间步$t$处的观测有关，则序列上的分布满足一阶马尔可夫性质。阶数越高，对应的依赖关系就越长。由此可推导出了许多可以应用于序列建模的近似公式：
 41 | 
 42 | $P(x_1,x_2,x_3,x_4)=P(x_1)P(x_2)P(x_3)P(x_4)$
 43 | 
 44 | $P(x_1,x_2,x_3,x_4)=P(x_1)P(x_2|x_1)P(x_3|x_2)P(x_4|x_3)$  (2)
 45 | 
 46 | $P(x_1,x_2,x_3,x_4)=P(x_1)P(x_2|x_1)P(x_3|x_1,x_2)P(x_4|x_2,x_3)$
 47 | 
 48 | 通常，涉及一个、两个和三个变量的概率公式分别被称为一元语法（unigram）、二元语法（bigram）和三元语法（trigram）模型。
 49 | 
 50 | 接下来将使用代码实现语言模型。
 51 | 
 52 | ### 四. 自然语言统计
 53 | ```python
 54 | tokens = d2l.tokenize(d2l.read_time_machine()) #提取词元
 55 | corpus = [token for line in tokens for token in line] #生成语料库
 56 | vocab = d2l.Vocab(corpus) #构造词汇表
 57 | 
 58 | print(vocab.token_freqs[:10])
 59 | ```
 60 | 这里使用了Dive to Deep Learning的d2l库方便读入文本数据和构造词汇表。
 61 | 
 62 | 在有了词汇表之后就可以画词频图
 63 | ```python
 64 | freq = [freq for token, freq in vocab.token_freqs]
 65 | d2l.plot(freq, xlabel='token: x', ylabel='freq: y', xscale='log', yscale='log')
 66 | ```
 67 | <div align=center>
 68 | <img src="https://github.com/pod2c/Machine_Learning/blob/87ead1b3081c0ec2111426a0d759b86db694e3c6/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B/fig1.png"/></br>
 69 | 图1：一元语法词频图
 70 | </div>
 71 | 
 72 | 
 73 | 某些出现频次过高的词元会被划分为停用词（Stop Words）。停用词是指在文本处理中经常要忽略的词汇，因为这些词通常不对文本的意义产生重要贡献。常见的停用词包括代词、介词、连词、冠词等。另外，在英文中还有一些高频词如 "the" "and" "a" 等被认为是停用词。
 74 | 
 75 | 接着是二元语法和三元语法的词频，即数据集中两个词元为一组的组合和三个词一组的组合出现的频率。
 76 | ```python
 77 | #二元语法
 78 | bi_tokens = [pair for pair in zip(corpus[:-1], corpus[1:])]
 79 | bi_vocab = d2l.Vocab(bi_tokens)
 80 | bi_vocab.token_freqs[:10]
 81 | #三元语法
 82 | tri_tokens = [triple for triple in zip(corpus[:-2], corpus[1:-1], corpus[2:])]
 83 | tri_vocab = d2l.Vocab(tri_tokens)
 84 | tri_vocab.token_freqs[:10]
 85 | ```
 86 | 画出词频图
 87 | ```python
 88 | bi_freq = [freq for token, freq in bi_vocab.token_freqs]
 89 | tri_freq = [freq for token, freq in tri_vocab.token_freqs]
 90 | d2l.plot([freq, bi_freq, tri_freq], xlabel='token: x', ylabel='freq: y', xscale='log', yscale='log', legend=['uni_freq','bi_freq','tri_freq'])
 91 | ```
 92 | <div align=center>
 93 | <img src="https://github.com/pod2c/Machine_Learning/blob/87ead1b3081c0ec2111426a0d759b86db694e3c6/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B/fig2.png"/></br>
 94 | 图2：词频图
 95 | </div>
 96 | 
 97 | 通过词频图可以看出，词频以一种明确的方式衰减。在剔除前几个单词后，后续的单词遵循双对数坐标上的一条直线衰减。这意味着词频满足齐普夫定律（Zipf's Law），即第 
 98 |  个词的词频 
 99 |  为：
100 | 
101 |  $n_i\propto \frac{1}{i^{\alpha} }$ (3)
102 | 
103 |  等价于
104 | 
105 | $\log n_i=-\alpha \log i + c$  (4)
106 | 
107 | 其中$\alpha$是刻画分布的指数，$c$是常数。
108 | 
109 | 这告诉我们想要通过计数统计和平滑来建模单词是不可行的， 因为这样建模的结果会大大高估尾部单词的频率，也就是所谓的不常用单词。
110 | 
111 | ### 五. 读取长序列数据
112 | 在模型训练长序列数据时，需要将这些长序列数据拆分开变成几段短序列方便模型读取。模型一次处理具有预定义长度的一个小批量序列。现在需要解决的是如何在原始文本序列中随机生成一个小批量数据的标签和特征以供读取。
113 | 
114 | 由于文本序列的长度可以被任意分割，在这里可以定义一个时间步数 
115 |  ，利用这个时间步数将文本序列分割为若干个具有相同时间步数的子序列。并且可以任意选择偏移量来指示分割开始的初始位置。例子如下，设$n=5$，则有
116 | 
117 | <div align=center>
118 | <img src="https://github.com/pod2c/Machine_Learning/blob/87ead1b3081c0ec2111426a0d759b86db694e3c6/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B/fig3.png"/></br>
119 | 图3：分割出来的子序列（图源：Dive to Deep Learning）
120 | </div>
121 | 
122 | 如图3，不同的偏移量会导致产生不同的子序列。为了保证随机性，在这里选择随机偏移量作为起始位置。下面将实现随机采样和顺序分区来分割文本序列。
123 | 
124 | ### 六. 随机采样
125 | 在随机采样（random sampling）中，每个子序列都是在原始长序列上任意捕获的短序列。在每次采样的过程中，采样之后两个相邻的子序列在原始长序列上不一定是相邻的。对于语言模型，特征（feature）是到目前为止能观测到的词元，而标签（label）则是位移了一个词元的原始序列。
126 | ```python
127 | def seq_data_iter_random(corpus, batch_size, num_steps): #num_steps为随即偏移量
128 |     # 依据随即偏移量，对数据集进行顺序分区
129 |     corpus = corpus[random.randint(0, num_steps-1):] 
130 |     # 计算子序列的数量
131 |     num_subseqs = (len(corpus)-1) // num_steps
132 |     # 建立长度为num_steps的子序列的起始索引
133 |     indices = list(range(0, num_subseqs * num_steps, num_steps))
134 |     random.shuffle(indices)
135 |     
136 |     def data(pos):
137 |         # 返回指定区间长度的序列
138 |         return corpus[pos: pos + num_steps]
139 |     
140 |     num_batches = num_subseqs // batch_size
141 |     for i in range(0, num_batches * batch_size, batch_size): 
142 |         #每个batch的起始位置
143 |         iter_indices_per_batch = indices[i: i + batch_size]
144 |         X = [data(j) for j in iter_indices_per_batch]
145 |         Y = [data(j + 1) for j in iter_indices_per_batch]
146 |         # 保留一次迭代的特征和标签
147 |         yield torch.tensor(X), torch.tensor(Y)
148 | ```
149 | 下面随机生成一个任意长度的序列来验证随机采样的效果。
150 | 
151 | 设序列长度为34，时间步长为5，batch_size为2
152 | ```python
153 | seq = list(range(34))
154 | for X, Y in seq_data_iter_random(seq, 2, 5):
155 |     print("X:",X, "\nY:", Y)
156 | ```
157 | 运行结果：
158 | 
159 | <div align=center>
160 | <img src="https://github.com/pod2c/Machine_Learning/blob/87ead1b3081c0ec2111426a0d759b86db694e3c6/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B/fig4.png"/></br>
161 | 图4：随机采样
162 | </div>
163 | 
164 | 从图4可以看出，一共生成了3组子序列，并且每一组序列里的特征都是随机采样的，任意相邻的两条子序列在原始长序列中不相邻。
165 | 
166 | ### 七. 顺序分区
167 | 在随机采样中，得到的两个相邻的子序列之间在原始序列上是不相邻的。如果想要得到两个在原始序列上也相邻的子序列则需要使用顺序分区（Sequential Partitioning）。这种方法在基于小批量的迭代过程中保留了拆分的子序列的顺序，因此被称为顺序分区。
168 | ```python
169 | def seq_data_iter_sequential(corpus, batch_size, num_steps):
170 |     #设置随机偏移量
171 |     offset = random.randint(0, num_steps) 
172 |     #计算使用偏移量分割之后每个步长内的词元数量
173 |     num_tokens = ((len(corpus) - offset - 1) // batch_size) * batch_size 
174 |     #储存按照偏移量分割之后的原始长序列
175 |     Xs = torch.tensor(corpus[offset: offset + num_tokens])
176 |     Ys = torch.tensor(corpus[offset + 1: offset + 1 + num_tokens])
177 |     Xs, Ys = Xs.reshape(batch_size, -1), Ys.reshape(batch_size, -1)
178 |     #计算一共产生的batch数量
179 |     num_batches = Xs.shape[1] // num_steps
180 |     for i in range(0, num_batches * num_steps, num_steps):
181 |         #从之前储存的分割完成的原始长序列当中按时间步长提取子序列
182 |         X = Xs[:, i:i + num_steps]
183 |         Y = Ys[:, i:i + num_steps]
184 |         yield X, Y
185 | ```
186 | 与随机采样不同的是，顺序分区在最后迭代提取一对子序列的时候，前一个子序列保持不变，而它的相邻序列会直接从在原始序列上与其相邻的下一个词元开始提取，这样就能保证提取出来的两个子序列在原始序列上也是相邻的。
187 | 
188 | 在之前生成的随机长序列上测试效果：
189 | 
190 | <div align=center>
191 | <img src="https://github.com/pod2c/Machine_Learning/blob/87ead1b3081c0ec2111426a0d759b86db694e3c6/%E6%9D%8E%E6%B2%90DeepLearning/%E5%9B%BE%E7%89%87/%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B/fig5.png"/></br>
192 | 图5：顺序分区
193 | </div>
194 | 
195 | 由图5可知，同样一共生成了三组子序列，可以看出其中两两相邻的子序列在原始长序列上也是相邻的。
196 | 
197 | 接着将随机采样函数和顺序分区采样函数整合到一个类当中，做成一个数据迭代器：
198 | ```python
199 | class SeqDataLoader:
200 |     def __init__(self, batch_size, num_steps, use_random_iter, max_tokens):
201 |         if use_random_iter:
202 |             self.data_iter_fn = d2l.seq_data_iter_random
203 |         else:
204 |             self.data_iter_fn = d2l.seq_data_iter_sequential
205 |         self.corpus, self.vocab = d2l.load_corpus_time_machine(max_tokens)
206 |         self.batch_size, self.num_steps = batch_size, num_steps
207 | 
208 |     def __iter__(self):
209 |         return self.data_iter_fn(self.corpus, self.batch_size, self.num_steps)
210 | ```
211 | 最后定义一个load_data_time_machine()函数，使得能同时返回数据迭代器（采样器）和词汇表：
212 | ```python
213 | def load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000):
214 |     data_iter = SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)
215 |     return data_iter, data_iter.Vocab
216 | ```
217 | 


--------------------------------------------------------------------------------