├── .gitattributes
├── images
└── tilt_arch.png
├── requirements.txt
├── mkdocs.yml
├── setup.cfg
├── setup.py
├── LICENSE
├── how_did_i_prepare_the_stuffs
├── README.md
├── tilt_part_3_1_aligning_all_the_parts_to_make_tilt.ipynb
└── tilt_part_2_3_sample_preparing_funsd_for_t5_dataset.ipynb
├── .gitignore
├── README.md
└── src
├── visual_backbone.py
├── dataset.py
└── t5.py
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
--------------------------------------------------------------------------------
/images/tilt_arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/uakarsh/TiLT-Implementation/HEAD/images/tilt_arch.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | transformers
2 | datasets
3 | sentencepiece
4 | pytorch_lightning
5 | seqeval
6 | evaluate
7 |
--------------------------------------------------------------------------------
/mkdocs.yml:
--------------------------------------------------------------------------------
1 | site_name: TiLT
2 | site_url: https://uakarsh.github.io/tilt
3 | copyright: MIT
4 | theme:
5 | name: "material"
6 | palette:
7 | primary: "red"
8 | accent: "red"
9 |
10 | repo_name: uakarsh/TiLT-Implementation
11 | repo_url: https://github.com/uakarsh/TiLT-Implementation
12 |
13 | nav:
14 | - Home: pad_tokens_start_idx.md
--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [flake8]
2 | ignore = W503, E203, B305
3 | max-line-length = 88
4 |
5 | [mypy]
6 | disallow_untyped_defs = True
7 | ignore_missing_imports = True
8 |
9 | [tool:isort]
10 | profile = black
11 | known_first_party = tilt,tests
12 |
13 | [tool:pytest]
14 | testpaths = tests
15 | addopts =
16 | -rxXs
17 | --cov=tilt
18 | --cov=tests
19 | --cov-report=term-missing
20 | --cov-fail-under=80
21 | --cov-config=.coveragerc
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import setup, find_packages
2 |
3 | setup(
4 | name = 'tilt_transformers',
5 | packages = find_packages(where="src"),
6 | package_dir = {"": "src", "docformer": "src/"},
7 | version = '0.1.0',
8 | license='MIT',
9 | description = 'Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer:',
10 | author = 'Akarsh Upadhay',
11 | author_email = 'akarshupadhyayabc@gmail.com',
12 | url = 'https://github.com/uakarsh/TiLT-Implementation',
13 | keywords = [
14 | 'artificial intelligence',
15 | 'attention mechanism',
16 | 'document understanding',
17 | ],
18 | install_requires=[
19 | 'torch>=1.6',
20 | 'torchvision',
21 | 'transformers',
22 | 'sentencepiece',
23 | ],
24 | classifiers=[
25 | 'Development Status :: 4 - Beta',
26 | 'Intended Audience :: Developers',
27 | 'Topic :: Scientific/Engineering :: Artificial Intelligence',
28 | 'License :: OSI Approved :: MIT License',
29 | 'Programming Language :: Python :: 3.7',
30 | ],
31 |
32 | )
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 Phil Wang
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
--------------------------------------------------------------------------------
/how_did_i_prepare_the_stuffs/README.md:
--------------------------------------------------------------------------------
1 | ### A big note here about the preparation of the stuffs.
2 |
3 | * The notebooks from 1 to 3.2 used generative approach to solve the problem. This was part from the idea that T5 transformers are based on the generative approach.
4 |
5 | * However, in the conclusion section of the TiLT transformer, the authors have mentioned that they have used the extractive approach, which means predicting the logits, and I have missed that part right now. I have added the code for preparing the FUNSD abstractive dataset, as well as the same would be followed for the CORD dataset.
6 |
7 | * Also, I have the code for DocVQA (for extractive tasks, which includes predicting the start and the end logits of the answer from the context) ready, and I would also add it soon
8 |
9 | * It would take me a while, to prepare the modeling approach for abstractive approach (as when I was going to finish the generative approach, I visited the paper and saw that the authors have used the extractive approach).
10 |
11 | * The idea was, actually confusing, as I was also ready for using the abstractive approach, but when I saw the T5's approach, I guess it hit me, and made me do the generative approach. Although, all the code are ready, I guess I would take a stop, and visit the abstractive approach for now. Let's see how this goes.
12 |
13 | * By the way, if time permits, I would soon add the code for FUNSD, CORD as well as DocVQA, since I have worked on them, and have the idea to finetune the model on the same.
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | pip-wheel-metadata/
24 | share/python-wheels/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 | MANIFEST
29 |
30 | # PyInstaller
31 | # Usually these files are written by a python script from a template
32 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
33 | *.manifest
34 | *.spec
35 |
36 | # Installer logs
37 | pip-log.txt
38 | pip-delete-this-directory.txt
39 |
40 | # Unit test / coverage reports
41 | htmlcov/
42 | .tox/
43 | .nox/
44 | .coverage
45 | .coverage.*
46 | .cache
47 | nosetests.xml
48 | coverage.xml
49 | *.cover
50 | *.py,cover
51 | .hypothesis/
52 | .pytest_cache/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95 | __pypackages__/
96 |
97 | # Celery stuff
98 | celerybeat-schedule
99 | celerybeat.pid
100 |
101 | # SageMath parsed files
102 | *.sage.py
103 |
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 |
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 |
117 | # Rope project settings
118 | .ropeproject
119 |
120 | # mkdocs documentation
121 | /site
122 |
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 |
128 | # Pyre type checker
129 | .pyre/
130 | .idea/
131 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer: PyTorch Implementation
2 |
3 | 
4 |
5 | This repository contains the implementation of the paper: [Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer](https://arxiv.org/pdf/2102.09550v3.pdf). Note that, the authors have not released the original implementation of the paper.
6 |
7 | Abstract: We address the challenging problem of Natural Language Comprehension beyond plain-text documents by introducing the TILT neural network architecture which simultaneously learns layout information, visual features, and textual semantics. Contrary to previous approaches, we rely on a decoder capable of unifying a variety of problems involving natural language. The layout is represented as an attention bias and complemented with contextualized visual information, while the core of our model is a pretrained encoder-decoder Transformer. Our novel approach achieves state-of-the-art results in extracting information from documents and answering questions which demand layout understanding (DocVQA, CORD, SROIE). At the same time, we simplify the process by employing an end-to-end model.
8 |
9 |
10 | ## Requirements
11 | * See in the requirements.txt file
12 |
13 |
14 | ## Dataset
15 | * I would be including the [FUNSD Dataset](https://guillaumejaume.github.io/FUNSD/), as well as the [CORD Dataset](https://github.com/clovaai/cord) soon. Currently, the entire approach is being implemented, and due to my silly mistakes, it would take me a while to prepare the entire pipeline.
16 |
17 |
18 | ## Pretrained Models
19 | * I am not sure, if I would be able to include the pretrained models, due to resource constraints, but would add the finetuning code for FUNSD, CORD and DocVQA soon.
20 |
21 |
22 | ## Modeling:
23 | * The modeling part of the pipeline, basically is inspired from [HuggingFace's T5 implementation](https://huggingface.co/docs/transformers/model_doc/t5), and the initialization of the weights are being done from the same. The code for the same is available in the `src/t5.py` file.
24 |
25 |
26 | ## Examples:
27 | * For finetuning TiLT on CORD, the example along with the results are present [here](https://github.com/uakarsh/TiLT-Implementation/blob/main/experiments/cord-tilt-part-4-1-abstractive-approach-for-t.ipynb)
28 |
29 | * Similarily, for finetuning TiLT on FUNSD, the example along with the results are present [here](https://github.com/uakarsh/TiLT-Implementation/blob/main/experiments/tilt-part-4-1-abstractive-approach-for-training.ipynb)
30 |
31 |
32 | ## My Results:
33 | | Model Name | Dataset Name | Number of Parameters | Overall Precision | Overall Recall | Overall F1 Score | Overall Accuracy |
34 | |-----------------|--------------|----------------------|-------------------|----------------|------------------|------------------|
35 | | TILT | FUNSD | 225M | 57.58 | 42.25 | 48.87 | 83.60 |
36 | | TILT | CORD | 225M | 64.81 | 62.64 | 63.71 | 80.52 |
37 | | TILT(Original) | CORD | 230M | --- | --- | 95.11 | --- |
38 |
39 | Note, that in the case of my results on CORD, the model has not been pre-trained (the weights are intialized from the hugging face's implementation), and it has been trained for 30 epochs, while in the original paper, the authors have trained on 360,000 steps which is roughly equivalent to 360,000 / 100 = 360 epochs. (100 comes from 800 / 8, since 8 is the batch size mentioned in the paper, and 800 are the training examples in the CORD dataset)
40 |
41 | ## Citation
42 | If you find this repository useful, please cite the following paper:
43 | ```bibtex
44 | @inproceedings{powalski2021going,
45 | title={Going full-tilt boogie on document understanding with text-image-layout transformer},
46 | author={Powalski, Rafa{\l} and Borchmann, {\L}ukasz and Jurkiewicz, Dawid and Dwojak, Tomasz and Pietruszka, Micha{\l} and Pa{\l}ka, Gabriela},
47 | booktitle={Document Analysis and Recognition--ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5--10, 2021, Proceedings, Part II 16},
48 | pages={732--747},
49 | year={2021},
50 | organization={Springer}
51 | }
52 | ```
53 |
54 | ## License
55 | This project is licensed under the MIT License - see the LICENSE file for details
56 |
--------------------------------------------------------------------------------
/src/visual_backbone.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 | import torch.nn.functional as F
3 | import torch
4 | from torchvision.ops import roi_pool
5 |
6 | # Convolution block for UNet Encoder
7 | class ConvBlock(nn.Module):
8 | """
9 | A Convolutional Block that consists of two convolution layers each followed by
10 | instance normalization, LeakyReLU activation and dropout.
11 | """
12 |
13 | def __init__(self, in_chans: int, out_chans: int, drop_prob: float):
14 | """
15 | Args:
16 | in_chans: Number of channels in the input.
17 | out_chans: Number of channels in the output.
18 | drop_prob: Dropout probability.
19 | """
20 | super().__init__()
21 |
22 | self.in_chans = in_chans
23 | self.out_chans = out_chans
24 | self.drop_prob = drop_prob
25 |
26 | self.layers = nn.Sequential(
27 | nn.Conv2d(in_chans, out_chans, kernel_size=3,
28 | padding=1, bias=False),
29 | nn.InstanceNorm2d(out_chans),
30 | nn.LeakyReLU(negative_slope=0.2, inplace=True),
31 | nn.Dropout2d(drop_prob),
32 | nn.Conv2d(out_chans, out_chans, kernel_size=3,
33 | padding=1, bias=False),
34 | nn.InstanceNorm2d(out_chans),
35 | nn.LeakyReLU(negative_slope=0.2, inplace=True),
36 | nn.Dropout2d(drop_prob),
37 | nn.Conv2d(out_chans, out_chans, kernel_size=3,
38 | padding=1, bias=False),
39 | nn.InstanceNorm2d(out_chans),
40 | nn.LeakyReLU(negative_slope=0.2, inplace=True),
41 | nn.Dropout2d(drop_prob),
42 | )
43 |
44 | def forward(self, image: torch.Tensor) -> torch.Tensor:
45 | """
46 | Args:
47 | image: Input 4D tensor of shape `(N, in_chans, H, W)`.
48 | Returns:
49 | Output tensor of shape `(N, out_chans, H, W)`.
50 | """
51 | return self.layers(image)
52 |
53 |
54 | # UNet Encoder
55 | class Unet_encoder(nn.Module):
56 |
57 | def __init__(self,
58 | in_channels: int = 3,
59 | channels: int = 32,
60 | num_pool_layers: int = 4,
61 | drop_prob: float = 0.0
62 | ):
63 | """
64 | Args:
65 | in_chans: Number of channels in the input to the U-Net model.
66 | out_chans: Number of channels in the output to the U-Net model.
67 | chans: Number of output channels of the first convolution layer.
68 | num_pool_layers: Number of down-sampling and up-sampling layers.
69 | drop_prob: Dropout probability.
70 | """
71 | super().__init__()
72 |
73 | self.in_channels = in_channels
74 | self.channels = channels
75 |
76 | self.num_pool_layers = num_pool_layers
77 | self.drop_prob = drop_prob
78 |
79 | self.down_sample_layers = nn.ModuleList([
80 | ConvBlock(in_channels, channels, drop_prob)
81 | ])
82 | ch = channels
83 |
84 | for _ in range(num_pool_layers - 1):
85 | self.down_sample_layers.append(ConvBlock(ch, ch*2, drop_prob))
86 | ch *= 2
87 |
88 | self.conv = ConvBlock(ch, ch*2, drop_prob)
89 |
90 | def forward(self, image: torch.Tensor) -> torch.Tensor:
91 | """
92 | Args:
93 | Image: Input 4D tensor of shape (Batch Size, in channels, H, W)
94 | Returns:
95 | Output tensor of shape (Batch Size, out_channels, H, W)
96 | """
97 | output = image
98 |
99 | # Appplying down sample layers
100 | for num, layer in enumerate(self.down_sample_layers):
101 | output = layer(output)
102 | output = F.max_pool2d(output, kernel_size=2, stride=2, padding=0)
103 |
104 | output = self.conv(output)
105 | return output
106 |
107 |
108 | # RoI Align, it was a mistake, I assumed RoIPool for RoIALign, but it was not the case
109 |
110 | class RoIAlign(nn.Module):
111 | def __init__(self, output_size=(3, 3), spatial_scale=0.125, sampling_ratio=2):
112 | super().__init__()
113 |
114 | """
115 | Args
116 | output_size: (h, w) of the output feature map
117 | spatial_scale: ratio of the input feature map height (or w) to the raw image height (or w).
118 | Equals the reciprocal of total stride in convolutional layers
119 | sampling_ratio: number of inputs samples to take for each output sample
120 | """
121 |
122 | # self.output_size = output_size
123 | # self.spatial_scale = spatial_scale
124 | # self.sampling_ratio = sampling_ratio
125 | self.roi_align = RoIAlign(
126 | output_size, spatial_scale=spatial_scale, sampling_ratio=sampling_ratio)
127 |
128 | def forward(self, image_embedding, bboxes):
129 | """
130 | Args:
131 | image_embedding: Input 4D tensor of shape (Batch size, in channels, H, W)
132 | bboxes: Input 3D Tensor of shape (Batch Size, max sequence length, 4) (4 corresponding to xmin, ymin, xmax, ymax)
133 | Returns:
134 | feature_maps_bboxes: tensor of shape (batch, max sequence length, in channels, *output_size)
135 | """
136 |
137 | feature_maps_bboxes = []
138 | for single_batch_img, single_batch_bbox in zip(image_embedding, bboxes):
139 | feature_map_single_batch = self.roi_align(input=single_batch_img.unsqueeze(0),
140 | rois=torch.cat([torch.zeros(single_batch_bbox.shape[0], 1).to(
141 | single_batch_bbox.device), single_batch_bbox], axis=-1).float()
142 | )
143 | feature_maps_bboxes.append(feature_map_single_batch)
144 |
145 | return torch.stack(feature_maps_bboxes, axis=0)
146 |
147 |
148 | # RoIPool
149 |
150 | class RoIPool(nn.Module):
151 |
152 | def __init__(self, output_size=(3, 3), spatial_scale=0.125):
153 | super().__init__()
154 | """Args
155 | output_size: (h, w) of the output feature map
156 | spatial_scale: ratio of the input feature map height (or w) to the raw image height (or w).
157 | Equals the reciprocal of total stride in convolutional layers
158 | """
159 |
160 | self.output_size = output_size
161 | self.spatial_scale = spatial_scale
162 | self.roi_pool = roi_pool
163 |
164 | def forward(self, image_embedding, bboxes):
165 | """
166 | Args:
167 | image_embedding: Input 4D tensor of shape (Batch size, in channels, H, W)
168 | bboxes: Input 3D Tensor of shape (Batch Size, max sequence length, 4) (4 corresponding to xmin, ymin, xmax, ymax)
169 | Returns:
170 | feature_maps_bboxes: tensor of shape (batch, max sequence length, in channels, *output_size)
171 | """
172 |
173 | feature_maps_bboxes = []
174 | for single_batch_img, single_batch_bbox in zip(image_embedding, bboxes):
175 | feature_map_single_batch = self.roi_pool(input=single_batch_img.unsqueeze(0),
176 | boxes=torch.cat([torch.zeros(single_batch_bbox.shape[0], 1).to(
177 | single_batch_bbox.device), single_batch_bbox], axis=-1).float(),
178 | output_size=self.output_size,
179 | spatial_scale=self.spatial_scale
180 | )
181 | feature_maps_bboxes.append(feature_map_single_batch)
182 |
183 | return torch.stack(feature_maps_bboxes, axis=0)
184 |
--------------------------------------------------------------------------------
/src/dataset.py:
--------------------------------------------------------------------------------
1 | from torch.utils.data import Dataset
2 | from torchvision.transforms import ToTensor
3 | import torch
4 |
5 |
6 | ## This is a dataset referring to the generative problem
7 | ## The Dataset class for FUNSD Dataset (and I believe the same would be use for CORD Dataset)
8 | class FUNSDDs(Dataset):
9 |
10 | def __init__(self, ds, tokenizer, max_seq_length:int = 512, pad_token_box = [0, 0, 0, 0], resize_scale = (512, 384), transform = None):
11 |
12 | """
13 | Args:
14 | ds (list): list of dict, each dict contains the following keys:
15 | - image (np.ndarray): the image
16 | - tokens (list): list of tokens
17 | - bboxes (list): list of bboxes
18 | - ner_tags (list): list of ner_tags
19 | tokenizer (Tokenizer): the tokenizer
20 | max_seq_length (int, optional): the maximum length of the sequence. Defaults to 512.
21 | pad_token_box (list, optional): the padding token box. Defaults to [0, 0, 0, 0].
22 | resize_scale (tuple, optional): the resize scale. Defaults to (512, 384).
23 | transform (callable, optional): the transform. Defaults to None.
24 | """
25 |
26 | self.ds = ds
27 | self.tokenizer = tokenizer
28 | self.max_seq_length = max_seq_length
29 | self.pad_token_box = pad_token_box
30 | self.resize_scale = resize_scale
31 | self.transform = transform if transform is not None else ToTensor()
32 |
33 | def __len__(self):
34 | """
35 | Returns:
36 | int: the length of the dataset
37 | """
38 | return len(self.ds)
39 |
40 | def __getitem__(self, idx):
41 |
42 | """
43 | Args:
44 | idx (int): the index of the data to be returned.
45 | """
46 |
47 | encoding = self.ds[idx]
48 |
49 | resized_image = encoding['image'].copy().resize(self.resize_scale)
50 | words = encoding['tokens']
51 | bboxes = encoding['bboxes']
52 | labels = encoding['ner_tags']
53 |
54 | ## 1. Performing the image pre-processing
55 | img_tensor = self.transform(resized_image) ## (3, 384, 512)
56 |
57 | ## 2. Performing the semantic pre-processing
58 | encoding = self.tokenizer(words, is_split_into_words = True, add_special_tokens = False)
59 |
60 | # pad_token_box = [0, 0, 0, 0]
61 | max_seq_length = 512
62 |
63 | input_ids = encoding['input_ids']
64 | attention_mask = encoding['attention_mask']
65 |
66 | ## Note that, there is no need for bboxes, since the model does not use bbox as feature, so no pre-processing of that
67 | bbox_according_to_tokenizer = [bboxes[i] for i in encoding.word_ids()]
68 | # labels_according_to_tokenizer = [self.tokenizer(str(labels[i] + 1))['input_ids'][0] for i in encoding.word_ids()]
69 | #labels_according_to_tokenizer = [self.tokenizer(str(labels[i] + 1))['input_ids'][0] for i, _ in enumerate(labels)]
70 |
71 | # Truncation of token_boxes + token_labels
72 | special_tokens_count = 1
73 | if len(input_ids) > max_seq_length - special_tokens_count:
74 | bbox_according_to_tokenizer = bbox_according_to_tokenizer[: (max_seq_length - special_tokens_count)]
75 | input_ids = input_ids[: (max_seq_length - special_tokens_count)]
76 | #labels_according_to_tokenizer = labels_according_to_tokenizer[: (max_seq_length - special_tokens_count)]
77 | attention_mask = attention_mask[: (max_seq_length - special_tokens_count)]
78 |
79 |
80 | ## Padding
81 | input_ids = input_ids + [self.tokenizer.eos_token_id]
82 | bbox_according_to_tokenizer = bbox_according_to_tokenizer + [[1000, 1000, 1000, 1000]]
83 | #labels_according_to_tokenizer = labels_according_to_tokenizer + [self.tokenizer.eos_token_id] ## For QA, the model requires an end of sentence i.e eos token
84 | attention_mask = attention_mask + [1]
85 |
86 | pad_length = max_seq_length - len(input_ids)
87 |
88 | input_ids = input_ids + [self.tokenizer.pad_token_id] * (pad_length)
89 | bbox_according_to_tokenizer = bbox_according_to_tokenizer + [self.pad_token_box] * (pad_length)
90 | #labels_according_to_tokenizer = labels_according_to_tokenizer + [self.tokenizer.pad_token_id] * (pad_length)
91 | attention_mask = attention_mask + [0] * (pad_length)
92 |
93 | ## Converting stuffs to tensor
94 | input_ids = torch.tensor(input_ids)
95 | bbox_according_to_tokenizer = torch.tensor(bbox_according_to_tokenizer)
96 | #labels_according_to_tokenizer = torch.tensor(labels_according_to_tokenizer)
97 | attention_mask = torch.tensor(attention_mask)
98 |
99 | return {"input_ids" : input_ids, "labels" : labels, "attention_mask" : attention_mask, "bboxes" : bbox_according_to_tokenizer, # labels_according_to_tokenizer
100 | "pixel_values" : img_tensor}
101 |
102 |
103 | ## This is a dataset referring to the extractive problem, I believe the same would be used for CORD Dataset, and this was also a mistake, since I didn't read the last part of the paper properly
104 | class ExtFUNSDDs(Dataset):
105 | def __init__(self, ds, tokenizer, max_seq_length:int = 512, pad_token_box = [0, 0, 0, 0], resize_scale = (512, 384), transform = None):
106 |
107 | """
108 | Args:
109 | ds (list): list of dict, each dict contains the following keys:
110 | - image (np.ndarray): the image
111 | - tokens (list): list of tokens
112 | - bboxes (list): list of bboxes
113 | - ner_tags (list): list of ner_tags
114 | tokenizer (Tokenizer): the tokenizer
115 | max_seq_length (int, optional): the maximum length of the sequence. Defaults to 512.
116 | pad_token_box (list, optional): the padding token box. Defaults to [0, 0, 0, 0].
117 | resize_scale (tuple, optional): the resize scale. Defaults to (512, 384).
118 | transform (callable, optional): the transform. Defaults to None.
119 | """
120 |
121 | self.ds = ds
122 | self.tokenizer = tokenizer
123 | self.max_seq_length = max_seq_length
124 | self.pad_token_box = pad_token_box
125 | self.resize_scale = resize_scale
126 | self.transform = transform if transform is not None else ToTensor()
127 |
128 | def __len__(self):
129 | """
130 | Returns:
131 | int: the length of the dataset
132 | """
133 | return len(self.ds)
134 |
135 | def __getitem__(self, idx):
136 |
137 | """
138 | Args:
139 | idx (int): the index of the data to be returned.
140 | """
141 |
142 | encoding = self.ds[idx]
143 |
144 | resized_image = encoding['image'].copy().resize(self.resize_scale)
145 | words = encoding['tokens']
146 | bboxes = encoding['bboxes']
147 | labels = encoding['ner_tags']
148 |
149 | ## 1. Performing the image pre-processing
150 | img_tensor = self.transform(resized_image) ## (3, 384, 512)
151 |
152 | ## 2. Performing the semantic pre-processing
153 | encoding = self.tokenizer(words, is_split_into_words = True, add_special_tokens = False)
154 |
155 | # pad_token_box = [0, 0, 0, 0]
156 | max_seq_length = 512
157 |
158 | input_ids = encoding['input_ids']
159 | attention_mask = encoding['attention_mask']
160 |
161 | ## Note that, there is no need for bboxes, since the model does not use bbox as feature, so no pre-processing of that
162 | bbox_according_to_tokenizer = [bboxes[i] for i in encoding.word_ids()]
163 | labels_according_to_tokenizer = [labels[i] for i in encoding.word_ids()] ## Labels have to be in the numerical format
164 | #labels_according_to_tokenizer = [self.tokenizer(str(labels[i] + 1))['input_ids'][0] for i, _ in enumerate(labels)]
165 |
166 | # Truncation of token_boxes + token_labels
167 | special_tokens_count = 1
168 | if len(input_ids) > max_seq_length - special_tokens_count:
169 | bbox_according_to_tokenizer = bbox_according_to_tokenizer[: (max_seq_length - special_tokens_count)]
170 | input_ids = input_ids[: (max_seq_length - special_tokens_count)]
171 | labels_according_to_tokenizer = labels_according_to_tokenizer[: (max_seq_length - special_tokens_count)]
172 | attention_mask = attention_mask[: (max_seq_length - special_tokens_count)]
173 |
174 |
175 | ## Padding
176 | input_ids = input_ids + [self.tokenizer.eos_token_id]
177 | bbox_according_to_tokenizer = bbox_according_to_tokenizer + [[1000, 1000, 1000, 1000]]
178 | labels_according_to_tokenizer = labels_according_to_tokenizer + [-100] ## For QA, the model requires an end of sentence i.e eos token
179 | attention_mask = attention_mask + [1]
180 |
181 | pad_length = max_seq_length - len(input_ids)
182 |
183 | input_ids = input_ids + [self.tokenizer.pad_token_id] * (pad_length)
184 | bbox_according_to_tokenizer = bbox_according_to_tokenizer + [self.pad_token_box] * (pad_length)
185 | labels_according_to_tokenizer = labels_according_to_tokenizer + [-100] * (pad_length)
186 | attention_mask = attention_mask + [0] * (pad_length)
187 |
188 | ## Converting stuffs to tensor
189 | input_ids = torch.tensor(input_ids)
190 | bbox_according_to_tokenizer = torch.tensor(bbox_according_to_tokenizer)
191 | labels_according_to_tokenizer = torch.tensor(labels_according_to_tokenizer)
192 | attention_mask = torch.tensor(attention_mask)
193 |
194 | return {"input_ids" : input_ids, "labels" : labels_according_to_tokenizer, "attention_mask" : attention_mask, "bboxes" : bbox_according_to_tokenizer, # labels_according_to_tokenizer
195 | "pixel_values" : img_tensor}
--------------------------------------------------------------------------------
/src/t5.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import math
4 | import copy
5 | from transformers.models import t5
6 | from transformers import AutoModel
7 |
8 |
9 | class T5LayerNorm(nn.Module):
10 | def __init__(self, hidden_size, eps=1e-6):
11 | """
12 | Construct a layernorm module in the T5 Style. No bias and no subtraction of mean.
13 | """
14 | super().__init__()
15 | self.weight = nn.Parameter(torch.ones(hidden_size))
16 | self.variance_epsilon = eps
17 |
18 | def forward(self, hidden_states):
19 |
20 | # T5 uses a layer_norm which only scales and doesn't shift, which is also known as Root Mean
21 | # Square Layer Normalization https://arxiv.org/abs/1910.07467 thus varience is calculated
22 | # w/o mean and there is no bias. Additionally we want to make sure that the accumulation for
23 | # half-precision inputs is done in fp32
24 |
25 | variance = hidden_states.to(torch.float32).pow(
26 | 2).mean(-1, keepdim=True)
27 | hidden_states = hidden_states * \
28 | torch.rsqrt(variance + self.variance_epsilon)
29 |
30 | # convert into half-precision if necessary
31 | if self.weight.dtype in [torch.float16, torch.bfloat16]:
32 | hidden_states = hidden_states.to(self.weight.dtype)
33 |
34 | return self.weight * hidden_states
35 |
36 |
37 | class T5DenseActDense(nn.Module):
38 | def __init__(self, config):
39 | super().__init__()
40 | self.wi = nn.Linear(config.d_model, config.d_ff, bias=False)
41 | self.wo = nn.Linear(config.d_ff, config.d_model, bias=False)
42 | self.dropout = nn.Dropout(config.dropout_rate)
43 | self.act = nn.ReLU()
44 |
45 | def forward(self, hidden_states):
46 | hidden_states = self.wi(hidden_states)
47 | hidden_states = self.act(hidden_states)
48 | hidden_states = self.dropout(hidden_states)
49 | if hidden_states.dtype != self.wo.weight.dtype and self.wo.weight.dtype != torch.int8:
50 | hidden_states = hidden_states.to(self.wo.weight.dtype)
51 | hidden_states = self.wo(hidden_states)
52 | return hidden_states
53 |
54 |
55 | class T5DenseGatedActDense(nn.Module):
56 | def __init__(self, config):
57 | super().__init__()
58 | self.wi_0 = nn.Linear(config.d_model, config.d_ff, bias=False)
59 | self.wi_1 = nn.Linear(config.d_model, config.d_ff, bias=False)
60 | self.wo = nn.Linear(config.d_ff, config.d_model, bias=False)
61 | self.dropout = nn.Dropout(config.dropout_rate)
62 | self.act = nn.ReLU()
63 |
64 | def forward(self, hidden_states):
65 | hidden_gelu = self.act(self.wi_0(hidden_states))
66 | hidden_linear = self.wi_1(hidden_states)
67 | hidden_states = hidden_gelu * hidden_linear
68 | hidden_states = self.dropout(hidden_states)
69 | if hidden_states.dtype != self.wo.weight.dtype and self.wo.weight.dtype != torch.int8:
70 | hidden_states = hidden_states.to(self.wo.weight.dtype)
71 | hidden_states = self.wo(hidden_states)
72 | return hidden_states
73 |
74 |
75 | class T5LayerFF(nn.Module):
76 | def __init__(self, config):
77 | super().__init__()
78 | if config.is_gated_act:
79 | self.DenseReluDense = T5DenseGatedActDense(config)
80 | else:
81 | self.DenseReluDense = T5DenseActDense(config)
82 |
83 | self.layer_norm = T5LayerNorm(
84 | config.d_model, eps=config.layer_norm_epsilon)
85 | self.dropout = nn.Dropout(config.dropout_rate)
86 |
87 | def forward(self, hidden_states):
88 | forwarded_states = self.layer_norm(hidden_states)
89 | forwarded_states = self.DenseReluDense(forwarded_states)
90 | hidden_states = hidden_states + self.dropout(forwarded_states)
91 | return hidden_states
92 |
93 |
94 | class T5Attention(nn.Module):
95 | def __init__(self, config, has_relative_attention_bias=False):
96 | super().__init__()
97 | self.is_decoder = config.is_decoder
98 | self.has_relative_attention_bias = has_relative_attention_bias
99 | self.relative_attention_num_buckets = config.relative_attention_num_buckets
100 | self.relative_attention_max_distance = config.relative_attention_max_distance
101 | self.d_model = config.d_model
102 | self.key_value_proj_dim = config.d_kv
103 | self.n_heads = config.num_heads
104 | self.dropout = config.dropout_rate
105 | self.inner_dim = self.n_heads * self.key_value_proj_dim
106 |
107 | self.q = nn.Linear(self.d_model, self.inner_dim, bias=False)
108 | self.k = nn.Linear(self.d_model, self.inner_dim, bias=False)
109 | self.v = nn.Linear(self.d_model, self.inner_dim, bias=False)
110 | self.o = nn.Linear(self.inner_dim, self.d_model, bias=False)
111 |
112 | '''
113 | Here is where the change lies, i.e adding the relative_horizontal_bias as well as the relative_vertical_bias
114 | '''
115 | if self.has_relative_attention_bias:
116 | self.relative_attention_bias = nn.Embedding(
117 | self.relative_attention_num_buckets, self.n_heads)
118 | self.relative_horizontal_bias = nn.Embedding(
119 | self.relative_attention_num_buckets, self.n_heads)
120 | self.relative_vertical_bias = nn.Embedding(
121 | self.relative_attention_num_buckets, self.n_heads)
122 |
123 | self.gradient_checkpointing = False
124 |
125 | @staticmethod
126 | def _relative_position_bucket(relative_position, bidirectional=True, num_buckets=32, max_distance=128):
127 | """
128 | Adapted from Mesh Tensorflow:
129 | https://github.com/tensorflow/mesh/blob/0cb87fe07da627bf0b7e60475d59f95ed6b5be3d/mesh_tensorflow/transformer/transformer_layers.py#L593
130 | Translate relative position to a bucket number for relative attention. The relative position is defined as
131 | memory_position - query_position, i.e. the distance in tokens from the attending position to the attended-to
132 | position. If bidirectional=False, then positive relative positions are invalid. We use smaller buckets for
133 | small absolute relative_position and larger buckets for larger absolute relative_positions. All relative
134 | positions >=max_distance map to the same bucket. All relative positions <=-max_distance map to the same bucket.
135 | This should allow for more graceful generalization to longer sequences than the model has been trained on
136 | Args:
137 | relative_position: an int32 Tensor
138 | bidirectional: a boolean - whether the attention is bidirectional
139 | num_buckets: an integer
140 | max_distance: an integer
141 | Returns:
142 | a Tensor with the same shape as relative_position, containing int32 values in the range [0, num_buckets)
143 | """
144 | relative_buckets = 0
145 | if bidirectional:
146 | num_buckets //= 2
147 | relative_buckets += (relative_position
148 | > 0).to(torch.long) * num_buckets
149 | relative_position = torch.abs(relative_position)
150 | else:
151 | relative_position = - \
152 | torch.min(relative_position,
153 | torch.zeros_like(relative_position))
154 | # now relative_position is in the range [0, inf)
155 |
156 | # half of the buckets are for exact increments in positions
157 | max_exact = num_buckets // 2
158 | is_small = relative_position < max_exact
159 |
160 | # The other half of the buckets are for logarithmically bigger bins in positions up to max_distance
161 | relative_position_if_large = max_exact + (
162 | torch.log(relative_position.float() / max_exact)
163 | / math.log(max_distance / max_exact)
164 | * (num_buckets - max_exact)
165 | ).to(torch.long)
166 | relative_position_if_large = torch.min(
167 | relative_position_if_large, torch.full_like(
168 | relative_position_if_large, num_buckets - 1)
169 | )
170 |
171 | relative_buckets += torch.where(is_small,
172 | relative_position, relative_position_if_large)
173 | return relative_buckets
174 |
175 | def compute_bias_1d(self, query_length, key_length, device=None):
176 | """Compute binned relative position bias"""
177 | if device is None:
178 | device = self.relative_attention_bias.weight.device
179 | context_position = torch.arange(
180 | query_length, dtype=torch.long, device=device)[:, None]
181 | memory_position = torch.arange(
182 | key_length, dtype=torch.long, device=device)[None, :]
183 | relative_position = memory_position - \
184 | context_position # shape (query_length, key_length)
185 | relative_position_bucket = self._relative_position_bucket(
186 | relative_position, # shape (query_length, key_length)
187 | bidirectional=(not self.is_decoder),
188 | num_buckets=self.relative_attention_num_buckets,
189 | max_distance=self.relative_attention_max_distance,
190 | )
191 | # shape (query_length, key_length, num_heads)
192 | values = self.relative_attention_bias(relative_position_bucket)
193 | # shape (1, num_heads, query_length, key_length)
194 | values = values.permute([2, 0, 1]).unsqueeze(0)
195 | return values
196 |
197 | def compute_vertical_horizontal_bias(self, total_boxes: int = 512, device=None):
198 |
199 | denominator_to_divide = total_boxes // self.relative_attention_num_buckets
200 |
201 | """Compute the vertical and horizontal bias"""
202 | if device is None:
203 | device = self.relative_attention_bias.weight.device
204 | indices = torch.arange(total_boxes, dtype=torch.long, device=device)
205 | h_distances = (indices % self.relative_attention_num_buckets)[
206 | :, None] - (indices % self.relative_attention_num_buckets)[None, :]
207 | v_distances = (
208 | indices // denominator_to_divide)[:, None] - (indices // denominator_to_divide)[None, :]
209 |
210 | h_distances_bucket = self._relative_position_bucket(
211 | h_distances, # shape (query_length, key_length)
212 | bidirectional=(not self.is_decoder),
213 | num_buckets=self.relative_attention_num_buckets,
214 | max_distance=self.relative_attention_max_distance,
215 | )
216 |
217 | ## It has to be like this : https://github.com/microsoft/i-Code/blob/d933ae53eb9dec057e605fa4c89ea701629c5b9d/i-Code-Doc/core/models/embedding/relative/relative.py#L175
218 | ## so change is needed here
219 | v_distances_bucket = self._relative_position_bucket(
220 | v_distances, # shape (query_length, key_length)
221 | bidirectional=(not self.is_decoder),
222 | num_buckets=self.relative_attention_num_buckets,
223 | max_distance=self.relative_attention_max_distance,
224 | )
225 |
226 | h_distances_values = self.relative_horizontal_bias(
227 | h_distances_bucket) # shape (query_length, key_length, num_heads)
228 | h_distances_values = h_distances_values.permute([2, 0, 1]).unsqueeze(
229 | 0) # shape (1, num_heads, query_length, key_length)
230 |
231 | v_distances_values = self.relative_vertical_bias(
232 | v_distances_bucket) # shape (query_length, key_length, num_heads)
233 | v_distances_values = v_distances_values.permute([2, 0, 1]).unsqueeze(
234 | 0) # shape (1, num_heads, query_length, key_length)
235 |
236 | return h_distances_values, v_distances_values
237 |
238 | def forward(self, hidden_states, mask=None, key_value_states=None, position_bias=None, past_key_value=None, layer_head_mask=None, query_length=None,
239 | use_cache=False, output_attentions=False):
240 | """
241 | Self-attention (if key_value_states is None) or attention over source sentence (provided by key_value_states).
242 | """
243 | # Input is (batch_size, seq_length, dim)
244 | # Mask is (batch_size, key_length) (non-causal) or (batch_size, key_length, key_length)
245 | # past_key_value[0] is (batch_size, n_heads, q_len - 1, dim_per_head)
246 | batch_size, seq_length = hidden_states.shape[:2]
247 |
248 | real_seq_length = seq_length
249 |
250 | if past_key_value is not None:
251 | assert(len(past_key_value)
252 | == 2), f"past_key_value should have 2 past states: keys and values. Got { len(past_key_value)} past states"
253 | real_seq_length += past_key_value[0].shape[2] if query_length is None else key_value_states.shape[1]
254 |
255 | key_length = real_seq_length if key_value_states is None else key_value_states.shape[
256 | 1]
257 |
258 | def shape(states):
259 | "projection"
260 | return states.view(batch_size, -1, self.n_heads, self.key_value_proj_dim).transpose(1, 2)
261 |
262 | def unshape(states):
263 | """reshape"""
264 | return states.transpose(1, 2).contiguous().view(batch_size, -1, self.inner_dim)
265 |
266 | def project(hidden_states, proj_layer, key_value_states, past_key_value):
267 | """project hidden states correctly to key/query states"""
268 | if key_value_states is None:
269 | # self-attn
270 | # (batch_size, n_heads, seq_length, dim_per_head)
271 | hidden_states = shape(proj_layer(hidden_states))
272 | elif past_key_value is None:
273 | # cross-attn
274 | # (batch_size, n_heads, seq_length, dim_per_head)
275 | hidden_states = shape(proj_layer(key_value_states))
276 |
277 | if past_key_value is not None:
278 | if key_value_states is None:
279 | # self-attn
280 | # (batch_size, n_heads, key_length, dim_per_head)
281 | hidden_states = torch.cat(
282 | [past_key_value, hidden_states], dim=2)
283 | elif past_key_value.shape[2] != key_value_states.shape[1]:
284 | # checking that the `sequence_length` of the `past_key_value` is the same as
285 | # the provided `key_value_states` to support prefix tuning
286 | # cross-attn
287 | # (batch_size, n_heads, seq_length, dim_per_head)
288 | hidden_states = shape(proj_layer(key_value_states))
289 | else:
290 | # cross-attn
291 | hidden_states = past_key_value
292 | return hidden_states
293 |
294 | # get query states
295 | query_states = shape(self.q(hidden_states))
296 |
297 | # get key/value states
298 | key_states = project(hidden_states, self.k, key_value_states,
299 | past_key_value[0] if past_key_value is not None else None)
300 | value_states = project(hidden_states, self.v, key_value_states,
301 | past_key_value[0] if past_key_value is not None else None)
302 |
303 | # compute score
304 | # equivalent of torch.einsum("bnqd,bnkd->bnqk", query_states, key_states), compatible with onnx op>9
305 | scores = torch.matmul(query_states, key_states.transpose(3, 2))
306 |
307 | # Sequential Part
308 | if position_bias is None:
309 | if not self.has_relative_attention_bias:
310 | position_bias = torch.zeros(
311 | (1, self.n_heads, real_seq_length, key_length), device=scores.device, dtype=scores.dtype
312 | )
313 | if self.gradient_checkpointing and self.training:
314 | position_bias.requires_grad = True
315 | else:
316 | position_bias = self.compute_bias_1d(
317 | real_seq_length, key_length, device=scores.device)
318 | h_distances_values, v_distances_values = self.compute_vertical_horizontal_bias(
319 | total_boxes=real_seq_length, device=scores.device)
320 | position_bias = position_bias + h_distances_values + v_distances_values
321 |
322 | # if key and values are already calculated
323 | # we want only the last query position bias
324 | if past_key_value is not None:
325 | position_bias = position_bias[:, :, -hidden_states.size(1):, :]
326 |
327 | if mask is not None:
328 | # (batch_size, n_heads, seq_length, key_length)
329 | position_bias = position_bias + mask
330 |
331 | position_bias_masked = position_bias # No pruning right now
332 |
333 | scores += position_bias_masked
334 | attn_weights = nn.functional.softmax(scores.float(), dim=-1).type_as(
335 | scores
336 | ) # (batch_size, n_heads, seq_length, key_length)
337 | attn_weights = nn.functional.dropout(
338 | attn_weights, p=self.dropout, training=self.training
339 | ) # (batch_size, n_heads, seq_length, key_length)
340 |
341 | # Mask heads if we want to
342 | if layer_head_mask is not None:
343 | attn_weights = attn_weights * layer_head_mask
344 |
345 | # (batch_size, seq_length, dim)
346 | attn_output = unshape(torch.matmul(attn_weights, value_states))
347 | attn_output = self.o(attn_output)
348 |
349 | present_key_value_state = (key_states, value_states) if (
350 | self.is_decoder and use_cache) else None
351 | outputs = (attn_output,) + \
352 | (present_key_value_state,) + (position_bias,)
353 |
354 | if output_attentions:
355 | outputs = outputs + (attn_weights,)
356 | return outputs
357 |
358 |
359 | class T5LayerSelfAttention(nn.Module):
360 | def __init__(self, config, has_relative_attention_bias=False):
361 | super().__init__()
362 | self.SelfAttention = T5Attention(
363 | config, has_relative_attention_bias=has_relative_attention_bias)
364 | self.layer_norm = T5LayerNorm(
365 | config.d_model, eps=config.layer_norm_epsilon)
366 | self.dropout = nn.Dropout(config.dropout_rate)
367 |
368 | def forward(self, hidden_states, attention_mask=None, position_bias=None, layer_head_mask=None, past_key_value=None, use_cache=False, output_attentions=False):
369 | normed_hidden_states = self.layer_norm(hidden_states)
370 | attention_output = self.SelfAttention(normed_hidden_states, mask=attention_mask, position_bias=position_bias,
371 | layer_head_mask=layer_head_mask, past_key_value=past_key_value, use_cache=use_cache, output_attentions=output_attentions,)
372 | hidden_states = hidden_states + self.dropout(attention_output[0])
373 | # add attentions if we output them
374 | outputs = (hidden_states,) + attention_output[1:]
375 | return outputs
376 |
377 |
378 | class T5LayerCrossAttention(nn.Module):
379 | def __init__(self, config):
380 | super().__init__()
381 | self.EncDecAttention = T5Attention(
382 | config, has_relative_attention_bias=False)
383 | self.layer_norm = T5LayerNorm(
384 | config.d_model, eps=config.layer_norm_epsilon)
385 | self.dropout = nn.Dropout(config.dropout_rate)
386 |
387 | def forward(self, hidden_states, key_value_states, attention_mask=None, position_bias=None, layer_head_mask=None, past_key_value=None, use_cache=False, query_length=None, output_attentions=False, ):
388 | normed_hidden_states = self.layer_norm(hidden_states)
389 | attention_output = self.EncDecAttention(normed_hidden_states, mask=attention_mask,
390 | key_value_states=key_value_states, position_bias=position_bias,
391 | layer_head_mask=layer_head_mask,
392 | past_key_value=past_key_value,
393 | use_cache=use_cache,
394 | query_length=query_length,
395 | output_attentions=output_attentions,)
396 | layer_output = hidden_states + self.dropout(attention_output[0])
397 | # add attention if we output them
398 | outputs = (layer_output, ) + attention_output[1:]
399 | return outputs
400 |
401 |
402 | class T5Block(nn.Module):
403 | def __init__(self, config, has_relative_attention_bias=False):
404 | super().__init__()
405 | self.is_decoder = config.is_decoder
406 | self.layer = nn.ModuleList()
407 | self.layer.append(T5LayerSelfAttention(
408 | config, has_relative_attention_bias=has_relative_attention_bias))
409 | if self.is_decoder:
410 | self.layer.append(T5LayerCrossAttention(config))
411 |
412 | self.layer.append(T5LayerFF(config))
413 |
414 | def forward(self, hidden_states, attention_mask=None, position_bias=None, encoder_hidden_states=None,
415 | encoder_attention_mask=None, encoder_decoder_position_bias=None, layer_head_mask=None, cross_attn_layer_head_mask=None,
416 | past_key_value=None, use_cache=False, output_attentions=False, return_dict=True):
417 |
418 | if past_key_value is not None:
419 | expected_num_past_key_values = 2 if encoder_hidden_states is None else 4
420 |
421 | if len(past_key_value) != expected_num_past_key_values:
422 | raise ValueError(
423 | f"There should be {expected_num_past_key_values} past states. "
424 | f"{'2 (past / key) for cross attention. ' if expected_num_past_key_values == 4 else ''}"
425 | f"Got {len(past_key_value)} past key / value states"
426 | )
427 |
428 | self_attn_past_key_value = past_key_value[:2]
429 | cross_attn_past_key_value = past_key_value[2:]
430 | else:
431 | self_attn_past_key_value, cross_attn_past_key_value = None, None
432 |
433 | self_attention_outputs = self.layer[0](
434 | hidden_states,
435 | attention_mask=attention_mask,
436 | position_bias=position_bias,
437 | layer_head_mask=layer_head_mask,
438 | past_key_value=self_attn_past_key_value,
439 | use_cache=use_cache,
440 | output_attentions=output_attentions,
441 | )
442 | hidden_states, present_key_value_state = self_attention_outputs[:2]
443 | # Keep self-attention outputs and relative position weights
444 | attention_outputs = self_attention_outputs[2:]
445 |
446 | # clamp inf values to enable fp16 training
447 | if hidden_states.dtype == torch.float16 and torch.isinf(hidden_states).any():
448 | clamp_value = torch.finfo(hidden_states.dtype).max - 1000
449 | hidden_states = torch.clamp(
450 | hidden_states, min=-clamp_value, max=clamp_value)
451 |
452 | do_cross_attention = self.is_decoder and encoder_hidden_states is not None
453 | if do_cross_attention:
454 | # the actual query length is unknown for cross attention
455 | # if using past key value states. Need to inject it here
456 | if present_key_value_state is not None:
457 | query_length = present_key_value_state[0].shape[2]
458 | else:
459 | query_length = None
460 |
461 | cross_attention_outputs = self.layer[1](
462 | hidden_states,
463 | key_value_states=encoder_hidden_states,
464 | attention_mask=encoder_attention_mask,
465 | position_bias=encoder_decoder_position_bias,
466 | layer_head_mask=cross_attn_layer_head_mask,
467 | past_key_value=cross_attn_past_key_value,
468 | query_length=query_length,
469 | use_cache=use_cache,
470 | output_attentions=output_attentions,
471 | )
472 | hidden_states = cross_attention_outputs[0]
473 |
474 | # clamp inf values to enable fp16 training
475 | if hidden_states.dtype == torch.float16 and torch.isinf(hidden_states).any():
476 | clamp_value = torch.finfo(hidden_states.dtype).max - 1000
477 | hidden_states = torch.clamp(
478 | hidden_states, min=-clamp_value, max=clamp_value)
479 |
480 | # Combine self attn and cross attn key value states
481 | if present_key_value_state is not None:
482 | present_key_value_state = present_key_value_state + \
483 | cross_attention_outputs[1]
484 |
485 | # Keep cross-attention outputs and relative position weights
486 | attention_outputs = attention_outputs + cross_attention_outputs[2:]
487 |
488 | # Apply Feed Forward layer
489 | hidden_states = self.layer[-1](hidden_states)
490 |
491 | # clamp inf values to enable fp16 training
492 | if hidden_states.dtype == torch.float16 and torch.isinf(hidden_states).any():
493 | clamp_value = torch.finfo(hidden_states.dtype).max - 1000
494 | hidden_states = torch.clamp(
495 | hidden_states, min=-clamp_value, max=clamp_value)
496 |
497 | outputs = (hidden_states,)
498 |
499 | if use_cache:
500 | outputs = outputs + (present_key_value_state,) + attention_outputs
501 | else:
502 | outputs = outputs + attention_outputs
503 |
504 | # hidden-states, present_key_value_states, (self-attention position bias), (self-attention weights), (cross-attention position bias), (cross-attention weights)
505 | return outputs
506 |
507 |
508 | class T5Stack(t5.modeling_t5.T5Stack):
509 | def __init__(self, config, embed_tokens=None):
510 | '''Just changes in the `T5Block`, so have to update it as per our implementation'''
511 | super().__init__(config=config, embed_tokens=embed_tokens)
512 | self.block = nn.ModuleList(
513 | [T5Block(config, has_relative_attention_bias=bool(i == 0))
514 | for i in range(config.num_layers)]
515 | )
516 |
517 | def forward(
518 | self,
519 | input_ids=None,
520 | attention_mask=None,
521 | encoder_hidden_states=None,
522 | encoder_attention_mask=None,
523 | inputs_embeds=None,
524 | head_mask=None,
525 | cross_attn_head_mask=None,
526 | past_key_values=None,
527 | use_cache=None,
528 | output_attentions=None,
529 | output_hidden_states=None,
530 | return_dict=None,
531 | ):
532 |
533 | return super().forward(input_ids=input_ids, attention_mask=attention_mask, encoder_hidden_states=encoder_hidden_states, encoder_attention_mask=encoder_attention_mask,
534 | inputs_embeds=inputs_embeds, head_mask=head_mask, cross_attn_head_mask=cross_attn_head_mask, past_key_values=past_key_values,
535 | use_cache=use_cache, output_attentions=output_attentions, output_hidden_states=output_hidden_states, return_dict=return_dict)
536 |
537 |
538 | class T5Model(t5.modeling_t5.T5Model):
539 | def __init__(self, config):
540 | super().__init__(config=config)
541 |
542 | self.config = config
543 | encoder_config = copy.deepcopy(config)
544 | decoder_config = copy.deepcopy(config)
545 | decoder_config.update(dict(is_decoder=True))
546 |
547 | self.encoder = T5Stack(encoder_config, self.shared)
548 | self.decoder = T5Stack(decoder_config, self.shared)
549 |
550 | self.post_init()
551 |
552 | def forward(self, **kwargs):
553 | return super().forward(**kwargs)
554 |
555 | def load_weights(self):
556 | dummy_model = AutoModel.from_pretrained(self.config._name_or_path)
557 | self.load_state_dict(dummy_model.state_dict(), strict=False)
558 | print("Weights loaded successfully!")
559 |
560 |
561 | class T5ForConditionalGeneration(t5.modeling_t5.T5ForConditionalGeneration):
562 | def __init__(self, config):
563 | '''
564 | It is similar to the T5ForConditionalGeneration described in the `hugging_face` repository, however I had to tweak it a bit,
565 | since there is an addition of the `relative_horizontal_bias` as well as `relative_vertical_bias` in the `T5Attention` class, and also
566 | the entire approach is generative in nature, so maybe it can be used in some other dataset, such as Question Answering
567 | '''
568 |
569 | super().__init__(config=config)
570 |
571 | self.config = config
572 | encoder_config = copy.deepcopy(config)
573 | decoder_config = copy.deepcopy(config)
574 | # In the pretrained version, the decoder config, the `is_decoder` option is True
575 | decoder_config.update(dict(is_decoder=True))
576 |
577 | self.encoder = T5Stack(encoder_config, self.shared)
578 | self.decoder = T5Stack(decoder_config, self.shared)
579 |
580 | if config.load_weights:
581 | self.load_weights()
582 | else:
583 | self.post_init()
584 | print("Initialization done without loading the weights")
585 |
586 | def forward(self, **kwargs):
587 | '''Same as mentioned in the hugging face's implementation'''
588 | return super().forward(**kwargs)
589 |
590 | def load_weights(self):
591 | '''
592 | Loads the weights from the pretrained model
593 | '''
594 | dummy_model = AutoModel.from_pretrained(self.config._name_or_path)
595 | self.load_state_dict(dummy_model.state_dict(), strict=False)
596 | print("Weights loaded successfully!")
597 |
598 |
599 | class T5EncoderModel(t5.modeling_t5.T5ForConditionalGeneration):
600 | def __init__(self, config):
601 | '''
602 | It is similar to the T5EncoderModel described in the `hugging_face` repository, however I had to tweak it a bit,
603 | since there is an addition of the `relative_horizontal_bias` as well as `relative_vertical_bias` in the `T5Attention` class
604 | '''
605 | super().__init__(config=config)
606 | self.encoder = T5Stack(config, self.shared)
607 | self.post_init()
608 |
609 | def forward(self, **kwargs):
610 | '''Similar to the `T5EncoderModel` mentioned in the hugging face's t5 implementation'''
611 | return super().forward(**kwargs)
612 |
613 |
614 | class T5ForConditionalGenerationAbstractive(t5.modeling_t5.T5ForConditionalGeneration):
615 | def __init__(self, config):
616 | '''
617 | T5ForConditionalGenerationAbstractive is a T5ForConditionalGeneration model with a linear layer on top of the decoder output,
618 | where the decoder output is the output of the last layer of the decoder, followed by a linear layer projection.
619 |
620 | It is similar to T5ForConditionalGeneration, however, it is based on a concept of generative answer, and this was what I did earlier,
621 | however, the authors have used an abstractive approach, and so I had to tweak somethinig, and essentially, it is the `self.lm_head`
622 | '''
623 |
624 | super().__init__(config=config)
625 |
626 | self.config = config
627 | encoder_config = copy.deepcopy(config)
628 | decoder_config = copy.deepcopy(config)
629 | # In the pretrained version, the decoder config, the `is_decoder` option is True
630 | decoder_config.update(dict(is_decoder=True))
631 |
632 | self.encoder = T5Stack(encoder_config, self.shared)
633 | self.decoder = T5Stack(decoder_config, self.shared)
634 | self.lm_head = nn.Linear(in_features=config.d_model,
635 | out_features=config.num_classes, bias=False)
636 |
637 | if config.load_weights:
638 | self.load_weights()
639 | else:
640 | self.post_init()
641 | print("Initialization done without loading the weights")
642 |
643 | def forward(self, **kwargs):
644 | '''
645 | Forward pass of T5ForConditionalGenerationAbstractive. It is similar to T5ForConditionalGeneration, however, it is based on a concept of generative answer,
646 | and this was what I did earlier,
647 | '''
648 | return super().forward(**kwargs)
649 |
650 | def load_weights(self):
651 | '''
652 | Load the weights of the T5ForConditionalGenerationAbstractive model
653 | It is adaptable to both the `t5-base` and `t5-large` configuration settings
654 | '''
655 | dummy_model = AutoModel.from_pretrained(self.config._name_or_path)
656 | self.load_state_dict(dummy_model.state_dict(), strict=False)
657 | print("Weights loaded successfully!")
658 |
--------------------------------------------------------------------------------
/how_did_i_prepare_the_stuffs/tilt_part_3_1_aligning_all_the_parts_to_make_tilt.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyNdtYJJE/gqRdJD2aJ7nI9l",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | },
17 | "widgets": {
18 | "application/vnd.jupyter.widget-state+json": {
19 | "250938dfb71d45e0858cd757df64cd8b": {
20 | "model_module": "@jupyter-widgets/controls",
21 | "model_name": "HBoxModel",
22 | "model_module_version": "1.5.0",
23 | "state": {
24 | "_dom_classes": [],
25 | "_model_module": "@jupyter-widgets/controls",
26 | "_model_module_version": "1.5.0",
27 | "_model_name": "HBoxModel",
28 | "_view_count": null,
29 | "_view_module": "@jupyter-widgets/controls",
30 | "_view_module_version": "1.5.0",
31 | "_view_name": "HBoxView",
32 | "box_style": "",
33 | "children": [
34 | "IPY_MODEL_20ea30cbafb84c618979bd41e8d921d6",
35 | "IPY_MODEL_ff6f0b419450464a9daf73a41e357930",
36 | "IPY_MODEL_681f7ed3e39442cd8f987db7a131f4e1"
37 | ],
38 | "layout": "IPY_MODEL_33af6b7b9d7e4c8d90a9e4d66159f951"
39 | }
40 | },
41 | "20ea30cbafb84c618979bd41e8d921d6": {
42 | "model_module": "@jupyter-widgets/controls",
43 | "model_name": "HTMLModel",
44 | "model_module_version": "1.5.0",
45 | "state": {
46 | "_dom_classes": [],
47 | "_model_module": "@jupyter-widgets/controls",
48 | "_model_module_version": "1.5.0",
49 | "_model_name": "HTMLModel",
50 | "_view_count": null,
51 | "_view_module": "@jupyter-widgets/controls",
52 | "_view_module_version": "1.5.0",
53 | "_view_name": "HTMLView",
54 | "description": "",
55 | "description_tooltip": null,
56 | "layout": "IPY_MODEL_3959923118c84010b4b0024b28ca2734",
57 | "placeholder": "",
58 | "style": "IPY_MODEL_b3fd33e82ba7455981c0499df81cb0d5",
59 | "value": "100%"
60 | }
61 | },
62 | "ff6f0b419450464a9daf73a41e357930": {
63 | "model_module": "@jupyter-widgets/controls",
64 | "model_name": "FloatProgressModel",
65 | "model_module_version": "1.5.0",
66 | "state": {
67 | "_dom_classes": [],
68 | "_model_module": "@jupyter-widgets/controls",
69 | "_model_module_version": "1.5.0",
70 | "_model_name": "FloatProgressModel",
71 | "_view_count": null,
72 | "_view_module": "@jupyter-widgets/controls",
73 | "_view_module_version": "1.5.0",
74 | "_view_name": "ProgressView",
75 | "bar_style": "success",
76 | "description": "",
77 | "description_tooltip": null,
78 | "layout": "IPY_MODEL_59955fdf275549ca8873c9b053419fd7",
79 | "max": 2,
80 | "min": 0,
81 | "orientation": "horizontal",
82 | "style": "IPY_MODEL_5e288264d36f49c58f09b95ae3587e26",
83 | "value": 2
84 | }
85 | },
86 | "681f7ed3e39442cd8f987db7a131f4e1": {
87 | "model_module": "@jupyter-widgets/controls",
88 | "model_name": "HTMLModel",
89 | "model_module_version": "1.5.0",
90 | "state": {
91 | "_dom_classes": [],
92 | "_model_module": "@jupyter-widgets/controls",
93 | "_model_module_version": "1.5.0",
94 | "_model_name": "HTMLModel",
95 | "_view_count": null,
96 | "_view_module": "@jupyter-widgets/controls",
97 | "_view_module_version": "1.5.0",
98 | "_view_name": "HTMLView",
99 | "description": "",
100 | "description_tooltip": null,
101 | "layout": "IPY_MODEL_66bcfafd26e14fb8b333d0da2efc222f",
102 | "placeholder": "",
103 | "style": "IPY_MODEL_c3deff7939634161bca16addf85406a4",
104 | "value": " 2/2 [00:00<00:00, 21.71it/s]"
105 | }
106 | },
107 | "33af6b7b9d7e4c8d90a9e4d66159f951": {
108 | "model_module": "@jupyter-widgets/base",
109 | "model_name": "LayoutModel",
110 | "model_module_version": "1.2.0",
111 | "state": {
112 | "_model_module": "@jupyter-widgets/base",
113 | "_model_module_version": "1.2.0",
114 | "_model_name": "LayoutModel",
115 | "_view_count": null,
116 | "_view_module": "@jupyter-widgets/base",
117 | "_view_module_version": "1.2.0",
118 | "_view_name": "LayoutView",
119 | "align_content": null,
120 | "align_items": null,
121 | "align_self": null,
122 | "border": null,
123 | "bottom": null,
124 | "display": null,
125 | "flex": null,
126 | "flex_flow": null,
127 | "grid_area": null,
128 | "grid_auto_columns": null,
129 | "grid_auto_flow": null,
130 | "grid_auto_rows": null,
131 | "grid_column": null,
132 | "grid_gap": null,
133 | "grid_row": null,
134 | "grid_template_areas": null,
135 | "grid_template_columns": null,
136 | "grid_template_rows": null,
137 | "height": null,
138 | "justify_content": null,
139 | "justify_items": null,
140 | "left": null,
141 | "margin": null,
142 | "max_height": null,
143 | "max_width": null,
144 | "min_height": null,
145 | "min_width": null,
146 | "object_fit": null,
147 | "object_position": null,
148 | "order": null,
149 | "overflow": null,
150 | "overflow_x": null,
151 | "overflow_y": null,
152 | "padding": null,
153 | "right": null,
154 | "top": null,
155 | "visibility": null,
156 | "width": null
157 | }
158 | },
159 | "3959923118c84010b4b0024b28ca2734": {
160 | "model_module": "@jupyter-widgets/base",
161 | "model_name": "LayoutModel",
162 | "model_module_version": "1.2.0",
163 | "state": {
164 | "_model_module": "@jupyter-widgets/base",
165 | "_model_module_version": "1.2.0",
166 | "_model_name": "LayoutModel",
167 | "_view_count": null,
168 | "_view_module": "@jupyter-widgets/base",
169 | "_view_module_version": "1.2.0",
170 | "_view_name": "LayoutView",
171 | "align_content": null,
172 | "align_items": null,
173 | "align_self": null,
174 | "border": null,
175 | "bottom": null,
176 | "display": null,
177 | "flex": null,
178 | "flex_flow": null,
179 | "grid_area": null,
180 | "grid_auto_columns": null,
181 | "grid_auto_flow": null,
182 | "grid_auto_rows": null,
183 | "grid_column": null,
184 | "grid_gap": null,
185 | "grid_row": null,
186 | "grid_template_areas": null,
187 | "grid_template_columns": null,
188 | "grid_template_rows": null,
189 | "height": null,
190 | "justify_content": null,
191 | "justify_items": null,
192 | "left": null,
193 | "margin": null,
194 | "max_height": null,
195 | "max_width": null,
196 | "min_height": null,
197 | "min_width": null,
198 | "object_fit": null,
199 | "object_position": null,
200 | "order": null,
201 | "overflow": null,
202 | "overflow_x": null,
203 | "overflow_y": null,
204 | "padding": null,
205 | "right": null,
206 | "top": null,
207 | "visibility": null,
208 | "width": null
209 | }
210 | },
211 | "b3fd33e82ba7455981c0499df81cb0d5": {
212 | "model_module": "@jupyter-widgets/controls",
213 | "model_name": "DescriptionStyleModel",
214 | "model_module_version": "1.5.0",
215 | "state": {
216 | "_model_module": "@jupyter-widgets/controls",
217 | "_model_module_version": "1.5.0",
218 | "_model_name": "DescriptionStyleModel",
219 | "_view_count": null,
220 | "_view_module": "@jupyter-widgets/base",
221 | "_view_module_version": "1.2.0",
222 | "_view_name": "StyleView",
223 | "description_width": ""
224 | }
225 | },
226 | "59955fdf275549ca8873c9b053419fd7": {
227 | "model_module": "@jupyter-widgets/base",
228 | "model_name": "LayoutModel",
229 | "model_module_version": "1.2.0",
230 | "state": {
231 | "_model_module": "@jupyter-widgets/base",
232 | "_model_module_version": "1.2.0",
233 | "_model_name": "LayoutModel",
234 | "_view_count": null,
235 | "_view_module": "@jupyter-widgets/base",
236 | "_view_module_version": "1.2.0",
237 | "_view_name": "LayoutView",
238 | "align_content": null,
239 | "align_items": null,
240 | "align_self": null,
241 | "border": null,
242 | "bottom": null,
243 | "display": null,
244 | "flex": null,
245 | "flex_flow": null,
246 | "grid_area": null,
247 | "grid_auto_columns": null,
248 | "grid_auto_flow": null,
249 | "grid_auto_rows": null,
250 | "grid_column": null,
251 | "grid_gap": null,
252 | "grid_row": null,
253 | "grid_template_areas": null,
254 | "grid_template_columns": null,
255 | "grid_template_rows": null,
256 | "height": null,
257 | "justify_content": null,
258 | "justify_items": null,
259 | "left": null,
260 | "margin": null,
261 | "max_height": null,
262 | "max_width": null,
263 | "min_height": null,
264 | "min_width": null,
265 | "object_fit": null,
266 | "object_position": null,
267 | "order": null,
268 | "overflow": null,
269 | "overflow_x": null,
270 | "overflow_y": null,
271 | "padding": null,
272 | "right": null,
273 | "top": null,
274 | "visibility": null,
275 | "width": null
276 | }
277 | },
278 | "5e288264d36f49c58f09b95ae3587e26": {
279 | "model_module": "@jupyter-widgets/controls",
280 | "model_name": "ProgressStyleModel",
281 | "model_module_version": "1.5.0",
282 | "state": {
283 | "_model_module": "@jupyter-widgets/controls",
284 | "_model_module_version": "1.5.0",
285 | "_model_name": "ProgressStyleModel",
286 | "_view_count": null,
287 | "_view_module": "@jupyter-widgets/base",
288 | "_view_module_version": "1.2.0",
289 | "_view_name": "StyleView",
290 | "bar_color": null,
291 | "description_width": ""
292 | }
293 | },
294 | "66bcfafd26e14fb8b333d0da2efc222f": {
295 | "model_module": "@jupyter-widgets/base",
296 | "model_name": "LayoutModel",
297 | "model_module_version": "1.2.0",
298 | "state": {
299 | "_model_module": "@jupyter-widgets/base",
300 | "_model_module_version": "1.2.0",
301 | "_model_name": "LayoutModel",
302 | "_view_count": null,
303 | "_view_module": "@jupyter-widgets/base",
304 | "_view_module_version": "1.2.0",
305 | "_view_name": "LayoutView",
306 | "align_content": null,
307 | "align_items": null,
308 | "align_self": null,
309 | "border": null,
310 | "bottom": null,
311 | "display": null,
312 | "flex": null,
313 | "flex_flow": null,
314 | "grid_area": null,
315 | "grid_auto_columns": null,
316 | "grid_auto_flow": null,
317 | "grid_auto_rows": null,
318 | "grid_column": null,
319 | "grid_gap": null,
320 | "grid_row": null,
321 | "grid_template_areas": null,
322 | "grid_template_columns": null,
323 | "grid_template_rows": null,
324 | "height": null,
325 | "justify_content": null,
326 | "justify_items": null,
327 | "left": null,
328 | "margin": null,
329 | "max_height": null,
330 | "max_width": null,
331 | "min_height": null,
332 | "min_width": null,
333 | "object_fit": null,
334 | "object_position": null,
335 | "order": null,
336 | "overflow": null,
337 | "overflow_x": null,
338 | "overflow_y": null,
339 | "padding": null,
340 | "right": null,
341 | "top": null,
342 | "visibility": null,
343 | "width": null
344 | }
345 | },
346 | "c3deff7939634161bca16addf85406a4": {
347 | "model_module": "@jupyter-widgets/controls",
348 | "model_name": "DescriptionStyleModel",
349 | "model_module_version": "1.5.0",
350 | "state": {
351 | "_model_module": "@jupyter-widgets/controls",
352 | "_model_module_version": "1.5.0",
353 | "_model_name": "DescriptionStyleModel",
354 | "_view_count": null,
355 | "_view_module": "@jupyter-widgets/base",
356 | "_view_module_version": "1.2.0",
357 | "_view_name": "StyleView",
358 | "description_width": ""
359 | }
360 | }
361 | }
362 | },
363 | "accelerator": "GPU",
364 | "gpuClass": "standard"
365 | },
366 | "cells": [
367 | {
368 | "cell_type": "markdown",
369 | "metadata": {
370 | "id": "view-in-github",
371 | "colab_type": "text"
372 | },
373 | "source": [
374 | "
"
375 | ]
376 | },
377 | {
378 | "cell_type": "code",
379 | "execution_count": null,
380 | "metadata": {
381 | "colab": {
382 | "base_uri": "https://localhost:8080/"
383 | },
384 | "id": "15ohrRAbDZ4v",
385 | "outputId": "5d5eec1e-0227-45c0-c9f8-337e02486c1b"
386 | },
387 | "outputs": [
388 | {
389 | "output_type": "stream",
390 | "name": "stdout",
391 | "text": [
392 | "fatal: destination path 'TiLT-Implementation' already exists and is not an empty directory.\n"
393 | ]
394 | }
395 | ],
396 | "source": [
397 | "!git clone https://github.com/uakarsh/TiLT-Implementation.git"
398 | ]
399 | },
400 | {
401 | "cell_type": "code",
402 | "source": [
403 | "!pip install -r /content/TiLT-Implementation/requirements.txt"
404 | ],
405 | "metadata": {
406 | "id": "IlsCNhv3D0hx"
407 | },
408 | "execution_count": null,
409 | "outputs": []
410 | },
411 | {
412 | "cell_type": "code",
413 | "source": [
414 | "import sys\n",
415 | "sys.path.append(\"/content/TiLT-Implementation/src/\")"
416 | ],
417 | "metadata": {
418 | "id": "1W4eZnAcD3Pg"
419 | },
420 | "execution_count": null,
421 | "outputs": []
422 | },
423 | {
424 | "cell_type": "code",
425 | "source": [
426 | "from transformers import AutoTokenizer, AutoConfig\n",
427 | "from datasets import load_dataset\n",
428 | "import torch\n",
429 | "import torch.nn as nn\n",
430 | "\n",
431 | "from dataset import FUNSDDs\n",
432 | "from torchvision import transforms\n",
433 | "from tqdm.auto import tqdm\n",
434 | "\n",
435 | "## Custom imports\n",
436 | "from visual_backbone import Unet_encoder, RoIPool\n",
437 | "from t5 import T5ForConditionalGeneration, T5Stack\n",
438 | "from transformers import AutoModel"
439 | ],
440 | "metadata": {
441 | "id": "hshzmmrID39p"
442 | },
443 | "execution_count": null,
444 | "outputs": []
445 | },
446 | {
447 | "cell_type": "markdown",
448 | "source": [
449 | "## 1.1. Preparing the dataset"
450 | ],
451 | "metadata": {
452 | "id": "uPeVZmMeEbyf"
453 | }
454 | },
455 | {
456 | "cell_type": "code",
457 | "source": [
458 | "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
459 | "\n",
460 | "hf_ds = load_dataset(\"nielsr/funsd-layoutlmv3\")\n",
461 | "model_name = \"t5-base\"\n",
462 | "## Visual Embedding extractor's parameters\n",
463 | "in_channels = 3\n",
464 | "num_pool_layers = 3\n",
465 | "channels = 16\n",
466 | "sampling_ratio = 2\n",
467 | "spatial_scale = 48 / 384\n",
468 | "output_size = (3,3)\n",
469 | "load_weights = True\n",
470 | "\n",
471 | "## Tokenizer's parameter\n",
472 | "model_max_length = 512\n",
473 | "\n",
474 | "t5_config = AutoConfig.from_pretrained(model_name)\n",
475 | "## Adding new parameters\n",
476 | "t5_config.update(dict(in_channels = in_channels, num_pool_layers = num_pool_layers, channels = channels, model_max_length = model_max_length,\n",
477 | " output_size = output_size, spatial_scale = spatial_scale, sampling_ratio = sampling_ratio, use_cache = False, load_weights = load_weights))\n",
478 | "\n",
479 | "## Tokenizer\n",
480 | "tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast = True, model_max_length = model_max_length)"
481 | ],
482 | "metadata": {
483 | "colab": {
484 | "base_uri": "https://localhost:8080/",
485 | "height": 86,
486 | "referenced_widgets": [
487 | "250938dfb71d45e0858cd757df64cd8b",
488 | "20ea30cbafb84c618979bd41e8d921d6",
489 | "ff6f0b419450464a9daf73a41e357930",
490 | "681f7ed3e39442cd8f987db7a131f4e1",
491 | "33af6b7b9d7e4c8d90a9e4d66159f951",
492 | "3959923118c84010b4b0024b28ca2734",
493 | "b3fd33e82ba7455981c0499df81cb0d5",
494 | "59955fdf275549ca8873c9b053419fd7",
495 | "5e288264d36f49c58f09b95ae3587e26",
496 | "66bcfafd26e14fb8b333d0da2efc222f",
497 | "c3deff7939634161bca16addf85406a4"
498 | ]
499 | },
500 | "id": "qzcvnEyYD-KC",
501 | "outputId": "a7f9bab8-de97-400c-edbf-36ba169ab33d"
502 | },
503 | "execution_count": null,
504 | "outputs": [
505 | {
506 | "output_type": "stream",
507 | "name": "stderr",
508 | "text": [
509 | "WARNING:datasets.builder:Found cached dataset funsd-layoutlmv3 (/root/.cache/huggingface/datasets/nielsr___funsd-layoutlmv3/funsd/1.0.0/0e3f4efdfd59aa1c3b4952c517894f7b1fc4d75c12ef01bcc8626a69e41c1bb9)\n"
510 | ]
511 | },
512 | {
513 | "output_type": "display_data",
514 | "data": {
515 | "text/plain": [
516 | " 0%| | 0/2 [00:00, ?it/s]"
517 | ],
518 | "application/vnd.jupyter.widget-view+json": {
519 | "version_major": 2,
520 | "version_minor": 0,
521 | "model_id": "250938dfb71d45e0858cd757df64cd8b"
522 | }
523 | },
524 | "metadata": {}
525 | }
526 | ]
527 | },
528 | {
529 | "cell_type": "code",
530 | "source": [
531 | "def get_id2label_and_label2id():\n",
532 | " label2id = {'O': 0, 'B-HEADER': 1, 'I-HEADER': 2, 'B-QUESTION': 3, 'I-QUESTION': 4, 'B-ANSWER': 5, 'I-ANSWER': 6}\n",
533 | " id2label = {0: 'O', 1: 'B-HEADER', 2: 'I-HEADER', 3: 'B-QUESTION', 4: 'I-QUESTION', 5: 'B-ANSWER', 6: 'I-ANSWER'}\n",
534 | " return id2label, label2id\n",
535 | "\n",
536 | "def convert_id_to_label(list_of_label):\n",
537 | " return [id2label[x] for x in list_of_label]"
538 | ],
539 | "metadata": {
540 | "id": "DOzw5T71EJtJ"
541 | },
542 | "execution_count": null,
543 | "outputs": []
544 | },
545 | {
546 | "cell_type": "code",
547 | "source": [
548 | "id2label, label2id = get_id2label_and_label2id()\n",
549 | "transform = transforms.Compose([transforms.ToTensor(), \n",
550 | " transforms.Lambda(lambda x : 2 * x - 1)])"
551 | ],
552 | "metadata": {
553 | "id": "tObzBFUUEFGo"
554 | },
555 | "execution_count": null,
556 | "outputs": []
557 | },
558 | {
559 | "cell_type": "code",
560 | "source": [
561 | "train_new_tags = list(map(lambda x : convert_id_to_label(x), hf_ds['train']['ner_tags']))\n",
562 | "test_new_tags = list(map(lambda x : convert_id_to_label(x), hf_ds['test']['ner_tags']))"
563 | ],
564 | "metadata": {
565 | "id": "2x4vv43XEOKV"
566 | },
567 | "execution_count": null,
568 | "outputs": []
569 | },
570 | {
571 | "cell_type": "code",
572 | "source": [
573 | "hf_ds['train'] = hf_ds['train'].remove_columns(\"ner_tags\").add_column(\"ner_tags\", train_new_tags)\n",
574 | "hf_ds['test'] = hf_ds['test'].remove_columns(\"ner_tags\").add_column(\"ner_tags\", test_new_tags)"
575 | ],
576 | "metadata": {
577 | "id": "UcmRobRhEQ0E"
578 | },
579 | "execution_count": null,
580 | "outputs": []
581 | },
582 | {
583 | "cell_type": "code",
584 | "source": [
585 | "train_ds = FUNSDDs(hf_ds['train'],tokenizer = tokenizer, transform = transform)\n",
586 | "val_ds = FUNSDDs(hf_ds['test'],tokenizer = tokenizer, transform = transform)"
587 | ],
588 | "metadata": {
589 | "id": "cknoT_1YEQ9A"
590 | },
591 | "execution_count": null,
592 | "outputs": []
593 | },
594 | {
595 | "cell_type": "markdown",
596 | "source": [
597 | "### 1.2 Writing the `collate_fn` for custom handling of the dataloader"
598 | ],
599 | "metadata": {
600 | "id": "8bHthQUuEeVD"
601 | }
602 | },
603 | {
604 | "cell_type": "code",
605 | "source": [
606 | "class CollateFn(object):\n",
607 | " def __init__(self, tokenizer):\n",
608 | " self.tokenizer = tokenizer\n",
609 | "\n",
610 | " def __call__(self, list_of_ds):\n",
611 | " simple_keys = [\"input_ids\", \"attention_mask\", \"bboxes\", \"pixel_values\" ]\n",
612 | " actual_batch = {}\n",
613 | " for key in simple_keys:\n",
614 | " actual_batch[key] = torch.stack([x[key] for x in list_of_ds])\n",
615 | " \n",
616 | " actual_batch['labels'] = self.tokenizer.batch_encode_plus([x['labels'] for x in list_of_ds], return_tensors = 'pt', is_split_into_words = True,\n",
617 | " padding='max_length', truncation = True)['input_ids']\n",
618 | " return actual_batch"
619 | ],
620 | "metadata": {
621 | "id": "fchZF6hxESY6"
622 | },
623 | "execution_count": null,
624 | "outputs": []
625 | },
626 | {
627 | "cell_type": "code",
628 | "source": [
629 | "collate_fn = CollateFn(tokenizer)"
630 | ],
631 | "metadata": {
632 | "id": "HqlPfz8TET7g"
633 | },
634 | "execution_count": null,
635 | "outputs": []
636 | },
637 | {
638 | "cell_type": "code",
639 | "source": [
640 | "# sample_batch_encoding = collate_fn([train_ds[0], train_ds[1]])\n",
641 | "# for key in sample_batch_encoding:\n",
642 | "# sample_batch_encoding[key] = sample_batch_encoding[key].to(device)\n",
643 | "# # print(f\"Key : {key}, has shape : {sample_batch_encoding[key].shape}\")"
644 | ],
645 | "metadata": {
646 | "id": "m26xnsfXEVFJ"
647 | },
648 | "execution_count": null,
649 | "outputs": []
650 | },
651 | {
652 | "cell_type": "markdown",
653 | "source": [
654 | "## 2.1 Preparing the visual model"
655 | ],
656 | "metadata": {
657 | "id": "u5qhZ38OGQDJ"
658 | }
659 | },
660 | {
661 | "cell_type": "code",
662 | "source": [
663 | "class VisualEmbedding(nn.Module):\n",
664 | " def __init__(self, config):\n",
665 | " super().__init__()\n",
666 | " self.unet_encoder = Unet_encoder(in_channels = config.in_channels, channels = config.channels, num_pool_layers = config.num_pool_layers)\n",
667 | " self.roi_pool = RoIPool(output_size = config.output_size, spatial_scale = config.spatial_scale)\n",
668 | " self.proj = nn.Linear(in_features = 128 * 3 * 3, out_features = config.d_model)\n",
669 | " self.config = config\n",
670 | "\n",
671 | " def forward(self, pixel_values, bboxes):\n",
672 | " image_embedding = self.unet_encoder(pixel_values)\n",
673 | " feature_maps_bboxes = self.roi_pool(image_embedding, bboxes).flatten(2)\n",
674 | " projection = self.proj(feature_maps_bboxes)\n",
675 | " return projection"
676 | ],
677 | "metadata": {
678 | "id": "_cVQaYBUEZbv"
679 | },
680 | "execution_count": null,
681 | "outputs": []
682 | },
683 | {
684 | "cell_type": "code",
685 | "source": [
686 | "# visual_embedding_extractor = VisualEmbedding(t5_config).to(device)"
687 | ],
688 | "metadata": {
689 | "id": "-24G_PZ7iBKZ"
690 | },
691 | "execution_count": null,
692 | "outputs": []
693 | },
694 | {
695 | "cell_type": "code",
696 | "source": [
697 | "# visual_embedding = visual_embedding_extractor(pixel_values = sample_batch_encoding['pixel_values'], bboxes = sample_batch_encoding['bboxes'])"
698 | ],
699 | "metadata": {
700 | "id": "yjDdffEsjJiR"
701 | },
702 | "execution_count": null,
703 | "outputs": []
704 | },
705 | {
706 | "cell_type": "markdown",
707 | "source": [
708 | "## 2.2 Preparing the semantic model"
709 | ],
710 | "metadata": {
711 | "id": "jdJzjR8elcoj"
712 | }
713 | },
714 | {
715 | "cell_type": "code",
716 | "source": [
717 | "# t5_model = T5ForConditionalGeneration(t5_config).to(device)"
718 | ],
719 | "metadata": {
720 | "id": "hBUB23n6lrD0"
721 | },
722 | "execution_count": null,
723 | "outputs": []
724 | },
725 | {
726 | "cell_type": "code",
727 | "source": [
728 | "# ## Forward method\n",
729 | "\n",
730 | "# ## Semantic embedding from t5_model's embedding layer\n",
731 | "# semantic_embedding = t5_model.shared(sample_batch_encoding['input_ids'])\n",
732 | "\n",
733 | "# ## Net embedding is addition of both the embeddings\n",
734 | "# total_embedding = visual_embedding + semantic_embedding\n",
735 | "\n",
736 | "# ## This is then fed to t5_model\n",
737 | "# final_output = t5_model(attention_mask = sample_batch_encoding['attention_mask'], inputs_embeds = total_embedding,\n",
738 | "# labels = sample_batch_encoding['labels'])"
739 | ],
740 | "metadata": {
741 | "id": "wQGN4Y-FlvtS"
742 | },
743 | "execution_count": null,
744 | "outputs": []
745 | },
746 | {
747 | "cell_type": "code",
748 | "source": [
749 | "## Some rough work\n",
750 | "\n",
751 | "# pretrained_t5_model = AutoModel.from_pretrained(model_name)\n",
752 | "\n",
753 | "# for (name, param), (name_1, param_1) in zip(pretrained_t5_model.named_parameters(), t5_model.named_parameters()): \n",
754 | "# if name.startswith(\"decoder\"):\n",
755 | "# print(f\"{name} {name_1}\")\n",
756 | "\n",
757 | "# t5_model_sd = t5_model.state_dict()\n",
758 | "# t5_model_sd_keys = t5_model_sd.keys()\n",
759 | "\n",
760 | "# pretrained_t5_model_sd = pretrained_t5_model.state_dict()\n",
761 | "# pretrained_t5_model_sd_keys = pretrained_t5_model_sd.keys()\n",
762 | "\n",
763 | "# t5_model_sd_keys = [k for k in t5_model_sd_keys if not any([\"relative_horizontal_bias\" in k, \"relative_vertical_bias\" in k])] # discard this mask / buffer, not a param\n",
764 | "\n",
765 | "# t5_model.load_state_dict(pretrained_t5_model.state_dict(), strict = False)"
766 | ],
767 | "metadata": {
768 | "id": "nKeRVrzSo6X_"
769 | },
770 | "execution_count": null,
771 | "outputs": []
772 | },
773 | {
774 | "cell_type": "code",
775 | "source": [
776 | "class TiLTTransformer(nn.Module):\n",
777 | " def __init__(self, config):\n",
778 | " super().__init__()\n",
779 | " self.config = config\n",
780 | " self.visual_embedding_extractor = VisualEmbedding(config)\n",
781 | " self.t5_model = T5ForConditionalGeneration(config)\n",
782 | " \n",
783 | "\n",
784 | " def generate(self, batch):\n",
785 | " total_embedding = self.common_step(batch)\n",
786 | " return self.t5_model.generate(input_embeds = total_embedding)\n",
787 | "\n",
788 | " def common_step(self, batch):\n",
789 | " ## Visual embedding\n",
790 | " visual_embedding = self.visual_embedding_extractor(pixel_values = batch['pixel_values'], bboxes = batch['bboxes'])\n",
791 | "\n",
792 | " ## Semantic embedding from t5_model's embedding layer\n",
793 | " semantic_embedding = self.t5_model.shared(batch['input_ids'])\n",
794 | "\n",
795 | " ## Net embedding is addition of both the embeddings\n",
796 | " total_embedding = visual_embedding + semantic_embedding\n",
797 | "\n",
798 | " return total_embedding\n",
799 | "\n",
800 | " def forward(self, batch):\n",
801 | "\n",
802 | " total_embedding = self.common_step(batch)\n",
803 | "\n",
804 | " ## This is then fed to t5_model\n",
805 | " final_output = self.t5_model(attention_mask = batch['attention_mask'], inputs_embeds = total_embedding,\n",
806 | " labels = batch['labels'])\n",
807 | " \n",
808 | " return final_output"
809 | ],
810 | "metadata": {
811 | "id": "SvWFMOIMDoaR"
812 | },
813 | "execution_count": null,
814 | "outputs": []
815 | },
816 | {
817 | "cell_type": "code",
818 | "source": [
819 | "# tilt_model = TiLTTransformer(t5_config).to(device)\n",
820 | "# output = tilt_model(sample_batch_encoding)"
821 | ],
822 | "metadata": {
823 | "id": "vdJstn48Gnzw"
824 | },
825 | "execution_count": null,
826 | "outputs": []
827 | },
828 | {
829 | "cell_type": "markdown",
830 | "source": [
831 | "## Checking out the parameters of all the models that have been mentioned in the paper"
832 | ],
833 | "metadata": {
834 | "id": "fTbS6zL0sLOQ"
835 | }
836 | },
837 | {
838 | "cell_type": "code",
839 | "source": [
840 | "T5_PRETRAINED_MODEL_ARCHIVE_LIST = [\n",
841 | " \"t5-base\",\n",
842 | " # \"t5-large\"\n",
843 | "]\n",
844 | "\n",
845 | "for model_name in T5_PRETRAINED_MODEL_ARCHIVE_LIST:\n",
846 | " t5_config = AutoConfig.from_pretrained(model_name)\n",
847 | " t5_config.update(dict(in_channels = in_channels, num_pool_layers = num_pool_layers, channels = channels, model_max_length = model_max_length,\n",
848 | " output_size = output_size, spatial_scale = spatial_scale, sampling_ratio = sampling_ratio, use_cache = False, load_weights = load_weights))\n",
849 | " tilt_model = TiLTTransformer(t5_config)\n",
850 | " print(f\"Model : {model_name} has {sum(p.numel() for p in tilt_model.parameters()) / 1e6:.4f} M parameters\")"
851 | ],
852 | "metadata": {
853 | "id": "IjqpGJ8BHfdq",
854 | "colab": {
855 | "base_uri": "https://localhost:8080/"
856 | },
857 | "outputId": "daa383cc-1942-4801-af19-30bbd7345d7e"
858 | },
859 | "execution_count": null,
860 | "outputs": [
861 | {
862 | "output_type": "stream",
863 | "name": "stdout",
864 | "text": [
865 | "Weights loaded successfully!\n",
866 | "Model : t5-base has 224.2795 M parameters\n"
867 | ]
868 | }
869 | ]
870 | },
871 | {
872 | "cell_type": "markdown",
873 | "source": [
874 | "## In the paper, they reported: 230M and 780M, and we have got 225M and 740M, not sure, where am I missing, maybe in the visual backbone? But, I guess we can continue with this for now"
875 | ],
876 | "metadata": {
877 | "id": "_YJgtx7ow0-N"
878 | }
879 | },
880 | {
881 | "cell_type": "code",
882 | "source": [
883 | "# from transformers import AutoModel\n",
884 | "# for model_name in T5_PRETRAINED_MODEL_ARCHIVE_LIST:\n",
885 | "# tilt_model = AutoModel.from_pretrained(model_name)\n",
886 | "# print(f\"Model : {model_name} has {sum(p.numel() for p in tilt_model.parameters()) / 1e6:.4f} M parameters\")"
887 | ],
888 | "metadata": {
889 | "id": "XbQBPo9Dsjdt"
890 | },
891 | "execution_count": null,
892 | "outputs": []
893 | },
894 | {
895 | "cell_type": "code",
896 | "source": [],
897 | "metadata": {
898 | "id": "GH8qxtCJtTvs"
899 | },
900 | "execution_count": null,
901 | "outputs": []
902 | }
903 | ]
904 | }
--------------------------------------------------------------------------------
/how_did_i_prepare_the_stuffs/tilt_part_2_3_sample_preparing_funsd_for_t5_dataset.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyOqBqA/c3pyy0P2dEEExwF0",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | },
17 | "widgets": {
18 | "application/vnd.jupyter.widget-state+json": {
19 | "2c08e449950a4d3ea9049a4aaf9f4de7": {
20 | "model_module": "@jupyter-widgets/controls",
21 | "model_name": "HBoxModel",
22 | "model_module_version": "1.5.0",
23 | "state": {
24 | "_dom_classes": [],
25 | "_model_module": "@jupyter-widgets/controls",
26 | "_model_module_version": "1.5.0",
27 | "_model_name": "HBoxModel",
28 | "_view_count": null,
29 | "_view_module": "@jupyter-widgets/controls",
30 | "_view_module_version": "1.5.0",
31 | "_view_name": "HBoxView",
32 | "box_style": "",
33 | "children": [
34 | "IPY_MODEL_759b967f50d84ffd903fab90c0969a6d",
35 | "IPY_MODEL_f97e169933e648378b59c146da88bc3b",
36 | "IPY_MODEL_3432461a6f7446dda1e28ac7870591c6"
37 | ],
38 | "layout": "IPY_MODEL_e41f7aa014de44dca338647c329d2e02"
39 | }
40 | },
41 | "759b967f50d84ffd903fab90c0969a6d": {
42 | "model_module": "@jupyter-widgets/controls",
43 | "model_name": "HTMLModel",
44 | "model_module_version": "1.5.0",
45 | "state": {
46 | "_dom_classes": [],
47 | "_model_module": "@jupyter-widgets/controls",
48 | "_model_module_version": "1.5.0",
49 | "_model_name": "HTMLModel",
50 | "_view_count": null,
51 | "_view_module": "@jupyter-widgets/controls",
52 | "_view_module_version": "1.5.0",
53 | "_view_name": "HTMLView",
54 | "description": "",
55 | "description_tooltip": null,
56 | "layout": "IPY_MODEL_35b3ce19787a47c8b6c2a9e685550108",
57 | "placeholder": "",
58 | "style": "IPY_MODEL_ba5cff7a36bf4ae080f76ec53dfce07c",
59 | "value": "Downloading (…)lve/main/config.json: 100%"
60 | }
61 | },
62 | "f97e169933e648378b59c146da88bc3b": {
63 | "model_module": "@jupyter-widgets/controls",
64 | "model_name": "FloatProgressModel",
65 | "model_module_version": "1.5.0",
66 | "state": {
67 | "_dom_classes": [],
68 | "_model_module": "@jupyter-widgets/controls",
69 | "_model_module_version": "1.5.0",
70 | "_model_name": "FloatProgressModel",
71 | "_view_count": null,
72 | "_view_module": "@jupyter-widgets/controls",
73 | "_view_module_version": "1.5.0",
74 | "_view_name": "ProgressView",
75 | "bar_style": "success",
76 | "description": "",
77 | "description_tooltip": null,
78 | "layout": "IPY_MODEL_c203dd7e0dcc42d889b05755a018c998",
79 | "max": 1208,
80 | "min": 0,
81 | "orientation": "horizontal",
82 | "style": "IPY_MODEL_a40c5f7a73ab422ca89258fa2c98bc52",
83 | "value": 1208
84 | }
85 | },
86 | "3432461a6f7446dda1e28ac7870591c6": {
87 | "model_module": "@jupyter-widgets/controls",
88 | "model_name": "HTMLModel",
89 | "model_module_version": "1.5.0",
90 | "state": {
91 | "_dom_classes": [],
92 | "_model_module": "@jupyter-widgets/controls",
93 | "_model_module_version": "1.5.0",
94 | "_model_name": "HTMLModel",
95 | "_view_count": null,
96 | "_view_module": "@jupyter-widgets/controls",
97 | "_view_module_version": "1.5.0",
98 | "_view_name": "HTMLView",
99 | "description": "",
100 | "description_tooltip": null,
101 | "layout": "IPY_MODEL_1dc415b05d0d4b9e999fddf1dea84562",
102 | "placeholder": "",
103 | "style": "IPY_MODEL_cecac60443284810bac22a9a10819572",
104 | "value": " 1.21k/1.21k [00:00<00:00, 30.3kB/s]"
105 | }
106 | },
107 | "e41f7aa014de44dca338647c329d2e02": {
108 | "model_module": "@jupyter-widgets/base",
109 | "model_name": "LayoutModel",
110 | "model_module_version": "1.2.0",
111 | "state": {
112 | "_model_module": "@jupyter-widgets/base",
113 | "_model_module_version": "1.2.0",
114 | "_model_name": "LayoutModel",
115 | "_view_count": null,
116 | "_view_module": "@jupyter-widgets/base",
117 | "_view_module_version": "1.2.0",
118 | "_view_name": "LayoutView",
119 | "align_content": null,
120 | "align_items": null,
121 | "align_self": null,
122 | "border": null,
123 | "bottom": null,
124 | "display": null,
125 | "flex": null,
126 | "flex_flow": null,
127 | "grid_area": null,
128 | "grid_auto_columns": null,
129 | "grid_auto_flow": null,
130 | "grid_auto_rows": null,
131 | "grid_column": null,
132 | "grid_gap": null,
133 | "grid_row": null,
134 | "grid_template_areas": null,
135 | "grid_template_columns": null,
136 | "grid_template_rows": null,
137 | "height": null,
138 | "justify_content": null,
139 | "justify_items": null,
140 | "left": null,
141 | "margin": null,
142 | "max_height": null,
143 | "max_width": null,
144 | "min_height": null,
145 | "min_width": null,
146 | "object_fit": null,
147 | "object_position": null,
148 | "order": null,
149 | "overflow": null,
150 | "overflow_x": null,
151 | "overflow_y": null,
152 | "padding": null,
153 | "right": null,
154 | "top": null,
155 | "visibility": null,
156 | "width": null
157 | }
158 | },
159 | "35b3ce19787a47c8b6c2a9e685550108": {
160 | "model_module": "@jupyter-widgets/base",
161 | "model_name": "LayoutModel",
162 | "model_module_version": "1.2.0",
163 | "state": {
164 | "_model_module": "@jupyter-widgets/base",
165 | "_model_module_version": "1.2.0",
166 | "_model_name": "LayoutModel",
167 | "_view_count": null,
168 | "_view_module": "@jupyter-widgets/base",
169 | "_view_module_version": "1.2.0",
170 | "_view_name": "LayoutView",
171 | "align_content": null,
172 | "align_items": null,
173 | "align_self": null,
174 | "border": null,
175 | "bottom": null,
176 | "display": null,
177 | "flex": null,
178 | "flex_flow": null,
179 | "grid_area": null,
180 | "grid_auto_columns": null,
181 | "grid_auto_flow": null,
182 | "grid_auto_rows": null,
183 | "grid_column": null,
184 | "grid_gap": null,
185 | "grid_row": null,
186 | "grid_template_areas": null,
187 | "grid_template_columns": null,
188 | "grid_template_rows": null,
189 | "height": null,
190 | "justify_content": null,
191 | "justify_items": null,
192 | "left": null,
193 | "margin": null,
194 | "max_height": null,
195 | "max_width": null,
196 | "min_height": null,
197 | "min_width": null,
198 | "object_fit": null,
199 | "object_position": null,
200 | "order": null,
201 | "overflow": null,
202 | "overflow_x": null,
203 | "overflow_y": null,
204 | "padding": null,
205 | "right": null,
206 | "top": null,
207 | "visibility": null,
208 | "width": null
209 | }
210 | },
211 | "ba5cff7a36bf4ae080f76ec53dfce07c": {
212 | "model_module": "@jupyter-widgets/controls",
213 | "model_name": "DescriptionStyleModel",
214 | "model_module_version": "1.5.0",
215 | "state": {
216 | "_model_module": "@jupyter-widgets/controls",
217 | "_model_module_version": "1.5.0",
218 | "_model_name": "DescriptionStyleModel",
219 | "_view_count": null,
220 | "_view_module": "@jupyter-widgets/base",
221 | "_view_module_version": "1.2.0",
222 | "_view_name": "StyleView",
223 | "description_width": ""
224 | }
225 | },
226 | "c203dd7e0dcc42d889b05755a018c998": {
227 | "model_module": "@jupyter-widgets/base",
228 | "model_name": "LayoutModel",
229 | "model_module_version": "1.2.0",
230 | "state": {
231 | "_model_module": "@jupyter-widgets/base",
232 | "_model_module_version": "1.2.0",
233 | "_model_name": "LayoutModel",
234 | "_view_count": null,
235 | "_view_module": "@jupyter-widgets/base",
236 | "_view_module_version": "1.2.0",
237 | "_view_name": "LayoutView",
238 | "align_content": null,
239 | "align_items": null,
240 | "align_self": null,
241 | "border": null,
242 | "bottom": null,
243 | "display": null,
244 | "flex": null,
245 | "flex_flow": null,
246 | "grid_area": null,
247 | "grid_auto_columns": null,
248 | "grid_auto_flow": null,
249 | "grid_auto_rows": null,
250 | "grid_column": null,
251 | "grid_gap": null,
252 | "grid_row": null,
253 | "grid_template_areas": null,
254 | "grid_template_columns": null,
255 | "grid_template_rows": null,
256 | "height": null,
257 | "justify_content": null,
258 | "justify_items": null,
259 | "left": null,
260 | "margin": null,
261 | "max_height": null,
262 | "max_width": null,
263 | "min_height": null,
264 | "min_width": null,
265 | "object_fit": null,
266 | "object_position": null,
267 | "order": null,
268 | "overflow": null,
269 | "overflow_x": null,
270 | "overflow_y": null,
271 | "padding": null,
272 | "right": null,
273 | "top": null,
274 | "visibility": null,
275 | "width": null
276 | }
277 | },
278 | "a40c5f7a73ab422ca89258fa2c98bc52": {
279 | "model_module": "@jupyter-widgets/controls",
280 | "model_name": "ProgressStyleModel",
281 | "model_module_version": "1.5.0",
282 | "state": {
283 | "_model_module": "@jupyter-widgets/controls",
284 | "_model_module_version": "1.5.0",
285 | "_model_name": "ProgressStyleModel",
286 | "_view_count": null,
287 | "_view_module": "@jupyter-widgets/base",
288 | "_view_module_version": "1.2.0",
289 | "_view_name": "StyleView",
290 | "bar_color": null,
291 | "description_width": ""
292 | }
293 | },
294 | "1dc415b05d0d4b9e999fddf1dea84562": {
295 | "model_module": "@jupyter-widgets/base",
296 | "model_name": "LayoutModel",
297 | "model_module_version": "1.2.0",
298 | "state": {
299 | "_model_module": "@jupyter-widgets/base",
300 | "_model_module_version": "1.2.0",
301 | "_model_name": "LayoutModel",
302 | "_view_count": null,
303 | "_view_module": "@jupyter-widgets/base",
304 | "_view_module_version": "1.2.0",
305 | "_view_name": "LayoutView",
306 | "align_content": null,
307 | "align_items": null,
308 | "align_self": null,
309 | "border": null,
310 | "bottom": null,
311 | "display": null,
312 | "flex": null,
313 | "flex_flow": null,
314 | "grid_area": null,
315 | "grid_auto_columns": null,
316 | "grid_auto_flow": null,
317 | "grid_auto_rows": null,
318 | "grid_column": null,
319 | "grid_gap": null,
320 | "grid_row": null,
321 | "grid_template_areas": null,
322 | "grid_template_columns": null,
323 | "grid_template_rows": null,
324 | "height": null,
325 | "justify_content": null,
326 | "justify_items": null,
327 | "left": null,
328 | "margin": null,
329 | "max_height": null,
330 | "max_width": null,
331 | "min_height": null,
332 | "min_width": null,
333 | "object_fit": null,
334 | "object_position": null,
335 | "order": null,
336 | "overflow": null,
337 | "overflow_x": null,
338 | "overflow_y": null,
339 | "padding": null,
340 | "right": null,
341 | "top": null,
342 | "visibility": null,
343 | "width": null
344 | }
345 | },
346 | "cecac60443284810bac22a9a10819572": {
347 | "model_module": "@jupyter-widgets/controls",
348 | "model_name": "DescriptionStyleModel",
349 | "model_module_version": "1.5.0",
350 | "state": {
351 | "_model_module": "@jupyter-widgets/controls",
352 | "_model_module_version": "1.5.0",
353 | "_model_name": "DescriptionStyleModel",
354 | "_view_count": null,
355 | "_view_module": "@jupyter-widgets/base",
356 | "_view_module_version": "1.2.0",
357 | "_view_name": "StyleView",
358 | "description_width": ""
359 | }
360 | },
361 | "33ec7ec081d24433a72b65c2c6085aaf": {
362 | "model_module": "@jupyter-widgets/controls",
363 | "model_name": "HBoxModel",
364 | "model_module_version": "1.5.0",
365 | "state": {
366 | "_dom_classes": [],
367 | "_model_module": "@jupyter-widgets/controls",
368 | "_model_module_version": "1.5.0",
369 | "_model_name": "HBoxModel",
370 | "_view_count": null,
371 | "_view_module": "@jupyter-widgets/controls",
372 | "_view_module_version": "1.5.0",
373 | "_view_name": "HBoxView",
374 | "box_style": "",
375 | "children": [
376 | "IPY_MODEL_954d298b8a8d43628fc2add59266541e",
377 | "IPY_MODEL_75208c1d1d3247c19baca7c526eb7382",
378 | "IPY_MODEL_6e1cb705ef9d4341b6d61f4cccd6c009"
379 | ],
380 | "layout": "IPY_MODEL_9f9fa275896b446b8e6bd511190fe922"
381 | }
382 | },
383 | "954d298b8a8d43628fc2add59266541e": {
384 | "model_module": "@jupyter-widgets/controls",
385 | "model_name": "HTMLModel",
386 | "model_module_version": "1.5.0",
387 | "state": {
388 | "_dom_classes": [],
389 | "_model_module": "@jupyter-widgets/controls",
390 | "_model_module_version": "1.5.0",
391 | "_model_name": "HTMLModel",
392 | "_view_count": null,
393 | "_view_module": "@jupyter-widgets/controls",
394 | "_view_module_version": "1.5.0",
395 | "_view_name": "HTMLView",
396 | "description": "",
397 | "description_tooltip": null,
398 | "layout": "IPY_MODEL_ffa2df3f95e241fe90f6514049a32669",
399 | "placeholder": "",
400 | "style": "IPY_MODEL_993ea1ee236b4b77bce5def3cb4eba68",
401 | "value": "Downloading (…)ve/main/spiece.model: 100%"
402 | }
403 | },
404 | "75208c1d1d3247c19baca7c526eb7382": {
405 | "model_module": "@jupyter-widgets/controls",
406 | "model_name": "FloatProgressModel",
407 | "model_module_version": "1.5.0",
408 | "state": {
409 | "_dom_classes": [],
410 | "_model_module": "@jupyter-widgets/controls",
411 | "_model_module_version": "1.5.0",
412 | "_model_name": "FloatProgressModel",
413 | "_view_count": null,
414 | "_view_module": "@jupyter-widgets/controls",
415 | "_view_module_version": "1.5.0",
416 | "_view_name": "ProgressView",
417 | "bar_style": "success",
418 | "description": "",
419 | "description_tooltip": null,
420 | "layout": "IPY_MODEL_01b087a18a2346cc97aeaa712e54cef4",
421 | "max": 791656,
422 | "min": 0,
423 | "orientation": "horizontal",
424 | "style": "IPY_MODEL_c0903c5551224beea5edf4609b07514c",
425 | "value": 791656
426 | }
427 | },
428 | "6e1cb705ef9d4341b6d61f4cccd6c009": {
429 | "model_module": "@jupyter-widgets/controls",
430 | "model_name": "HTMLModel",
431 | "model_module_version": "1.5.0",
432 | "state": {
433 | "_dom_classes": [],
434 | "_model_module": "@jupyter-widgets/controls",
435 | "_model_module_version": "1.5.0",
436 | "_model_name": "HTMLModel",
437 | "_view_count": null,
438 | "_view_module": "@jupyter-widgets/controls",
439 | "_view_module_version": "1.5.0",
440 | "_view_name": "HTMLView",
441 | "description": "",
442 | "description_tooltip": null,
443 | "layout": "IPY_MODEL_3ecf6715c92042ccbe672b5e1f758de5",
444 | "placeholder": "",
445 | "style": "IPY_MODEL_3e826539414a4e798e4a6232174e1066",
446 | "value": " 792k/792k [00:00<00:00, 5.33MB/s]"
447 | }
448 | },
449 | "9f9fa275896b446b8e6bd511190fe922": {
450 | "model_module": "@jupyter-widgets/base",
451 | "model_name": "LayoutModel",
452 | "model_module_version": "1.2.0",
453 | "state": {
454 | "_model_module": "@jupyter-widgets/base",
455 | "_model_module_version": "1.2.0",
456 | "_model_name": "LayoutModel",
457 | "_view_count": null,
458 | "_view_module": "@jupyter-widgets/base",
459 | "_view_module_version": "1.2.0",
460 | "_view_name": "LayoutView",
461 | "align_content": null,
462 | "align_items": null,
463 | "align_self": null,
464 | "border": null,
465 | "bottom": null,
466 | "display": null,
467 | "flex": null,
468 | "flex_flow": null,
469 | "grid_area": null,
470 | "grid_auto_columns": null,
471 | "grid_auto_flow": null,
472 | "grid_auto_rows": null,
473 | "grid_column": null,
474 | "grid_gap": null,
475 | "grid_row": null,
476 | "grid_template_areas": null,
477 | "grid_template_columns": null,
478 | "grid_template_rows": null,
479 | "height": null,
480 | "justify_content": null,
481 | "justify_items": null,
482 | "left": null,
483 | "margin": null,
484 | "max_height": null,
485 | "max_width": null,
486 | "min_height": null,
487 | "min_width": null,
488 | "object_fit": null,
489 | "object_position": null,
490 | "order": null,
491 | "overflow": null,
492 | "overflow_x": null,
493 | "overflow_y": null,
494 | "padding": null,
495 | "right": null,
496 | "top": null,
497 | "visibility": null,
498 | "width": null
499 | }
500 | },
501 | "ffa2df3f95e241fe90f6514049a32669": {
502 | "model_module": "@jupyter-widgets/base",
503 | "model_name": "LayoutModel",
504 | "model_module_version": "1.2.0",
505 | "state": {
506 | "_model_module": "@jupyter-widgets/base",
507 | "_model_module_version": "1.2.0",
508 | "_model_name": "LayoutModel",
509 | "_view_count": null,
510 | "_view_module": "@jupyter-widgets/base",
511 | "_view_module_version": "1.2.0",
512 | "_view_name": "LayoutView",
513 | "align_content": null,
514 | "align_items": null,
515 | "align_self": null,
516 | "border": null,
517 | "bottom": null,
518 | "display": null,
519 | "flex": null,
520 | "flex_flow": null,
521 | "grid_area": null,
522 | "grid_auto_columns": null,
523 | "grid_auto_flow": null,
524 | "grid_auto_rows": null,
525 | "grid_column": null,
526 | "grid_gap": null,
527 | "grid_row": null,
528 | "grid_template_areas": null,
529 | "grid_template_columns": null,
530 | "grid_template_rows": null,
531 | "height": null,
532 | "justify_content": null,
533 | "justify_items": null,
534 | "left": null,
535 | "margin": null,
536 | "max_height": null,
537 | "max_width": null,
538 | "min_height": null,
539 | "min_width": null,
540 | "object_fit": null,
541 | "object_position": null,
542 | "order": null,
543 | "overflow": null,
544 | "overflow_x": null,
545 | "overflow_y": null,
546 | "padding": null,
547 | "right": null,
548 | "top": null,
549 | "visibility": null,
550 | "width": null
551 | }
552 | },
553 | "993ea1ee236b4b77bce5def3cb4eba68": {
554 | "model_module": "@jupyter-widgets/controls",
555 | "model_name": "DescriptionStyleModel",
556 | "model_module_version": "1.5.0",
557 | "state": {
558 | "_model_module": "@jupyter-widgets/controls",
559 | "_model_module_version": "1.5.0",
560 | "_model_name": "DescriptionStyleModel",
561 | "_view_count": null,
562 | "_view_module": "@jupyter-widgets/base",
563 | "_view_module_version": "1.2.0",
564 | "_view_name": "StyleView",
565 | "description_width": ""
566 | }
567 | },
568 | "01b087a18a2346cc97aeaa712e54cef4": {
569 | "model_module": "@jupyter-widgets/base",
570 | "model_name": "LayoutModel",
571 | "model_module_version": "1.2.0",
572 | "state": {
573 | "_model_module": "@jupyter-widgets/base",
574 | "_model_module_version": "1.2.0",
575 | "_model_name": "LayoutModel",
576 | "_view_count": null,
577 | "_view_module": "@jupyter-widgets/base",
578 | "_view_module_version": "1.2.0",
579 | "_view_name": "LayoutView",
580 | "align_content": null,
581 | "align_items": null,
582 | "align_self": null,
583 | "border": null,
584 | "bottom": null,
585 | "display": null,
586 | "flex": null,
587 | "flex_flow": null,
588 | "grid_area": null,
589 | "grid_auto_columns": null,
590 | "grid_auto_flow": null,
591 | "grid_auto_rows": null,
592 | "grid_column": null,
593 | "grid_gap": null,
594 | "grid_row": null,
595 | "grid_template_areas": null,
596 | "grid_template_columns": null,
597 | "grid_template_rows": null,
598 | "height": null,
599 | "justify_content": null,
600 | "justify_items": null,
601 | "left": null,
602 | "margin": null,
603 | "max_height": null,
604 | "max_width": null,
605 | "min_height": null,
606 | "min_width": null,
607 | "object_fit": null,
608 | "object_position": null,
609 | "order": null,
610 | "overflow": null,
611 | "overflow_x": null,
612 | "overflow_y": null,
613 | "padding": null,
614 | "right": null,
615 | "top": null,
616 | "visibility": null,
617 | "width": null
618 | }
619 | },
620 | "c0903c5551224beea5edf4609b07514c": {
621 | "model_module": "@jupyter-widgets/controls",
622 | "model_name": "ProgressStyleModel",
623 | "model_module_version": "1.5.0",
624 | "state": {
625 | "_model_module": "@jupyter-widgets/controls",
626 | "_model_module_version": "1.5.0",
627 | "_model_name": "ProgressStyleModel",
628 | "_view_count": null,
629 | "_view_module": "@jupyter-widgets/base",
630 | "_view_module_version": "1.2.0",
631 | "_view_name": "StyleView",
632 | "bar_color": null,
633 | "description_width": ""
634 | }
635 | },
636 | "3ecf6715c92042ccbe672b5e1f758de5": {
637 | "model_module": "@jupyter-widgets/base",
638 | "model_name": "LayoutModel",
639 | "model_module_version": "1.2.0",
640 | "state": {
641 | "_model_module": "@jupyter-widgets/base",
642 | "_model_module_version": "1.2.0",
643 | "_model_name": "LayoutModel",
644 | "_view_count": null,
645 | "_view_module": "@jupyter-widgets/base",
646 | "_view_module_version": "1.2.0",
647 | "_view_name": "LayoutView",
648 | "align_content": null,
649 | "align_items": null,
650 | "align_self": null,
651 | "border": null,
652 | "bottom": null,
653 | "display": null,
654 | "flex": null,
655 | "flex_flow": null,
656 | "grid_area": null,
657 | "grid_auto_columns": null,
658 | "grid_auto_flow": null,
659 | "grid_auto_rows": null,
660 | "grid_column": null,
661 | "grid_gap": null,
662 | "grid_row": null,
663 | "grid_template_areas": null,
664 | "grid_template_columns": null,
665 | "grid_template_rows": null,
666 | "height": null,
667 | "justify_content": null,
668 | "justify_items": null,
669 | "left": null,
670 | "margin": null,
671 | "max_height": null,
672 | "max_width": null,
673 | "min_height": null,
674 | "min_width": null,
675 | "object_fit": null,
676 | "object_position": null,
677 | "order": null,
678 | "overflow": null,
679 | "overflow_x": null,
680 | "overflow_y": null,
681 | "padding": null,
682 | "right": null,
683 | "top": null,
684 | "visibility": null,
685 | "width": null
686 | }
687 | },
688 | "3e826539414a4e798e4a6232174e1066": {
689 | "model_module": "@jupyter-widgets/controls",
690 | "model_name": "DescriptionStyleModel",
691 | "model_module_version": "1.5.0",
692 | "state": {
693 | "_model_module": "@jupyter-widgets/controls",
694 | "_model_module_version": "1.5.0",
695 | "_model_name": "DescriptionStyleModel",
696 | "_view_count": null,
697 | "_view_module": "@jupyter-widgets/base",
698 | "_view_module_version": "1.2.0",
699 | "_view_name": "StyleView",
700 | "description_width": ""
701 | }
702 | },
703 | "242efb0ea706421c994793f87ece24eb": {
704 | "model_module": "@jupyter-widgets/controls",
705 | "model_name": "HBoxModel",
706 | "model_module_version": "1.5.0",
707 | "state": {
708 | "_dom_classes": [],
709 | "_model_module": "@jupyter-widgets/controls",
710 | "_model_module_version": "1.5.0",
711 | "_model_name": "HBoxModel",
712 | "_view_count": null,
713 | "_view_module": "@jupyter-widgets/controls",
714 | "_view_module_version": "1.5.0",
715 | "_view_name": "HBoxView",
716 | "box_style": "",
717 | "children": [
718 | "IPY_MODEL_56e42bcb2df74e2a8e990da26760da6f",
719 | "IPY_MODEL_9aef638dd3db437ab712c99cbb57433d",
720 | "IPY_MODEL_c5371258333d471abc1a9e73b23573a0"
721 | ],
722 | "layout": "IPY_MODEL_69cb8cdc043144148dd599569da87d25"
723 | }
724 | },
725 | "56e42bcb2df74e2a8e990da26760da6f": {
726 | "model_module": "@jupyter-widgets/controls",
727 | "model_name": "HTMLModel",
728 | "model_module_version": "1.5.0",
729 | "state": {
730 | "_dom_classes": [],
731 | "_model_module": "@jupyter-widgets/controls",
732 | "_model_module_version": "1.5.0",
733 | "_model_name": "HTMLModel",
734 | "_view_count": null,
735 | "_view_module": "@jupyter-widgets/controls",
736 | "_view_module_version": "1.5.0",
737 | "_view_name": "HTMLView",
738 | "description": "",
739 | "description_tooltip": null,
740 | "layout": "IPY_MODEL_3a970557b24940d29eea9ab77b620a0b",
741 | "placeholder": "",
742 | "style": "IPY_MODEL_aece0cbdf6d04e29a3484a71ae74a971",
743 | "value": "Downloading (…)/main/tokenizer.json: 100%"
744 | }
745 | },
746 | "9aef638dd3db437ab712c99cbb57433d": {
747 | "model_module": "@jupyter-widgets/controls",
748 | "model_name": "FloatProgressModel",
749 | "model_module_version": "1.5.0",
750 | "state": {
751 | "_dom_classes": [],
752 | "_model_module": "@jupyter-widgets/controls",
753 | "_model_module_version": "1.5.0",
754 | "_model_name": "FloatProgressModel",
755 | "_view_count": null,
756 | "_view_module": "@jupyter-widgets/controls",
757 | "_view_module_version": "1.5.0",
758 | "_view_name": "ProgressView",
759 | "bar_style": "success",
760 | "description": "",
761 | "description_tooltip": null,
762 | "layout": "IPY_MODEL_9e1768f313ef4f8f9503a6e9c64bb707",
763 | "max": 1389353,
764 | "min": 0,
765 | "orientation": "horizontal",
766 | "style": "IPY_MODEL_09d9547e80e44ce5981f66190bd3bb5c",
767 | "value": 1389353
768 | }
769 | },
770 | "c5371258333d471abc1a9e73b23573a0": {
771 | "model_module": "@jupyter-widgets/controls",
772 | "model_name": "HTMLModel",
773 | "model_module_version": "1.5.0",
774 | "state": {
775 | "_dom_classes": [],
776 | "_model_module": "@jupyter-widgets/controls",
777 | "_model_module_version": "1.5.0",
778 | "_model_name": "HTMLModel",
779 | "_view_count": null,
780 | "_view_module": "@jupyter-widgets/controls",
781 | "_view_module_version": "1.5.0",
782 | "_view_name": "HTMLView",
783 | "description": "",
784 | "description_tooltip": null,
785 | "layout": "IPY_MODEL_41c267d7ae7f4de183d8ffbd60d47b36",
786 | "placeholder": "",
787 | "style": "IPY_MODEL_656c59f5ea564ff994112595236213b9",
788 | "value": " 1.39M/1.39M [00:00<00:00, 6.41MB/s]"
789 | }
790 | },
791 | "69cb8cdc043144148dd599569da87d25": {
792 | "model_module": "@jupyter-widgets/base",
793 | "model_name": "LayoutModel",
794 | "model_module_version": "1.2.0",
795 | "state": {
796 | "_model_module": "@jupyter-widgets/base",
797 | "_model_module_version": "1.2.0",
798 | "_model_name": "LayoutModel",
799 | "_view_count": null,
800 | "_view_module": "@jupyter-widgets/base",
801 | "_view_module_version": "1.2.0",
802 | "_view_name": "LayoutView",
803 | "align_content": null,
804 | "align_items": null,
805 | "align_self": null,
806 | "border": null,
807 | "bottom": null,
808 | "display": null,
809 | "flex": null,
810 | "flex_flow": null,
811 | "grid_area": null,
812 | "grid_auto_columns": null,
813 | "grid_auto_flow": null,
814 | "grid_auto_rows": null,
815 | "grid_column": null,
816 | "grid_gap": null,
817 | "grid_row": null,
818 | "grid_template_areas": null,
819 | "grid_template_columns": null,
820 | "grid_template_rows": null,
821 | "height": null,
822 | "justify_content": null,
823 | "justify_items": null,
824 | "left": null,
825 | "margin": null,
826 | "max_height": null,
827 | "max_width": null,
828 | "min_height": null,
829 | "min_width": null,
830 | "object_fit": null,
831 | "object_position": null,
832 | "order": null,
833 | "overflow": null,
834 | "overflow_x": null,
835 | "overflow_y": null,
836 | "padding": null,
837 | "right": null,
838 | "top": null,
839 | "visibility": null,
840 | "width": null
841 | }
842 | },
843 | "3a970557b24940d29eea9ab77b620a0b": {
844 | "model_module": "@jupyter-widgets/base",
845 | "model_name": "LayoutModel",
846 | "model_module_version": "1.2.0",
847 | "state": {
848 | "_model_module": "@jupyter-widgets/base",
849 | "_model_module_version": "1.2.0",
850 | "_model_name": "LayoutModel",
851 | "_view_count": null,
852 | "_view_module": "@jupyter-widgets/base",
853 | "_view_module_version": "1.2.0",
854 | "_view_name": "LayoutView",
855 | "align_content": null,
856 | "align_items": null,
857 | "align_self": null,
858 | "border": null,
859 | "bottom": null,
860 | "display": null,
861 | "flex": null,
862 | "flex_flow": null,
863 | "grid_area": null,
864 | "grid_auto_columns": null,
865 | "grid_auto_flow": null,
866 | "grid_auto_rows": null,
867 | "grid_column": null,
868 | "grid_gap": null,
869 | "grid_row": null,
870 | "grid_template_areas": null,
871 | "grid_template_columns": null,
872 | "grid_template_rows": null,
873 | "height": null,
874 | "justify_content": null,
875 | "justify_items": null,
876 | "left": null,
877 | "margin": null,
878 | "max_height": null,
879 | "max_width": null,
880 | "min_height": null,
881 | "min_width": null,
882 | "object_fit": null,
883 | "object_position": null,
884 | "order": null,
885 | "overflow": null,
886 | "overflow_x": null,
887 | "overflow_y": null,
888 | "padding": null,
889 | "right": null,
890 | "top": null,
891 | "visibility": null,
892 | "width": null
893 | }
894 | },
895 | "aece0cbdf6d04e29a3484a71ae74a971": {
896 | "model_module": "@jupyter-widgets/controls",
897 | "model_name": "DescriptionStyleModel",
898 | "model_module_version": "1.5.0",
899 | "state": {
900 | "_model_module": "@jupyter-widgets/controls",
901 | "_model_module_version": "1.5.0",
902 | "_model_name": "DescriptionStyleModel",
903 | "_view_count": null,
904 | "_view_module": "@jupyter-widgets/base",
905 | "_view_module_version": "1.2.0",
906 | "_view_name": "StyleView",
907 | "description_width": ""
908 | }
909 | },
910 | "9e1768f313ef4f8f9503a6e9c64bb707": {
911 | "model_module": "@jupyter-widgets/base",
912 | "model_name": "LayoutModel",
913 | "model_module_version": "1.2.0",
914 | "state": {
915 | "_model_module": "@jupyter-widgets/base",
916 | "_model_module_version": "1.2.0",
917 | "_model_name": "LayoutModel",
918 | "_view_count": null,
919 | "_view_module": "@jupyter-widgets/base",
920 | "_view_module_version": "1.2.0",
921 | "_view_name": "LayoutView",
922 | "align_content": null,
923 | "align_items": null,
924 | "align_self": null,
925 | "border": null,
926 | "bottom": null,
927 | "display": null,
928 | "flex": null,
929 | "flex_flow": null,
930 | "grid_area": null,
931 | "grid_auto_columns": null,
932 | "grid_auto_flow": null,
933 | "grid_auto_rows": null,
934 | "grid_column": null,
935 | "grid_gap": null,
936 | "grid_row": null,
937 | "grid_template_areas": null,
938 | "grid_template_columns": null,
939 | "grid_template_rows": null,
940 | "height": null,
941 | "justify_content": null,
942 | "justify_items": null,
943 | "left": null,
944 | "margin": null,
945 | "max_height": null,
946 | "max_width": null,
947 | "min_height": null,
948 | "min_width": null,
949 | "object_fit": null,
950 | "object_position": null,
951 | "order": null,
952 | "overflow": null,
953 | "overflow_x": null,
954 | "overflow_y": null,
955 | "padding": null,
956 | "right": null,
957 | "top": null,
958 | "visibility": null,
959 | "width": null
960 | }
961 | },
962 | "09d9547e80e44ce5981f66190bd3bb5c": {
963 | "model_module": "@jupyter-widgets/controls",
964 | "model_name": "ProgressStyleModel",
965 | "model_module_version": "1.5.0",
966 | "state": {
967 | "_model_module": "@jupyter-widgets/controls",
968 | "_model_module_version": "1.5.0",
969 | "_model_name": "ProgressStyleModel",
970 | "_view_count": null,
971 | "_view_module": "@jupyter-widgets/base",
972 | "_view_module_version": "1.2.0",
973 | "_view_name": "StyleView",
974 | "bar_color": null,
975 | "description_width": ""
976 | }
977 | },
978 | "41c267d7ae7f4de183d8ffbd60d47b36": {
979 | "model_module": "@jupyter-widgets/base",
980 | "model_name": "LayoutModel",
981 | "model_module_version": "1.2.0",
982 | "state": {
983 | "_model_module": "@jupyter-widgets/base",
984 | "_model_module_version": "1.2.0",
985 | "_model_name": "LayoutModel",
986 | "_view_count": null,
987 | "_view_module": "@jupyter-widgets/base",
988 | "_view_module_version": "1.2.0",
989 | "_view_name": "LayoutView",
990 | "align_content": null,
991 | "align_items": null,
992 | "align_self": null,
993 | "border": null,
994 | "bottom": null,
995 | "display": null,
996 | "flex": null,
997 | "flex_flow": null,
998 | "grid_area": null,
999 | "grid_auto_columns": null,
1000 | "grid_auto_flow": null,
1001 | "grid_auto_rows": null,
1002 | "grid_column": null,
1003 | "grid_gap": null,
1004 | "grid_row": null,
1005 | "grid_template_areas": null,
1006 | "grid_template_columns": null,
1007 | "grid_template_rows": null,
1008 | "height": null,
1009 | "justify_content": null,
1010 | "justify_items": null,
1011 | "left": null,
1012 | "margin": null,
1013 | "max_height": null,
1014 | "max_width": null,
1015 | "min_height": null,
1016 | "min_width": null,
1017 | "object_fit": null,
1018 | "object_position": null,
1019 | "order": null,
1020 | "overflow": null,
1021 | "overflow_x": null,
1022 | "overflow_y": null,
1023 | "padding": null,
1024 | "right": null,
1025 | "top": null,
1026 | "visibility": null,
1027 | "width": null
1028 | }
1029 | },
1030 | "656c59f5ea564ff994112595236213b9": {
1031 | "model_module": "@jupyter-widgets/controls",
1032 | "model_name": "DescriptionStyleModel",
1033 | "model_module_version": "1.5.0",
1034 | "state": {
1035 | "_model_module": "@jupyter-widgets/controls",
1036 | "_model_module_version": "1.5.0",
1037 | "_model_name": "DescriptionStyleModel",
1038 | "_view_count": null,
1039 | "_view_module": "@jupyter-widgets/base",
1040 | "_view_module_version": "1.2.0",
1041 | "_view_name": "StyleView",
1042 | "description_width": ""
1043 | }
1044 | },
1045 | "d36fe23be9334b53b2715f4a39576ec5": {
1046 | "model_module": "@jupyter-widgets/controls",
1047 | "model_name": "HBoxModel",
1048 | "model_module_version": "1.5.0",
1049 | "state": {
1050 | "_dom_classes": [],
1051 | "_model_module": "@jupyter-widgets/controls",
1052 | "_model_module_version": "1.5.0",
1053 | "_model_name": "HBoxModel",
1054 | "_view_count": null,
1055 | "_view_module": "@jupyter-widgets/controls",
1056 | "_view_module_version": "1.5.0",
1057 | "_view_name": "HBoxView",
1058 | "box_style": "",
1059 | "children": [
1060 | "IPY_MODEL_e23b54cc65e844548e19c784102725dd",
1061 | "IPY_MODEL_65d64f04508745b0ac817be4840b494c",
1062 | "IPY_MODEL_630b162223734bad92bffddb22a81534"
1063 | ],
1064 | "layout": "IPY_MODEL_ab4241cadba948d7946497b1fe33fd71"
1065 | }
1066 | },
1067 | "e23b54cc65e844548e19c784102725dd": {
1068 | "model_module": "@jupyter-widgets/controls",
1069 | "model_name": "HTMLModel",
1070 | "model_module_version": "1.5.0",
1071 | "state": {
1072 | "_dom_classes": [],
1073 | "_model_module": "@jupyter-widgets/controls",
1074 | "_model_module_version": "1.5.0",
1075 | "_model_name": "HTMLModel",
1076 | "_view_count": null,
1077 | "_view_module": "@jupyter-widgets/controls",
1078 | "_view_module_version": "1.5.0",
1079 | "_view_name": "HTMLView",
1080 | "description": "",
1081 | "description_tooltip": null,
1082 | "layout": "IPY_MODEL_8e9c9c88077649b1b92d20e1844d3136",
1083 | "placeholder": "",
1084 | "style": "IPY_MODEL_1ecd41da2f354b7794abc085827a732d",
1085 | "value": "Downloading builder script: 100%"
1086 | }
1087 | },
1088 | "65d64f04508745b0ac817be4840b494c": {
1089 | "model_module": "@jupyter-widgets/controls",
1090 | "model_name": "FloatProgressModel",
1091 | "model_module_version": "1.5.0",
1092 | "state": {
1093 | "_dom_classes": [],
1094 | "_model_module": "@jupyter-widgets/controls",
1095 | "_model_module_version": "1.5.0",
1096 | "_model_name": "FloatProgressModel",
1097 | "_view_count": null,
1098 | "_view_module": "@jupyter-widgets/controls",
1099 | "_view_module_version": "1.5.0",
1100 | "_view_name": "ProgressView",
1101 | "bar_style": "success",
1102 | "description": "",
1103 | "description_tooltip": null,
1104 | "layout": "IPY_MODEL_9532d70512f041958b6e8655c303430e",
1105 | "max": 5133,
1106 | "min": 0,
1107 | "orientation": "horizontal",
1108 | "style": "IPY_MODEL_17cb083323e74fa2a018dbb0b061e47b",
1109 | "value": 5133
1110 | }
1111 | },
1112 | "630b162223734bad92bffddb22a81534": {
1113 | "model_module": "@jupyter-widgets/controls",
1114 | "model_name": "HTMLModel",
1115 | "model_module_version": "1.5.0",
1116 | "state": {
1117 | "_dom_classes": [],
1118 | "_model_module": "@jupyter-widgets/controls",
1119 | "_model_module_version": "1.5.0",
1120 | "_model_name": "HTMLModel",
1121 | "_view_count": null,
1122 | "_view_module": "@jupyter-widgets/controls",
1123 | "_view_module_version": "1.5.0",
1124 | "_view_name": "HTMLView",
1125 | "description": "",
1126 | "description_tooltip": null,
1127 | "layout": "IPY_MODEL_d0239f9b7577416597f5ee8e0287415c",
1128 | "placeholder": "",
1129 | "style": "IPY_MODEL_0ff9aad8a1d34926a3fdf280a24126ba",
1130 | "value": " 5.13k/5.13k [00:00<00:00, 115kB/s]"
1131 | }
1132 | },
1133 | "ab4241cadba948d7946497b1fe33fd71": {
1134 | "model_module": "@jupyter-widgets/base",
1135 | "model_name": "LayoutModel",
1136 | "model_module_version": "1.2.0",
1137 | "state": {
1138 | "_model_module": "@jupyter-widgets/base",
1139 | "_model_module_version": "1.2.0",
1140 | "_model_name": "LayoutModel",
1141 | "_view_count": null,
1142 | "_view_module": "@jupyter-widgets/base",
1143 | "_view_module_version": "1.2.0",
1144 | "_view_name": "LayoutView",
1145 | "align_content": null,
1146 | "align_items": null,
1147 | "align_self": null,
1148 | "border": null,
1149 | "bottom": null,
1150 | "display": null,
1151 | "flex": null,
1152 | "flex_flow": null,
1153 | "grid_area": null,
1154 | "grid_auto_columns": null,
1155 | "grid_auto_flow": null,
1156 | "grid_auto_rows": null,
1157 | "grid_column": null,
1158 | "grid_gap": null,
1159 | "grid_row": null,
1160 | "grid_template_areas": null,
1161 | "grid_template_columns": null,
1162 | "grid_template_rows": null,
1163 | "height": null,
1164 | "justify_content": null,
1165 | "justify_items": null,
1166 | "left": null,
1167 | "margin": null,
1168 | "max_height": null,
1169 | "max_width": null,
1170 | "min_height": null,
1171 | "min_width": null,
1172 | "object_fit": null,
1173 | "object_position": null,
1174 | "order": null,
1175 | "overflow": null,
1176 | "overflow_x": null,
1177 | "overflow_y": null,
1178 | "padding": null,
1179 | "right": null,
1180 | "top": null,
1181 | "visibility": null,
1182 | "width": null
1183 | }
1184 | },
1185 | "8e9c9c88077649b1b92d20e1844d3136": {
1186 | "model_module": "@jupyter-widgets/base",
1187 | "model_name": "LayoutModel",
1188 | "model_module_version": "1.2.0",
1189 | "state": {
1190 | "_model_module": "@jupyter-widgets/base",
1191 | "_model_module_version": "1.2.0",
1192 | "_model_name": "LayoutModel",
1193 | "_view_count": null,
1194 | "_view_module": "@jupyter-widgets/base",
1195 | "_view_module_version": "1.2.0",
1196 | "_view_name": "LayoutView",
1197 | "align_content": null,
1198 | "align_items": null,
1199 | "align_self": null,
1200 | "border": null,
1201 | "bottom": null,
1202 | "display": null,
1203 | "flex": null,
1204 | "flex_flow": null,
1205 | "grid_area": null,
1206 | "grid_auto_columns": null,
1207 | "grid_auto_flow": null,
1208 | "grid_auto_rows": null,
1209 | "grid_column": null,
1210 | "grid_gap": null,
1211 | "grid_row": null,
1212 | "grid_template_areas": null,
1213 | "grid_template_columns": null,
1214 | "grid_template_rows": null,
1215 | "height": null,
1216 | "justify_content": null,
1217 | "justify_items": null,
1218 | "left": null,
1219 | "margin": null,
1220 | "max_height": null,
1221 | "max_width": null,
1222 | "min_height": null,
1223 | "min_width": null,
1224 | "object_fit": null,
1225 | "object_position": null,
1226 | "order": null,
1227 | "overflow": null,
1228 | "overflow_x": null,
1229 | "overflow_y": null,
1230 | "padding": null,
1231 | "right": null,
1232 | "top": null,
1233 | "visibility": null,
1234 | "width": null
1235 | }
1236 | },
1237 | "1ecd41da2f354b7794abc085827a732d": {
1238 | "model_module": "@jupyter-widgets/controls",
1239 | "model_name": "DescriptionStyleModel",
1240 | "model_module_version": "1.5.0",
1241 | "state": {
1242 | "_model_module": "@jupyter-widgets/controls",
1243 | "_model_module_version": "1.5.0",
1244 | "_model_name": "DescriptionStyleModel",
1245 | "_view_count": null,
1246 | "_view_module": "@jupyter-widgets/base",
1247 | "_view_module_version": "1.2.0",
1248 | "_view_name": "StyleView",
1249 | "description_width": ""
1250 | }
1251 | },
1252 | "9532d70512f041958b6e8655c303430e": {
1253 | "model_module": "@jupyter-widgets/base",
1254 | "model_name": "LayoutModel",
1255 | "model_module_version": "1.2.0",
1256 | "state": {
1257 | "_model_module": "@jupyter-widgets/base",
1258 | "_model_module_version": "1.2.0",
1259 | "_model_name": "LayoutModel",
1260 | "_view_count": null,
1261 | "_view_module": "@jupyter-widgets/base",
1262 | "_view_module_version": "1.2.0",
1263 | "_view_name": "LayoutView",
1264 | "align_content": null,
1265 | "align_items": null,
1266 | "align_self": null,
1267 | "border": null,
1268 | "bottom": null,
1269 | "display": null,
1270 | "flex": null,
1271 | "flex_flow": null,
1272 | "grid_area": null,
1273 | "grid_auto_columns": null,
1274 | "grid_auto_flow": null,
1275 | "grid_auto_rows": null,
1276 | "grid_column": null,
1277 | "grid_gap": null,
1278 | "grid_row": null,
1279 | "grid_template_areas": null,
1280 | "grid_template_columns": null,
1281 | "grid_template_rows": null,
1282 | "height": null,
1283 | "justify_content": null,
1284 | "justify_items": null,
1285 | "left": null,
1286 | "margin": null,
1287 | "max_height": null,
1288 | "max_width": null,
1289 | "min_height": null,
1290 | "min_width": null,
1291 | "object_fit": null,
1292 | "object_position": null,
1293 | "order": null,
1294 | "overflow": null,
1295 | "overflow_x": null,
1296 | "overflow_y": null,
1297 | "padding": null,
1298 | "right": null,
1299 | "top": null,
1300 | "visibility": null,
1301 | "width": null
1302 | }
1303 | },
1304 | "17cb083323e74fa2a018dbb0b061e47b": {
1305 | "model_module": "@jupyter-widgets/controls",
1306 | "model_name": "ProgressStyleModel",
1307 | "model_module_version": "1.5.0",
1308 | "state": {
1309 | "_model_module": "@jupyter-widgets/controls",
1310 | "_model_module_version": "1.5.0",
1311 | "_model_name": "ProgressStyleModel",
1312 | "_view_count": null,
1313 | "_view_module": "@jupyter-widgets/base",
1314 | "_view_module_version": "1.2.0",
1315 | "_view_name": "StyleView",
1316 | "bar_color": null,
1317 | "description_width": ""
1318 | }
1319 | },
1320 | "d0239f9b7577416597f5ee8e0287415c": {
1321 | "model_module": "@jupyter-widgets/base",
1322 | "model_name": "LayoutModel",
1323 | "model_module_version": "1.2.0",
1324 | "state": {
1325 | "_model_module": "@jupyter-widgets/base",
1326 | "_model_module_version": "1.2.0",
1327 | "_model_name": "LayoutModel",
1328 | "_view_count": null,
1329 | "_view_module": "@jupyter-widgets/base",
1330 | "_view_module_version": "1.2.0",
1331 | "_view_name": "LayoutView",
1332 | "align_content": null,
1333 | "align_items": null,
1334 | "align_self": null,
1335 | "border": null,
1336 | "bottom": null,
1337 | "display": null,
1338 | "flex": null,
1339 | "flex_flow": null,
1340 | "grid_area": null,
1341 | "grid_auto_columns": null,
1342 | "grid_auto_flow": null,
1343 | "grid_auto_rows": null,
1344 | "grid_column": null,
1345 | "grid_gap": null,
1346 | "grid_row": null,
1347 | "grid_template_areas": null,
1348 | "grid_template_columns": null,
1349 | "grid_template_rows": null,
1350 | "height": null,
1351 | "justify_content": null,
1352 | "justify_items": null,
1353 | "left": null,
1354 | "margin": null,
1355 | "max_height": null,
1356 | "max_width": null,
1357 | "min_height": null,
1358 | "min_width": null,
1359 | "object_fit": null,
1360 | "object_position": null,
1361 | "order": null,
1362 | "overflow": null,
1363 | "overflow_x": null,
1364 | "overflow_y": null,
1365 | "padding": null,
1366 | "right": null,
1367 | "top": null,
1368 | "visibility": null,
1369 | "width": null
1370 | }
1371 | },
1372 | "0ff9aad8a1d34926a3fdf280a24126ba": {
1373 | "model_module": "@jupyter-widgets/controls",
1374 | "model_name": "DescriptionStyleModel",
1375 | "model_module_version": "1.5.0",
1376 | "state": {
1377 | "_model_module": "@jupyter-widgets/controls",
1378 | "_model_module_version": "1.5.0",
1379 | "_model_name": "DescriptionStyleModel",
1380 | "_view_count": null,
1381 | "_view_module": "@jupyter-widgets/base",
1382 | "_view_module_version": "1.2.0",
1383 | "_view_name": "StyleView",
1384 | "description_width": ""
1385 | }
1386 | },
1387 | "7eed869b205c448db60a579dec2aaa5b": {
1388 | "model_module": "@jupyter-widgets/controls",
1389 | "model_name": "HBoxModel",
1390 | "model_module_version": "1.5.0",
1391 | "state": {
1392 | "_dom_classes": [],
1393 | "_model_module": "@jupyter-widgets/controls",
1394 | "_model_module_version": "1.5.0",
1395 | "_model_name": "HBoxModel",
1396 | "_view_count": null,
1397 | "_view_module": "@jupyter-widgets/controls",
1398 | "_view_module_version": "1.5.0",
1399 | "_view_name": "HBoxView",
1400 | "box_style": "",
1401 | "children": [
1402 | "IPY_MODEL_ec3803d2a96645e8a584bf2870e0ce92",
1403 | "IPY_MODEL_4f1ae710de3e4f3f8813e7fe4f643190",
1404 | "IPY_MODEL_a756c1bb1dbc4eb5925346005fcec6a4"
1405 | ],
1406 | "layout": "IPY_MODEL_79c7f170b8b44c33bca1e3705271d99f"
1407 | }
1408 | },
1409 | "ec3803d2a96645e8a584bf2870e0ce92": {
1410 | "model_module": "@jupyter-widgets/controls",
1411 | "model_name": "HTMLModel",
1412 | "model_module_version": "1.5.0",
1413 | "state": {
1414 | "_dom_classes": [],
1415 | "_model_module": "@jupyter-widgets/controls",
1416 | "_model_module_version": "1.5.0",
1417 | "_model_name": "HTMLModel",
1418 | "_view_count": null,
1419 | "_view_module": "@jupyter-widgets/controls",
1420 | "_view_module_version": "1.5.0",
1421 | "_view_name": "HTMLView",
1422 | "description": "",
1423 | "description_tooltip": null,
1424 | "layout": "IPY_MODEL_f8211cae0cbb466896f58383719d086d",
1425 | "placeholder": "",
1426 | "style": "IPY_MODEL_4ca5728b66dc4c639c13c6c569f1a2b2",
1427 | "value": "Downloading data: 100%"
1428 | }
1429 | },
1430 | "4f1ae710de3e4f3f8813e7fe4f643190": {
1431 | "model_module": "@jupyter-widgets/controls",
1432 | "model_name": "FloatProgressModel",
1433 | "model_module_version": "1.5.0",
1434 | "state": {
1435 | "_dom_classes": [],
1436 | "_model_module": "@jupyter-widgets/controls",
1437 | "_model_module_version": "1.5.0",
1438 | "_model_name": "FloatProgressModel",
1439 | "_view_count": null,
1440 | "_view_module": "@jupyter-widgets/controls",
1441 | "_view_module_version": "1.5.0",
1442 | "_view_name": "ProgressView",
1443 | "bar_style": "success",
1444 | "description": "",
1445 | "description_tooltip": null,
1446 | "layout": "IPY_MODEL_9b8abc703ee646d489afd6d9acbd6ed2",
1447 | "max": 16838830,
1448 | "min": 0,
1449 | "orientation": "horizontal",
1450 | "style": "IPY_MODEL_f0bb95077f5e44b488a3d3b83b117f6c",
1451 | "value": 16838830
1452 | }
1453 | },
1454 | "a756c1bb1dbc4eb5925346005fcec6a4": {
1455 | "model_module": "@jupyter-widgets/controls",
1456 | "model_name": "HTMLModel",
1457 | "model_module_version": "1.5.0",
1458 | "state": {
1459 | "_dom_classes": [],
1460 | "_model_module": "@jupyter-widgets/controls",
1461 | "_model_module_version": "1.5.0",
1462 | "_model_name": "HTMLModel",
1463 | "_view_count": null,
1464 | "_view_module": "@jupyter-widgets/controls",
1465 | "_view_module_version": "1.5.0",
1466 | "_view_name": "HTMLView",
1467 | "description": "",
1468 | "description_tooltip": null,
1469 | "layout": "IPY_MODEL_6405c3d09eda4eed843c2ddef31b2229",
1470 | "placeholder": "",
1471 | "style": "IPY_MODEL_6d97225a55274cbb90a3df3d82676130",
1472 | "value": " 16.8M/16.8M [00:00<00:00, 55.5MB/s]"
1473 | }
1474 | },
1475 | "79c7f170b8b44c33bca1e3705271d99f": {
1476 | "model_module": "@jupyter-widgets/base",
1477 | "model_name": "LayoutModel",
1478 | "model_module_version": "1.2.0",
1479 | "state": {
1480 | "_model_module": "@jupyter-widgets/base",
1481 | "_model_module_version": "1.2.0",
1482 | "_model_name": "LayoutModel",
1483 | "_view_count": null,
1484 | "_view_module": "@jupyter-widgets/base",
1485 | "_view_module_version": "1.2.0",
1486 | "_view_name": "LayoutView",
1487 | "align_content": null,
1488 | "align_items": null,
1489 | "align_self": null,
1490 | "border": null,
1491 | "bottom": null,
1492 | "display": null,
1493 | "flex": null,
1494 | "flex_flow": null,
1495 | "grid_area": null,
1496 | "grid_auto_columns": null,
1497 | "grid_auto_flow": null,
1498 | "grid_auto_rows": null,
1499 | "grid_column": null,
1500 | "grid_gap": null,
1501 | "grid_row": null,
1502 | "grid_template_areas": null,
1503 | "grid_template_columns": null,
1504 | "grid_template_rows": null,
1505 | "height": null,
1506 | "justify_content": null,
1507 | "justify_items": null,
1508 | "left": null,
1509 | "margin": null,
1510 | "max_height": null,
1511 | "max_width": null,
1512 | "min_height": null,
1513 | "min_width": null,
1514 | "object_fit": null,
1515 | "object_position": null,
1516 | "order": null,
1517 | "overflow": null,
1518 | "overflow_x": null,
1519 | "overflow_y": null,
1520 | "padding": null,
1521 | "right": null,
1522 | "top": null,
1523 | "visibility": null,
1524 | "width": null
1525 | }
1526 | },
1527 | "f8211cae0cbb466896f58383719d086d": {
1528 | "model_module": "@jupyter-widgets/base",
1529 | "model_name": "LayoutModel",
1530 | "model_module_version": "1.2.0",
1531 | "state": {
1532 | "_model_module": "@jupyter-widgets/base",
1533 | "_model_module_version": "1.2.0",
1534 | "_model_name": "LayoutModel",
1535 | "_view_count": null,
1536 | "_view_module": "@jupyter-widgets/base",
1537 | "_view_module_version": "1.2.0",
1538 | "_view_name": "LayoutView",
1539 | "align_content": null,
1540 | "align_items": null,
1541 | "align_self": null,
1542 | "border": null,
1543 | "bottom": null,
1544 | "display": null,
1545 | "flex": null,
1546 | "flex_flow": null,
1547 | "grid_area": null,
1548 | "grid_auto_columns": null,
1549 | "grid_auto_flow": null,
1550 | "grid_auto_rows": null,
1551 | "grid_column": null,
1552 | "grid_gap": null,
1553 | "grid_row": null,
1554 | "grid_template_areas": null,
1555 | "grid_template_columns": null,
1556 | "grid_template_rows": null,
1557 | "height": null,
1558 | "justify_content": null,
1559 | "justify_items": null,
1560 | "left": null,
1561 | "margin": null,
1562 | "max_height": null,
1563 | "max_width": null,
1564 | "min_height": null,
1565 | "min_width": null,
1566 | "object_fit": null,
1567 | "object_position": null,
1568 | "order": null,
1569 | "overflow": null,
1570 | "overflow_x": null,
1571 | "overflow_y": null,
1572 | "padding": null,
1573 | "right": null,
1574 | "top": null,
1575 | "visibility": null,
1576 | "width": null
1577 | }
1578 | },
1579 | "4ca5728b66dc4c639c13c6c569f1a2b2": {
1580 | "model_module": "@jupyter-widgets/controls",
1581 | "model_name": "DescriptionStyleModel",
1582 | "model_module_version": "1.5.0",
1583 | "state": {
1584 | "_model_module": "@jupyter-widgets/controls",
1585 | "_model_module_version": "1.5.0",
1586 | "_model_name": "DescriptionStyleModel",
1587 | "_view_count": null,
1588 | "_view_module": "@jupyter-widgets/base",
1589 | "_view_module_version": "1.2.0",
1590 | "_view_name": "StyleView",
1591 | "description_width": ""
1592 | }
1593 | },
1594 | "9b8abc703ee646d489afd6d9acbd6ed2": {
1595 | "model_module": "@jupyter-widgets/base",
1596 | "model_name": "LayoutModel",
1597 | "model_module_version": "1.2.0",
1598 | "state": {
1599 | "_model_module": "@jupyter-widgets/base",
1600 | "_model_module_version": "1.2.0",
1601 | "_model_name": "LayoutModel",
1602 | "_view_count": null,
1603 | "_view_module": "@jupyter-widgets/base",
1604 | "_view_module_version": "1.2.0",
1605 | "_view_name": "LayoutView",
1606 | "align_content": null,
1607 | "align_items": null,
1608 | "align_self": null,
1609 | "border": null,
1610 | "bottom": null,
1611 | "display": null,
1612 | "flex": null,
1613 | "flex_flow": null,
1614 | "grid_area": null,
1615 | "grid_auto_columns": null,
1616 | "grid_auto_flow": null,
1617 | "grid_auto_rows": null,
1618 | "grid_column": null,
1619 | "grid_gap": null,
1620 | "grid_row": null,
1621 | "grid_template_areas": null,
1622 | "grid_template_columns": null,
1623 | "grid_template_rows": null,
1624 | "height": null,
1625 | "justify_content": null,
1626 | "justify_items": null,
1627 | "left": null,
1628 | "margin": null,
1629 | "max_height": null,
1630 | "max_width": null,
1631 | "min_height": null,
1632 | "min_width": null,
1633 | "object_fit": null,
1634 | "object_position": null,
1635 | "order": null,
1636 | "overflow": null,
1637 | "overflow_x": null,
1638 | "overflow_y": null,
1639 | "padding": null,
1640 | "right": null,
1641 | "top": null,
1642 | "visibility": null,
1643 | "width": null
1644 | }
1645 | },
1646 | "f0bb95077f5e44b488a3d3b83b117f6c": {
1647 | "model_module": "@jupyter-widgets/controls",
1648 | "model_name": "ProgressStyleModel",
1649 | "model_module_version": "1.5.0",
1650 | "state": {
1651 | "_model_module": "@jupyter-widgets/controls",
1652 | "_model_module_version": "1.5.0",
1653 | "_model_name": "ProgressStyleModel",
1654 | "_view_count": null,
1655 | "_view_module": "@jupyter-widgets/base",
1656 | "_view_module_version": "1.2.0",
1657 | "_view_name": "StyleView",
1658 | "bar_color": null,
1659 | "description_width": ""
1660 | }
1661 | },
1662 | "6405c3d09eda4eed843c2ddef31b2229": {
1663 | "model_module": "@jupyter-widgets/base",
1664 | "model_name": "LayoutModel",
1665 | "model_module_version": "1.2.0",
1666 | "state": {
1667 | "_model_module": "@jupyter-widgets/base",
1668 | "_model_module_version": "1.2.0",
1669 | "_model_name": "LayoutModel",
1670 | "_view_count": null,
1671 | "_view_module": "@jupyter-widgets/base",
1672 | "_view_module_version": "1.2.0",
1673 | "_view_name": "LayoutView",
1674 | "align_content": null,
1675 | "align_items": null,
1676 | "align_self": null,
1677 | "border": null,
1678 | "bottom": null,
1679 | "display": null,
1680 | "flex": null,
1681 | "flex_flow": null,
1682 | "grid_area": null,
1683 | "grid_auto_columns": null,
1684 | "grid_auto_flow": null,
1685 | "grid_auto_rows": null,
1686 | "grid_column": null,
1687 | "grid_gap": null,
1688 | "grid_row": null,
1689 | "grid_template_areas": null,
1690 | "grid_template_columns": null,
1691 | "grid_template_rows": null,
1692 | "height": null,
1693 | "justify_content": null,
1694 | "justify_items": null,
1695 | "left": null,
1696 | "margin": null,
1697 | "max_height": null,
1698 | "max_width": null,
1699 | "min_height": null,
1700 | "min_width": null,
1701 | "object_fit": null,
1702 | "object_position": null,
1703 | "order": null,
1704 | "overflow": null,
1705 | "overflow_x": null,
1706 | "overflow_y": null,
1707 | "padding": null,
1708 | "right": null,
1709 | "top": null,
1710 | "visibility": null,
1711 | "width": null
1712 | }
1713 | },
1714 | "6d97225a55274cbb90a3df3d82676130": {
1715 | "model_module": "@jupyter-widgets/controls",
1716 | "model_name": "DescriptionStyleModel",
1717 | "model_module_version": "1.5.0",
1718 | "state": {
1719 | "_model_module": "@jupyter-widgets/controls",
1720 | "_model_module_version": "1.5.0",
1721 | "_model_name": "DescriptionStyleModel",
1722 | "_view_count": null,
1723 | "_view_module": "@jupyter-widgets/base",
1724 | "_view_module_version": "1.2.0",
1725 | "_view_name": "StyleView",
1726 | "description_width": ""
1727 | }
1728 | },
1729 | "d5fc06bdaf054696a2be0c511f8362de": {
1730 | "model_module": "@jupyter-widgets/controls",
1731 | "model_name": "HBoxModel",
1732 | "model_module_version": "1.5.0",
1733 | "state": {
1734 | "_dom_classes": [],
1735 | "_model_module": "@jupyter-widgets/controls",
1736 | "_model_module_version": "1.5.0",
1737 | "_model_name": "HBoxModel",
1738 | "_view_count": null,
1739 | "_view_module": "@jupyter-widgets/controls",
1740 | "_view_module_version": "1.5.0",
1741 | "_view_name": "HBoxView",
1742 | "box_style": "",
1743 | "children": [
1744 | "IPY_MODEL_7d7f0285f7c0494192e390b3b914ad2c",
1745 | "IPY_MODEL_a56dab6eebd24823b33b94e02b67aa8d",
1746 | "IPY_MODEL_5e15c6e52cb749149bc991fd0e8861b5"
1747 | ],
1748 | "layout": "IPY_MODEL_cff5ef358a4e44588d08e908b7e9cebe"
1749 | }
1750 | },
1751 | "7d7f0285f7c0494192e390b3b914ad2c": {
1752 | "model_module": "@jupyter-widgets/controls",
1753 | "model_name": "HTMLModel",
1754 | "model_module_version": "1.5.0",
1755 | "state": {
1756 | "_dom_classes": [],
1757 | "_model_module": "@jupyter-widgets/controls",
1758 | "_model_module_version": "1.5.0",
1759 | "_model_name": "HTMLModel",
1760 | "_view_count": null,
1761 | "_view_module": "@jupyter-widgets/controls",
1762 | "_view_module_version": "1.5.0",
1763 | "_view_name": "HTMLView",
1764 | "description": "",
1765 | "description_tooltip": null,
1766 | "layout": "IPY_MODEL_de8ec040fef94a2a851e4bbdfe99e366",
1767 | "placeholder": "",
1768 | "style": "IPY_MODEL_1c1718b43073404892b90cb7ff902f60",
1769 | "value": "Generating train split: "
1770 | }
1771 | },
1772 | "a56dab6eebd24823b33b94e02b67aa8d": {
1773 | "model_module": "@jupyter-widgets/controls",
1774 | "model_name": "FloatProgressModel",
1775 | "model_module_version": "1.5.0",
1776 | "state": {
1777 | "_dom_classes": [],
1778 | "_model_module": "@jupyter-widgets/controls",
1779 | "_model_module_version": "1.5.0",
1780 | "_model_name": "FloatProgressModel",
1781 | "_view_count": null,
1782 | "_view_module": "@jupyter-widgets/controls",
1783 | "_view_module_version": "1.5.0",
1784 | "_view_name": "ProgressView",
1785 | "bar_style": "info",
1786 | "description": "",
1787 | "description_tooltip": null,
1788 | "layout": "IPY_MODEL_44e2b7b085be4980b8d10c6e8a81e712",
1789 | "max": 1,
1790 | "min": 0,
1791 | "orientation": "horizontal",
1792 | "style": "IPY_MODEL_8f16bc2c6ef547fbb8c9a66b75b99d03",
1793 | "value": 1
1794 | }
1795 | },
1796 | "5e15c6e52cb749149bc991fd0e8861b5": {
1797 | "model_module": "@jupyter-widgets/controls",
1798 | "model_name": "HTMLModel",
1799 | "model_module_version": "1.5.0",
1800 | "state": {
1801 | "_dom_classes": [],
1802 | "_model_module": "@jupyter-widgets/controls",
1803 | "_model_module_version": "1.5.0",
1804 | "_model_name": "HTMLModel",
1805 | "_view_count": null,
1806 | "_view_module": "@jupyter-widgets/controls",
1807 | "_view_module_version": "1.5.0",
1808 | "_view_name": "HTMLView",
1809 | "description": "",
1810 | "description_tooltip": null,
1811 | "layout": "IPY_MODEL_5049799a9ba14926b9747775dce8ae0e",
1812 | "placeholder": "",
1813 | "style": "IPY_MODEL_154de61264fb4f399f1c699ad8ec199a",
1814 | "value": " 149/0 [00:17<00:00, 4.32 examples/s]"
1815 | }
1816 | },
1817 | "cff5ef358a4e44588d08e908b7e9cebe": {
1818 | "model_module": "@jupyter-widgets/base",
1819 | "model_name": "LayoutModel",
1820 | "model_module_version": "1.2.0",
1821 | "state": {
1822 | "_model_module": "@jupyter-widgets/base",
1823 | "_model_module_version": "1.2.0",
1824 | "_model_name": "LayoutModel",
1825 | "_view_count": null,
1826 | "_view_module": "@jupyter-widgets/base",
1827 | "_view_module_version": "1.2.0",
1828 | "_view_name": "LayoutView",
1829 | "align_content": null,
1830 | "align_items": null,
1831 | "align_self": null,
1832 | "border": null,
1833 | "bottom": null,
1834 | "display": null,
1835 | "flex": null,
1836 | "flex_flow": null,
1837 | "grid_area": null,
1838 | "grid_auto_columns": null,
1839 | "grid_auto_flow": null,
1840 | "grid_auto_rows": null,
1841 | "grid_column": null,
1842 | "grid_gap": null,
1843 | "grid_row": null,
1844 | "grid_template_areas": null,
1845 | "grid_template_columns": null,
1846 | "grid_template_rows": null,
1847 | "height": null,
1848 | "justify_content": null,
1849 | "justify_items": null,
1850 | "left": null,
1851 | "margin": null,
1852 | "max_height": null,
1853 | "max_width": null,
1854 | "min_height": null,
1855 | "min_width": null,
1856 | "object_fit": null,
1857 | "object_position": null,
1858 | "order": null,
1859 | "overflow": null,
1860 | "overflow_x": null,
1861 | "overflow_y": null,
1862 | "padding": null,
1863 | "right": null,
1864 | "top": null,
1865 | "visibility": "hidden",
1866 | "width": null
1867 | }
1868 | },
1869 | "de8ec040fef94a2a851e4bbdfe99e366": {
1870 | "model_module": "@jupyter-widgets/base",
1871 | "model_name": "LayoutModel",
1872 | "model_module_version": "1.2.0",
1873 | "state": {
1874 | "_model_module": "@jupyter-widgets/base",
1875 | "_model_module_version": "1.2.0",
1876 | "_model_name": "LayoutModel",
1877 | "_view_count": null,
1878 | "_view_module": "@jupyter-widgets/base",
1879 | "_view_module_version": "1.2.0",
1880 | "_view_name": "LayoutView",
1881 | "align_content": null,
1882 | "align_items": null,
1883 | "align_self": null,
1884 | "border": null,
1885 | "bottom": null,
1886 | "display": null,
1887 | "flex": null,
1888 | "flex_flow": null,
1889 | "grid_area": null,
1890 | "grid_auto_columns": null,
1891 | "grid_auto_flow": null,
1892 | "grid_auto_rows": null,
1893 | "grid_column": null,
1894 | "grid_gap": null,
1895 | "grid_row": null,
1896 | "grid_template_areas": null,
1897 | "grid_template_columns": null,
1898 | "grid_template_rows": null,
1899 | "height": null,
1900 | "justify_content": null,
1901 | "justify_items": null,
1902 | "left": null,
1903 | "margin": null,
1904 | "max_height": null,
1905 | "max_width": null,
1906 | "min_height": null,
1907 | "min_width": null,
1908 | "object_fit": null,
1909 | "object_position": null,
1910 | "order": null,
1911 | "overflow": null,
1912 | "overflow_x": null,
1913 | "overflow_y": null,
1914 | "padding": null,
1915 | "right": null,
1916 | "top": null,
1917 | "visibility": null,
1918 | "width": null
1919 | }
1920 | },
1921 | "1c1718b43073404892b90cb7ff902f60": {
1922 | "model_module": "@jupyter-widgets/controls",
1923 | "model_name": "DescriptionStyleModel",
1924 | "model_module_version": "1.5.0",
1925 | "state": {
1926 | "_model_module": "@jupyter-widgets/controls",
1927 | "_model_module_version": "1.5.0",
1928 | "_model_name": "DescriptionStyleModel",
1929 | "_view_count": null,
1930 | "_view_module": "@jupyter-widgets/base",
1931 | "_view_module_version": "1.2.0",
1932 | "_view_name": "StyleView",
1933 | "description_width": ""
1934 | }
1935 | },
1936 | "44e2b7b085be4980b8d10c6e8a81e712": {
1937 | "model_module": "@jupyter-widgets/base",
1938 | "model_name": "LayoutModel",
1939 | "model_module_version": "1.2.0",
1940 | "state": {
1941 | "_model_module": "@jupyter-widgets/base",
1942 | "_model_module_version": "1.2.0",
1943 | "_model_name": "LayoutModel",
1944 | "_view_count": null,
1945 | "_view_module": "@jupyter-widgets/base",
1946 | "_view_module_version": "1.2.0",
1947 | "_view_name": "LayoutView",
1948 | "align_content": null,
1949 | "align_items": null,
1950 | "align_self": null,
1951 | "border": null,
1952 | "bottom": null,
1953 | "display": null,
1954 | "flex": null,
1955 | "flex_flow": null,
1956 | "grid_area": null,
1957 | "grid_auto_columns": null,
1958 | "grid_auto_flow": null,
1959 | "grid_auto_rows": null,
1960 | "grid_column": null,
1961 | "grid_gap": null,
1962 | "grid_row": null,
1963 | "grid_template_areas": null,
1964 | "grid_template_columns": null,
1965 | "grid_template_rows": null,
1966 | "height": null,
1967 | "justify_content": null,
1968 | "justify_items": null,
1969 | "left": null,
1970 | "margin": null,
1971 | "max_height": null,
1972 | "max_width": null,
1973 | "min_height": null,
1974 | "min_width": null,
1975 | "object_fit": null,
1976 | "object_position": null,
1977 | "order": null,
1978 | "overflow": null,
1979 | "overflow_x": null,
1980 | "overflow_y": null,
1981 | "padding": null,
1982 | "right": null,
1983 | "top": null,
1984 | "visibility": null,
1985 | "width": "20px"
1986 | }
1987 | },
1988 | "8f16bc2c6ef547fbb8c9a66b75b99d03": {
1989 | "model_module": "@jupyter-widgets/controls",
1990 | "model_name": "ProgressStyleModel",
1991 | "model_module_version": "1.5.0",
1992 | "state": {
1993 | "_model_module": "@jupyter-widgets/controls",
1994 | "_model_module_version": "1.5.0",
1995 | "_model_name": "ProgressStyleModel",
1996 | "_view_count": null,
1997 | "_view_module": "@jupyter-widgets/base",
1998 | "_view_module_version": "1.2.0",
1999 | "_view_name": "StyleView",
2000 | "bar_color": null,
2001 | "description_width": ""
2002 | }
2003 | },
2004 | "5049799a9ba14926b9747775dce8ae0e": {
2005 | "model_module": "@jupyter-widgets/base",
2006 | "model_name": "LayoutModel",
2007 | "model_module_version": "1.2.0",
2008 | "state": {
2009 | "_model_module": "@jupyter-widgets/base",
2010 | "_model_module_version": "1.2.0",
2011 | "_model_name": "LayoutModel",
2012 | "_view_count": null,
2013 | "_view_module": "@jupyter-widgets/base",
2014 | "_view_module_version": "1.2.0",
2015 | "_view_name": "LayoutView",
2016 | "align_content": null,
2017 | "align_items": null,
2018 | "align_self": null,
2019 | "border": null,
2020 | "bottom": null,
2021 | "display": null,
2022 | "flex": null,
2023 | "flex_flow": null,
2024 | "grid_area": null,
2025 | "grid_auto_columns": null,
2026 | "grid_auto_flow": null,
2027 | "grid_auto_rows": null,
2028 | "grid_column": null,
2029 | "grid_gap": null,
2030 | "grid_row": null,
2031 | "grid_template_areas": null,
2032 | "grid_template_columns": null,
2033 | "grid_template_rows": null,
2034 | "height": null,
2035 | "justify_content": null,
2036 | "justify_items": null,
2037 | "left": null,
2038 | "margin": null,
2039 | "max_height": null,
2040 | "max_width": null,
2041 | "min_height": null,
2042 | "min_width": null,
2043 | "object_fit": null,
2044 | "object_position": null,
2045 | "order": null,
2046 | "overflow": null,
2047 | "overflow_x": null,
2048 | "overflow_y": null,
2049 | "padding": null,
2050 | "right": null,
2051 | "top": null,
2052 | "visibility": null,
2053 | "width": null
2054 | }
2055 | },
2056 | "154de61264fb4f399f1c699ad8ec199a": {
2057 | "model_module": "@jupyter-widgets/controls",
2058 | "model_name": "DescriptionStyleModel",
2059 | "model_module_version": "1.5.0",
2060 | "state": {
2061 | "_model_module": "@jupyter-widgets/controls",
2062 | "_model_module_version": "1.5.0",
2063 | "_model_name": "DescriptionStyleModel",
2064 | "_view_count": null,
2065 | "_view_module": "@jupyter-widgets/base",
2066 | "_view_module_version": "1.2.0",
2067 | "_view_name": "StyleView",
2068 | "description_width": ""
2069 | }
2070 | },
2071 | "a0f3abfbc70e4fdab2f183a49b3f6177": {
2072 | "model_module": "@jupyter-widgets/controls",
2073 | "model_name": "HBoxModel",
2074 | "model_module_version": "1.5.0",
2075 | "state": {
2076 | "_dom_classes": [],
2077 | "_model_module": "@jupyter-widgets/controls",
2078 | "_model_module_version": "1.5.0",
2079 | "_model_name": "HBoxModel",
2080 | "_view_count": null,
2081 | "_view_module": "@jupyter-widgets/controls",
2082 | "_view_module_version": "1.5.0",
2083 | "_view_name": "HBoxView",
2084 | "box_style": "",
2085 | "children": [
2086 | "IPY_MODEL_a862285ba9324a1c8e374ce561de8c1c",
2087 | "IPY_MODEL_6c6a59bb3cc04bd6a5904cbeb2bce030",
2088 | "IPY_MODEL_33d6f73537e249c29eb39d9590412e23"
2089 | ],
2090 | "layout": "IPY_MODEL_cf7376fbaf074a43bf3311292c85f34f"
2091 | }
2092 | },
2093 | "a862285ba9324a1c8e374ce561de8c1c": {
2094 | "model_module": "@jupyter-widgets/controls",
2095 | "model_name": "HTMLModel",
2096 | "model_module_version": "1.5.0",
2097 | "state": {
2098 | "_dom_classes": [],
2099 | "_model_module": "@jupyter-widgets/controls",
2100 | "_model_module_version": "1.5.0",
2101 | "_model_name": "HTMLModel",
2102 | "_view_count": null,
2103 | "_view_module": "@jupyter-widgets/controls",
2104 | "_view_module_version": "1.5.0",
2105 | "_view_name": "HTMLView",
2106 | "description": "",
2107 | "description_tooltip": null,
2108 | "layout": "IPY_MODEL_96747be99fb540d48aefc9862a59d8f1",
2109 | "placeholder": "",
2110 | "style": "IPY_MODEL_477e07d7a0ef4b07943cfc46fc49f479",
2111 | "value": "Generating test split: "
2112 | }
2113 | },
2114 | "6c6a59bb3cc04bd6a5904cbeb2bce030": {
2115 | "model_module": "@jupyter-widgets/controls",
2116 | "model_name": "FloatProgressModel",
2117 | "model_module_version": "1.5.0",
2118 | "state": {
2119 | "_dom_classes": [],
2120 | "_model_module": "@jupyter-widgets/controls",
2121 | "_model_module_version": "1.5.0",
2122 | "_model_name": "FloatProgressModel",
2123 | "_view_count": null,
2124 | "_view_module": "@jupyter-widgets/controls",
2125 | "_view_module_version": "1.5.0",
2126 | "_view_name": "ProgressView",
2127 | "bar_style": "info",
2128 | "description": "",
2129 | "description_tooltip": null,
2130 | "layout": "IPY_MODEL_61c9bbdebbee447eb4253ee86020e86a",
2131 | "max": 1,
2132 | "min": 0,
2133 | "orientation": "horizontal",
2134 | "style": "IPY_MODEL_1121f7d0c1d648e2a3902af77fcc91e5",
2135 | "value": 1
2136 | }
2137 | },
2138 | "33d6f73537e249c29eb39d9590412e23": {
2139 | "model_module": "@jupyter-widgets/controls",
2140 | "model_name": "HTMLModel",
2141 | "model_module_version": "1.5.0",
2142 | "state": {
2143 | "_dom_classes": [],
2144 | "_model_module": "@jupyter-widgets/controls",
2145 | "_model_module_version": "1.5.0",
2146 | "_model_name": "HTMLModel",
2147 | "_view_count": null,
2148 | "_view_module": "@jupyter-widgets/controls",
2149 | "_view_module_version": "1.5.0",
2150 | "_view_name": "HTMLView",
2151 | "description": "",
2152 | "description_tooltip": null,
2153 | "layout": "IPY_MODEL_dcfa26149ddd48ecbbb5e5b20ce623af",
2154 | "placeholder": "",
2155 | "style": "IPY_MODEL_2c68c7c8422d44e4a2f96043c1047e20",
2156 | "value": " 49/0 [00:07<00:00, 6.05 examples/s]"
2157 | }
2158 | },
2159 | "cf7376fbaf074a43bf3311292c85f34f": {
2160 | "model_module": "@jupyter-widgets/base",
2161 | "model_name": "LayoutModel",
2162 | "model_module_version": "1.2.0",
2163 | "state": {
2164 | "_model_module": "@jupyter-widgets/base",
2165 | "_model_module_version": "1.2.0",
2166 | "_model_name": "LayoutModel",
2167 | "_view_count": null,
2168 | "_view_module": "@jupyter-widgets/base",
2169 | "_view_module_version": "1.2.0",
2170 | "_view_name": "LayoutView",
2171 | "align_content": null,
2172 | "align_items": null,
2173 | "align_self": null,
2174 | "border": null,
2175 | "bottom": null,
2176 | "display": null,
2177 | "flex": null,
2178 | "flex_flow": null,
2179 | "grid_area": null,
2180 | "grid_auto_columns": null,
2181 | "grid_auto_flow": null,
2182 | "grid_auto_rows": null,
2183 | "grid_column": null,
2184 | "grid_gap": null,
2185 | "grid_row": null,
2186 | "grid_template_areas": null,
2187 | "grid_template_columns": null,
2188 | "grid_template_rows": null,
2189 | "height": null,
2190 | "justify_content": null,
2191 | "justify_items": null,
2192 | "left": null,
2193 | "margin": null,
2194 | "max_height": null,
2195 | "max_width": null,
2196 | "min_height": null,
2197 | "min_width": null,
2198 | "object_fit": null,
2199 | "object_position": null,
2200 | "order": null,
2201 | "overflow": null,
2202 | "overflow_x": null,
2203 | "overflow_y": null,
2204 | "padding": null,
2205 | "right": null,
2206 | "top": null,
2207 | "visibility": null,
2208 | "width": null
2209 | }
2210 | },
2211 | "96747be99fb540d48aefc9862a59d8f1": {
2212 | "model_module": "@jupyter-widgets/base",
2213 | "model_name": "LayoutModel",
2214 | "model_module_version": "1.2.0",
2215 | "state": {
2216 | "_model_module": "@jupyter-widgets/base",
2217 | "_model_module_version": "1.2.0",
2218 | "_model_name": "LayoutModel",
2219 | "_view_count": null,
2220 | "_view_module": "@jupyter-widgets/base",
2221 | "_view_module_version": "1.2.0",
2222 | "_view_name": "LayoutView",
2223 | "align_content": null,
2224 | "align_items": null,
2225 | "align_self": null,
2226 | "border": null,
2227 | "bottom": null,
2228 | "display": null,
2229 | "flex": null,
2230 | "flex_flow": null,
2231 | "grid_area": null,
2232 | "grid_auto_columns": null,
2233 | "grid_auto_flow": null,
2234 | "grid_auto_rows": null,
2235 | "grid_column": null,
2236 | "grid_gap": null,
2237 | "grid_row": null,
2238 | "grid_template_areas": null,
2239 | "grid_template_columns": null,
2240 | "grid_template_rows": null,
2241 | "height": null,
2242 | "justify_content": null,
2243 | "justify_items": null,
2244 | "left": null,
2245 | "margin": null,
2246 | "max_height": null,
2247 | "max_width": null,
2248 | "min_height": null,
2249 | "min_width": null,
2250 | "object_fit": null,
2251 | "object_position": null,
2252 | "order": null,
2253 | "overflow": null,
2254 | "overflow_x": null,
2255 | "overflow_y": null,
2256 | "padding": null,
2257 | "right": null,
2258 | "top": null,
2259 | "visibility": null,
2260 | "width": null
2261 | }
2262 | },
2263 | "477e07d7a0ef4b07943cfc46fc49f479": {
2264 | "model_module": "@jupyter-widgets/controls",
2265 | "model_name": "DescriptionStyleModel",
2266 | "model_module_version": "1.5.0",
2267 | "state": {
2268 | "_model_module": "@jupyter-widgets/controls",
2269 | "_model_module_version": "1.5.0",
2270 | "_model_name": "DescriptionStyleModel",
2271 | "_view_count": null,
2272 | "_view_module": "@jupyter-widgets/base",
2273 | "_view_module_version": "1.2.0",
2274 | "_view_name": "StyleView",
2275 | "description_width": ""
2276 | }
2277 | },
2278 | "61c9bbdebbee447eb4253ee86020e86a": {
2279 | "model_module": "@jupyter-widgets/base",
2280 | "model_name": "LayoutModel",
2281 | "model_module_version": "1.2.0",
2282 | "state": {
2283 | "_model_module": "@jupyter-widgets/base",
2284 | "_model_module_version": "1.2.0",
2285 | "_model_name": "LayoutModel",
2286 | "_view_count": null,
2287 | "_view_module": "@jupyter-widgets/base",
2288 | "_view_module_version": "1.2.0",
2289 | "_view_name": "LayoutView",
2290 | "align_content": null,
2291 | "align_items": null,
2292 | "align_self": null,
2293 | "border": null,
2294 | "bottom": null,
2295 | "display": null,
2296 | "flex": null,
2297 | "flex_flow": null,
2298 | "grid_area": null,
2299 | "grid_auto_columns": null,
2300 | "grid_auto_flow": null,
2301 | "grid_auto_rows": null,
2302 | "grid_column": null,
2303 | "grid_gap": null,
2304 | "grid_row": null,
2305 | "grid_template_areas": null,
2306 | "grid_template_columns": null,
2307 | "grid_template_rows": null,
2308 | "height": null,
2309 | "justify_content": null,
2310 | "justify_items": null,
2311 | "left": null,
2312 | "margin": null,
2313 | "max_height": null,
2314 | "max_width": null,
2315 | "min_height": null,
2316 | "min_width": null,
2317 | "object_fit": null,
2318 | "object_position": null,
2319 | "order": null,
2320 | "overflow": null,
2321 | "overflow_x": null,
2322 | "overflow_y": null,
2323 | "padding": null,
2324 | "right": null,
2325 | "top": null,
2326 | "visibility": null,
2327 | "width": "20px"
2328 | }
2329 | },
2330 | "1121f7d0c1d648e2a3902af77fcc91e5": {
2331 | "model_module": "@jupyter-widgets/controls",
2332 | "model_name": "ProgressStyleModel",
2333 | "model_module_version": "1.5.0",
2334 | "state": {
2335 | "_model_module": "@jupyter-widgets/controls",
2336 | "_model_module_version": "1.5.0",
2337 | "_model_name": "ProgressStyleModel",
2338 | "_view_count": null,
2339 | "_view_module": "@jupyter-widgets/base",
2340 | "_view_module_version": "1.2.0",
2341 | "_view_name": "StyleView",
2342 | "bar_color": null,
2343 | "description_width": ""
2344 | }
2345 | },
2346 | "dcfa26149ddd48ecbbb5e5b20ce623af": {
2347 | "model_module": "@jupyter-widgets/base",
2348 | "model_name": "LayoutModel",
2349 | "model_module_version": "1.2.0",
2350 | "state": {
2351 | "_model_module": "@jupyter-widgets/base",
2352 | "_model_module_version": "1.2.0",
2353 | "_model_name": "LayoutModel",
2354 | "_view_count": null,
2355 | "_view_module": "@jupyter-widgets/base",
2356 | "_view_module_version": "1.2.0",
2357 | "_view_name": "LayoutView",
2358 | "align_content": null,
2359 | "align_items": null,
2360 | "align_self": null,
2361 | "border": null,
2362 | "bottom": null,
2363 | "display": null,
2364 | "flex": null,
2365 | "flex_flow": null,
2366 | "grid_area": null,
2367 | "grid_auto_columns": null,
2368 | "grid_auto_flow": null,
2369 | "grid_auto_rows": null,
2370 | "grid_column": null,
2371 | "grid_gap": null,
2372 | "grid_row": null,
2373 | "grid_template_areas": null,
2374 | "grid_template_columns": null,
2375 | "grid_template_rows": null,
2376 | "height": null,
2377 | "justify_content": null,
2378 | "justify_items": null,
2379 | "left": null,
2380 | "margin": null,
2381 | "max_height": null,
2382 | "max_width": null,
2383 | "min_height": null,
2384 | "min_width": null,
2385 | "object_fit": null,
2386 | "object_position": null,
2387 | "order": null,
2388 | "overflow": null,
2389 | "overflow_x": null,
2390 | "overflow_y": null,
2391 | "padding": null,
2392 | "right": null,
2393 | "top": null,
2394 | "visibility": null,
2395 | "width": null
2396 | }
2397 | },
2398 | "2c68c7c8422d44e4a2f96043c1047e20": {
2399 | "model_module": "@jupyter-widgets/controls",
2400 | "model_name": "DescriptionStyleModel",
2401 | "model_module_version": "1.5.0",
2402 | "state": {
2403 | "_model_module": "@jupyter-widgets/controls",
2404 | "_model_module_version": "1.5.0",
2405 | "_model_name": "DescriptionStyleModel",
2406 | "_view_count": null,
2407 | "_view_module": "@jupyter-widgets/base",
2408 | "_view_module_version": "1.2.0",
2409 | "_view_name": "StyleView",
2410 | "description_width": ""
2411 | }
2412 | },
2413 | "994d2439a06e4edda672c78da480fad4": {
2414 | "model_module": "@jupyter-widgets/controls",
2415 | "model_name": "HBoxModel",
2416 | "model_module_version": "1.5.0",
2417 | "state": {
2418 | "_dom_classes": [],
2419 | "_model_module": "@jupyter-widgets/controls",
2420 | "_model_module_version": "1.5.0",
2421 | "_model_name": "HBoxModel",
2422 | "_view_count": null,
2423 | "_view_module": "@jupyter-widgets/controls",
2424 | "_view_module_version": "1.5.0",
2425 | "_view_name": "HBoxView",
2426 | "box_style": "",
2427 | "children": [
2428 | "IPY_MODEL_0e64966469944155937a3d7546692596",
2429 | "IPY_MODEL_005132a051824de2b566f5cb0300c4d1",
2430 | "IPY_MODEL_c046ae07aecd4887a9ce5fd007cb6d44"
2431 | ],
2432 | "layout": "IPY_MODEL_febb9fce57a0438490c08968dae91630"
2433 | }
2434 | },
2435 | "0e64966469944155937a3d7546692596": {
2436 | "model_module": "@jupyter-widgets/controls",
2437 | "model_name": "HTMLModel",
2438 | "model_module_version": "1.5.0",
2439 | "state": {
2440 | "_dom_classes": [],
2441 | "_model_module": "@jupyter-widgets/controls",
2442 | "_model_module_version": "1.5.0",
2443 | "_model_name": "HTMLModel",
2444 | "_view_count": null,
2445 | "_view_module": "@jupyter-widgets/controls",
2446 | "_view_module_version": "1.5.0",
2447 | "_view_name": "HTMLView",
2448 | "description": "",
2449 | "description_tooltip": null,
2450 | "layout": "IPY_MODEL_f48519b7deea484085d950039a7fedf8",
2451 | "placeholder": "",
2452 | "style": "IPY_MODEL_ab7d0cc2bffc466f927c30885d284f76",
2453 | "value": "100%"
2454 | }
2455 | },
2456 | "005132a051824de2b566f5cb0300c4d1": {
2457 | "model_module": "@jupyter-widgets/controls",
2458 | "model_name": "FloatProgressModel",
2459 | "model_module_version": "1.5.0",
2460 | "state": {
2461 | "_dom_classes": [],
2462 | "_model_module": "@jupyter-widgets/controls",
2463 | "_model_module_version": "1.5.0",
2464 | "_model_name": "FloatProgressModel",
2465 | "_view_count": null,
2466 | "_view_module": "@jupyter-widgets/controls",
2467 | "_view_module_version": "1.5.0",
2468 | "_view_name": "ProgressView",
2469 | "bar_style": "success",
2470 | "description": "",
2471 | "description_tooltip": null,
2472 | "layout": "IPY_MODEL_9dffac53721b41dab3034e9f8f2c1922",
2473 | "max": 2,
2474 | "min": 0,
2475 | "orientation": "horizontal",
2476 | "style": "IPY_MODEL_aacf975117774c8f987f4e86a1dd5a14",
2477 | "value": 2
2478 | }
2479 | },
2480 | "c046ae07aecd4887a9ce5fd007cb6d44": {
2481 | "model_module": "@jupyter-widgets/controls",
2482 | "model_name": "HTMLModel",
2483 | "model_module_version": "1.5.0",
2484 | "state": {
2485 | "_dom_classes": [],
2486 | "_model_module": "@jupyter-widgets/controls",
2487 | "_model_module_version": "1.5.0",
2488 | "_model_name": "HTMLModel",
2489 | "_view_count": null,
2490 | "_view_module": "@jupyter-widgets/controls",
2491 | "_view_module_version": "1.5.0",
2492 | "_view_name": "HTMLView",
2493 | "description": "",
2494 | "description_tooltip": null,
2495 | "layout": "IPY_MODEL_dee813d0981d4478a423c69bd5fefaed",
2496 | "placeholder": "",
2497 | "style": "IPY_MODEL_97b1fe94a6c14a1987066bf5d77bd75a",
2498 | "value": " 2/2 [00:00<00:00, 29.32it/s]"
2499 | }
2500 | },
2501 | "febb9fce57a0438490c08968dae91630": {
2502 | "model_module": "@jupyter-widgets/base",
2503 | "model_name": "LayoutModel",
2504 | "model_module_version": "1.2.0",
2505 | "state": {
2506 | "_model_module": "@jupyter-widgets/base",
2507 | "_model_module_version": "1.2.0",
2508 | "_model_name": "LayoutModel",
2509 | "_view_count": null,
2510 | "_view_module": "@jupyter-widgets/base",
2511 | "_view_module_version": "1.2.0",
2512 | "_view_name": "LayoutView",
2513 | "align_content": null,
2514 | "align_items": null,
2515 | "align_self": null,
2516 | "border": null,
2517 | "bottom": null,
2518 | "display": null,
2519 | "flex": null,
2520 | "flex_flow": null,
2521 | "grid_area": null,
2522 | "grid_auto_columns": null,
2523 | "grid_auto_flow": null,
2524 | "grid_auto_rows": null,
2525 | "grid_column": null,
2526 | "grid_gap": null,
2527 | "grid_row": null,
2528 | "grid_template_areas": null,
2529 | "grid_template_columns": null,
2530 | "grid_template_rows": null,
2531 | "height": null,
2532 | "justify_content": null,
2533 | "justify_items": null,
2534 | "left": null,
2535 | "margin": null,
2536 | "max_height": null,
2537 | "max_width": null,
2538 | "min_height": null,
2539 | "min_width": null,
2540 | "object_fit": null,
2541 | "object_position": null,
2542 | "order": null,
2543 | "overflow": null,
2544 | "overflow_x": null,
2545 | "overflow_y": null,
2546 | "padding": null,
2547 | "right": null,
2548 | "top": null,
2549 | "visibility": null,
2550 | "width": null
2551 | }
2552 | },
2553 | "f48519b7deea484085d950039a7fedf8": {
2554 | "model_module": "@jupyter-widgets/base",
2555 | "model_name": "LayoutModel",
2556 | "model_module_version": "1.2.0",
2557 | "state": {
2558 | "_model_module": "@jupyter-widgets/base",
2559 | "_model_module_version": "1.2.0",
2560 | "_model_name": "LayoutModel",
2561 | "_view_count": null,
2562 | "_view_module": "@jupyter-widgets/base",
2563 | "_view_module_version": "1.2.0",
2564 | "_view_name": "LayoutView",
2565 | "align_content": null,
2566 | "align_items": null,
2567 | "align_self": null,
2568 | "border": null,
2569 | "bottom": null,
2570 | "display": null,
2571 | "flex": null,
2572 | "flex_flow": null,
2573 | "grid_area": null,
2574 | "grid_auto_columns": null,
2575 | "grid_auto_flow": null,
2576 | "grid_auto_rows": null,
2577 | "grid_column": null,
2578 | "grid_gap": null,
2579 | "grid_row": null,
2580 | "grid_template_areas": null,
2581 | "grid_template_columns": null,
2582 | "grid_template_rows": null,
2583 | "height": null,
2584 | "justify_content": null,
2585 | "justify_items": null,
2586 | "left": null,
2587 | "margin": null,
2588 | "max_height": null,
2589 | "max_width": null,
2590 | "min_height": null,
2591 | "min_width": null,
2592 | "object_fit": null,
2593 | "object_position": null,
2594 | "order": null,
2595 | "overflow": null,
2596 | "overflow_x": null,
2597 | "overflow_y": null,
2598 | "padding": null,
2599 | "right": null,
2600 | "top": null,
2601 | "visibility": null,
2602 | "width": null
2603 | }
2604 | },
2605 | "ab7d0cc2bffc466f927c30885d284f76": {
2606 | "model_module": "@jupyter-widgets/controls",
2607 | "model_name": "DescriptionStyleModel",
2608 | "model_module_version": "1.5.0",
2609 | "state": {
2610 | "_model_module": "@jupyter-widgets/controls",
2611 | "_model_module_version": "1.5.0",
2612 | "_model_name": "DescriptionStyleModel",
2613 | "_view_count": null,
2614 | "_view_module": "@jupyter-widgets/base",
2615 | "_view_module_version": "1.2.0",
2616 | "_view_name": "StyleView",
2617 | "description_width": ""
2618 | }
2619 | },
2620 | "9dffac53721b41dab3034e9f8f2c1922": {
2621 | "model_module": "@jupyter-widgets/base",
2622 | "model_name": "LayoutModel",
2623 | "model_module_version": "1.2.0",
2624 | "state": {
2625 | "_model_module": "@jupyter-widgets/base",
2626 | "_model_module_version": "1.2.0",
2627 | "_model_name": "LayoutModel",
2628 | "_view_count": null,
2629 | "_view_module": "@jupyter-widgets/base",
2630 | "_view_module_version": "1.2.0",
2631 | "_view_name": "LayoutView",
2632 | "align_content": null,
2633 | "align_items": null,
2634 | "align_self": null,
2635 | "border": null,
2636 | "bottom": null,
2637 | "display": null,
2638 | "flex": null,
2639 | "flex_flow": null,
2640 | "grid_area": null,
2641 | "grid_auto_columns": null,
2642 | "grid_auto_flow": null,
2643 | "grid_auto_rows": null,
2644 | "grid_column": null,
2645 | "grid_gap": null,
2646 | "grid_row": null,
2647 | "grid_template_areas": null,
2648 | "grid_template_columns": null,
2649 | "grid_template_rows": null,
2650 | "height": null,
2651 | "justify_content": null,
2652 | "justify_items": null,
2653 | "left": null,
2654 | "margin": null,
2655 | "max_height": null,
2656 | "max_width": null,
2657 | "min_height": null,
2658 | "min_width": null,
2659 | "object_fit": null,
2660 | "object_position": null,
2661 | "order": null,
2662 | "overflow": null,
2663 | "overflow_x": null,
2664 | "overflow_y": null,
2665 | "padding": null,
2666 | "right": null,
2667 | "top": null,
2668 | "visibility": null,
2669 | "width": null
2670 | }
2671 | },
2672 | "aacf975117774c8f987f4e86a1dd5a14": {
2673 | "model_module": "@jupyter-widgets/controls",
2674 | "model_name": "ProgressStyleModel",
2675 | "model_module_version": "1.5.0",
2676 | "state": {
2677 | "_model_module": "@jupyter-widgets/controls",
2678 | "_model_module_version": "1.5.0",
2679 | "_model_name": "ProgressStyleModel",
2680 | "_view_count": null,
2681 | "_view_module": "@jupyter-widgets/base",
2682 | "_view_module_version": "1.2.0",
2683 | "_view_name": "StyleView",
2684 | "bar_color": null,
2685 | "description_width": ""
2686 | }
2687 | },
2688 | "dee813d0981d4478a423c69bd5fefaed": {
2689 | "model_module": "@jupyter-widgets/base",
2690 | "model_name": "LayoutModel",
2691 | "model_module_version": "1.2.0",
2692 | "state": {
2693 | "_model_module": "@jupyter-widgets/base",
2694 | "_model_module_version": "1.2.0",
2695 | "_model_name": "LayoutModel",
2696 | "_view_count": null,
2697 | "_view_module": "@jupyter-widgets/base",
2698 | "_view_module_version": "1.2.0",
2699 | "_view_name": "LayoutView",
2700 | "align_content": null,
2701 | "align_items": null,
2702 | "align_self": null,
2703 | "border": null,
2704 | "bottom": null,
2705 | "display": null,
2706 | "flex": null,
2707 | "flex_flow": null,
2708 | "grid_area": null,
2709 | "grid_auto_columns": null,
2710 | "grid_auto_flow": null,
2711 | "grid_auto_rows": null,
2712 | "grid_column": null,
2713 | "grid_gap": null,
2714 | "grid_row": null,
2715 | "grid_template_areas": null,
2716 | "grid_template_columns": null,
2717 | "grid_template_rows": null,
2718 | "height": null,
2719 | "justify_content": null,
2720 | "justify_items": null,
2721 | "left": null,
2722 | "margin": null,
2723 | "max_height": null,
2724 | "max_width": null,
2725 | "min_height": null,
2726 | "min_width": null,
2727 | "object_fit": null,
2728 | "object_position": null,
2729 | "order": null,
2730 | "overflow": null,
2731 | "overflow_x": null,
2732 | "overflow_y": null,
2733 | "padding": null,
2734 | "right": null,
2735 | "top": null,
2736 | "visibility": null,
2737 | "width": null
2738 | }
2739 | },
2740 | "97b1fe94a6c14a1987066bf5d77bd75a": {
2741 | "model_module": "@jupyter-widgets/controls",
2742 | "model_name": "DescriptionStyleModel",
2743 | "model_module_version": "1.5.0",
2744 | "state": {
2745 | "_model_module": "@jupyter-widgets/controls",
2746 | "_model_module_version": "1.5.0",
2747 | "_model_name": "DescriptionStyleModel",
2748 | "_view_count": null,
2749 | "_view_module": "@jupyter-widgets/base",
2750 | "_view_module_version": "1.2.0",
2751 | "_view_name": "StyleView",
2752 | "description_width": ""
2753 | }
2754 | }
2755 | }
2756 | }
2757 | },
2758 | "cells": [
2759 | {
2760 | "cell_type": "markdown",
2761 | "metadata": {
2762 | "id": "view-in-github",
2763 | "colab_type": "text"
2764 | },
2765 | "source": [
2766 | "
"
2767 | ]
2768 | },
2769 | {
2770 | "cell_type": "code",
2771 | "execution_count": null,
2772 | "metadata": {
2773 | "colab": {
2774 | "base_uri": "https://localhost:8080/"
2775 | },
2776 | "id": "IR5WxNj-dSN_",
2777 | "outputId": "ba18d456-9945-46cb-8f02-16cf0a46bfa1"
2778 | },
2779 | "outputs": [
2780 | {
2781 | "output_type": "stream",
2782 | "name": "stdout",
2783 | "text": [
2784 | "Cloning into 'TiLT-Implementation'...\n",
2785 | "remote: Enumerating objects: 77, done.\u001b[K\n",
2786 | "remote: Counting objects: 100% (77/77), done.\u001b[K\n",
2787 | "remote: Compressing objects: 100% (58/58), done.\u001b[K\n",
2788 | "remote: Total 77 (delta 31), reused 45 (delta 11), pack-reused 0\u001b[K\n",
2789 | "Unpacking objects: 100% (77/77), 2.77 MiB | 7.51 MiB/s, done.\n"
2790 | ]
2791 | }
2792 | ],
2793 | "source": [
2794 | "!git clone https://github.com/uakarsh/TiLT-Implementation.git"
2795 | ]
2796 | },
2797 | {
2798 | "cell_type": "code",
2799 | "source": [
2800 | "!pip install -r /content/TiLT-Implementation/requirements.txt"
2801 | ],
2802 | "metadata": {
2803 | "id": "oBrxq2BLiuXw"
2804 | },
2805 | "execution_count": null,
2806 | "outputs": []
2807 | },
2808 | {
2809 | "cell_type": "code",
2810 | "source": [
2811 | "import sys\n",
2812 | "sys.path.append(\"/content/TiLT-Implementation/src/\")"
2813 | ],
2814 | "metadata": {
2815 | "id": "fiGVO1aFiv-h"
2816 | },
2817 | "execution_count": null,
2818 | "outputs": []
2819 | },
2820 | {
2821 | "cell_type": "code",
2822 | "source": [
2823 | "from transformers import AutoTokenizer\n",
2824 | "from datasets import load_dataset\n",
2825 | "import torch\n",
2826 | "import torch.nn\n",
2827 | "\n",
2828 | "model_name = \"t5-base\"\n",
2829 | "tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast = True)"
2830 | ],
2831 | "metadata": {
2832 | "colab": {
2833 | "base_uri": "https://localhost:8080/",
2834 | "height": 217,
2835 | "referenced_widgets": [
2836 | "2c08e449950a4d3ea9049a4aaf9f4de7",
2837 | "759b967f50d84ffd903fab90c0969a6d",
2838 | "f97e169933e648378b59c146da88bc3b",
2839 | "3432461a6f7446dda1e28ac7870591c6",
2840 | "e41f7aa014de44dca338647c329d2e02",
2841 | "35b3ce19787a47c8b6c2a9e685550108",
2842 | "ba5cff7a36bf4ae080f76ec53dfce07c",
2843 | "c203dd7e0dcc42d889b05755a018c998",
2844 | "a40c5f7a73ab422ca89258fa2c98bc52",
2845 | "1dc415b05d0d4b9e999fddf1dea84562",
2846 | "cecac60443284810bac22a9a10819572",
2847 | "33ec7ec081d24433a72b65c2c6085aaf",
2848 | "954d298b8a8d43628fc2add59266541e",
2849 | "75208c1d1d3247c19baca7c526eb7382",
2850 | "6e1cb705ef9d4341b6d61f4cccd6c009",
2851 | "9f9fa275896b446b8e6bd511190fe922",
2852 | "ffa2df3f95e241fe90f6514049a32669",
2853 | "993ea1ee236b4b77bce5def3cb4eba68",
2854 | "01b087a18a2346cc97aeaa712e54cef4",
2855 | "c0903c5551224beea5edf4609b07514c",
2856 | "3ecf6715c92042ccbe672b5e1f758de5",
2857 | "3e826539414a4e798e4a6232174e1066",
2858 | "242efb0ea706421c994793f87ece24eb",
2859 | "56e42bcb2df74e2a8e990da26760da6f",
2860 | "9aef638dd3db437ab712c99cbb57433d",
2861 | "c5371258333d471abc1a9e73b23573a0",
2862 | "69cb8cdc043144148dd599569da87d25",
2863 | "3a970557b24940d29eea9ab77b620a0b",
2864 | "aece0cbdf6d04e29a3484a71ae74a971",
2865 | "9e1768f313ef4f8f9503a6e9c64bb707",
2866 | "09d9547e80e44ce5981f66190bd3bb5c",
2867 | "41c267d7ae7f4de183d8ffbd60d47b36",
2868 | "656c59f5ea564ff994112595236213b9"
2869 | ]
2870 | },
2871 | "id": "lR-QX-JbiwRW",
2872 | "outputId": "ccac92d0-61d7-4749-a076-31e6bf9cc3bb"
2873 | },
2874 | "execution_count": null,
2875 | "outputs": [
2876 | {
2877 | "output_type": "display_data",
2878 | "data": {
2879 | "text/plain": [
2880 | "Downloading (…)lve/main/config.json: 0%| | 0.00/1.21k [00:00, ?B/s]"
2881 | ],
2882 | "application/vnd.jupyter.widget-view+json": {
2883 | "version_major": 2,
2884 | "version_minor": 0,
2885 | "model_id": "2c08e449950a4d3ea9049a4aaf9f4de7"
2886 | }
2887 | },
2888 | "metadata": {}
2889 | },
2890 | {
2891 | "output_type": "display_data",
2892 | "data": {
2893 | "text/plain": [
2894 | "Downloading (…)ve/main/spiece.model: 0%| | 0.00/792k [00:00, ?B/s]"
2895 | ],
2896 | "application/vnd.jupyter.widget-view+json": {
2897 | "version_major": 2,
2898 | "version_minor": 0,
2899 | "model_id": "33ec7ec081d24433a72b65c2c6085aaf"
2900 | }
2901 | },
2902 | "metadata": {}
2903 | },
2904 | {
2905 | "output_type": "display_data",
2906 | "data": {
2907 | "text/plain": [
2908 | "Downloading (…)/main/tokenizer.json: 0%| | 0.00/1.39M [00:00, ?B/s]"
2909 | ],
2910 | "application/vnd.jupyter.widget-view+json": {
2911 | "version_major": 2,
2912 | "version_minor": 0,
2913 | "model_id": "242efb0ea706421c994793f87ece24eb"
2914 | }
2915 | },
2916 | "metadata": {}
2917 | },
2918 | {
2919 | "output_type": "stream",
2920 | "name": "stderr",
2921 | "text": [
2922 | "/usr/local/lib/python3.9/dist-packages/transformers/models/t5/tokenization_t5_fast.py:155: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.\n",
2923 | "For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.\n",
2924 | "- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.\n",
2925 | "- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.\n",
2926 | "- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.\n",
2927 | " warnings.warn(\n"
2928 | ]
2929 | }
2930 | ]
2931 | },
2932 | {
2933 | "cell_type": "code",
2934 | "source": [
2935 | "hf_ds = load_dataset(\"nielsr/funsd-layoutlmv3\")"
2936 | ],
2937 | "metadata": {
2938 | "colab": {
2939 | "base_uri": "https://localhost:8080/",
2940 | "height": 180,
2941 | "referenced_widgets": [
2942 | "d36fe23be9334b53b2715f4a39576ec5",
2943 | "e23b54cc65e844548e19c784102725dd",
2944 | "65d64f04508745b0ac817be4840b494c",
2945 | "630b162223734bad92bffddb22a81534",
2946 | "ab4241cadba948d7946497b1fe33fd71",
2947 | "8e9c9c88077649b1b92d20e1844d3136",
2948 | "1ecd41da2f354b7794abc085827a732d",
2949 | "9532d70512f041958b6e8655c303430e",
2950 | "17cb083323e74fa2a018dbb0b061e47b",
2951 | "d0239f9b7577416597f5ee8e0287415c",
2952 | "0ff9aad8a1d34926a3fdf280a24126ba",
2953 | "7eed869b205c448db60a579dec2aaa5b",
2954 | "ec3803d2a96645e8a584bf2870e0ce92",
2955 | "4f1ae710de3e4f3f8813e7fe4f643190",
2956 | "a756c1bb1dbc4eb5925346005fcec6a4",
2957 | "79c7f170b8b44c33bca1e3705271d99f",
2958 | "f8211cae0cbb466896f58383719d086d",
2959 | "4ca5728b66dc4c639c13c6c569f1a2b2",
2960 | "9b8abc703ee646d489afd6d9acbd6ed2",
2961 | "f0bb95077f5e44b488a3d3b83b117f6c",
2962 | "6405c3d09eda4eed843c2ddef31b2229",
2963 | "6d97225a55274cbb90a3df3d82676130",
2964 | "d5fc06bdaf054696a2be0c511f8362de",
2965 | "7d7f0285f7c0494192e390b3b914ad2c",
2966 | "a56dab6eebd24823b33b94e02b67aa8d",
2967 | "5e15c6e52cb749149bc991fd0e8861b5",
2968 | "cff5ef358a4e44588d08e908b7e9cebe",
2969 | "de8ec040fef94a2a851e4bbdfe99e366",
2970 | "1c1718b43073404892b90cb7ff902f60",
2971 | "44e2b7b085be4980b8d10c6e8a81e712",
2972 | "8f16bc2c6ef547fbb8c9a66b75b99d03",
2973 | "5049799a9ba14926b9747775dce8ae0e",
2974 | "154de61264fb4f399f1c699ad8ec199a",
2975 | "a0f3abfbc70e4fdab2f183a49b3f6177",
2976 | "a862285ba9324a1c8e374ce561de8c1c",
2977 | "6c6a59bb3cc04bd6a5904cbeb2bce030",
2978 | "33d6f73537e249c29eb39d9590412e23",
2979 | "cf7376fbaf074a43bf3311292c85f34f",
2980 | "96747be99fb540d48aefc9862a59d8f1",
2981 | "477e07d7a0ef4b07943cfc46fc49f479",
2982 | "61c9bbdebbee447eb4253ee86020e86a",
2983 | "1121f7d0c1d648e2a3902af77fcc91e5",
2984 | "dcfa26149ddd48ecbbb5e5b20ce623af",
2985 | "2c68c7c8422d44e4a2f96043c1047e20",
2986 | "994d2439a06e4edda672c78da480fad4",
2987 | "0e64966469944155937a3d7546692596",
2988 | "005132a051824de2b566f5cb0300c4d1",
2989 | "c046ae07aecd4887a9ce5fd007cb6d44",
2990 | "febb9fce57a0438490c08968dae91630",
2991 | "f48519b7deea484085d950039a7fedf8",
2992 | "ab7d0cc2bffc466f927c30885d284f76",
2993 | "9dffac53721b41dab3034e9f8f2c1922",
2994 | "aacf975117774c8f987f4e86a1dd5a14",
2995 | "dee813d0981d4478a423c69bd5fefaed",
2996 | "97b1fe94a6c14a1987066bf5d77bd75a"
2997 | ]
2998 | },
2999 | "id": "ATze8lsLjJWz",
3000 | "outputId": "cd90199f-0cad-4b95-ea5b-c4cc5ea2c8a1"
3001 | },
3002 | "execution_count": null,
3003 | "outputs": [
3004 | {
3005 | "output_type": "display_data",
3006 | "data": {
3007 | "text/plain": [
3008 | "Downloading builder script: 0%| | 0.00/5.13k [00:00, ?B/s]"
3009 | ],
3010 | "application/vnd.jupyter.widget-view+json": {
3011 | "version_major": 2,
3012 | "version_minor": 0,
3013 | "model_id": "d36fe23be9334b53b2715f4a39576ec5"
3014 | }
3015 | },
3016 | "metadata": {}
3017 | },
3018 | {
3019 | "output_type": "stream",
3020 | "name": "stdout",
3021 | "text": [
3022 | "Downloading and preparing dataset funsd-layoutlmv3/funsd to /root/.cache/huggingface/datasets/nielsr___funsd-layoutlmv3/funsd/1.0.0/0e3f4efdfd59aa1c3b4952c517894f7b1fc4d75c12ef01bcc8626a69e41c1bb9...\n"
3023 | ]
3024 | },
3025 | {
3026 | "output_type": "display_data",
3027 | "data": {
3028 | "text/plain": [
3029 | "Downloading data: 0%| | 0.00/16.8M [00:00, ?B/s]"
3030 | ],
3031 | "application/vnd.jupyter.widget-view+json": {
3032 | "version_major": 2,
3033 | "version_minor": 0,
3034 | "model_id": "7eed869b205c448db60a579dec2aaa5b"
3035 | }
3036 | },
3037 | "metadata": {}
3038 | },
3039 | {
3040 | "output_type": "display_data",
3041 | "data": {
3042 | "text/plain": [
3043 | "Generating train split: 0 examples [00:00, ? examples/s]"
3044 | ],
3045 | "application/vnd.jupyter.widget-view+json": {
3046 | "version_major": 2,
3047 | "version_minor": 0,
3048 | "model_id": "d5fc06bdaf054696a2be0c511f8362de"
3049 | }
3050 | },
3051 | "metadata": {}
3052 | },
3053 | {
3054 | "output_type": "display_data",
3055 | "data": {
3056 | "text/plain": [
3057 | "Generating test split: 0 examples [00:00, ? examples/s]"
3058 | ],
3059 | "application/vnd.jupyter.widget-view+json": {
3060 | "version_major": 2,
3061 | "version_minor": 0,
3062 | "model_id": "a0f3abfbc70e4fdab2f183a49b3f6177"
3063 | }
3064 | },
3065 | "metadata": {}
3066 | },
3067 | {
3068 | "output_type": "stream",
3069 | "name": "stdout",
3070 | "text": [
3071 | "Dataset funsd-layoutlmv3 downloaded and prepared to /root/.cache/huggingface/datasets/nielsr___funsd-layoutlmv3/funsd/1.0.0/0e3f4efdfd59aa1c3b4952c517894f7b1fc4d75c12ef01bcc8626a69e41c1bb9. Subsequent calls will reuse this data.\n"
3072 | ]
3073 | },
3074 | {
3075 | "output_type": "display_data",
3076 | "data": {
3077 | "text/plain": [
3078 | " 0%| | 0/2 [00:00, ?it/s]"
3079 | ],
3080 | "application/vnd.jupyter.widget-view+json": {
3081 | "version_major": 2,
3082 | "version_minor": 0,
3083 | "model_id": "994d2439a06e4edda672c78da480fad4"
3084 | }
3085 | },
3086 | "metadata": {}
3087 | }
3088 | ]
3089 | },
3090 | {
3091 | "cell_type": "code",
3092 | "source": [
3093 | "from dataset import FUNSDDs\n",
3094 | "from torchvision import transforms\n",
3095 | "from tqdm.auto import tqdm\n",
3096 | "\n",
3097 | "transform = transforms.Compose([transforms.ToTensor(), \n",
3098 | " transforms.Lambda(lambda x : 2 * x - 1)])"
3099 | ],
3100 | "metadata": {
3101 | "id": "SEHWnEqrOg61"
3102 | },
3103 | "execution_count": null,
3104 | "outputs": []
3105 | },
3106 | {
3107 | "cell_type": "code",
3108 | "source": [
3109 | "def get_id2label_and_label2id():\n",
3110 | " label2id = {'O': 0, 'B-HEADER': 1, 'I-HEADER': 2, 'B-QUESTION': 3, 'I-QUESTION': 4, 'B-ANSWER': 5, 'I-ANSWER': 6}\n",
3111 | " id2label = {0: 'O', 1: 'B-HEADER', 2: 'I-HEADER', 3: 'B-QUESTION', 4: 'I-QUESTION', 5: 'B-ANSWER', 6: 'I-ANSWER'}\n",
3112 | " return id2label, label2id\n",
3113 | "\n",
3114 | "id2label, label2id = get_id2label_and_label2id()\n",
3115 | "\n",
3116 | "def convert_id_to_label(list_of_label):\n",
3117 | " return [id2label[x] for x in list_of_label]"
3118 | ],
3119 | "metadata": {
3120 | "id": "CdEJiL-7R9KK"
3121 | },
3122 | "execution_count": null,
3123 | "outputs": []
3124 | },
3125 | {
3126 | "cell_type": "code",
3127 | "source": [
3128 | "train_new_tags = list(map(lambda x : convert_id_to_label(x), hf_ds['train']['ner_tags']))\n",
3129 | "test_new_tags = list(map(lambda x : convert_id_to_label(x), hf_ds['test']['ner_tags']))"
3130 | ],
3131 | "metadata": {
3132 | "id": "h_eXqCGaXQq1"
3133 | },
3134 | "execution_count": null,
3135 | "outputs": []
3136 | },
3137 | {
3138 | "cell_type": "code",
3139 | "source": [
3140 | "hf_ds['train'] = hf_ds['train'].remove_columns(\"ner_tags\").add_column(\"ner_tags\", train_new_tags)\n",
3141 | "hf_ds['test'] = hf_ds['test'].remove_columns(\"ner_tags\").add_column(\"ner_tags\", test_new_tags)"
3142 | ],
3143 | "metadata": {
3144 | "id": "oWRdHLJDZMq-"
3145 | },
3146 | "execution_count": null,
3147 | "outputs": []
3148 | },
3149 | {
3150 | "cell_type": "code",
3151 | "source": [
3152 | "train_ds = FUNSDDs(hf_ds['train'],tokenizer = tokenizer, transform = transform)\n",
3153 | "val_ds = FUNSDDs(hf_ds['test'],tokenizer = tokenizer, transform = transform)"
3154 | ],
3155 | "metadata": {
3156 | "id": "gFJ6C6yvXE6C"
3157 | },
3158 | "execution_count": null,
3159 | "outputs": []
3160 | },
3161 | {
3162 | "cell_type": "code",
3163 | "source": [
3164 | "class CollateFn(object):\n",
3165 | " def __init__(self, tokenizer):\n",
3166 | " self.tokenizer = tokenizer\n",
3167 | "\n",
3168 | " def __call__(self, list_of_ds):\n",
3169 | " simple_keys = [\"input_ids\", \"attention_mask\", \"bboxes\", \"pixel_values\" ]\n",
3170 | " actual_batch = {}\n",
3171 | " for key in simple_keys:\n",
3172 | " actual_batch[key] = torch.stack([x[key] for x in list_of_ds])\n",
3173 | " \n",
3174 | " actual_batch['labels'] = self.tokenizer.batch_encode_plus([x['labels'] for x in list_of_ds], return_tensors = 'pt', is_split_into_words = True,\n",
3175 | " padding='max_length', truncation = True)['input_ids']\n",
3176 | " return actual_batch\n"
3177 | ],
3178 | "metadata": {
3179 | "id": "pwT-S7RoV548"
3180 | },
3181 | "execution_count": null,
3182 | "outputs": []
3183 | },
3184 | {
3185 | "cell_type": "code",
3186 | "source": [
3187 | "collate_fn = CollateFn(tokenizer)"
3188 | ],
3189 | "metadata": {
3190 | "id": "04U-bLPrWslH"
3191 | },
3192 | "execution_count": null,
3193 | "outputs": []
3194 | },
3195 | {
3196 | "cell_type": "code",
3197 | "source": [
3198 | "sample_batch_encoding = collate_fn([train_ds[0], train_ds[1]])"
3199 | ],
3200 | "metadata": {
3201 | "id": "BrG5VmLBWwuN"
3202 | },
3203 | "execution_count": null,
3204 | "outputs": []
3205 | },
3206 | {
3207 | "cell_type": "code",
3208 | "source": [
3209 | "for key in sample_batch_encoding:\n",
3210 | " print(f\"Key : {key}, has shape : {sample_batch_encoding[key].shape}\")"
3211 | ],
3212 | "metadata": {
3213 | "colab": {
3214 | "base_uri": "https://localhost:8080/"
3215 | },
3216 | "id": "GB_6pJEeaM2v",
3217 | "outputId": "979cf0c3-8c92-4bbb-c362-4d84c4167159"
3218 | },
3219 | "execution_count": null,
3220 | "outputs": [
3221 | {
3222 | "output_type": "stream",
3223 | "name": "stdout",
3224 | "text": [
3225 | "Key : input_ids, has shape : torch.Size([2, 512])\n",
3226 | "Key : attention_mask, has shape : torch.Size([2, 512])\n",
3227 | "Key : bboxes, has shape : torch.Size([2, 512, 4])\n",
3228 | "Key : pixel_values, has shape : torch.Size([2, 3, 384, 512])\n",
3229 | "Key : labels, has shape : torch.Size([2, 512])\n"
3230 | ]
3231 | }
3232 | ]
3233 | },
3234 | {
3235 | "cell_type": "code",
3236 | "source": [],
3237 | "metadata": {
3238 | "id": "1W_QXkfmaz5I"
3239 | },
3240 | "execution_count": null,
3241 | "outputs": []
3242 | }
3243 | ]
3244 | }
--------------------------------------------------------------------------------