├── .gitignore
├── README.md
├── notebooks
    └── train_emotions_classifier.ipynb
├── pyproject.toml
├── setup.cfg
├── src
    └── liqfit
    │   ├── __init__.py
    │   ├── collators
    │       ├── __init__.py
    │       ├── base_collator.py
    │       └── nli_collator.py
    │   ├── datasets
    │       ├── __init__.py
    │       ├── nli_dataset.py
    │       └── transform.py
    │   ├── losses
    │       ├── __init__.py
    │       └── losses.py
    │   ├── modeling
    │       ├── __init__.py
    │       ├── backbone.py
    │       ├── heads.py
    │       ├── model.py
    │       └── pooling.py
    │   ├── models
    │       ├── __init__.py
    │       ├── deberta.py
    │       └── t5.py
    │   ├── pipeline
    │       ├── __init__.py
    │       └── inference.py
    │   └── utils
    │       ├── __init__.py
    │       ├── metrics.py
    │       ├── standardization.py
    │       └── transforms.py
└── tests
    ├── __init__.py
    ├── test_losses.py
    ├── test_models.py
    └── test_pipeline.py


/.gitignore:
--------------------------------------------------------------------------------
1 | demo.ipynb


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <p align="center">
  2 |     🤗 <a href="https://huggingface.co/collections/knowledgator/zero-shot-text-classification-models-65b93970ddafc3f8a5e9b591" target="_blank">Models</a> | 📕 <a href="https://docs.knowledgator.com/docs/frameworks/liqfit" target="_blank">Documentation</a> | 📖 <a href="https://medium.com/@knowledgrator/introducing-liqfit-flexible-few-shot-learning-library-for-cross-encoder-models-804eac5aea92" target="_blank">Blog</a>
  3 | <br>
  4 | . . .
  5 | </p>
  6 | 
  7 | # LiqFit - Flexible Few-shot Learning Library.
  8 | 
  9 | LiqFit is an easy-to-use framework for few-shot learning of cross-encoder models. Such models were trained to distinguish whether two statements entail, contradict each other or are neutral. Such task setting is universal for many information extraction tasks, starting from text classification and ending with named entity recognition and question-answering. With LiqFit, you can achieve competitive results by having just 8 examples per label. 
 10 | 
 11 | 
 12 | Key features and benefits of LiqFit are:
 13 | * 🔢 **A small number of examples are required** - LiqFit can significantly improve the accuracy of the default zero-shot classifier having just 8 examples;
 14 | * 📝 **Can solve many different information-extraction tasks** - Natural language inference is a universal task that can be applied as a setting for many other information extraction tasks, like named entity recognition of question&answering;
 15 | * 🌈 **Can work for other classes not presented in the training set** - It's not mandatory to have all needed classes in a training set. Because of pre-finetuning on large amounts of NLI and classification tasks, a model will save generalisability to other classes;
 16 | * ⚙️ **Support of a variety of cross-encoder realisations** - LiqFit supports different types of cross-encoders, including conventional, binary one and encoder-decoder architectures;
 17 | * ⚖️ **Stable to unbalanced datasets** - LiqFit uses normalisation techniques that allow it to work well even in the cases of unbalanced data;
 18 | * 🏷️ **Multi-label classification support** -  The approach can be applied for both multi-class and multi-label classification;
 19 | 
 20 | Limitations:
 21 | * 🤔 It’s required to run N times transformers feedforward pass, where N is the amount of labels;
 22 | 
 23 | 
 24 | ## Installation
 25 | 
 26 | Download and install `LiqFit` by running:
 27 | 
 28 | ```bash
 29 | pip install liqfit
 30 | ```
 31 | 
 32 | For the most up-to-date version, you can build from source code by executing:
 33 | 
 34 | ```bash
 35 | pip install git+https://github.com/knowledgator/LiqFit.git
 36 | ```
 37 | 
 38 | ## How to use:
 39 | Check more real example in the `notebooks` section.
 40 | 
 41 | ```python
 42 | from liqfit.modeling import LiqFitModel
 43 | from liqfit.losses import FocalLoss
 44 | from liqfit.collators import NLICollator
 45 | from transformers import TrainingArguments, Trainer
 46 | 
 47 | backbone_model = AutoModelForSequenceClassification.from_pretrained('microsoft/deberta-v3-xsmall')
 48 | 
 49 | loss_func = FocalLoss(multi_target=True)
 50 | 
 51 | model = LiqFitModel(backbone_model.config, backbone_model, loss_func=loss_func)
 52 | 
 53 | data_collator = NLICollator(tokenizer, max_length=128, padding=True, truncation=True)
 54 | 
 55 | 
 56 | training_args = TrainingArguments(
 57 |     output_dir='comprehendo',
 58 |     learning_rate=3e-5,
 59 |     per_device_train_batch_size=3,
 60 |     per_device_eval_batch_size=3,
 61 |     num_train_epochs=9,
 62 |     weight_decay=0.01,
 63 |     evaluation_strategy="epoch",
 64 |     save_steps = 5000,
 65 |     save_total_limit=3,
 66 |     remove_unused_columns=False,
 67 | )
 68 | 
 69 | trainer = Trainer(
 70 |     model=model,
 71 |     args=training_args,
 72 |     train_dataset=nli_train_dataset,
 73 |     eval_dataset=nli_test_dataset,
 74 |     tokenizer=tokenizer,
 75 |     data_collator=data_collator,
 76 | )
 77 | 
 78 | trainer.train()
 79 | ```
 80 | Please check more examples in the `notebooks` section.
 81 | 
 82 | ...
 83 | 
 84 | To run inference, we recommend to use `ZeroShotClassificationPipeline`:
 85 | 
 86 | ```python
 87 | from liqfit import ZeroShotClassificationPipeline
 88 | 
 89 | 
 90 | classifier = ZeroShotClassificationPipeline(model=model, tokenizer=tokenizer)
 91 | from sklearn.metrics import classification_report
 92 | from tqdm import tqdm
 93 | 
 94 | label2idx = {label: id for id, label in enumerate(classes)}
 95 | 
 96 | preds = []
 97 | 
 98 | for example in tqdm(test_dataset):
 99 |    if not example['text']:
100 |        preds.append(idx)
101 |        continue
102 |    pred = classifier(example['text'], classes)['labels'][0]
103 |    idx = label2idx[pred]
104 |    preds.append(idx)
105 | 
106 | print(classification_report(test_dataset['label'][:len(preds)], preds, target_names=classes, digits=4))
107 | ```
108 | 
109 | ## Benchmarks:
110 | | Model & examples per label | Emotion | AgNews | SST5 |
111 | |-|-|-|-|
112 | | Comprehend-it/0 | 56.60 | 79.82 | 37.9 |  
113 | | Comprehend-it/8 | 63.38 | 85.9 | 46.67 |
114 | | Comprehend-it/64 | 80.7 | 88 | 47 |
115 | | SetFit/0 | 57.54 | 56.36 | 24.11 |
116 | | SetFit/8 | 56.81 | 64.93 | 33.61 |  
117 | | SetFit/64 | 79.03 | 88 | 45.38 |
118 | 
119 | LiqFit used [knowledgator/comprehend_it-base model](https://huggingface.co/knowledgator/comprehend_it-base), while for [SetFit](https://github.com/huggingface/setfit), we utilzed [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
120 | 


--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
 1 | [build-system]
 2 | requires = ["hatchling<=1.18.0"]
 3 | build-backend = "hatchling.build"
 4 | 
 5 | [project]
 6 | name = "liqfit"
 7 | version = "1.0.0"
 8 | 
 9 | requires-python = ">=3.7"
10 | 
11 | description = "Flexible Few-shot learning tool."
12 | license = "MIT"
13 | long_description = "file: README.md"
14 | 
15 | classifiers = [
16 |     "Programming Language :: Python :: 3",
17 |     "License :: OSI Approved :: MIT License",
18 |     "Operating System :: OS Independent",
19 | ]
20 | 
21 | dependencies = [
22 |     "kornia",
23 |     "transformers",
24 |     "accelerate",
25 | ]
26 | 
27 | 
28 | [options]
29 | packages = "./src/liqfit"
30 | zip_safe = "True"
31 | 
32 | 
33 | [tool.black]
34 | line-length = 80
35 | target-version = ['py37']


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [flake8]
2 | per-file-ignores = __init__.py:F401
3 | 


--------------------------------------------------------------------------------
/src/liqfit/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Knowledgator/LiqFit/51ba2714813ae1cf110f7e600cd7f2663cdec39c/src/liqfit/__init__.py


--------------------------------------------------------------------------------
/src/liqfit/collators/__init__.py:
--------------------------------------------------------------------------------
1 | from .base_collator import Collator
2 | from .nli_collator import NLICollator
3 | 


--------------------------------------------------------------------------------
/src/liqfit/collators/base_collator.py:
--------------------------------------------------------------------------------
 1 | from collections import defaultdict
 2 | import abc
 3 | from typing import Union
 4 | 
 5 | 
 6 | class Collator(abc.ABC):
 7 |     def __init__(
 8 |         self,
 9 |         tokenizer,
10 |         max_length: int,
11 |         padding: Union[bool, str],
12 |         truncation: bool,
13 |     ):
14 |         self.tokenizer = tokenizer
15 |         self.max_length = max_length
16 |         self.padding = padding
17 |         self.truncation = truncation
18 | 
19 |     @abc.abstractmethod
20 |     def collate(self, batch):
21 |         raise NotImplementedError("Should be implemented in a subclass.")
22 | 
23 |     def __call__(self, batch):
24 |         grouped_batch = defaultdict(list)
25 |         for example in batch:
26 |             for k, v in example.items():
27 |                 grouped_batch[k].append(v)
28 |         output = self.collate(grouped_batch)
29 |         return output
30 | 


--------------------------------------------------------------------------------
/src/liqfit/collators/nli_collator.py:
--------------------------------------------------------------------------------
 1 | from typing import Callable
 2 | import torch
 3 | 
 4 | from .base_collator import Collator
 5 | from typing import Union
 6 | 
 7 | 
 8 | class NLICollator(Collator):
 9 |     def __init__(
10 |         self,
11 |         tokenizer: Callable,
12 |         max_length: int,
13 |         padding: Union[bool, str],
14 |         truncation: bool,
15 |     ):
16 |         super().__init__(
17 |             tokenizer,
18 |             max_length=max_length,
19 |             padding=padding,
20 |             truncation=truncation,
21 |         )
22 | 
23 |     def _tokenize_and_align_labels(self, batch):
24 |         texts = batch.get("texts", None)
25 |         if texts is None:
26 |             raise ValueError(
27 |                 "Expected to find a key with name 'texts' that "
28 |                 "contains a list of tuples where each tuple "
29 |                 "contains the hypothesis and the premise. "
30 |                 f"Received: {batch.keys()}"
31 |             )
32 |         tokenized_input = self.tokenizer(
33 |             texts,
34 |             max_length=self.max_length,
35 |             padding=self.padding,
36 |             truncation=self.truncation,
37 |             return_tensors="pt",
38 |         )
39 |         labels = torch.tensor(batch["labels"])
40 |         tokenized_input.update({"labels": labels})
41 |         return tokenized_input
42 | 
43 |     def collate(self, batch):
44 |         tokenized_input = self._tokenize_and_align_labels(batch)
45 |         return tokenized_input
46 | 


--------------------------------------------------------------------------------
/src/liqfit/datasets/__init__.py:
--------------------------------------------------------------------------------
1 | from .nli_dataset import NLIDataset
2 | from .transform import transform_dataset
3 | 


--------------------------------------------------------------------------------
/src/liqfit/datasets/nli_dataset.py:
--------------------------------------------------------------------------------
  1 | from __future__ import annotations
  2 | 
  3 | from typing import Optional, List
  4 | from datasets import Dataset, load_dataset
  5 | 
  6 | from .transform import transform_dataset
  7 | 
  8 | 
  9 | class NLIDataset:
 10 |     def __init__(self, hypothesis: List, premises: List, labels: List):
 11 |         """LiqFitDataset used for NLI training.
 12 | 
 13 |         Args:
 14 |             hypothesis (List): List of hypothesis texts.
 15 |             premises (List): List of premises texts.
 16 |             labels (List): List of labels for each example.
 17 |         """
 18 |         self.hypothesis = hypothesis
 19 |         self.premises = premises
 20 |         self.labels = labels
 21 | 
 22 |     def __len__(self):
 23 |         equal_lengths = (
 24 |             len(self.hypothesis) == len(self.premises) == len(self.labels)
 25 |         )
 26 |         if not equal_lengths:
 27 |             raise ValueError(
 28 |                 "Expected equal lengths between `self.hypothesis`"
 29 |                 ", `self.premises` and `self.labels`. "
 30 |                 f"Received: {len(self.hypothesis)} "
 31 |                 f"- {len(self.premises)} - {len(self.labels)}."
 32 |             )
 33 |         return len(self.hypothesis)
 34 | 
 35 |     def __getitem__(self, idx):
 36 |         return {
 37 |             "texts": (self.hypothesis[idx], self.premises[idx]),
 38 |             "labels": self.labels[idx],
 39 |         }
 40 | 
 41 |     @classmethod
 42 |     def load_dataset(
 43 |         cls,
 44 |         dataset: Optional[Dataset] = None,
 45 |         dataset_name: Optional[str] = None,
 46 |         classes: Optional[List[str]] = None,
 47 |         text_column: Optional[str] = "text",
 48 |         label_column: Optional[str] = "label",
 49 |         template: Optional[str] = "This example is {}.",
 50 |         normalize_negatives: bool = False,
 51 |         positives: int = 1,
 52 |         negatives: int = -1,
 53 |         multi_label: bool = False,
 54 |     ) -> NLIDataset:
 55 |         """Returns a `NLIDataset` instance.
 56 | 
 57 |         Args:
 58 |             dataset (Optional[Dataset], optional): Instance of Huggingface
 59 |                 Dataset class. Defaults to None.
 60 |             dataset_name (Optional[str], optional): Dataset name to load from
 61 |                 Huggingface datasets. Defaults to None.
 62 |             classes (Optional[List[str]], optional): List of classes.
 63 |                 Defaults to None.
 64 |             text_column (Optional[str], optional): Text column name.
 65 |                 Defaults to 'text'.
 66 |             label_column (Optional[str], optional): Label column name.
 67 |                 Defaults to 'label'.
 68 |             template (Optional[str], optional): Template string that will be
 69 |                 used for Zero-Shot training/prediction. Defaults to
 70 |                 'This example is {}.'.
 71 |             normalize_negatives (bool, optional): Whether to normalize amount
 72 |                 of negative examples per each positive example of a class.
 73 |                 Defaults to False.
 74 |             positives (int, optional): Number of positive examples to generate
 75 |                 per source. Defaults to 1.
 76 |             negatives (int, optional): Number of negative examples to generate
 77 |                 per source. Defaults to -1.
 78 |             multi_label (bool, optional): Whether each example has multiple
 79 |                 labels or not. Defaults to False.
 80 | 
 81 |         Raises:
 82 |             TypeError: if `dataset_name` is `None` while `dataset` instance is
 83 |                 not passed.
 84 |             TypeError: if `label_name` is `None`.
 85 |             TypeError: if `text_column` is `None` while `dataset` instance is
 86 |                 not passed.
 87 |             TypeError: if `label_column` is `None` while `classes` is `None`.
 88 | 
 89 |         Returns:
 90 |             LiqFitDataset: An instance of LiqFitDataset.
 91 |         """
 92 |         if dataset is None:
 93 |             if dataset_name is None:
 94 |                 raise TypeError(
 95 |                     "If dataset object is not provided you need to"
 96 |                     " specify dataset_name."
 97 |                 )
 98 |             else:
 99 |                 dataset = load_dataset(dataset_name)["train"]
100 | 
101 |         if label_column not in dataset.features:
102 |             raise TypeError(f"Expected to find {label_column} in the dataset.")
103 | 
104 |         if text_column not in dataset.features:
105 |             raise TypeError(f"Expected to find {text_column} in the dataset.")
106 | 
107 |         if classes is None:
108 |             raise ValueError(
109 |                 f"Expected to have a list classes. Received: {classes}."
110 |             )
111 | 
112 |         processed_data = transform_dataset(
113 |             dataset,
114 |             classes,
115 |             text_column,
116 |             label_column,
117 |             template,
118 |             normalize_negatives,
119 |             positives,
120 |             negatives,
121 |             multi_label,
122 |         )
123 | 
124 |         return cls(
125 |             processed_data["sources"],
126 |             processed_data["targets"],
127 |             processed_data["labels"],
128 |         )
129 | 


--------------------------------------------------------------------------------
/src/liqfit/datasets/transform.py:
--------------------------------------------------------------------------------
  1 | from typing import List, Tuple, Optional
  2 | from collections import defaultdict
  3 | from datasets import Dataset
  4 | import numpy as np
  5 | import random
  6 | 
  7 | 
  8 | def get_labels_stat(labels: List[str]) -> Tuple[List[str], List[float]]:
  9 |     """Calculates the number of occurrences and probability of each unique
 10 |     label in the provided list of labels.
 11 | 
 12 |     Args:
 13 |         labels (List[str]): List of label strings
 14 | 
 15 |     Returns:
 16 |         unique_labels (List[str]): Unique label values
 17 |         probs (List[float]): Probability of each label
 18 |     """
 19 |     # count occurrences of each label
 20 |     label_counts = defaultdict(int)
 21 |     for label in labels:
 22 |         label_counts[label] += 1
 23 | 
 24 |     # calculate probabilities
 25 |     count = len(labels)
 26 |     label_probs = {
 27 |         label: label_count / count
 28 |         for label, label_count in label_counts.items()
 29 |     }
 30 | 
 31 |     # extract labels and probabilities
 32 |     unique_labels = list(label_probs.keys())
 33 |     probs = list(label_probs.values())
 34 | 
 35 |     return unique_labels, probs
 36 | 
 37 | 
 38 | def transform_dataset(
 39 |     dataset: Dataset,
 40 |     classes: List[str],
 41 |     text_column: Optional[str] = "text",
 42 |     label_column: Optional[str] = "label",
 43 |     template: Optional[str] = "This example is {}.",
 44 |     normalize_negatives: bool = False,
 45 |     positives: int = 1,
 46 |     negatives: int = -1,
 47 |     multi_label: bool = False,
 48 | ) -> Dataset:
 49 |     """Transform a dataset into a format suitable for training.
 50 | 
 51 |     Args:
 52 |         dataset (Dataset): Input dataset.
 53 |         classes (List[str]): List of possible class labels.
 54 |         template (str, optional): Template string for generating examples.
 55 |         normalize_negatives (bool, optional): Whether to normalize amount of
 56 |                                 negative examples per each positive example of a class.
 57 |         positives (int, optional): Number of positive examples to generate per source.
 58 |         negatives (int, optional): Number of negative examples to generate per source.
 59 | 
 60 | 
 61 |     Returns:
 62 |         Dataset: Transformed dataset.
 63 | 
 64 |     This function transforms the input dataset into a format suitable for
 65 |     multi-label discriminative training. For each source text, it generates
 66 |     positive examples using the provided labels, and negative examples by
 67 |     sampling random incorrect labels.
 68 |     """
 69 |     new_dataset = {"sources": [], "targets": [], "labels": []}
 70 | 
 71 |     texts = dataset[text_column]
 72 | 
 73 |     if label_column == "all_labels":
 74 |         labels = dataset["all_labels"]
 75 |         multi_label = True
 76 |     elif label_column in dataset.features:
 77 |         labels = dataset[label_column]
 78 |         if type(labels[0]) == int:
 79 |             labels = [classes[idx] for idx in labels]
 80 |     else:
 81 |         raise NotImplementedError(
 82 |             'Dataset should contains "label" or "all_labels" columns'
 83 |         )
 84 | 
 85 |     if normalize_negatives:
 86 |         unique_labels, probs = get_labels_stat(labels)
 87 | 
 88 |     if positives == -1:
 89 |         positives = len(classes) - 1
 90 |     if negatives == -1:
 91 |         negatives = len(classes) - 1
 92 | 
 93 |     for text, label in zip(texts, labels):
 94 |         if multi_label:
 95 |             curr_labels = label
 96 |         else:
 97 |             curr_labels = [label]
 98 | 
 99 |         for label in curr_labels:
100 |             for i in range(positives):
101 |                 new_dataset["sources"].append(text)
102 |                 new_dataset["targets"].append(template.format(label))
103 |                 new_dataset["labels"].append(1)
104 | 
105 |             for _ in range(len(classes) - 1):
106 |                 neg_class_ = label
107 | 
108 |                 while neg_class_ in curr_labels:
109 |                     if normalize_negatives:
110 |                         neg_class_ = np.random.choice(unique_labels, p=probs)
111 |                     else:
112 |                         neg_class_ = random.sample(classes, k=1)[0]
113 | 
114 |                 new_dataset["sources"].append(text)
115 |                 new_dataset["targets"].append(template.format(neg_class_))
116 |                 new_dataset["labels"].append(0)
117 | 
118 |     return Dataset.from_dict(new_dataset)
119 | 


--------------------------------------------------------------------------------
/src/liqfit/losses/__init__.py:
--------------------------------------------------------------------------------
1 | from .losses import cross_entropy
2 | from .losses import binary_cross_entropy_with_logits
3 | from .losses import focal_loss_with_mask
4 | from .losses import BinaryCrossEntropyLoss, CrossEntropyLoss, FocalLoss
5 | 


--------------------------------------------------------------------------------
/src/liqfit/losses/losses.py:
--------------------------------------------------------------------------------
  1 | from __future__ import annotations
  2 | from typing import Optional
  3 | import torch.nn.functional as F
  4 | from kornia.losses import focal_loss
  5 | import torch
  6 | 
  7 | 
  8 | def binary_cross_entropy_with_logits(logits: torch.Tensor,
  9 |                                      labels: torch.Tensor,
 10 |                                      multi_target: bool = False,
 11 |                                      weight: Optional[torch.Tensor] = None,
 12 |                                      reduction: str = 'mean') -> torch.Tensor:
 13 |     """Wrapper function for adding support for multi_target training.
 14 | 
 15 |     Args:
 16 |         logits (torch.Tensor): Tensor with shape (B, T, D) where B is batch
 17 |             size, T is timesteps and D is embedding dimension.
 18 |         labels (torch.Tensor): Tensor with shape (B, T) where B is batch size,
 19 |             T is timesteps.
 20 |         multi_target (bool, optional): Whether the labels are multi target or
 21 |             one target for the entire sequence. Defaults to False.
 22 |         weight (Optional[torch.Tensor], optional): a manual rescaling weight
 23 |             if provided it's repeated to match input tensor shape.
 24 |             Defaults to None.
 25 |         reduction (str, optional): Reduction type that will be applied on the
 26 |             loss function, supported: 'mean', 'sum' or 'none'.
 27 |             Defaults to 'mean'.
 28 | 
 29 |     Returns:
 30 |         torch.Tensor: Loss tensor.
 31 |     """
 32 |     if multi_target:
 33 |         logits = logits.view(-1, logits.shape[-1])
 34 |         labels = labels.view(-1)
 35 |     else:
 36 |         labels = labels.view(-1)
 37 |     loss = F.binary_cross_entropy_with_logits(logits,
 38 |                                               labels,
 39 |                                               weight=weight,
 40 |                                               reduction=reduction)
 41 |     return loss
 42 | 
 43 | 
 44 | class BinaryCrossEntropyLoss(torch.nn.Module):
 45 |     
 46 |     def __init__(self, multi_target=False, weight=None, reduction='mean'):
 47 |         super().__init__()
 48 |         """Calculate binary cross-entropy loss with support for multi target training.
 49 | 
 50 |         Args:
 51 |             multi_target (bool, optional): Whether the labels are multi target or
 52 |                 one target for the entire sequence. Defaults to False.
 53 |             weight (Optional[torch.Tensor], optional): a manual rescaling weight
 54 |                 if provided it's repeated to match input tensor shape.
 55 |                 Defaults to None.
 56 |             reduction (str, optional): Reduction type that will be applied on the
 57 |                 loss function, supported: 'mean', 'sum' or 'none'.
 58 |                 Defaults to 'mean'.
 59 | 
 60 |         Returns:
 61 |             torch.Tensor: Loss tensor.
 62 |         Examples:
 63 |             loss = BinaryCrossEntropyLoss()(logits, targets)
 64 |         """
 65 |         self.multi_target = multi_target
 66 |         self.weight = weight
 67 |         self.reduction = reduction
 68 |         
 69 |     def forward(self, logits, target):
 70 |         
 71 |         loss = binary_cross_entropy_with_logits(
 72 |             logits, 
 73 |             target,
 74 |             multi_target=self.multi_target,
 75 |             weight=self.weight,
 76 |             reduction=self.reduction,
 77 |         )
 78 |         
 79 |         return loss
 80 | 
 81 | 
 82 | def cross_entropy(logits: torch.Tensor,
 83 |                   labels: torch.Tensor,
 84 |                   multi_target: bool = False,
 85 |                   weight: Optional[torch.Tensor] = None,
 86 |                   ignore_index: int = -100,
 87 |                   reduction: str = 'mean',
 88 |                   label_smoothing: float = 0.0):
 89 |     """Wrapper function for adding support for multi_target training.
 90 | 
 91 |     Args:
 92 |         logits (torch.Tensor): Tensor with shape (B, T, D) where B is batch
 93 |             size, T is timesteps and D is embedding dimension.
 94 |         labels (torch.Tensor): Tensor with shape (B, T) where B is batch size,
 95 |             T is timesteps.
 96 |         multi_target (bool, optional): Whether the labels are multi target or
 97 |             one target for the entire sequence. Defaults to False.
 98 |         weight (Optional[torch.Tensor], optional): a manual rescaling weight
 99 |             if provided it's repeated to match input tensor shape.
100 |             Defaults to None.
101 |         ignore_index (int, optional): Index value that will be ignored during
102 |             loss calculation. Defaults to -100.
103 |         reduction (str, optional): Reduction type that will be applied on the
104 |             loss function, supported: 'mean', 'sum' or 'none'.
105 |             Defaults to 'mean'.
106 |         label_smoothing (float, optional): A float in [0.0, 1.0]. Specifies
107 |             the amount of smoothing when computing the loss, where 0.0 means
108 |             no smoothing. Defaults to 0.0.
109 | 
110 |     Returns:
111 |         torch.Tensor: Loss tensor.
112 |     """
113 |     if multi_target:
114 |         logits = logits.view(-1, logits.shape[-1])
115 |         labels = labels.view(-1)
116 |     else:
117 |         labels = labels.view(-1)
118 |     loss = F.cross_entropy(logits,
119 |                            labels,
120 |                            weight=weight,
121 |                            reduction=reduction,
122 |                            ignore_index=ignore_index,
123 |                            label_smoothing=label_smoothing)
124 |     return loss
125 | 
126 | 
127 | class CrossEntropyLoss(torch.nn.Module):
128 |     
129 |     def __init__(self, multi_target=False, weight=None, ignore_index=-100, reduction='mean', label_smoothing=0.0):
130 |         super().__init__()
131 |         """Calculate cross-entropy loss while ignoring specified target labels.
132 | 
133 |         Args:
134 |             multi_target (bool, optional): Whether the labels are multi target or
135 |                 one target for the entire sequence. Defaults to False.
136 |             weight (Optional[torch.Tensor], optional): a manual rescaling weight
137 |                 if provided it's repeated to match input tensor shape.
138 |                 Defaults to None.
139 |             ignore_index (int, optional): Index value that will be ignored during
140 |                 loss calculation. Defaults to -100.
141 |             reduction (str, optional): Reduction type that will be applied on the
142 |                 loss function, supported: 'mean', 'sum' or 'none'.
143 |                 Defaults to 'mean'.
144 |             label_smoothing (float, optional): A float in [0.0, 1.0]. Specifies
145 |                 the amount of smoothing when computing the loss, where 0.0 means
146 |                 no smoothing. Defaults to 0.0.
147 | 
148 |         Returns:
149 |             torch.Tensor: Loss tensor.
150 |         Examples:
151 |             loss = CrossEntropyLoss()(logits, targets)
152 |         """
153 |         self.multi_target = multi_target
154 |         self.weight = weight
155 |         self.ignore_index = ignore_index
156 |         self.reduction = reduction
157 |         self.label_smoothing = label_smoothing
158 |         
159 |     def forward(self, logits, target):
160 |             
161 |         loss = cross_entropy(
162 |             logits, 
163 |             target,
164 |             multi_target=self.multi_target,
165 |             weight=self.weight,
166 |             ignore_index=self.ignore_index,
167 |             reduction=self.reduction,
168 |             label_smoothing=self.label_smoothing
169 |         )
170 |         
171 |         return loss
172 | 
173 | 
174 | def focal_loss_with_mask(
175 |     logits: torch.Tensor,
176 |     target: torch.Tensor,
177 |     ignore_index: int = -100,
178 |     alpha: float = 0.5,
179 |     gamma: float = 2.0,
180 |     reduction: str | None = "mean",
181 | ) -> torch.Tensor:
182 |     """Calculate focal loss while ignoring specified target labels.
183 |     
184 |     Args:
185 |         logits (torch.Tensor): Model predictions.
186 |         target (torch.Tensor): True labels.
187 |         ignore_index (int): Label to ignore from loss calculation.
188 |         alpha (float): Focal loss alpha parameter.
189 |         gamma (float): Focal loss gamma parameter.
190 |         reduction (str | None): Method to reduce loss.
191 |     
192 |     Returns:
193 |         torch.Tensor: Loss tensor.
194 |         
195 |     This function calculates the focal loss between logits and targets, 
196 |     while ignoring any examples where the target is equal to ignore_index.
197 | 
198 |     Examples:
199 |     
200 |         loss = focal_loss_with_mask(logits, targets, ignore_index=-100)
201 |     """
202 |     if not isinstance(ignore_index, int):
203 |         raise ValueError('Expected `ignore_index` to be of type `int`. '
204 |                          f'Received: {type(ignore_index)}')
205 | 
206 |     mask = target == ignore_index
207 | 
208 |     # To make focal_loss function work because
209 |     # it cannot work with -ve numbers (e.g. -100).
210 |     if ignore_index != 0:
211 |         target_without_ignore_index = target.masked_fill(mask, 0)
212 | 
213 |     loss = focal_loss(
214 |         pred=logits,
215 |         target=target_without_ignore_index,
216 |         alpha=alpha,
217 |         gamma=gamma,
218 |         reduction="none",
219 |     )
220 | 
221 |     loss = loss.masked_fill(mask.view(-1, 1), torch.inf)
222 | 
223 |     if reduction == "mean":
224 |         return loss[loss != torch.inf].mean()
225 |     elif reduction == "sum":
226 |         return loss[loss != torch.inf].sum()
227 |     elif reduction is None:
228 |         return loss
229 |     else:
230 |         raise ValueError(
231 |             'Expected reduction to be "sum", "mean" or `None`. '
232 |             f"Received: {reduction}."
233 |         )
234 | 
235 | class FocalLoss(torch.nn.Module):
236 |     def __init__(
237 |         self,
238 |         ignore_index: int = -100,
239 |         alpha: float = 0.5,
240 |         gamma: float = 2.0,
241 |         reduction: str = "mean",
242 |     ):
243 |         """Calculate focal loss while ignoring specified target labels.
244 |         Args:
245 |             logits (torch.Tensor): Model predictions.
246 |             target (torch.Tensor): True labels.
247 |             ignore_index (int): Label to ignore from loss calculation.
248 |         alpha: Weighting factor that ranges between [0, 1]`.
249 |             gamma: Focusing parameter gamma >= 0`.
250 |             reduction (str | None): Reduction type for loss reduction.
251 |                 Supported: 'mean', 'sum' or 'none'. Defaults to 'mean'
252 | 
253 |         Returns:
254 |             torch.Tensor: Loss tensor.
255 |         Examples:
256 |             loss = FocalLoss()(logits, targets)
257 |         """
258 |         super().__init__()
259 |         self.ignore_index = ignore_index
260 |         self.alpha = alpha
261 |         self.gamma = gamma
262 |         self.reduction = reduction
263 | 
264 |     def forward(self, logits: torch.Tensor, target: torch.Tensor):
265 |         return focal_loss_with_mask(
266 |             logits=logits,
267 |             target=target,
268 |             ignore_index=self.ignore_index,
269 |             alpha=self.alpha,
270 |             gamma=self.gamma,
271 |             reduction=self.reduction,
272 |         )
273 | 


--------------------------------------------------------------------------------
/src/liqfit/modeling/__init__.py:
--------------------------------------------------------------------------------
1 | from .heads import LiqFitHead
2 | from .heads import LabelClassificationHead
3 | from .heads import ClassClassificationHead
4 | from .heads import ClassificationHead
5 | from .model import LiqFitModel
6 | from .backbone import LiqFitBackbone
7 | from .heads import HeadOutput
8 | 


--------------------------------------------------------------------------------
/src/liqfit/modeling/backbone.py:
--------------------------------------------------------------------------------
 1 | from __future__ import annotations
 2 | import abc
 3 | 
 4 | import torch
 5 | from torch import nn
 6 | from transformers import PreTrainedModel, PretrainedConfig
 7 | 
 8 | 
 9 | class LiqFitBackbone(PreTrainedModel, abc.ABC):
10 |     def __init__(
11 |         self, config: PretrainedConfig, backbone: nn.Module, push_backbone_only: bool = False
12 |     ) -> None:
13 |         """Backbone model wrapper."""
14 |         super().__init__(config=config)
15 |         self.push_backbone_only = push_backbone_only
16 |         self.backbone = backbone
17 | 
18 |     def push_to_hub(
19 |         self,
20 |         repo_id: str,
21 |         use_temp_dir: bool | None = None,
22 |         commit_message: str | None = None,
23 |         private: bool | None = None,
24 |         token: bool | str | None = None,
25 |         max_shard_size: int | str | None = "5GB",
26 |         create_pr: bool = False,
27 |         safe_serialization: bool = True,
28 |         revision: str = None,
29 |         commit_description: str = None,
30 |         **deprecated_kwargs,
31 |     ) -> str:
32 |         if self.push_backbone_only:
33 |             output = self.backbone.push_to_hub(
34 |                 repo_id=repo_id,
35 |                 use_temp_dir=use_temp_dir,
36 |                 commit_message=commit_message,
37 |                 private=private,
38 |                 token=token,
39 |                 max_shard_size=max_shard_size,
40 |                 create_pr=create_pr,
41 |                 safe_serialization=safe_serialization,
42 |                 revision=revision,
43 |                 commit_description=commit_description,
44 |                 **deprecated_kwargs,
45 |             )
46 |         else:
47 |             output = super().push_to_hub(
48 |                 repo_id=repo_id,
49 |                 use_temp_dir=use_temp_dir,
50 |                 commit_message=commit_message,
51 |                 private=private,
52 |                 token=token,
53 |                 max_shard_size=max_shard_size,
54 |                 create_pr=create_pr,
55 |                 safe_serialization=safe_serialization,
56 |                 revision=revision,
57 |                 commit_description=commit_description,
58 |                 **deprecated_kwargs,
59 |             )
60 |         return output
61 | 
62 |     @abc.abstractmethod
63 |     def encode(self, input_ids, attention_mask=None) -> torch.Tensor:
64 |         raise NotImplementedError("Should be implemented in a subclass.")
65 | 
66 | 


--------------------------------------------------------------------------------
/src/liqfit/modeling/heads.py:
--------------------------------------------------------------------------------
  1 | import abc
  2 | from typing import Optional
  3 | 
  4 | import torch
  5 | from torch import nn
  6 | from dataclasses import dataclass
  7 | from transformers.modeling_outputs import ModelOutput
  8 | 
  9 | from ..losses import binary_cross_entropy_with_logits, cross_entropy
 10 | 
 11 | class LiqFitHead(nn.Module, abc.ABC):
 12 |     def __init__(self, *args, **kwargs) -> None:
 13 |         """LiqFitHead base class."""
 14 |         super().__init__(*args, **kwargs)
 15 | 
 16 |     @abc.abstractmethod
 17 |     def compute_loss(self, logits, labels) -> torch.Tensor:
 18 |         raise NotImplementedError("Should be implemented in a subclass.")
 19 | 
 20 |     @staticmethod
 21 |     def init_weight(module):
 22 |         if isinstance(module, nn.Linear):
 23 |             nn.init.xavier_uniform_(module.weight)
 24 |             if module.bias is not None:
 25 |                 nn.init.constant_(module.bias, 1e-2)
 26 | 
 27 |     @abc.abstractmethod
 28 |     def forward(
 29 |         self, embeddings: torch.Tensor, labels: Optional[torch.Tensor] = None
 30 |     ):
 31 |         pass
 32 | 
 33 | @dataclass
 34 | class HeadOutput(ModelOutput):
 35 |     embeddings: Optional[torch.Tensor] = None
 36 |     logits: Optional[torch.Tensor] = None
 37 |     loss: Optional[torch.Tensor] = None
 38 | 
 39 | 
 40 | class LabelClassificationHead(LiqFitHead):
 41 |     def __init__(
 42 |         self,
 43 |         in_features: int,
 44 |         out_features: int,
 45 |         multi_target: bool,
 46 |         bias: bool = True,
 47 |         temperature: int = 1.0,
 48 |         eps: float = 1e-5,
 49 |     ):
 50 |         """Label Classification Head class for Binary or Multi-label tasks.
 51 | 
 52 |         Args:
 53 |             in_features (_type_): Number of input features.
 54 |             out_features (_type_): Number of output features.
 55 |             multi_target (_type_): Whether this class is for multi-target
 56 |                 task or not.
 57 |             bias (bool, optional): Whether to add bias to the `Linear`
 58 |                 layer or not. Defaults to True.
 59 |             temperature (int, optional): Temperature that will be used
 60 |                 to calibrate the head to the task. Defaults to 1.0.
 61 |             eps (float, optional): Epsilon value for numirical stability.
 62 |                 Defaults to 1e-5.
 63 |         """
 64 |         super().__init__()
 65 |         self.temperature = temperature
 66 |         self.eps = eps
 67 |         self.multi_target = multi_target
 68 |         self.linear = nn.Linear(in_features, out_features, bias=bias)
 69 |         LiqFitHead.init_weight(self.linear)
 70 | 
 71 |     def compute_loss(self, logits: torch.Tensor, labels: torch.Tensor):
 72 |         loss = binary_cross_entropy_with_logits(
 73 |             logits, labels, self.multi_target
 74 |         )
 75 |         return loss
 76 | 
 77 |     def forward(
 78 |         self, embeddings: torch.Tensor, labels: Optional[torch.Tensor] = None
 79 |     ) -> torch.Tensor:
 80 |         logits = self.linear(embeddings)
 81 |         logits /= self.temperature + self.eps
 82 |         if labels is not None:
 83 |             loss = self.compute_loss(logits, labels)
 84 |         else:
 85 |             loss = None
 86 |         return HeadOutput(embeddings=embeddings, logits=logits, loss=loss)
 87 | 
 88 | 
 89 | class ClassClassificationHead(LiqFitHead):
 90 |     def __init__(
 91 |         self,
 92 |         in_features: int,
 93 |         out_features: int,
 94 |         multi_target: bool,
 95 |         bias: bool = True,
 96 |         temperature: int = 1.0,
 97 |         eps: float = 1e-5,
 98 |         ignore_index: int = -100,
 99 |     ):
100 |         """Class Classification Head class for Sequence/Token classification
101 |             tasks.
102 | 
103 |         Args:
104 |             in_features (int): Number of input features.
105 |             out_features (int): Number of output features.
106 |             multi_target (bool): Whether this class is for multi-target task
107 |                 or not.
108 |             bias (bool, optional): Whether to add bias to the `Linear`
109 |                 layer or not. Defaults to True.
110 |             temperature (int, optional): Temperature that will be used
111 |                 to calibrate the head to the task. Defaults to 1.0.
112 |             eps (float, optional): Epsilon value for numirical stability.
113 |                 Defaults to 1e-5.
114 |             ignore_index (int, optional): Index that will be ignore in
115 |                 case of token classification tasks. Defaults to -100.
116 |         """
117 |         super().__init__()
118 |         self.temperature = temperature
119 |         self.eps = eps
120 |         self.multi_target = multi_target
121 |         self.ignore_index = ignore_index
122 |         self.linear = nn.Linear(in_features, out_features, bias=bias)
123 |         LiqFitHead.init_weight(self.linear)
124 | 
125 |     def compute_loss(self, logits: torch.Tensor, labels: torch.Tensor):
126 |         return cross_entropy(
127 |             logits, labels, self.multi_target, ignore_index=self.ignore_index
128 |         )
129 | 
130 |     def forward(
131 |         self, embeddings: torch.Tensor, labels: Optional[torch.Tensor] = None
132 |     ) -> torch.Tensor:
133 |         logits = self.linear(embeddings) / (self.temperature + self.eps)
134 |         if labels is not None:
135 |             loss = self.compute_loss(logits, labels)
136 |         else:
137 |             loss = None
138 |         return HeadOutput(embeddings=embeddings, logits=logits, loss=loss)
139 | 
140 | 
141 | class ClassificationHead(LiqFitHead):
142 |     def __init__(
143 |         self,
144 |         in_features: int,
145 |         out_features: int,
146 |         pooler: nn.Module, 
147 |         loss_func: nn.Module,
148 |         bias: bool = True,
149 |         temperature: int = 1.0,
150 |         eps: float = 1e-5,
151 |     ):
152 |         """Class Classification Head class for Sequence/Token classification
153 |             tasks.
154 | 
155 |         Args:
156 |             in_features (int): Number of input features.
157 |             out_features (int): Number of output features.
158 |             pooler (torch.nn.Module): Module that applier various pooling opperation on the outputs of a model .
159 |             loss_func (torch.nn.Module): loss function object.
160 |             out_features (int): Number of output features.
161 |             bias (bool, optional): Whether to add bias to the `Linear`
162 |                 layer or not. Defaults to True.
163 |             temperature (int, optional): Temperature that will be used
164 |                 to calibrate the head to the task. Defaults to 1.0.
165 |             eps (float, optional): Epsilon value for numirical stability.
166 |                 Defaults to 1e-5.
167 |             ignore_index (int, optional): Index that will be ignore in
168 |                 case of token classification tasks. Defaults to -100.
169 |         """
170 |         super().__init__()
171 |         self.temperature = temperature
172 |         self.eps = eps
173 |         self.pooler = pooler
174 |         self.loss_func = loss_func
175 |         self.linear = nn.Linear(in_features, out_features, bias=bias)
176 |         LiqFitHead.init_weight(self.linear)
177 | 
178 |     def compute_loss(self, logits: torch.Tensor, labels: torch.Tensor):
179 |         return self.loss_func(
180 |                     logits, labels
181 |         )
182 | 
183 |     def forward(
184 |         self, embeddings: torch.Tensor, labels: Optional[torch.Tensor] = None
185 |     ) -> torch.Tensor:
186 |         pooled_input = self.pooler(embeddings)
187 |         logits = self.linear(pooled_input) / (self.temperature + self.eps)
188 |         if labels is not None:
189 |             loss = self.compute_loss(logits, labels)
190 |         else:
191 |             loss = None
192 |         return HeadOutput(embeddings=pooled_input, logits=logits, loss=loss)
193 | 


--------------------------------------------------------------------------------
/src/liqfit/modeling/model.py:
--------------------------------------------------------------------------------
  1 | from __future__ import annotations
  2 | 
  3 | from typing import Optional
  4 | 
  5 | import inspect
  6 | import torch
  7 | from torch import nn
  8 | import torch.nn.functional as F
  9 | from sklearn.linear_model import LogisticRegression
 10 | from transformers import PreTrainedModel, PretrainedConfig
 11 | 
 12 | from .backbone import LiqFitBackbone
 13 | from .heads import LiqFitHead, HeadOutput
 14 | from ..utils.standardization import convert_to_numpy
 15 | 
 16 | class LiqFitModel(PreTrainedModel):
 17 |     def __init__(
 18 |         self,
 19 |         config: PretrainedConfig,
 20 |         backbone: LiqFitBackbone | nn.Module | PreTrainedModel,
 21 |         head: Optional[LiqFitHead | LogisticRegression] = None,
 22 |         loss_func: Optional[nn.Module] = None,
 23 |         normalize_backbone_embeddings: bool = False,
 24 |         labels_name: str = "labels",
 25 |         push_backbone_only: bool = False,
 26 |     ):
 27 |         """Model container that groups the backbone and head together
 28 |             and applies forward on both of them.
 29 | 
 30 |         Args:
 31 |             backbone (LiqFitBackbone): Backbone model.
 32 |             head (Optional[LiqFitHead  |  LogisticRegression], optional):
 33 |                 Head that is defined for the task. Could be set to `None`
 34 |                 if the head is already attached to the backbone.
 35 |                 Defaults to None.
 36 |             loss_func (Optional[nn.Module]): class for calculation of loss functions.
 37 |             normalize_backbone_embeddings (bool, optional): Whether to
 38 |                 normalize the backbone embeddings or not (Requires the
 39 |                 backbone output to be a `torch.Tensor` not a Huggingface
 40 |                 object). Defaults to False.
 41 |             labels_name (str, optional): Labels name that will be sent in the
 42 |                 **kwargs for loss calculation. Defaults to "labels".
 43 | 
 44 |         Example 1:
 45 |             # make sure that the output from this model
 46 |             # is a torch.Tensor otherwise wrap it using LiqFitBackbone.
 47 |             my_backbone = AutoModel.from_pretrained(....)
 48 |             head = LiqFit.modeling.LabelClassificationHead(...)
 49 |             model = LiqFitModel(my_backbone.config, my_backbone, head)
 50 | 
 51 |         Example 2:
 52 |             class MyBackbone(LiqFitBackbone):
 53 |                 def __init__(self):
 54 |                     my_backbone = AutoModel.from_pretrained(....)
 55 |                     super().__init__(my_backbone.config, backbone=backbone)
 56 |                 def encode(self, input_ids, attention_mask=None) -> torch.Tensor:
 57 |                     output = self.backbone(input_ids, attention_mask=attention_mask)
 58 |                     return output
 59 | 
 60 |             my_backbone = MyBackbone()
 61 |             head = LiqFit.modeling.LabelClassificationHead(...)
 62 |             model = LiqFitModel(my_backbone.config, my_backbone, head)
 63 |         """
 64 | 
 65 |         super().__init__(config=config)
 66 |         self._is_sklearn_head = None
 67 |         self.backbone = backbone
 68 |         self._determine_and_validate_head_type(head)
 69 |         self.head = head
 70 |         self.loss_func = loss_func
 71 |         self.normalize_backbone_embeddings = normalize_backbone_embeddings
 72 |         self.labels_name = labels_name
 73 |         self.push_backbone_only = push_backbone_only
 74 |         self.expecting_labels = 'labels' in inspect.getfullargspec(self.backbone.forward).args
 75 | 
 76 |     def push_to_hub(
 77 |         self,
 78 |         repo_id: str,
 79 |         use_temp_dir: bool | None = None,
 80 |         commit_message: str | None = None,
 81 |         private: bool | None = None,
 82 |         token: bool | str | None = None,
 83 |         max_shard_size: int | str | None = "5GB",
 84 |         create_pr: bool = False,
 85 |         safe_serialization: bool = True,
 86 |         revision: str = None,
 87 |         commit_description: str = None,
 88 |         **deprecated_kwargs,
 89 |     ) -> str:
 90 |         if self.push_backbone_only:
 91 |             if isinstance(self.backbone, (LiqFitBackbone, PreTrainedModel)):
 92 |                 return self.backbone.push_to_hub(
 93 |                     repo_id,
 94 |                     use_temp_dir,
 95 |                     commit_message,
 96 |                     private,
 97 |                     token,
 98 |                     max_shard_size,
 99 |                     create_pr,
100 |                     safe_serialization,
101 |                     revision,
102 |                     commit_description,
103 |                     **deprecated_kwargs,
104 |                 )
105 |         else:
106 |             output = super().push_to_hub(
107 |                 repo_id=repo_id,
108 |                 use_temp_dir=use_temp_dir,
109 |                 commit_message=commit_message,
110 |                 private=private,
111 |                 token=token,
112 |                 max_shard_size=max_shard_size,
113 |                 create_pr=create_pr,
114 |                 safe_serialization=safe_serialization,
115 |                 revision=revision,
116 |                 commit_description=commit_description,
117 |                 **deprecated_kwargs,
118 |             )
119 |         return output
120 | 
121 |     def freeze_weights(self):
122 |         self.requires_grad_(False)
123 | 
124 |     def unfreeze_weights(self):
125 |         self.requires_grad_(True)
126 | 
127 |     def _determine_and_validate_head_type(self, head):
128 |         if head is None:
129 |             return
130 | 
131 |         self._is_sklearn_head = isinstance(head, LogisticRegression)
132 |         if not self._is_sklearn_head and not isinstance(head, LiqFitHead):
133 |             raise TypeError(
134 |                 "Expected `head` to be of type "
135 |                 "`LogisticRegression` or `LiqFitHead`. "
136 |                 f"Received: {type(head)}."
137 |             )
138 | 
139 |     def _backbone_forward(self, **kwargs):
140 |         if isinstance(self.backbone, LiqFitBackbone):
141 |             output = self.backbone.encode(**kwargs)
142 |             if not isinstance(output, torch.Tensor):
143 |                 raise ValueError(
144 |                     "Expected output from backbone model to be of type "
145 |                     f"`torch.Tensor`. Received: {type(output)}."
146 |                 )
147 |         else:
148 |             output = self.backbone(**kwargs)
149 |         return output
150 | 
151 |     def _torch_head_forward(self, embeddings, labels=None):
152 |         output = self.head(embeddings, labels)
153 |         return output
154 | 
155 |     def _sklearn_head_forward(self, embeddings):
156 |         embeddings = convert_to_numpy(embeddings)
157 |         output = self.head.predict(embeddings)
158 |         return output
159 | 
160 |     def _head_forward(self, inputs, labels=None):
161 |         if self._is_sklearn_head:
162 |             return self._sklearn_head_forward(inputs)
163 |         else:
164 |             return self._torch_head_forward(inputs, labels)
165 | 
166 |     def forward(self, **kwargs):
167 |         labels = kwargs.pop('labels', None)
168 | 
169 |         output = self._backbone_forward(**kwargs)
170 |         
171 |         if not isinstance(output, torch.Tensor):
172 |             if isinstance(output, tuple):
173 |                 output = output[0]
174 |             elif 'logits' in output:
175 |                 output = output['logits']
176 |             elif 'last_hidden_state' in output:
177 |                 output = output['last_hidden_state']
178 |             else:
179 |                 raise NotImplementedError('A model output should contains logits or last_hidden_state.')
180 |             
181 |         if self.normalize_backbone_embeddings:
182 |             if isinstance(output, torch.Tensor):
183 |                 output = F.normalize(output, p=2.0, dim=-1)
184 |             else:
185 |                 raise TypeError(
186 |                     "Normalizing the embedding requires type of "
187 |                     f"`torch.Tensor`. Received: {type(output)}."
188 |                 )
189 |         if self.head is not None:
190 |             output = self._head_forward(output, labels)
191 |         elif self.loss_func is not None and labels is not None:
192 |             loss = self.loss_func(output, labels)
193 |             output = HeadOutput(logits=output, loss=loss)
194 |         return output
195 | 


--------------------------------------------------------------------------------
/src/liqfit/modeling/pooling.py:
--------------------------------------------------------------------------------
 1 | from typing import Optional
 2 | 
 3 | import torch
 4 | from torch import nn
 5 | 
 6 | 
 7 | class GlobalMaxPooling1D(nn.Module):
 8 |     """Applies Global Max Pooling on the timesteps dimension."""
 9 | 
10 |     def forward(self, x: torch.Tensor):
11 |         return x.amax(dim=1)
12 | 
13 | 
14 | class FirstTokenPooling1D(nn.Module):
15 |     """Takes the first token's embedding."""
16 | 
17 |     def forward(self, x: torch.Tensor):
18 |         return x[:, 0, :]
19 | 
20 | 
21 | class LastTokenPooling1D(nn.Module):
22 |     """Takes the last token's embedding."""
23 | 
24 |     def forward(self, x: torch.Tensor):
25 |         return x[:, -1, :]
26 | 
27 | 
28 | class GlobalAvgPooling1D(nn.Module):
29 |     """Applies Global Average Pooling on the timesteps dimension."""
30 | 
31 |     def forward(
32 |         self, x: torch.Tensor, attention_mask: Optional[torch.Tensor] = None
33 |     ):
34 |         if attention_mask is not None:
35 |             attention_mask = attention_mask.repeat((1, 1, x.shape[-1])).to(
36 |                 dtype=x.dtype
37 |             )
38 |             x = x * attention_mask
39 |             return x.sum(1) / attention_mask.sum(1)
40 |         else:
41 |             return x.mean(dim=1)
42 | 
43 | 
44 | class GlobalSumPooling1D(nn.Module):
45 |     """Applies Global Sum Pooling on the timesteps dimension."""
46 | 
47 |     def forward(self, x: torch.Tensor, attention_mask: Optional[torch.Tensor] = None):
48 |         if attention_mask is not None:
49 |             x = x * attention_mask
50 |         return x.sum(dim=1)
51 | 
52 | 
53 | class GlobalRMSPooling1D(nn.Module):
54 |     """Applies Global RMS Pooling on the timesteps dimension."""
55 | 
56 |     def forward(self, x: torch.Tensor, attention_mask: Optional[torch.Tensor] = None):
57 |         if attention_mask is not None:
58 |             attention_mask = attention_mask.repeat((1, 1, x.shape[-1])).to(
59 |                 dtype=x.dtype
60 |             )
61 |             x = x * attention_mask
62 |             return (x.pow(2).sum(dim=1) / attention_mask.sum(1)).sqrt()
63 |         else:
64 |             return x.pow(2).mean(dim=1).sqrt()
65 | 
66 | 
67 | class GlobalAbsMaxPooling1D(nn.Module):
68 |     """Applies Global Max Pooling of absolute values on the timesteps dimension."""
69 | 
70 |     def forward(self, x: torch.Tensor, attention_mask: Optional[torch.Tensor] = None):
71 |         if attention_mask is not None:
72 |             attention_mask = attention_mask.repeat((1, 1, x.shape[-1])).to(
73 |                 dtype=x.dtype
74 |             )
75 |             x = x * attention_mask
76 |         return x.abs().amax(dim=1)
77 | 
78 | 
79 | class GlobalAbsAvgPooling1D(nn.Module):
80 |     """Applies Global Average Pooling of absolute values on the timesteps dimension."""
81 | 
82 |     def forward(self, x: torch.Tensor, attention_mask: Optional[torch.Tensor] = None):
83 |         if attention_mask is not None:
84 |             attention_mask = attention_mask.repeat((1, 1, x.shape[-1])).to(
85 |                 dtype=x.dtype
86 |             )
87 |             x = (x * attention_mask).abs()
88 |             return x.sum(dim=1) / attention_mask.sum(1)
89 |         else:
90 |             return x.abs().mean(dim=1)
91 | 


--------------------------------------------------------------------------------
/src/liqfit/models/__init__.py:
--------------------------------------------------------------------------------
1 | from .t5 import T5ForZeroShotClassification, T5ConfigWithLoss
2 | from .deberta import DebertaV2ForZeroShotClassification, DebertaConfigWithLoss


--------------------------------------------------------------------------------
/src/liqfit/models/deberta.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2020, The T5 Authors and HuggingFace Inc. and Knowledagtor
  3 | #
  4 | # Licensed under the Apache License, Version 2.0 (the "License");
  5 | # you may not use this file except in compliance with the License.
  6 | # You may obtain a copy of the License at
  7 | #
  8 | #     http://www.apache.org/licenses/LICENSE-2.0
  9 | #
 10 | # Unless required by applicable law or agreed to in writing, software
 11 | # distributed under the License is distributed on an "AS IS" BASIS,
 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13 | # See the License for the specific language governing permissions and
 14 | # limitations under the License.
 15 | 
 16 | from transformers import DebertaConfig, DebertaV2ForSequenceClassification
 17 | from transformers.modeling_outputs import SequenceClassifierOutput
 18 | from transformers.utils import add_end_docstrings, logging
 19 | from torch.nn import BCEWithLogitsLoss, CrossEntropyLoss, MSELoss
 20 | 
 21 | from typing import Union, Optional, Tuple
 22 | import torch
 23 | from torch import nn
 24 | 
 25 | from typing import List, Union
 26 | 
 27 | from ..losses import FocalLoss
 28 | 
 29 | logger = logging.get_logger(__name__)
 30 | 
 31 | SUPPORTED_LOSSES = ("focal_loss", "cross_entropy")
 32 | 
 33 | 
 34 | class DebertaConfigWithLoss(DebertaConfig):
 35 |     """Deberta configuration with additional loss parameters.
 36 |     
 37 |     Extends Deberta to include parameters for configuring the 
 38 |     loss function during training.
 39 |     """
 40 |     def __init__(
 41 |         self,
 42 |         loss_type = "focal_loss",
 43 |         focal_loss_alpha=0.5,
 44 |         focal_loss_gamma=2.0,
 45 |         **kwargs,
 46 |     ):
 47 |         super().__init__(**kwargs)
 48 |         self.loss_type=  loss_type
 49 |         self.focal_loss_alpha = focal_loss_alpha
 50 |         self.focal_loss_gamma = focal_loss_gamma
 51 | 
 52 | class DebertaV2ForZeroShotClassification(DebertaV2ForSequenceClassification):
 53 |     def __init__(self, config: DebertaConfigWithLoss):
 54 |         super().__init__(config)
 55 | 
 56 |         if self.config.loss_type not in SUPPORTED_LOSSES:
 57 |             raise NotImplementedError(f"{self.config.loss_type} is not implemented loss function type. ")
 58 |         
 59 |     def forward(
 60 |         self,
 61 |         input_ids: Optional[torch.Tensor] = None,
 62 |         attention_mask: Optional[torch.Tensor] = None,
 63 |         token_type_ids: Optional[torch.Tensor] = None,
 64 |         position_ids: Optional[torch.Tensor] = None,
 65 |         inputs_embeds: Optional[torch.Tensor] = None,
 66 |         labels: Optional[torch.Tensor] = None,
 67 |         output_attentions: Optional[bool] = None,
 68 |         output_hidden_states: Optional[bool] = None,
 69 |         return_dict: Optional[bool] = None,
 70 |     ) -> Union[Tuple, SequenceClassifierOutput]:
 71 |         r"""
 72 |         labels (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
 73 |             Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
 74 |             config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
 75 |             `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
 76 |         """
 77 |         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
 78 | 
 79 |         outputs = self.deberta(
 80 |             input_ids,
 81 |             token_type_ids=token_type_ids,
 82 |             attention_mask=attention_mask,
 83 |             position_ids=position_ids,
 84 |             inputs_embeds=inputs_embeds,
 85 |             output_attentions=output_attentions,
 86 |             output_hidden_states=output_hidden_states,
 87 |             return_dict=return_dict,
 88 |         )
 89 | 
 90 |         encoder_layer = outputs[0]
 91 |         pooled_output = self.pooler(encoder_layer)
 92 |         pooled_output = self.dropout(pooled_output)
 93 |         logits = self.classifier(pooled_output)
 94 | 
 95 |         loss = None
 96 |         if labels is not None:
 97 |             if self.config.problem_type is None:
 98 |                 if self.num_labels == 1:
 99 |                     # regression task
100 |                     loss_fn = nn.MSELoss()
101 |                     logits = logits.view(-1).to(labels.dtype)
102 |                     loss = loss_fn(logits, labels.view(-1))
103 |                 elif labels.dim() == 1 or labels.size(-1) == 1:
104 |                     label_index = (labels >= 0).nonzero()
105 |                     labels = labels.long()
106 |                     if label_index.size(0) > 0:
107 |                         labeled_logits = torch.gather(
108 |                             logits, 0, label_index.expand(label_index.size(0), logits.size(1))
109 |                         )
110 |                         labels = torch.gather(labels, 0, label_index.view(-1))
111 |                         loss_fct = CrossEntropyLoss()
112 |                         loss = loss_fct(labeled_logits.view(-1, self.num_labels).float(), labels.view(-1))
113 |                     else:
114 |                         loss = torch.tensor(0).to(logits)
115 |                 else:
116 |                     log_softmax = nn.LogSoftmax(-1)
117 |                     loss = -((log_softmax(logits) * labels).sum(-1)).mean()
118 |             elif self.config.problem_type == "regression":
119 |                 loss_fct = MSELoss()
120 |                 if self.num_labels == 1:
121 |                     loss = loss_fct(logits.squeeze(), labels.squeeze())
122 |                 else:
123 |                     loss = loss_fct(logits, labels)
124 |             elif self.config.problem_type == "single_label_classification":
125 |                 if self.config.loss_type == "cross_entropy":
126 |                     loss_fct = CrossEntropyLoss()
127 |                 elif self.config.loss_type == "focal_loss":
128 |                     loss_fct = FocalLoss(alpha=self.config.focal_loss_alpha, gamma=self.config.focal_loss_gamma)
129 |                 loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))
130 |             elif self.config.problem_type == "multi_label_classification":
131 |                 loss_fct = BCEWithLogitsLoss()
132 |                 loss = loss_fct(logits, labels)
133 |         if not return_dict:
134 |             output = (logits,) + outputs[1:]
135 |             return ((loss,) + output) if loss is not None else output
136 | 
137 |         return SequenceClassifierOutput(
138 |             loss=loss, logits=logits, hidden_states=outputs.hidden_states, attentions=outputs.attentions
139 |         )


--------------------------------------------------------------------------------
/src/liqfit/models/t5.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2020, The T5 Authors and HuggingFace Inc. and Knowledagtor
  3 | #
  4 | # Licensed under the Apache License, Version 2.0 (the "License");
  5 | # you may not use this file except in compliance with the License.
  6 | # You may obtain a copy of the License at
  7 | #
  8 | #     http://www.apache.org/licenses/LICENSE-2.0
  9 | #
 10 | # Unless required by applicable law or agreed to in writing, software
 11 | # distributed under the License is distributed on an "AS IS" BASIS,
 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13 | # See the License for the specific language governing permissions and
 14 | # limitations under the License.
 15 | 
 16 | from transformers import T5PreTrainedModel, T5Config, T5Model
 17 | from transformers.modeling_outputs import Seq2SeqSequenceClassifierOutput
 18 | from transformers.utils import add_end_docstrings, logging
 19 | from torch.nn import BCEWithLogitsLoss, CrossEntropyLoss, MSELoss
 20 | 
 21 | from typing import Union, Optional, Tuple
 22 | import torch
 23 | from torch import nn
 24 | 
 25 | from typing import List, Union
 26 | 
 27 | from ..losses import FocalLoss
 28 | 
 29 | logger = logging.get_logger(__name__)
 30 | 
 31 | SUPPORTED_LOSSES = ("focal_loss", "cross_entropy")
 32 | 
 33 | class T5ConfigWithLoss(T5Config):
 34 |     """T5 configuration with additional loss parameters.
 35 |     
 36 |     Extends T5Config to include parameters for configuring the 
 37 |     loss function during training.
 38 |     """
 39 |     def __init__(
 40 |         self,
 41 |         loss_type = "focal_loss",
 42 |         focal_loss_alpha=0.5,
 43 |         focal_loss_gamma=2.0,
 44 |         **kwargs,
 45 |     ):
 46 |         super().__init__(**kwargs)
 47 |         self.loss_type=  loss_type
 48 |         self.focal_loss_alpha = focal_loss_alpha
 49 |         self.focal_loss_gamma = focal_loss_gamma
 50 | 
 51 | class T5ClassificationHead(nn.Module):
 52 |     """Head for sentence-level classification tasks."""
 53 | 
 54 |     def __init__(self, config: T5ConfigWithLoss):
 55 |         super().__init__()
 56 |         self.dense = nn.Linear(config.d_model, config.d_model)
 57 |         self.dropout = nn.Dropout(p=config.classifier_dropout)
 58 |         self.out_proj = nn.Linear(config.d_model, config.num_labels)
 59 | 
 60 |     def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
 61 |         hidden_states = self.dropout(hidden_states)
 62 |         hidden_states = self.dense(hidden_states)
 63 |         hidden_states = torch.tanh(hidden_states)
 64 |         hidden_states = self.dropout(hidden_states)
 65 |         hidden_states = self.out_proj(hidden_states)
 66 |         return hidden_states
 67 | 
 68 | 
 69 | class T5ForZeroShotClassification(T5PreTrainedModel):
 70 |     _keys_to_ignore_on_load_unexpected = ["decoder.block.0.layer.1.EncDecAttention.relative_attention_bias.weight"]
 71 |     _tied_weights_keys = ["encoder.embed_tokens.weight", "decoder.embed_tokens.weight"]
 72 | 
 73 |     def __init__(self, config: T5ConfigWithLoss):
 74 |         super().__init__(config)
 75 | 
 76 |         if self.config.loss_type not in SUPPORTED_LOSSES:
 77 |             raise NotImplementedError(f"{self.config.loss_type} is not implemented loss function type. ")
 78 |         
 79 |         self.transformer = T5Model(config)
 80 |         self.classification_head = T5ClassificationHead(config)
 81 | 
 82 |         # Initialize weights and apply final processing
 83 |         self.post_init()
 84 | 
 85 |         self.model_parallel = False
 86 | 
 87 |     def forward(
 88 |         self,
 89 |         input_ids: torch.LongTensor = None,
 90 |         attention_mask: Optional[torch.Tensor] = None,
 91 |         decoder_input_ids: Optional[torch.LongTensor] = None,
 92 |         decoder_attention_mask: Optional[torch.LongTensor] = None,
 93 |         head_mask: Optional[torch.Tensor] = None,
 94 |         decoder_head_mask: Optional[torch.Tensor] = None,
 95 |         cross_attn_head_mask: Optional[torch.Tensor] = None,
 96 |         encoder_outputs: Optional[List[torch.FloatTensor]] = None,
 97 |         inputs_embeds: Optional[torch.FloatTensor] = None,
 98 |         decoder_inputs_embeds: Optional[torch.FloatTensor] = None,
 99 |         labels: Optional[torch.LongTensor] = None,
100 |         use_cache: Optional[bool] = None,
101 |         output_attentions: Optional[bool] = None,
102 |         output_hidden_states: Optional[bool] = None,
103 |         return_dict: Optional[bool] = None,
104 |     ) -> Union[Tuple, Seq2SeqSequenceClassifierOutput]:
105 |         r"""
106 |         labels (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
107 |             Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
108 |             config.num_labels - 1]`. If `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
109 |         Returns:
110 |         """
111 |         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
112 |         if labels is not None:
113 |             use_cache = False
114 | 
115 |         if input_ids is None and inputs_embeds is not None:
116 |             raise NotImplementedError(
117 |                 f"Passing input embeddings is currently not supported for {self.__class__.__name__}"
118 |             )
119 | 
120 |         # Copied from models.bart.modeling_bart.BartModel.forward different to other models, T5 automatically creates
121 |         # decoder_input_ids from input_ids if no decoder_input_ids are provided
122 |         if decoder_input_ids is None and decoder_inputs_embeds is None:
123 |             if input_ids is None:
124 |                 raise ValueError(
125 |                     "If no `decoder_input_ids` or `decoder_inputs_embeds` are "
126 |                     "passed, `input_ids` cannot be `None`. Please pass either "
127 |                     "`input_ids` or `decoder_input_ids` or `decoder_inputs_embeds`."
128 |                 )
129 |             decoder_input_ids = self._shift_right(input_ids)
130 | 
131 |         outputs = self.transformer(
132 |             input_ids,
133 |             attention_mask=attention_mask,
134 |             decoder_input_ids=decoder_input_ids,
135 |             decoder_attention_mask=decoder_attention_mask,
136 |             head_mask=head_mask,
137 |             decoder_head_mask=decoder_head_mask,
138 |             cross_attn_head_mask=cross_attn_head_mask,
139 |             encoder_outputs=encoder_outputs,
140 |             inputs_embeds=inputs_embeds,
141 |             decoder_inputs_embeds=decoder_inputs_embeds,
142 |             use_cache=use_cache,
143 |             output_attentions=output_attentions,
144 |             output_hidden_states=output_hidden_states,
145 |             return_dict=return_dict,
146 |         )
147 |         sequence_output = outputs[0]
148 | 
149 |         eos_mask = decoder_input_ids.eq(self.config.eos_token_id).to(sequence_output.device)
150 | 
151 |         if len(torch.unique_consecutive(eos_mask.sum(1))) > 1:
152 |             raise ValueError("All examples must have the same number of <eos> tokens.")
153 |         batch_size, _, hidden_size = sequence_output.shape
154 |         sentence_representation = sequence_output[eos_mask, :].view(batch_size, -1, hidden_size)[:, -1, :]
155 | 
156 |         logits = self.classification_head(sentence_representation)
157 | 
158 |         loss = None
159 |         if labels is not None:
160 |             labels = labels.to(logits.device)
161 |             if self.config.problem_type is None:
162 |                 if self.config.num_labels == 1:
163 |                     self.config.problem_type = "regression"
164 |                 elif self.config.num_labels > 1 and (labels.dtype == torch.long or labels.dtype == torch.int):
165 |                     self.config.problem_type = "single_label_classification"
166 |                 else:
167 |                     self.config.problem_type = "multi_label_classification"
168 | 
169 |             if self.config.problem_type == "regression":
170 |                 loss_fct = MSELoss()
171 |                 if self.config.num_labels == 1:
172 |                     loss = loss_fct(logits.squeeze(), labels.squeeze())
173 |                 else:
174 |                     loss = loss_fct(logits, labels)
175 |             elif self.config.problem_type == "single_label_classification":
176 |                 if self.config.loss_type == "cross_entropy":
177 |                     loss_fct = CrossEntropyLoss()
178 |                 elif self.config.loss_type == "focal_loss":
179 |                     loss_fct = FocalLoss(alpha=self.config.focal_loss_alpha, gamma=self.config.focal_loss_gamma)
180 |                 loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))
181 |             elif self.config.problem_type == "multi_label_classification":
182 |                 loss_fct = BCEWithLogitsLoss()
183 |                 loss = loss_fct(logits, labels)
184 |         if not return_dict:
185 |             output = (logits,) + outputs[1:]
186 |             return ((loss,) + output) if loss is not None else output
187 | 
188 |         return Seq2SeqSequenceClassifierOutput(
189 |             loss=loss,
190 |             logits=logits,
191 |             past_key_values=outputs.past_key_values,
192 |             decoder_hidden_states=outputs.decoder_hidden_states,
193 |             decoder_attentions=outputs.decoder_attentions,
194 |             cross_attentions=outputs.cross_attentions,
195 |             encoder_last_hidden_state=outputs.encoder_last_hidden_state,
196 |             encoder_hidden_states=outputs.encoder_hidden_states,
197 |             encoder_attentions=outputs.encoder_attentions,
198 |         )


--------------------------------------------------------------------------------
/src/liqfit/pipeline/__init__.py:
--------------------------------------------------------------------------------
1 | from .inference import ZeroShotClassificationPipeline
2 | 


--------------------------------------------------------------------------------
/src/liqfit/pipeline/inference.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2020 The HuggingFace Team and Knowledgator. All rights reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | from transformers.tokenization_utils import TruncationStrategy
 16 | from transformers.utils import add_end_docstrings, logging
 17 | from transformers.pipelines.base import PIPELINE_INIT_ARGS, ArgumentHandler, ChunkPipeline
 18 | 
 19 | from typing import Union
 20 | import inspect
 21 | from typing import List, Union
 22 | import numpy as np
 23 | 
 24 | 
 25 | logger = logging.get_logger(__name__)
 26 | 
 27 | class ZeroShotClassificationArgumentHandler(ArgumentHandler):
 28 |     """
 29 |     Handles arguments for zero-shot for text classification by turning each possible label into an NLI
 30 |     premise/hypothesis pair.
 31 |     """
 32 | 
 33 |     def _parse_labels(self, labels):
 34 |         if isinstance(labels, str):
 35 |             labels = [label.strip() for label in labels.split(",") if label.strip()]
 36 |         return labels
 37 | 
 38 |     def __call__(self, sequences, labels, hypothesis_template, hypothesis_first):
 39 |         if len(labels) == 0 or len(sequences) == 0:
 40 |             raise ValueError("You must include at least one label and at least one sequence.")
 41 |         if hypothesis_template.format(labels[0]) == hypothesis_template:
 42 |             raise ValueError(
 43 |                 (
 44 |                     'The provided hypothesis_template "{}" was not able to be formatted with the target labels. '
 45 |                     "Make sure the passed template includes formatting syntax such as {{}} where the label should go."
 46 |                 ).format(hypothesis_template)
 47 |             )
 48 | 
 49 |         if isinstance(sequences, str):
 50 |             sequences = [sequences]
 51 | 
 52 |         sequence_pairs = []
 53 |         if not hypothesis_first:
 54 |             for sequence in sequences:
 55 |                 sequence_pairs.extend([[sequence, hypothesis_template.format(label)] for label in labels])
 56 |         else:
 57 |             for sequence in sequences:
 58 |                 sequence_pairs.extend([[hypothesis_template.format(label), sequence] for label in labels])
 59 |         return sequence_pairs, sequences
 60 | 
 61 | 
 62 | @add_end_docstrings(PIPELINE_INIT_ARGS)
 63 | class ZeroShotClassificationPipeline(ChunkPipeline):
 64 |     """
 65 |     NLI-based zero-shot classification pipeline using a `ModelForSequenceClassification` trained on NLI (natural
 66 |     language inference) tasks. Equivalent of `text-classification` pipelines, but these models don't require a
 67 |     hardcoded number of potential classes, they can be chosen at runtime. It usually means it's slower but it is
 68 |     **much** more flexible.
 69 | 
 70 |     Any combination of sequences and labels can be passed and each combination will be posed as a premise/hypothesis
 71 |     pair and passed to the pretrained model. Then, the logit for *entailment* is taken as the logit for the candidate
 72 |     label being valid. Any NLI model can be used, but the id of the *entailment* label must be included in the model
 73 |     config's :attr:*~transformers.PretrainedConfig.label2id*.
 74 | 
 75 |     Example:
 76 | 
 77 |     ```python
 78 |     >>> from transformers import pipeline
 79 | 
 80 |     >>> oracle = pipeline(model="facebook/bart-large-mnli")
 81 |     >>> oracle(
 82 |     ...     "I have a problem with my iphone that needs to be resolved asap!!",
 83 |     ...     candidate_labels=["urgent", "not urgent", "phone", "tablet", "computer"],
 84 |     ... )
 85 |     {'sequence': 'I have a problem with my iphone that needs to be resolved asap!!', 'labels': ['urgent', 'phone', 'computer', 'not urgent', 'tablet'], 'scores': [0.504, 0.479, 0.013, 0.003, 0.002]}
 86 | 
 87 |     >>> oracle(
 88 |     ...     "I have a problem with my iphone that needs to be resolved asap!!",
 89 |     ...     candidate_labels=["english", "german"],
 90 |     ... )
 91 |     {'sequence': 'I have a problem with my iphone that needs to be resolved asap!!', 'labels': ['english', 'german'], 'scores': [0.814, 0.186]}
 92 |     ```
 93 | 
 94 |     Learn more about the basics of using a pipeline in the [pipeline tutorial](../pipeline_tutorial)
 95 | 
 96 |     This NLI pipeline can currently be loaded from [`pipeline`] using the following task identifier:
 97 |     `"zero-shot-classification"`.
 98 | 
 99 |     The models that this pipeline can use are models that have been fine-tuned on an NLI task. See the up-to-date list
100 |     of available models on [huggingface.co/models](https://huggingface.co/models?search=nli).
101 |     """
102 | 
103 |     def __init__(self, args_parser=ZeroShotClassificationArgumentHandler(), *args, **kwargs):
104 |         self._args_parser = args_parser
105 |         super().__init__(*args, **kwargs)
106 |         if self.entailment_id == -1:
107 |             logger.warning(
108 |                 "Failed to determine 'entailment' label id from the label2id mapping in the model config. Setting to "
109 |                 "-1. Define a descriptive label2id mapping in the model config to ensure correct outputs."
110 |             )
111 | 
112 |     @property
113 |     def entailment_id(self):
114 |         if len(self.model.config.label2id.items()) == 0:
115 |             return 0
116 |         for label, ind in self.model.config.label2id.items():
117 |             if label.lower().startswith("entail"):
118 |                 return ind
119 |         return -1
120 | 
121 |     def _parse_and_tokenize(
122 |         self, sequence_pairs, padding=True, add_special_tokens=True, truncation=TruncationStrategy.ONLY_FIRST, 
123 |                                                             encoder_decoder = False, **kwargs
124 |     ):
125 |         """
126 |         Parse arguments and tokenize only_first so that hypothesis (label) is not truncated
127 |         """
128 |         return_tensors = self.framework
129 |         if self.tokenizer.pad_token is None:
130 |             # Override for tokenizers not supporting padding
131 |             logger.error(
132 |                 "Tokenizer was not supporting padding necessary for zero-shot, attempting to use "
133 |                 " `pad_token=eos_token`"
134 |             )
135 |             self.tokenizer.pad_token = self.tokenizer.eos_token
136 |         try:
137 |             if encoder_decoder:
138 |                 sequence_pairs, decoder_input = sequence_pairs
139 | 
140 |             inputs = self.tokenizer(
141 |                 [sequence_pairs],
142 |                 add_special_tokens=add_special_tokens,
143 |                 return_tensors=return_tensors,
144 |                 padding=padding,
145 |                 truncation=truncation,
146 |             )
147 |             if encoder_decoder:
148 |                 decoder_inputs = self.tokenizer(
149 |                     [decoder_input],
150 |                     add_special_tokens=add_special_tokens,
151 |                     return_tensors=return_tensors,
152 |                     padding=padding,
153 |                     truncation=truncation,
154 |                 )
155 |                 inputs['decoder_input_ids'] = decoder_inputs['input_ids']
156 |                 inputs['decoder_attention_mask'] = decoder_inputs['attention_mask']
157 |                 
158 |         except Exception as e:
159 |             if "too short" in str(e):
160 |                 # tokenizers might yell that we want to truncate
161 |                 # to a value that is not even reached by the input.
162 |                 # In that case we don't want to truncate.
163 |                 # It seems there's not a really better way to catch that
164 |                 # exception.
165 | 
166 |                 inputs = self.tokenizer(
167 |                     [sequence_pairs],
168 |                     add_special_tokens=add_special_tokens,
169 |                     return_tensors=return_tensors,
170 |                     padding=padding,
171 |                     truncation=TruncationStrategy.DO_NOT_TRUNCATE,
172 |                 )
173 |                 if encoder_decoder:
174 |                     decoder_inputs = self.tokenizer(
175 |                         [decoder_input],
176 |                         add_special_tokens=add_special_tokens,
177 |                         return_tensors=return_tensors,
178 |                         padding=padding,
179 |                         truncation=TruncationStrategy.DO_NOT_TRUNCATE,
180 |                     )
181 |                     inputs['decoder_input_ids'] = decoder_inputs['input_ids']
182 |                     inputs['decoder_attention_mask'] = decoder_inputs['attention_mask']
183 |             else:
184 |                 raise e
185 | 
186 |         return inputs
187 | 
188 |     def _sanitize_parameters(self, **kwargs):
189 |         if kwargs.get("multi_class", None) is not None:
190 |             kwargs["multi_label"] = kwargs["multi_class"]
191 |             logger.warning(
192 |                 "The `multi_class` argument has been deprecated and renamed to `multi_label`. "
193 |                 "`multi_class` will be removed in a future version of Transformers."
194 |             )
195 |         preprocess_params = {}
196 |         if "candidate_labels" in kwargs:
197 |             preprocess_params["candidate_labels"] = self._args_parser._parse_labels(kwargs["candidate_labels"])
198 |         if "hypothesis_template" in kwargs:
199 |             preprocess_params["hypothesis_template"] = kwargs["hypothesis_template"]
200 |         if "hypothesis_first" in kwargs:
201 |             preprocess_params["hypothesis_first"] = kwargs["hypothesis_first"]
202 |         if "encoder_decoder" in kwargs:
203 |             preprocess_params["encoder_decoder"] = kwargs["encoder_decoder"]
204 | 
205 |         postprocess_params = {}
206 |         if "multi_label" in kwargs:
207 |             postprocess_params["multi_label"] = kwargs["multi_label"]
208 |         return preprocess_params, {}, postprocess_params
209 | 
210 |     def __call__(
211 |         self,
212 |         sequences: Union[str, List[str]],
213 |         *args,
214 |         **kwargs,
215 |     ):
216 |         """
217 |         Classify the sequence(s) given as inputs. See the [`ZeroShotClassificationPipeline`] documentation for more
218 |         information.
219 | 
220 |         Args:
221 |             sequences (`str` or `List[str]`):
222 |                 The sequence(s) to classify, will be truncated if the model input is too large.
223 |             candidate_labels (`str` or `List[str]`):
224 |                 The set of possible class labels to classify each sequence into. Can be a single label, a string of
225 |                 comma-separated labels, or a list of labels.
226 |             hypothesis_template (`str`, *optional*, defaults to `"This example is {}."`):
227 |                 The template used to turn each label into an NLI-style hypothesis. This template must include a {} or
228 |                 similar syntax for the candidate label to be inserted into the template. For example, the default
229 |                 template is `"This example is {}."` With the candidate label `"sports"`, this would be fed into the
230 |                 model like `"<cls> sequence to classify <sep> This example is sports . <sep>"`. The default template
231 |                 works well in many cases, but it may be worthwhile to experiment with different templates depending on
232 |                 the task setting.
233 |             multi_label (`bool`, *optional*, defaults to `False`):
234 |                 Whether or not multiple candidate labels can be true. If `False`, the scores are normalized such that
235 |                 the sum of the label likelihoods for each sequence is 1. If `True`, the labels are considered
236 |                 independent and probabilities are normalized for each candidate by doing a softmax of the entailment
237 |                 score vs. the contradiction score.
238 | 
239 |         Return:
240 |             A `dict` or a list of `dict`: Each result comes as a dictionary with the following keys:
241 | 
242 |             - **sequence** (`str`) -- The sequence for which this is the output.
243 |             - **labels** (`List[str]`) -- The labels sorted by order of likelihood.
244 |             - **scores** (`List[float]`) -- The probabilities for each of the labels.
245 |         """
246 |         if len(args) == 0:
247 |             pass
248 |         elif len(args) == 1 and "candidate_labels" not in kwargs:
249 |             kwargs["candidate_labels"] = args[0]
250 |         else:
251 |             raise ValueError(f"Unable to understand extra arguments {args}")
252 | 
253 |         return super().__call__(sequences, **kwargs)
254 | 
255 |     def preprocess(self, inputs, candidate_labels=None, hypothesis_template="This example is {}.", hypothesis_first = False, encoder_decoder = False):
256 |         sequence_pairs, sequences = self._args_parser(inputs, candidate_labels, hypothesis_template, hypothesis_first)
257 | 
258 |         for i, (candidate_label, sequence_pair) in enumerate(zip(candidate_labels, sequence_pairs)):
259 |             model_input = self._parse_and_tokenize(sequence_pair, encoder_decoder = encoder_decoder)
260 | 
261 |             yield {
262 |                 "candidate_label": candidate_label,
263 |                 "sequence": sequences[0],
264 |                 "is_last": i == len(candidate_labels) - 1,
265 |                 **model_input,
266 |             }
267 | 
268 |     def _forward(self, inputs):
269 |         candidate_label = inputs["candidate_label"]
270 |         sequence = inputs["sequence"]
271 |         input_names = self.tokenizer.model_input_names
272 |         input_names.extend(['decoder_input_ids', 'decoder_attention_mask'])
273 |         model_inputs = {k: inputs[k] for k in input_names if k in inputs}
274 |         # `XXXForSequenceClassification` models should not use `use_cache=True` even if it's supported
275 |         model_forward = self.model.forward if self.framework == "pt" else self.model.call
276 |         if "use_cache" in inspect.signature(model_forward).parameters.keys():
277 |             model_inputs["use_cache"] = False
278 |         outputs = self.model(**model_inputs)
279 | 
280 |         model_outputs = {
281 |             "candidate_label": candidate_label,
282 |             "sequence": sequence,
283 |             "is_last": inputs["is_last"],
284 |             **outputs,
285 |         }
286 |         return model_outputs
287 | 
288 |     def postprocess(self, model_outputs, multi_label=False):
289 |         candidate_labels = [outputs["candidate_label"] for outputs in model_outputs]
290 |         sequences = [outputs["sequence"] for outputs in model_outputs]
291 |         logits = np.concatenate([output["logits"].numpy() for output in model_outputs])
292 |         N = logits.shape[0]
293 |         n = len(candidate_labels)
294 |         num_sequences = N // n
295 |         reshaped_outputs = logits.reshape((num_sequences, n, -1))
296 |         
297 |         if multi_label and len(self.model.config.label2id)==0:
298 |             scores = 1 / (1 + np.exp(-entail_contr_logits))
299 | 
300 |         elif multi_label or len(candidate_labels) == 1:
301 |             # softmax over the entailment vs. contradiction dim for each label independently
302 |             entailment_id = self.entailment_id
303 |             contradiction_id = -1 if entailment_id == 0 else 0
304 |             entail_contr_logits = reshaped_outputs[..., [contradiction_id, entailment_id]]
305 |             scores = np.exp(entail_contr_logits) / np.exp(entail_contr_logits).sum(-1, keepdims=True)
306 |             scores = scores[..., 1]
307 |             
308 |         else:
309 |             # softmax the "entailment" logits over all candidate labels
310 |             entail_logits = reshaped_outputs[..., self.entailment_id]
311 |             scores = np.exp(entail_logits) / np.exp(entail_logits).sum(-1, keepdims=True)
312 | 
313 |         top_inds = list(reversed(scores[0].argsort()))
314 |         return {
315 |             "sequence": sequences[0],
316 |             "labels": [candidate_labels[i] for i in top_inds],
317 |             "scores": scores[0, top_inds].tolist(),
318 |         }


--------------------------------------------------------------------------------
/src/liqfit/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .standardization import convert_to_numpy
2 | from .standardization import convert_to_torch
3 | from .transforms import tokenize_and_align_label
4 | from .transforms import transform
5 | from .metrics import Accuracy
6 | 


--------------------------------------------------------------------------------
/src/liqfit/utils/metrics.py:
--------------------------------------------------------------------------------
 1 | import evaluate
 2 | import numpy as np
 3 | from transformers import EvalPrediction
 4 | 
 5 | 
 6 | class Accuracy:
 7 |     def __init__(self):
 8 |         """Simple wrapper class around `evaluate.load("accuracy")`.
 9 |         """
10 |         self.accuracy = evaluate.load("accuracy")
11 | 
12 |     def __call__(self, eval_pred: EvalPrediction):
13 |         predictions, labels = eval_pred
14 |         predictions = np.argmax(predictions, axis=1)
15 |         return self.accuracy.compute(
16 |             predictions=predictions, references=labels
17 |         )
18 | 


--------------------------------------------------------------------------------
/src/liqfit/utils/standardization.py:
--------------------------------------------------------------------------------
 1 | from __future__ import annotations
 2 | from typing import List, Tuple
 3 | import torch
 4 | import numpy as np
 5 | 
 6 | 
 7 | def convert_to_numpy(x: torch.Tensor | Tuple | List | np.ndarray) -> np.ndarray:
 8 |     """Converts torch.Tensor, Tuple, List or NumPy array to Numpy Array.
 9 | 
10 |     Args:
11 |         x (torch.Tensor | Tuple | List | np.ndarray): Input to convert to
12 |             NumPy array.
13 | 
14 |     Returns:
15 |         np.ndarray: Converted NumPy array.
16 |     """
17 |     if isinstance(x, torch.tensor):
18 |         return x.detach().cpu().numpy()
19 |     else:
20 |         return np.array(x)
21 | 
22 | 
23 | def convert_to_torch(x: torch.Tensor | Tuple | List | np.ndarray) -> torch.Tensor:
24 |     """Converts input to torch.Tensor
25 | 
26 |     Args:
27 |         x (torch.Tensor | Tuple | List | np.ndarray): _description_
28 | 
29 |     Raises:
30 |         ValueError: If the input is not a type of `torch.Tensor`,
31 |             `Tuple`, `List`, `np.ndarray`
32 | 
33 |     Returns:
34 |         torch.Tensor: Converted torch.Tensor.
35 |     """
36 |     if isinstance(x, (list, tuple)):
37 |         return torch.tensor(x)
38 |     elif isinstance(x, np.ndarray):
39 |         return torch.from_numpy(x)
40 |     elif isinstance(x, torch.Tensor):
41 |         return x
42 |     else:
43 |         raise ValueError(
44 |             "Expected `List`, `Tuple` or `np.ndarray`. "
45 |             f"Received: {type(x)}."
46 |         )
47 | 


--------------------------------------------------------------------------------
/src/liqfit/utils/transforms.py:
--------------------------------------------------------------------------------
 1 | from typing import Callable, Dict
 2 | from datasets import Dataset
 3 | from ..datasets import transform_dataset
 4 | 
 5 | 
 6 | def tokenize_and_align_label(
 7 |     example: Dict,
 8 |     tokenizer: Callable,
 9 |     sources_column_name: str = "sources",
10 |     targets_column_name: str = "targets",
11 | ):
12 |     """Tokenizes Source and Target sequences and concatenates them for NLI training task.
13 | 
14 |     Args:
15 |         example (Dict): Dictionary that contains the sources and target sequences.
16 |         tokenizer (Callable): Tokenizer function, if you are using Huggingface
17 |             tokenizer, you can wrap it with your configuration using
18 |             `functools.partial`. Example:
19 |             tokenizer_wrapped_function = \
20 |                 functools.partial(tokenizer.batch_encode_plus, padding=True,
21 |                 truncation=True, max_length=512) then pass
22 |                 `tokenizer_wrapped_function` to this function.
23 |         sources_column_name (str, optional): Sources key name in the
24 |             dictionary. Defaults to "sources".
25 |         targets_column_name (str, optional): Targets key name in the
26 |             dictionary. Defaults to "targets".
27 | 
28 |     Returns:
29 |         torch.Tensor: A tensor of your tokenized input.
30 |     """
31 |     hypothesis = example[targets_column_name]
32 |     seq = example[sources_column_name]
33 |     tokenized_input = tokenizer([seq, hypothesis])
34 |     return tokenized_input
35 | 
36 | 
37 | def transform(
38 |     dataset: Dataset,
39 |     classes: list,
40 |     template: str,
41 |     normalize_negatives: bool,
42 |     positives: int,
43 |     negatives: int,
44 | ):
45 |     """Transforms the dataset for NLI training task.
46 | 
47 |     Args:
48 |         dataset (Dataset): Hugginface Dataset instance
49 |         classes (List[str]): List of possible class labels.
50 |         template (str, optional): Template string for generating examples.
51 |         normalize_negatives (bool, optional): Whether to normalize amount of 
52 |                                 negative examples per each positive example of a class.
53 |         positives (int, optional): Number of positive examples to generate per source.
54 |         negatives (int, optional): Number of negative examples to generate per source.
55 | 
56 |     Raises:
57 |         ValueError: If there is no "{}" in the template. It should exist in
58 |             order to format the template with the labels.
59 | 
60 |     Returns:
61 |         Dataset: Transformed dataset.
62 |     """
63 |     if "{}" not in template:
64 |         raise ValueError(
65 |             "Cannot apply `.format()` function on the template. "
66 |             'Expected template to have "{}". '
67 |             f"Received: {template}."
68 |         )
69 | 
70 |     transformed_dataset = transform_dataset(
71 |         dataset, classes, template, normalize_negatives, positives, negatives
72 |     )
73 |     tokenized_dataset = transformed_dataset.map(tokenize_and_align_label)
74 |     return tokenized_dataset
75 | 


--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Knowledgator/LiqFit/51ba2714813ae1cf110f7e600cd7f2663cdec39c/tests/__init__.py


--------------------------------------------------------------------------------
/tests/test_losses.py:
--------------------------------------------------------------------------------
 1 | import unittest
 2 | 
 3 | import torch
 4 | from kornia.losses import focal_loss
 5 | from liqfit.losses import focal_loss_with_mask
 6 | 
 7 | 
 8 | class TestCorrectness(unittest.TestCase):
 9 |     def test_focal_loss_with_ignore_index(self):
10 |         x = torch.tensor(
11 |             [[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]],
12 |             dtype=torch.float32,
13 |         )
14 |         y = torch.tensor([[1, 2, 3]], dtype=torch.int64)
15 |         y[:, -1] = -100
16 |         loss = round(
17 |             focal_loss_with_mask(
18 |                 x.reshape(-1, x.shape[-1]), y.reshape(-1)
19 |             ).item(),
20 |             4,
21 |         )
22 |         output = 0.1795
23 |         self.assertEqual(loss, output)
24 | 
25 |     def test_modified_loss_with_kornia_impl(self):
26 |         x = torch.tensor(
27 |             [[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]],
28 |             dtype=torch.float32,
29 |         )
30 |         y = torch.tensor([[1, 2, 3]], dtype=torch.int64)
31 |         modified_loss = round(
32 |             focal_loss_with_mask(
33 |                 x.reshape(-1, x.shape[-1]), y.reshape(-1), alpha=0.5
34 |             ).item(),
35 |             4,
36 |         )
37 |         kornia_loss = round(
38 |             focal_loss(
39 |                 x.reshape(-1, x.shape[-1]),
40 |                 y.reshape(-1),
41 |                 alpha=0.5,
42 |                 reduction="mean",
43 |             ).item(),
44 |             4,
45 |         )
46 |         self.assertEqual(modified_loss, kornia_loss)
47 | 


--------------------------------------------------------------------------------
/tests/test_models.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from transformers import AutoTokenizer, AutoModel, AutoModelForSequenceClassification
 3 | from liqfit.models import T5ForZeroShotClassification, T5ConfigWithLoss, DebertaV2ForZeroShotClassification, DebertaConfigWithLoss
 4 | from liqfit.modeling import LiqFitModel, ClassificationHead
 5 | from liqfit.modeling.pooling import FirstTokenPooling1D
 6 | from liqfit.losses import CrossEntropyLoss
 7 | 
 8 | def test_t5():
 9 |     device = "cuda" if torch.cuda.is_available() else "cpu"
10 | 
11 |     text = "one day I will see the world"
12 |     label = "travel"
13 | 
14 |     tokenizer = AutoTokenizer.from_pretrained('google/flan-t5-small')
15 | 
16 |     input_ids = tokenizer(text, return_tensors='pt')['input_ids']
17 |     decoder_input_ids = tokenizer(label, return_tensors='pt')['input_ids']
18 |     
19 |     config = T5ConfigWithLoss()
20 |     model = T5ForZeroShotClassification(config).to(device)
21 |     outputs = model(input_ids = input_ids, decoder_input_ids = decoder_input_ids)
22 | 
23 | def test_deberta():
24 |     device = "cuda" if torch.cuda.is_available() else "cpu"
25 | 
26 |     text = "one day I will see the world. This example is travel."
27 | 
28 |     tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-v3-small')
29 | 
30 |     input_ids = tokenizer(text, return_tensors='pt')['input_ids']
31 |     
32 |     config = DebertaConfigWithLoss()
33 |     model = DebertaV2ForZeroShotClassification(config).to(device)
34 |     outputs = model(input_ids = input_ids)
35 | 
36 | def test_liqfit_model_with_automodel_for_sequence_classification():
37 |     device = "cuda" if torch.cuda.is_available() else "cpu"
38 | 
39 |     text = "one day I will see the world. This example is travel."
40 | 
41 |     tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-v3-small')
42 | 
43 |     input_ids = tokenizer(text, return_tensors='pt')['input_ids']
44 |     labels = torch.tensor([1])
45 |     
46 |     backbone_model = AutoModelForSequenceClassification.from_pretrained('microsoft/deberta-v3-xsmall')
47 |     
48 |     loss_func = CrossEntropyLoss(multi_target=True)
49 |     
50 |     model = LiqFitModel(backbone_model.config, backbone_model, loss_func=loss_func)
51 |     outputs = model(input_ids = input_ids, labels=labels)
52 | 
53 | def test_liqfit_model_with_head():
54 |     device = "cuda" if torch.cuda.is_available() else "cpu"
55 | 
56 |     text = "one day I will see the world. This example is travel."
57 | 
58 |     tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-v3-small')
59 | 
60 |     input_ids = tokenizer(text, return_tensors='pt')['input_ids']
61 |     labels = torch.tensor([1])
62 |     
63 |     backbone_model = AutoModel.from_pretrained('microsoft/deberta-v3-xsmall')
64 |     
65 |     pooler = FirstTokenPooling1D()
66 |     loss_func = CrossEntropyLoss(multi_target=True)
67 |     head = ClassificationHead(backbone_model.config.hidden_size, 3, pooler, loss_func)
68 |     
69 |     model = LiqFitModel(backbone_model.config, backbone_model, head)
70 |     outputs = model(input_ids = input_ids, labels=labels)
71 | 


--------------------------------------------------------------------------------
/tests/test_pipeline.py:
--------------------------------------------------------------------------------
 1 | from transformers import AutoTokenizer, AutoModelForSequenceClassification
 2 | 
 3 | from liqfit.pipeline import ZeroShotClassificationPipeline
 4 | 
 5 | 
 6 | class TestStandartModelPipeline:
 7 |     sequence_to_classify = "one day I will see the world"
 8 |     candidate_labels = ['travel', 'cooking', 'dancing']
 9 |     template = 'This example is {}.'
10 |     model_path = 'knowledgator/comprehend_it-base'
11 |     tokenizer = AutoTokenizer.from_pretrained(model_path)
12 |     model = AutoModelForSequenceClassification.from_pretrained(model_path)
13 | 
14 |     def test_standard_pipeline(self):
15 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
16 |                                                         tokenizer=self.tokenizer, 
17 |                                                         hypothesis_template = self.template,
18 |                                                         hypothesis_first = False)
19 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)
20 | 
21 | 
22 |     def test_hypothesis_first_pipeline(self):
23 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
24 |                                                         tokenizer=self.tokenizer, 
25 |                                                         hypothesis_template = self.template,
26 |                                                         hypothesis_first = True)
27 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)
28 | 
29 | 
30 | 
31 | class TestBinaryModelPipeline:
32 |     sequence_to_classify = "one day I will see the world"
33 |     candidate_labels = ['travel', 'cooking', 'dancing']
34 |     template = 'This example is {}.'
35 |     model_path = 'BAAI/bge-reranker-base'
36 |     tokenizer = AutoTokenizer.from_pretrained(model_path)
37 |     model = AutoModelForSequenceClassification.from_pretrained(model_path)
38 | 
39 |     def test_standard_pipeline(self):
40 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
41 |                                                         tokenizer=self.tokenizer, 
42 |                                                         hypothesis_template = self.template,
43 |                                                         hypothesis_first = False)
44 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)
45 | 
46 | 
47 |     def test_hypothesis_first_pipeline(self):
48 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
49 |                                                         tokenizer=self.tokenizer, 
50 |                                                         hypothesis_template = self.template,
51 |                                                         hypothesis_first = True)
52 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)
53 | 
54 | class TestEncoderDecoderModelPipeline:
55 |     sequence_to_classify = "one day I will see the world"
56 |     candidate_labels = ['travel', 'cooking', 'dancing']
57 |     template = 'This example is {}.'
58 |     model_path = 'knowledgator/mt5-comprehend-it-base'
59 |     tokenizer = AutoTokenizer.from_pretrained(model_path)
60 |     model = AutoModelForSequenceClassification.from_pretrained(model_path)
61 | 
62 |     def test_standard_pipeline(self):
63 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
64 |                                                         tokenizer=self.tokenizer, 
65 |                                                         hypothesis_template = self.template,
66 |                                                         hypothesis_first = False)
67 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)
68 | 
69 | 
70 |     def test_hypothesis_first_pipeline(self):
71 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
72 |                                                         tokenizer=self.tokenizer, 
73 |                                                         hypothesis_template = self.template,
74 |                                                         hypothesis_first = True)
75 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)
76 | 
77 | 
78 |     def test_encoder_decoder_pipeline(self):
79 |         classifier = ZeroShotClassificationPipeline(model=self.model, 
80 |                                                         tokenizer=self.tokenizer, 
81 |                                                         hypothesis_template = self.template,
82 |                                                         hypothesis_first = True)
83 |         results = classifier(self.sequence_to_classify, self.candidate_labels, multi_label=True)


--------------------------------------------------------------------------------