Losses
TripletLoss
- class oml.losses.triplet.TripletLoss(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
Bases:
Module
Class, which combines classical TripletMarginLoss and SoftTripletLoss. The idea of SoftTripletLoss is the following: instead of using the classical formula
loss = relu(margin + positive_distance - negative_distance)
we useloss = log1p(exp(positive_distance - negative_distance))
. It may help to solve the often problem when TripletMarginLoss converges to it’s margin value (also known as dimension collapse).- __init__(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
- Parameters
margin – Margin value, set
None
to use SoftTripletLossreduction –
mean
,sum
ornone
need_logs – Set
True
to store some information to track inself.last_logs
property.
- forward(anchor: Tensor, positive: Tensor, negative: Tensor) Tensor [source]
- Parameters
anchor – Anchor features with the shape of
(batch_size, feat)
positive – Positive features with the shape of
(batch_size, feat)
negative – Negative features with the shape of
(batch_size, feat)
- Returns
Loss value
TripletLossPlain
- class oml.losses.triplet.TripletLossPlain(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
Bases:
Module
The same as TripletLoss, but works with anchor, positive and negative features stacked together.
- __init__(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
- Parameters
margin – Margin value, set
None
to use SoftTripletLossreduction –
mean
,sum
ornone
need_logs – Set
True
to store some information to track inself.last_logs
property.
TripletLossWithMiner
- class oml.losses.triplet.TripletLossWithMiner(margin: ~typing.Optional[float], miner: ~oml.interfaces.miners.ITripletsMiner = <oml.miners.inbatch_all_tri.AllTripletsMiner object>, reduction: str = 'mean', need_logs: bool = False)[source]
Bases:
ITripletLossWithMiner
This class combines Miner and TripletLoss.
- __init__(margin: ~typing.Optional[float], miner: ~oml.interfaces.miners.ITripletsMiner = <oml.miners.inbatch_all_tri.AllTripletsMiner object>, reduction: str = 'mean', need_logs: bool = False)[source]
- Parameters
margin – Margin value, set
None
to use SoftTripletLossminer – A miner that implements the logic of picking triplets to pass them to the triplet loss.
reduction –
mean
,sum
ornone
need_logs – Set
True
to store some information to track inself.last_logs
property.
SurrogatePrecision
- class oml.losses.surrogate_precision.SurrogatePrecision(k: int, temperature1: float = 1.0, temperature2: float = 0.01, reduction: str = 'mean')[source]
Bases:
Module
This loss is a differentiable approximation of Precision@k metric.
The loss is described in the following paper under a bit different name: Recall@k Surrogate Loss with Large Batches and Similarity Mixup.
The idea is that we express the formula for Precision@k using two step functions (aka Heaviside functions). Then we approximate them using two sigmoid functions with temperatures. The smaller temperature the close sigmoid to the step function, but the gradients are sparser, and vice versa. In the original paper t1 = 1.0 and t2 = 0.01 have been used.
- __init__(k: int, temperature1: float = 1.0, temperature2: float = 0.01, reduction: str = 'mean')[source]
- Parameters
k – Parameter of Precision@k.
temperature1 – Scaling factor for the 1st sigmoid, see docs above.
temperature2 – Scaling factor for the 2nd sigmoid, see docs above.
reduction –
mean
,sum
ornone
ArcFaceLoss
- class oml.losses.arcface.ArcFaceLoss(in_features: int, num_classes: int, m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
Bases:
Module
ArcFace loss from paper with possibility to use label smoothing. It contains projection size of
num_features x num_classes
inside itself. Please make sure that class labels started with 0 and ended asnum_classes
- 1.- __init__(in_features: int, num_classes: int, m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
- Parameters
in_features – Input feature size
num_classes – Number of classes in train set
m – Margin parameter for ArcFace loss. Usually you should use 0.3-0.5 values for it
s – Scaling parameter for ArcFace loss. Usually you should use 30-64 values for it
smoothing_epsilon – Label smoothing effect strength
label2category – Optional, mapping from label to its category. If provided, label smoothing will redistribute
smoothing_epsilon
only inside the category corresponding to the sample’s ground truth labelreduction – CrossEntropyLoss reduction
ArcFaceLossWithMLP
- class oml.losses.arcface.ArcFaceLossWithMLP(in_features: int, num_classes: int, mlp_features: List[int], m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
Bases:
Module
Almost the same as
ArcFaceLoss
, but also has MLP projector before the loss. You may want to useArcFaceLossWithMLP
to boost the expressive power of ArcFace loss during the training (for example, in a multi-head setup it may be a good idea to have task-specific projectors in each of the losses). Note, the criterion does not exist during the validation time. Thus, if you want to keep your MLP layers, you should create them as a part of the model you train.- __init__(in_features: int, num_classes: int, mlp_features: List[int], m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
- Parameters
in_features – Input feature size
num_classes – Number of classes in train set
mlp_features – Layers sizes for MLP before ArcFace
m – Margin parameter for ArcFace loss. Usually you should use 0.3-0.5 values for it
s – Scaling parameter for ArcFace loss. Usually you should use 30-64 values for it
smoothing_epsilon – Label smoothing effect strength
label2category – Optional, mapping from label to its category. If provided, label smoothing will redistribute
smoothing_epsilon
only inside the category corresponding to the sample’s ground truth labelreduction – CrossEntropyLoss reduction
label_smoothing
- oml.functional.label_smoothing.label_smoothing(y: Tensor, num_classes: int, epsilon: float = 0.2, categories: Optional[Tensor] = None) Tensor [source]
This function is doing label smoothing. You can also use modified version, where the label is smoothed only for the category corresponding to sample’s ground truth label. To use this, you should provide the
categories
argument: vector, for which i-th entry is a corresponding category for labeli
.- Parameters
y – Ground truth labels with the size of batch_size where each element is from 0 (inclusive) to num_classes (exclusive).
num_classes – Number of classes in total
epsilon – Power of smoothing. The biggest value in OHE-vector will be
1 - epsilon + 1 / num_classes
after the transformationcategories – Vector for which i-th entry is a corresponding category for label
i
. Optional, used for category-based label smoothing. In that case the biggest value in OHE-vector will be1 - epsilon + 1 / num_classes_of_the_same_category
, labels outside of the category will not change