Losses

TripletLoss
TripletLossPlain
TripletLossWithMiner
SurrogatePrecision
ArcFaceLoss
ArcFaceLossWithMLP
label_smoothing

TripletLoss 

class oml.losses.triplet.TripletLoss(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]

Bases: Module

Class, which combines classical TripletMarginLoss and SoftTripletLoss. The idea of SoftTripletLoss is the following: instead of using the classical formula loss = relu(margin + positive_distance - negative_distance) we use loss = log1p(exp(positive_distance - negative_distance)). It may help to solve the often problem when TripletMarginLoss converges to it’s margin value (also known as dimension collapse).

__init__(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]

Parameters

margin – Margin value, set None to use SoftTripletLoss
reduction – mean, sum or none
need_logs – Set True if you want to store logs

forward(anchor: Tensor, positive: Tensor, negative: Tensor) → Tensor[source]

Parameters

anchor – Anchor features with the shape of (batch_size, feat)
positive – Positive features with the shape of (batch_size, feat)
negative – Negative features with the shape of (batch_size, feat)

Returns

Loss value

TripletLossPlain 

class oml.losses.triplet.TripletLossPlain(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]

Bases: Module

The same as TripletLoss, but works with anchor, positive and negative features stacked together.

__init__(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]

Parameters

margin – Margin value, set None to use SoftTripletLoss
reduction – mean, sum or none
need_logs – Set True if you want to store logs

forward(features: Tensor) → Tensor[source]

Parameters: features – Features with the shape of [batch_size, feat] with the following structure: 0,1,2 are indices of the 1st triplet, 3,4,5 are indices of the 2nd triplet, and so on. Thus, the features contains (N / 3) triplets
Returns: Loss value

TripletLossWithMiner 

class oml.losses.triplet.TripletLossWithMiner(margin: ~typing.Optional[float], miner: ~oml.interfaces.miners.ITripletsMiner = <oml.miners.inbatch_all_tri.AllTripletsMiner object>, reduction: str = 'mean', need_logs: bool = False)[source]

Bases: ITripletLossWithMiner

This class combines Miner and TripletLoss.

__init__(margin: ~typing.Optional[float], miner: ~oml.interfaces.miners.ITripletsMiner = <oml.miners.inbatch_all_tri.AllTripletsMiner object>, reduction: str = 'mean', need_logs: bool = False)[source]

Parameters

margin – Margin value, set None to use SoftTripletLoss
miner – A miner that implements the logic of picking triplets to pass them to the triplet loss.
reduction – mean, sum or none
need_logs – Set True if you want to store logs

forward(features: Tensor, labels: Union[Tensor, List[int]]) → Tensor[source]

Parameters

features – Features with the shape [batch_size, feat]
labels – Labels with the size of batch_size

Returns

Loss value

SurrogatePrecision 

class oml.losses.surrogate_precision.SurrogatePrecision(k: int, temperature1: float = 1.0, temperature2: float = 0.01, reduction: str = 'mean')[source]

Bases: Module

This loss is a differentiable approximation of Precision@k metric.

The loss is described in the following paper under a bit different name: Recall@k Surrogate Loss with Large Batches and Similarity Mixup.

The idea is that we express the formula for Precision@k using two step functions (aka Heaviside functions). Then we approximate them using two sigmoid functions with temperatures. The smaller temperature the close sigmoid to the step function, but the gradients are sparser, and vice versa. In the original paper t1 = 1.0 and t2 = 0.01 have been used.

__init__(k: int, temperature1: float = 1.0, temperature2: float = 0.01, reduction: str = 'mean')[source]

Parameters

k – Parameter of Precision@k.
temperature1 – Scaling factor for the 1st sigmoid, see docs above.
temperature2 – Scaling factor for the 2nd sigmoid, see docs above.
reduction – mean, sum or none

forward(features: Tensor, labels: Tensor) → Tensor[source]

Parameters

features – Features with the shape of [batch_size, feature_size]
labels – Labels with the size of batch_size

Returns

Loss value

ArcFaceLoss 

class oml.losses.arcface.ArcFaceLoss(in_features: int, num_classes: int, m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]

Bases: Module

ArcFace loss from paper with possibility to use label smoothing. It contains projection size of num_features x num_classes inside itself. Please make sure that class labels started with 0 and ended as num_classes - 1.

__init__(in_features: int, num_classes: int, m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]

Parameters

in_features – Input feature size
num_classes – Number of classes in train set
m – Margin parameter for ArcFace loss. Usually you should use 0.3-0.5 values for it
s – Scaling parameter for ArcFace loss. Usually you should use 30-64 values for it
smoothing_epsilon – Label smoothing effect strength
label2category – Optional, mapping from label to its category. If provided, label smoothing will redistribute smoothing_epsilon only inside the category corresponding to the sample’s ground truth label
reduction – CrossEntropyLoss reduction

ArcFaceLossWithMLP 

class oml.losses.arcface.ArcFaceLossWithMLP(in_features: int, num_classes: int, mlp_features: List[int], m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]

Bases: Module

Almost the same as ArcFaceLoss, but also has MLP projector before the loss. You may want to use ArcFaceLossWithMLP to boost the expressive power of ArcFace loss during the training (for example, in a multi-head setup it may be a good idea to have task-specific projectors in each of the losses). Note, the criterion does not exist during the validation time. Thus, if you want to keep your MLP layers, you should create them as a part of the model you train.

__init__(in_features: int, num_classes: int, mlp_features: List[int], m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]

Parameters

in_features – Input feature size
num_classes – Number of classes in train set
mlp_features – Layers sizes for MLP before ArcFace
m – Margin parameter for ArcFace loss. Usually you should use 0.3-0.5 values for it
s – Scaling parameter for ArcFace loss. Usually you should use 30-64 values for it
smoothing_epsilon – Label smoothing effect strength
label2category – Optional, mapping from label to its category. If provided, label smoothing will redistribute smoothing_epsilon only inside the category corresponding to the sample’s ground truth label
reduction – CrossEntropyLoss reduction

label_smoothing 

oml.functional.label_smoothing.label_smoothing(y: Tensor, num_classes: int, epsilon: float = 0.2, categories: Optional[Tensor] = None) → Tensor[source]

This function is doing label smoothing. You can also use modified version, where the label is smoothed only for the category corresponding to sample’s ground truth label. To use this, you should provide the categories argument: vector, for which i-th entry is a corresponding category for label i.

Parameters

y – Ground truth labels with the size of batch_size where each element is from 0 (inclusive) to num_classes (exclusive).
num_classes – Number of classes in total
epsilon – Power of smoothing. The biggest value in OHE-vector will be 1 - epsilon + 1 / num_classes after the transformation
categories – Vector for which i-th entry is a corresponding category for label i. Optional, used for category-based label smoothing. In that case the biggest value in OHE-vector will be 1 - epsilon + 1 / num_classes_of_the_same_category, labels outside of the category will not change

Losses

TripletLoss

TripletLossPlain

TripletLossWithMiner

SurrogatePrecision

ArcFaceLoss

ArcFaceLossWithMLP

label_smoothing

TripletLoss 

TripletLossPlain 

TripletLossWithMiner 

SurrogatePrecision 

ArcFaceLoss 

ArcFaceLossWithMLP 

label_smoothing 