Losses
TripletLoss
- class oml.losses.triplet.TripletLoss(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
Bases:
ModuleClass, which combines classical TripletMarginLoss and SoftTripletLoss. The idea of SoftTripletLoss is the following: instead of using the classical formula
loss = relu(margin + positive_distance - negative_distance)we useloss = log1p(exp(positive_distance - negative_distance)). It may help to solve the often problem when TripletMarginLoss converges to it’s margin value (also known as dimension collapse).- __init__(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
- Parameters
margin – Margin value, set
Noneto use SoftTripletLossreduction –
mean,sumornoneneed_logs – Set
Trueto store some information to track inself.last_logsproperty.
TripletLossPlain
- class oml.losses.triplet.TripletLossPlain(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
Bases:
ModuleThe same as TripletLoss, but works with anchor, positive and negative features stacked together.
- __init__(margin: Optional[float], reduction: str = 'mean', need_logs: bool = False)[source]
- Parameters
margin – Margin value, set
Noneto use SoftTripletLossreduction –
mean,sumornoneneed_logs – Set
Trueto store some information to track inself.last_logsproperty.
TripletLossWithMiner
- class oml.losses.triplet.TripletLossWithMiner(margin: ~typing.Optional[float], miner: ~oml.interfaces.miners.ITripletsMiner = <oml.miners.inbatch_all_tri.AllTripletsMiner object>, reduction: str = 'mean', need_logs: bool = False)[source]
Bases:
ITripletLossWithMinerThis class combines Miner and TripletLoss.
- __init__(margin: ~typing.Optional[float], miner: ~oml.interfaces.miners.ITripletsMiner = <oml.miners.inbatch_all_tri.AllTripletsMiner object>, reduction: str = 'mean', need_logs: bool = False)[source]
- Parameters
margin – Margin value, set
Noneto use SoftTripletLossminer – A miner that implements the logic of picking triplets to pass them to the triplet loss.
reduction –
mean,sumornoneneed_logs – Set
Trueto store some information to track inself.last_logsproperty.
SurrogatePrecision
- class oml.losses.surrogate_precision.SurrogatePrecision(k: int, temperature1: float = 1.0, temperature2: float = 0.01, reduction: str = 'mean')[source]
Bases:
ModuleThis loss is a differentiable approximation of Precision@k metric.
The loss is described in the following paper under a bit different name: Recall@k Surrogate Loss with Large Batches and Similarity Mixup.
The idea is that we express the formula for Precision@k using two step functions (aka Heaviside functions). Then we approximate them using two sigmoid functions with temperatures. The smaller temperature the close sigmoid to the step function, but the gradients are sparser, and vice versa. In the original paper t1 = 1.0 and t2 = 0.01 have been used.
- __init__(k: int, temperature1: float = 1.0, temperature2: float = 0.01, reduction: str = 'mean')[source]
- Parameters
k – Parameter of Precision@k.
temperature1 – Scaling factor for the 1st sigmoid, see docs above.
temperature2 – Scaling factor for the 2nd sigmoid, see docs above.
reduction –
mean,sumornone
ArcFaceLoss
- class oml.losses.arcface.ArcFaceLoss(in_features: int, num_classes: int, m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
Bases:
ModuleArcFace loss from paper with possibility to use label smoothing. It contains projection size of
num_features x num_classesinside itself. Please make sure that class labels started with 0 and ended asnum_classes- 1.- __init__(in_features: int, num_classes: int, m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
- Parameters
in_features – Input feature size
num_classes – Number of classes in train set
m – Margin parameter for ArcFace loss. Usually you should use 0.3-0.5 values for it
s – Scaling parameter for ArcFace loss. Usually you should use 30-64 values for it
smoothing_epsilon – Label smoothing effect strength
label2category – Optional, mapping from label to its category. If provided, label smoothing will redistribute
smoothing_epsilononly inside the category corresponding to the sample’s ground truth labelreduction – CrossEntropyLoss reduction
ArcFaceLossWithMLP
- class oml.losses.arcface.ArcFaceLossWithMLP(in_features: int, num_classes: int, mlp_features: List[int], m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
Bases:
ModuleAlmost the same as
ArcFaceLoss, but also has MLP projector before the loss. You may want to useArcFaceLossWithMLPto boost the expressive power of ArcFace loss during the training (for example, in a multi-head setup it may be a good idea to have task-specific projectors in each of the losses). Note, the criterion does not exist during the validation time. Thus, if you want to keep your MLP layers, you should create them as a part of the model you train.- __init__(in_features: int, num_classes: int, mlp_features: List[int], m: float = 0.5, s: float = 64, smoothing_epsilon: float = 0, label2category: Optional[Dict[Any, Any]] = None, reduction: str = 'mean')[source]
- Parameters
in_features – Input feature size
num_classes – Number of classes in train set
mlp_features – Layers sizes for MLP before ArcFace
m – Margin parameter for ArcFace loss. Usually you should use 0.3-0.5 values for it
s – Scaling parameter for ArcFace loss. Usually you should use 30-64 values for it
smoothing_epsilon – Label smoothing effect strength
label2category – Optional, mapping from label to its category. If provided, label smoothing will redistribute
smoothing_epsilononly inside the category corresponding to the sample’s ground truth labelreduction – CrossEntropyLoss reduction
label_smoothing
- oml.functional.label_smoothing.label_smoothing(y: Tensor, num_classes: int, epsilon: float = 0.2, categories: Optional[Tensor] = None) Tensor[source]
This function is doing label smoothing. You can also use modified version, where the label is smoothed only for the category corresponding to sample’s ground truth label. To use this, you should provide the
categoriesargument: vector, for which i-th entry is a corresponding category for labeli.- Parameters
y – Ground truth labels with the size of batch_size where each element is from 0 (inclusive) to num_classes (exclusive).
num_classes – Number of classes in total
epsilon – Power of smoothing. The biggest value in OHE-vector will be
1 - epsilon + 1 / num_classesafter the transformationcategories – Vector for which i-th entry is a corresponding category for label
i. Optional, used for category-based label smoothing. In that case the biggest value in OHE-vector will be1 - epsilon + 1 / num_classes_of_the_same_category, labels outside of the category will not change