Metrics

EmbeddingMetrics
calc_retrieval_metrics
calc_topological_metrics
calc_cmc
calc_precision
calc_map
calc_fnmr_at_fmr
calc_pcf

EmbeddingMetrics 

class oml.metrics.embeddings.EmbeddingMetrics(embeddings_key: str = 'embeddings', labels_key: str = 'labels', is_query_key: str = 'is_query', is_gallery_key: str = 'is_gallery', extra_keys: Tuple[str, ...] = (), cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), fmr_vals: Tuple[float, ...] = (), pcf_variance: Tuple[float, ...] = (0.5,), categories_key: Optional[str] = None, sequence_key: Optional[str] = None, postprocessor: Optional[IDistancesPostprocessor] = None, metrics_to_exclude_from_visualization: Iterable[str] = (), return_only_overall_category: bool = False, visualize_only_overall_category: bool = True, verbose: bool = True)[source]

Bases: IMetricVisualisable

This class accumulates the information from the batches and embeddings produced by the model at every batch in epoch. After all the samples have been stored, you can call the function which computes retrievals metrics. To get the needed information from the batches, it uses keys which have to be provided as init arguments. Please, check the usage example in Readme.

__init__(embeddings_key: str = 'embeddings', labels_key: str = 'labels', is_query_key: str = 'is_query', is_gallery_key: str = 'is_gallery', extra_keys: Tuple[str, ...] = (), cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), fmr_vals: Tuple[float, ...] = (), pcf_variance: Tuple[float, ...] = (0.5,), categories_key: Optional[str] = None, sequence_key: Optional[str] = None, postprocessor: Optional[IDistancesPostprocessor] = None, metrics_to_exclude_from_visualization: Iterable[str] = (), return_only_overall_category: bool = False, visualize_only_overall_category: bool = True, verbose: bool = True)[source]

Parameters

embeddings_key – Key to take the embeddings from the batches
labels_key – Key to take the labels from the batches
is_query_key – Key to take the information whether every batch sample belongs to the query
is_gallery_key – Key to take the information whether every batch sample belongs to the gallery
extra_keys – Keys to accumulate some additional information from the batches
cmc_top_k – Values of k to calculate cmc@k (Cumulative Matching Characteristic)
precision_top_k – Values of k to calculate precision@k
map_top_k – Values of k to calculate map@k (Mean Average Precision)
fmr_vals – Values of fmr (measured in quantiles) to calculate fnmr@fmr (False Non Match Rate at the given False Match Rate). For example, if fmr_values is (0.2, 0.4) we will calculate fnmr@fmr=0.2 and fnmr@fmr=0.4. Note, computing this metric requires additional memory overhead, that is why it’s turned off by default.
pcf_variance – Values in range [0, 1]. Find the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by pcf_variance.
categories_key – Key to take the samples’ categories from the batches (if you have ones)
sequence_key – Key to take sequence ids from the batches (if you have ones)
postprocessor – Postprocessor which applies some techniques like query reranking
metrics_to_exclude_from_visualization – Names of the metrics to exclude from the visualization. It will not affect calculations.
return_only_overall_category – Set True if you want to return only the aggregated metrics
visualize_only_overall_category – Set False if you want to visualize each category separately
verbose – Set True if you want to print metrics

setup(num_samples: int) → None[source]: Method for preparing metrics to work: memory allocation, placeholder preparation, etc. Has to be called before the first call of self.update_data().

update_data(data_dict: Dict[str, Any], indices: Optional[List[int]] = None) → None[source]

Parameters

data_dict – Batch of data containing records of the same size: bs.
indices – Global indices of the elements in your records within the range of (0, dataset_size - 1). Indices are needed in DDP (because data is gathered shuffled, additionally you may also get some duplicates due to padding). In the single device regime it’s may be useful if you accumulate data in shuffled order.

compute_metrics() → Dict[Union[str, int], Dict[str, Dict[Union[int, float], Union[float, Tensor]]]][source]

The output must be in the following format:

{
    "self.overall_categories_key": {"metric1": ..., "metric2": ...},
    "category1": {"metric1": ..., "metric2": ...},
    "category2": {"metric1": ..., "metric2": ...}
}

Where category1 and category2 are optional.

get_plot_for_queries(query_ids: List[int], n_instances: int, verbose: bool = True) → Figure[source]

Visualize the predictions for the query with the indicies <query_ids>.

Parameters

query_ids – Index of the query
n_instances – Amount of the predictions to show
verbose – wether to show image paths or not

get_worst_queries_ids(metric_name: str, n_queries: int) → List[int][source]

get_plot_for_worst_queries(metric_name: str, n_queries: int, n_instances: int, verbose: bool = False) → Figure[source]

visualize() → Tuple[Collection[Figure], Collection[str]][source]: Visualize worst queries by all the available metrics.

calc_retrieval_metrics 

oml.functional.metrics.calc_retrieval_metrics(retrieved_ids: LongTensor, gt_ids: List[LongTensor], cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), reduce: bool = True) → Dict[str, Dict[Union[int, float], Union[float, Tensor]]][source]

Function to count different retrieval metrics.

Parameters

retrieved_ids – Top N gallery ids retrieved for every query with the shape of [n_query, top_n]. Every element is within the range (0, n_gallery - 1).
gt_ids – Gallery ids relevant to every query, list of n_query elements where every element may have an arbitrary length. Every element is within the range (0, n_gallery - 1)
cmc_top_k – Values of k to calculate cmc@k (Cumulative Matching Characteristic)
precision_top_k – Values of k to calculate precision@k
map_top_k – Values of k to calculate map@k (Mean Average Precision)
reduce – If False return metrics for each query without averaging

Returns

Metrics dictionary.

calc_topological_metrics 

oml.functional.metrics.calc_topological_metrics(embeddings: Tensor, pcf_variance: Tuple[float, ...]) → Dict[str, Dict[Union[int, float], Union[float, Tensor]]][source]

Function to evaluate different topological metrics.

Parameters

embeddings – Embeddings matrix with the shape of [n_embeddings, embeddings_dim].
pcf_variance – Values in range [0, 1]. Find the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by pcf_variance.

Returns

Metrics dictionary.

calc_cmc 

oml.functional.metrics.calc_cmc(gt_tops: Tensor, top_k: Tuple[int, ...]) → List[Tensor][source]

Function to compute Cumulative Matching Characteristics (CMC) at cutoffs top_k.

cmc@k for a given query equals to 1 if there is at least 1 instance related to this query in top k gallery instances sorted by distances to the query, and 0 otherwise. The final cmc@k could be obtained by averaging the results calculated for each query.

Parameters

gt_tops – Matrix where the (i, j) element indicates if j-th gallery sample is related to i-th query or not. Obtained from the full ground truth matrix by taking max(top_k) elements with the smallest distances to the corresponding queries.
top_k – Values of k to calculate cmc@k.

Returns

List of cmc@k tensors.

\[\begin{split}\textrm{cmc}@k = \begin{cases} 1, & \textrm{if top-}k \textrm{ ranked gallery samples include an output relevant to the query}, \\ 0, & \textrm{otherwise}. \end{cases}\end{split}\]

Example

>>> gt_tops = torch.tensor([
...                         [1, 0],
...                         [0, 1],
...                         [0, 0]
... ], dtype=torch.bool)
>>> calc_cmc(gt_tops, top_k=(1, 2))
[tensor([1., 0., 0.]), tensor([1., 1., 0.])]

calc_precision 

oml.functional.metrics.calc_precision(gt_tops: Tensor, n_gt: Tensor, top_k: Tuple[int, ...]) → List[Tensor][source]

Function to compute Precision at cutoffs top_k.

precision@k for a given query is a fraction of the relevant gallery instances among the top k instances sorted by distances from the query to the gallery. The final precision@k could be obtained by averaging the results calculated for each query.

Parameters

gt_tops – Matrix where the (i, j) element indicates if j-th gallery sample is related to i-th query or not. Obtained from the full ground truth matrix by taking max(top_k) elements with the smallest distances to the corresponding queries.
n_gt – Array where the i-th element is the total number of elements in the gallery relevant to i-th query.
top_k – Values of k to calculate precision@k.

Returns

List of precision@k tensors.

Given a list \(g=[g_1, \ldots, g_k]\) of ground truth top \(k\) closest elements from the gallery to a given query (\(g_i\) is 1 if \(i\)-th element from the gallery is relevant to the query and 0 otherwise), and the total number of relevant elements from the gallery \(n\), the \(\textrm{precision}@k\) for the query is defined as

\[\textrm{precision}@k = \frac{1}{\min{\left(k, n\right)}}\sum\limits_{i = 1}^k g_i\]

It’s worth mentioning that OML version of \(\textrm{precision}@k\) differs from the commonly used by the denominator of the fraction. The OML version takes into account the total amount of relevant elements in the gallery, so it will not penalize the ideal model if \(n < k\).

For instance, let \(n = 3\) and \(g = [1, 1, 1, 0, 0]\). Then by using the common definition of \(\textrm{precision}@k\) we get

\[\begin{split}\begin{align} \textrm{precision}@1 &= \frac{1}{1}, \textrm{precision}@2 = \frac{2}{2}, \textrm{precision}@3 = \frac{3}{3}, \\ \textrm{precision}@4 &= \frac{3}{4}, \textrm{precision}@5 = \frac{3}{5}, \textrm{precision}@6 = \frac{3}{6} \\ \end{align}\end{split}\]

But with OML definition of \(\textrm{precision}@k\) we get

\[\begin{split}\begin{align} \textrm{precision}@1 &= \frac{1}{1}, \textrm{precision}@2 = \frac{2}{2}, \textrm{precision}@3 = \frac{3}{3} \\ \textrm{precision}@4 &= \frac{3}{3}, \textrm{precision}@5 = \frac{3}{3}, \textrm{precision}@6 = \frac{3}{3} \\ \end{align}\end{split}\]

See:: Evaluation measures (information retrieval). Precision@k

Example

>>> gt_tops = torch.tensor([
...                         [1, 0],
...                         [0, 1],
...                         [0, 0]
... ], dtype=torch.bool)
>>> n_gt = torch.tensor([2, 3, 5])
>>> calc_precision(gt_tops, n_gt, top_k=(1, 2))
[tensor([1., 0., 0.]), tensor([0.5000, 0.5000, 0.0000])]

calc_map 

oml.functional.metrics.calc_map(gt_tops: Tensor, n_gt: Tensor, top_k: Tuple[int, ...]) → List[Tensor][source]

Function to compute Mean Average Precision (MAP) at cutoffs top_k.

map@k for a given query is the average value of the precision considered as a function of the recall. The final map@k could be obtained by averaging the results calculated for each query.

Parameters

gt_tops – Matrix where the (i, j) element indicates if j-th gallery sample is related to i-th query or not. Obtained from the full ground truth matrix by taking max(top_k) elements with the smallest distances to the corresponding queries.
n_gt – Array where the i-th element is the total number of elements in the gallery relevant to i-th query.
top_k – Values of k to calculate map@k.

Returns

List of map@k tensors.

Given a list \(g=[g_1, \ldots, g_k]\) of ground truth top \(k\) closest elements from the gallery to a given query (\(g_i\) is 1 if \(i\)-th element from the gallery is relevant to the query and 0 otherwise), and the total number of relevant elements from the gallery \(n\), the \(\textrm{map}@k\) for the query is defined as

\[\begin{split} \textrm{map}@k &= \frac{1}{n_k}\sum\limits_{i = 1}^k \frac{n_i}{i} \times \textrm{rel}(i) \end{split}\]

where \(\textrm{rel}(i)\) is 1 if \(i\)-th element from the top \(i\) closest elements from the gallery to the query is relevant to the query, and 0 otherwise; and \(n_i = \sum\limits_{j = 1}^{i}g_j\), which is the number of the relevant predictions among the first \(i\) outputs.

See:

Evaluation measures (information retrieval). Mean Average Precision

Mean Average Precision (MAP) For Recommender Systems

Example

>>> gt_tops = torch.tensor([
...                         [1, 0],
...                         [0, 1],
...                         [0, 0]
... ], dtype=torch.bool)
>>> n_gt = torch.tensor([2, 3, 5])
>>> calc_map(gt_tops, n_gt, top_k=(1, 2))
[tensor([1., 0., 0.]), tensor([1.0000, 0.5000, 0.0000])]

calc_fnmr_at_fmr 

oml.functional.metrics.calc_fnmr_at_fmr(pos_dist: Tensor, neg_dist: Tensor, fmr_vals: Tuple[float, ...] = (0.1,)) → Tensor[source]

Function to compute False Non Match Rate (FNMR) value when False Match Rate (FMR) value is equal to fmr_vals.

The metric calculates the quantile of positive distances higher than a given \(q\)-th quantile of negative distances.

Parameters

pos_dist – Distances between relevant samples.
neg_dist – Distances between non-relevant samples.
fmr_vals – Values of fmr (measured in quantiles) to compute the corresponding fnmr. For example, if fmr_values is (0.2, 0.4) we will calculate fnmr@fmr=0.2 and fnmr@fmr=0.4

Returns

Tensor of fnmr@fmr values.

Given a vector of \(N\) distances between relevant samples, \(u\), the false non-match rate (\(\textrm{FNMR}\)) is computed as the proportion of \(u\) below some threshold, \(T\):

\[\textrm{FNMR}(T) = \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(u_i - T\right) = 1 - \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(T - u_i\right)\]

where \(H(x)\) is the unit step function, and \(H(0)\) taken to be \(1\).

Similarly, given a vector of \(N\) distances between non-relevant samples, \(v\), the false match rate (\(\textrm{FMR}\)) is computed as the proportion of \(v\) above some threshold, \(T\):

\[\textrm{FMR}(T) = 1 - \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(v_i - T\right) = \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(T - v_i\right)\]

Given some interesting false match rate values \(\textrm{FMR}_k\) one can find thresholds \(T_k\) corresponding to \(\textrm{FMR}\) measurements

\[T_k = Q_v\left(\textrm{FMR}_k\right)\]

where \(Q\) is the quantile function, and evaluate the corresponding values of \(\textrm{FNMR}@\textrm{FMR}\left(T_k\right) \stackrel{\text{def}}{=} \textrm{FNMR}\left(T_k\right)\).

See:

Biometrics Performance

BIOMETRIC RECOGNITION: A MODERN ERA FOR SECURITY

Example

>>> pos_dist = torch.tensor([0, 0, 1, 1, 2, 2, 5, 5, 9, 9])
>>> neg_dist = torch.tensor([3, 3, 4, 4, 6, 6, 7, 7, 8, 8])
>>> calc_fnmr_at_fmr(pos_dist, neg_dist, fmr_vals=(0.1, 0.5))
tensor([0.4000, 0.2000])

calc_pcf 

oml.functional.metrics.calc_pcf(embeddings: Tensor, pcf_variance: Tuple[float, ...]) → List[Tensor][source]

Function estimates the Principal Components Fraction (PCF) of embeddings using Principal Component Analysis. The metric is defined as a fraction of components needed to explain the required variance in data.

Parameters

embeddings – Embeddings matrix with the shape of [n_embeddings, embeddings_dim].
pcf_variance – Values in range [0, 1]. Find the number of components such that the amount of variance that needs to be explained is greater than the fraction specified by pcf_variance.

Returns

List of linear dimensions as a fractions of the embeddings dimension.

Let \(X\) be a set of \(d\) dimensional embeddings. Let \(\lambda_1, \ldots, \lambda_d\in\mathbb{R}\) be a set of eigenvalues of the covariance matrix of \(X\) sorted in descending order. Then for a given value of desired explained variance \(r\), the number of principal components that explaines \(r\cdot 100\%%\) variance is the largest integer \(n\) such that

\[\frac{\sum\limits_{i = 1}^{n - 1}\lambda_i}{\sum\limits_{i = 1}^{d}\lambda_i} \leq r\]

The function returns

\[\frac{n}{d}\]

See:

Principal Components Analysis

Example

In the example bellow there are 4 vectors of length 10, and only first 4 dimensions have non-zero values. Its covariance matrix will have only 4 eigenvalues that are greater than 0, i.e. there are only 4 principal axes. So, in order to keep at least 50% of the information from the set, we need to keep 2 principal axes, and in order to keep all the information we need to keep 5 principal axes (one additional axis appears because the number of principal axes is superior to the desired explained variance threshold).

>>> embeddings = torch.eye(4, 10, dtype=torch.float)
>>> calc_pcf(embeddings, pcf_variance=(0.5, 1))
tensor([0.2000, 0.5000])