Metrics
calc_retrieval_metrics
- oml.functional.metrics.calc_retrieval_metrics(retrieved_ids: Sequence[LongTensor], gt_ids: Sequence[LongTensor], query_categories: Optional[Union[LongTensor, ndarray]] = None, cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), reduce: bool = True, verbose: bool = True) Dict[str, Any] [source]
Function to compute different retrieval metrics.
- Parameters
retrieved_ids – First gallery indices retrieved for every query with the size of
n_query
. Every index is within the range(0, n_gallery - 1)
.gt_ids – Gallery indices relevant to every query with the size of
n_query
. Every element is within the range(0, n_gallery - 1)
query_categories – Categories of queries with the size of
n_query
to compute metrics for each category.cmc_top_k – Values of
k
to calculatecmc@k
(Cumulative Matching Characteristic)precision_top_k – Values of
k
to calculateprecision@k
map_top_k – Values of
k
to calculatemap@k
(Mean Average Precision)reduce – If
False
return metrics for each query without averagingverbose – Set
True
to make the function verbose.
- Returns
Metrics dictionary.
calc_retrieval_metrics_rr
- oml.metrics.embeddings.calc_retrieval_metrics_rr(rr: RetrievalResults, query_categories: Optional[Union[LongTensor, ndarray]] = None, cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), reduce: bool = True, verbose: bool = True) Dict[str, Any] [source]
Function to compute different retrieval metrics.
- Parameters
rr – An instance of RetrievalResults.
query_categories – Categories of queries with the size of
n_query
to compute metrics for each category.cmc_top_k – Values of
k
to calculatecmc@k
(Cumulative Matching Characteristic)precision_top_k – Values of
k
to calculateprecision@k
map_top_k – Values of
k
to calculatemap@k
(Mean Average Precision)reduce – If
False
return metrics for each query without averagingverbose – Set
True
to make the function verbose.
- Returns
Metrics dictionary.
calc_cmc
- oml.functional.metrics.calc_cmc(gt_tops: Sequence[BoolTensor], n_gts: List[int], top_k: Tuple[int, ...], verbose: bool = False) List[FloatTensor] [source]
Function to compute Cumulative Matching Characteristics (CMC) at cutoffs
top_k
.cmc@k
for a given query equals to 1 if there is at least 1 instance related to this query in topk
gallery instances sorted by distances to the query, and 0 otherwise. The finalcmc@k
could be obtained by averaging the results calculated for each query.- Parameters
gt_tops – Indicators that show if retrievied items are correct or not:
gt_tops[i][j]
isTrue
ifj
-th gallery item is related to thei
-th query item.n_gts – Number of existing ground truths for every query.
top_k – Values of
k
to calculatecmc@k
.verbose – Set
True
to see progress bar.
- Returns
List of
cmc@k
tensors computed for every query.
\[\begin{split}\textrm{cmc}@k = \begin{cases} 1, & \textrm{if top-}k \textrm{ ranked gallery samples include an output relevant to the query}, \\ 0, & \textrm{otherwise}. \end{cases}\end{split}\]Example
>>> gt_tops = [ ... BoolTensor([1, 0]), ... BoolTensor([0, 1, 1]), ... BoolTensor([0, 0]), ... BoolTensor([]) ... ] >>> n_gts = [2, 2, 1, 0] >>> calc_cmc(gt_tops, n_gts, top_k=(1, 2)) [tensor([1., 0., 0., 1.]), tensor([1., 1., 0., 1.])]
calc_precision
- oml.functional.metrics.calc_precision(gt_tops: Sequence[BoolTensor], n_gts: List[int], top_k: Tuple[int, ...], verbose: bool = False) List[FloatTensor] [source]
Function to compute Precision at cutoffs
top_k
.precision@k
for a given query is a fraction of the relevant gallery instances among the topk
instances sorted by distances from the query to the gallery. The finalprecision@k
could be obtained by averaging the results calculated for each query.- Parameters
gt_tops – Indicators that show if retrievied items are correct or not:
gt_tops[i][j]
isTrue
ifj
-th gallery item is related to thei
-th query item.n_gts – Number of existing ground truth for every query.
top_k – Values of
k
to calculateprecision@k
.verbose – Set
True
to see progress bar.
- Returns
List of
precision@k
tensors computed for every query.
Given a list \(g=[g_1, \ldots, g_k]\) of ground truth top \(k\) closest elements from the gallery to a given query (\(g_i\) is 1 if \(i\)-th element from the gallery is relevant to the query and 0 otherwise), and the total number of relevant elements from the gallery \(n\), the \(\textrm{precision}@k\) for the query is defined as
\[\textrm{precision}@k = \frac{1}{\min{\left(k, n\right)}}\sum\limits_{i = 1}^k g_i\]It’s worth mentioning that OML version of \(\textrm{precision}@k\) differs from the commonly used by the denominator of the fraction. The OML version takes into account the total amount of relevant elements in the gallery, so it will not penalize the ideal model if \(n < k\).
For instance, let \(n = 3\) and \(g = [1, 1, 1, 0, 0]\). Then by using the common definition of \(\textrm{precision}@k\) we get
\[\begin{split}\begin{align} \textrm{precision}@1 &= \frac{1}{1}, \textrm{precision}@2 = \frac{2}{2}, \textrm{precision}@3 = \frac{3}{3}, \\ \textrm{precision}@4 &= \frac{3}{4}, \textrm{precision}@5 = \frac{3}{5}, \textrm{precision}@6 = \frac{3}{6} \\ \end{align}\end{split}\]But with OML definition of \(\textrm{precision}@k\) we get
\[\begin{split}\begin{align} \textrm{precision}@1 &= \frac{1}{1}, \textrm{precision}@2 = \frac{2}{2}, \textrm{precision}@3 = \frac{3}{3} \\ \textrm{precision}@4 &= \frac{3}{3}, \textrm{precision}@5 = \frac{3}{3}, \textrm{precision}@6 = \frac{3}{3} \\ \end{align}\end{split}\]Example
>>> gt_tops = [ ... BoolTensor([1, 0]), ... BoolTensor([0, 1, 1]), ... BoolTensor([0, 0]), ... BoolTensor([]) ... ] >>> n_gts = [2, 3, 5, 2] >>> calc_precision(gt_tops, n_gts, top_k=(1, 2)) [tensor([1., 0., 0., 0.]), tensor([0.5000, 0.5000, 0.0000, 0.0000])]
calc_map
- oml.functional.metrics.calc_map(gt_tops: Sequence[BoolTensor], n_gts: List[int], top_k: Tuple[int, ...], verbose: bool = False) List[FloatTensor] [source]
Function to compute Mean Average Precision (MAP) at cutoffs
top_k
.map@k
for a given query is the average value of theprecision
considered as a function of therecall
. The finalmap@k
could be obtained by averaging the results calculated for each query.- Parameters
gt_tops – Indicators that show if retrievied items are correct or not:
gt_tops[i][j]
isTrue
ifj
-th gallery item is related to thei
-th query item.n_gts – Number of existing ground truth for every query.
top_k – Values of
k
to calculatemap@k
.verbose – Set
True
to see progress bar.
- Returns
List of
map@k
tensors computed for every query.
Given a list \(g=[g_1, \ldots, g_k]\) of ground truth top \(k\) closest elements from the gallery to a given query (\(g_i\) is 1 if \(i\)-th element from the gallery is relevant to the query and 0 otherwise), and the total number of relevant elements from the gallery \(n\), the \(\textrm{map}@k\) for the query is defined as
\[\begin{split} \textrm{map}@k &= \frac{1}{n_k}\sum\limits_{i = 1}^k \frac{n_i}{i} \times \textrm{rel}(i) \end{split}\]where \(\textrm{rel}(i)\) is 1 if \(i\)-th element from the top \(i\) closest elements from the gallery to the query is relevant to the query, and 0 otherwise; and \(n_i = \sum\limits_{j = 1}^{i}g_j\), which is the number of the relevant predictions among the first \(i\) outputs.
See:
Example
>>> gt_tops = [ ... BoolTensor([1, 0]), ... BoolTensor([0, 1]), ... BoolTensor([0, 0, 0, 0]), ... BoolTensor([]) ... ] >>> n_gts = [1, 1, 2, 0] >>> calc_map(gt_tops, n_gts, top_k=(1, 2)) [tensor([1., 0., 0., 1.]), tensor([1.0000, 0.5000, 0.0000, 1.0000])]
calc_fnmr_at_fmr
- oml.functional.metrics.calc_fnmr_at_fmr(pos_dist: ndarray, neg_dist: ndarray, fmr_vals: Tuple[float, ...] = (0.1,)) FloatTensor [source]
Function to compute False Non Match Rate (FNMR) value when False Match Rate (FMR) value is equal to
fmr_vals
.The metric calculates the quantile of positive distances higher than a given \(q\)-th quantile of negative distances.
- Parameters
pos_dist – Distances between relevant samples.
neg_dist – Distances between non-relevant samples.
fmr_vals – Values of
fmr
(measured in quantiles) to compute the correspondingfnmr
. For example, iffmr_values
is (0.2, 0.4) we will calculatefnmr@fmr=0.2
andfnmr@fmr=0.4
- Returns
Tensor of
fnmr@fmr
values.
Given a vector of \(N\) distances between relevant samples, \(u\), the false non-match rate (\(\textrm{FNMR}\)) is computed as the proportion of \(u\) below some threshold, \(T\):
\[\textrm{FNMR}(T) = \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(u_i - T\right) = 1 - \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(T - u_i\right)\]where \(H(x)\) is the unit step function, and \(H(0)\) taken to be \(1\).
Similarly, given a vector of \(N\) distances between non-relevant samples, \(v\), the false match rate (\(\textrm{FMR}\)) is computed as the proportion of \(v\) above some threshold, \(T\):
\[\textrm{FMR}(T) = 1 - \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(v_i - T\right) = \frac{1}{N}\sum\limits_{i = 1}^{N}H\left(T - v_i\right)\]Given some interesting false match rate values \(\textrm{FMR}_k\) one can find thresholds \(T_k\) corresponding to \(\textrm{FMR}\) measurements
\[T_k = Q_v\left(\textrm{FMR}_k\right)\]where \(Q\) is the quantile function, and evaluate the corresponding values of \(\textrm{FNMR}@\textrm{FMR}\left(T_k\right) \stackrel{\text{def}}{=} \textrm{FNMR}\left(T_k\right)\).
See:
Example
>>> pos_dist = np.array([0, 0, 1, 1, 2, 2, 5, 5, 9, 9]) >>> neg_dist = np.array([3, 3, 4, 4, 6, 6, 7, 7, 8, 8]) >>> calc_fnmr_at_fmr(pos_dist, neg_dist, fmr_vals=(0.1, 0.5)) tensor([0.4000, 0.2000])
calc_fnmr_at_fmr_rr
calc_topological_metrics
- oml.functional.metrics.calc_topological_metrics(embeddings: Tensor, pcf_variance: Tuple[float, ...], categories: Optional[Union[LongTensor, ndarray]] = None, verbose: bool = False) Dict[str, Any] [source]
Function to evaluate different topological metrics.
- Parameters
embeddings – Embeddings matrix with the shape of
[n_embeddings, embeddings_dim]
.categories – Categories of embeddings to compute category wise metrics.
pcf_variance – Values in range [0, 1]. Find the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by
pcf_variance
.verbose – Set
True
to see a progress bar.
- Returns
Metrics dictionary.
calc_pcf
- oml.functional.metrics.calc_pcf(embeddings: Tensor, pcf_variance: Tuple[float, ...]) List[Tensor] [source]
Function estimates the Principal Components Fraction (PCF) of embeddings using Principal Component Analysis. The metric is defined as a fraction of components needed to explain the required variance in data.
- Parameters
embeddings – Embeddings matrix with the shape of
[n_embeddings, embeddings_dim]
.pcf_variance – Values in range [0, 1]. Find the number of components such that the amount of variance that needs to be explained is greater than the fraction specified by
pcf_variance
.
- Returns
List of linear dimensions as a fractions of the embeddings dimension.
Let \(X\) be a set of \(d\) dimensional embeddings. Let \(\lambda_1, \ldots, \lambda_d\in\mathbb{R}\) be a set of eigenvalues of the covariance matrix of \(X\) sorted in descending order. Then for a given value of desired explained variance \(r\), the number of principal components that explaines \(r\cdot 100\%%\) variance is the largest integer \(n\) such that
\[\frac{\sum\limits_{i = 1}^{n - 1}\lambda_i}{\sum\limits_{i = 1}^{d}\lambda_i} \leq r\]The function returns
\[\frac{n}{d}\]See:
Example
In the example bellow there are 4 vectors of length 10, and only first 4 dimensions have non-zero values. Its covariance matrix will have only 4 eigenvalues that are greater than 0, i.e. there are only 4 principal axes. So, in order to keep at least 50% of the information from the set, we need to keep 2 principal axes, and in order to keep all the information we need to keep 5 principal axes (one additional axis appears because the number of principal axes is superior to the desired explained variance threshold).
>>> embeddings = torch.eye(4, 10, dtype=torch.float) >>> calc_pcf(embeddings, pcf_variance=(0.5, 1)) tensor([0.2000, 0.5000])
EmbeddingMetrics
- class oml.metrics.embeddings.EmbeddingMetrics(dataset: Optional[IQueryGalleryLabeledDataset], cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), fmr_vals: Tuple[float, ...] = (), pcf_variance: Tuple[float, ...] = (0.5,), postprocessor: Optional[IRetrievalPostprocessor] = None, metrics_to_exclude_from_visualization: Iterable[str] = (), return_only_overall_category: bool = False, visualize_only_overall_category: bool = True, verbose: bool = True)[source]
Bases:
IMetricVisualisable
This class is designed to accumulate model outputs produced for every batch. Since retrieval metrics are not additive, we can compute them only after all data has been collected.
- __init__(dataset: Optional[IQueryGalleryLabeledDataset], cmc_top_k: Tuple[int, ...] = (5,), precision_top_k: Tuple[int, ...] = (5,), map_top_k: Tuple[int, ...] = (5,), fmr_vals: Tuple[float, ...] = (), pcf_variance: Tuple[float, ...] = (0.5,), postprocessor: Optional[IRetrievalPostprocessor] = None, metrics_to_exclude_from_visualization: Iterable[str] = (), return_only_overall_category: bool = False, visualize_only_overall_category: bool = True, verbose: bool = True)[source]
- Parameters
dataset – Annotated dataset having query-gallery split.
cmc_top_k – Values of
k
to calculatecmc@k
(Cumulative Matching Characteristic)precision_top_k – Values of
k
to calculateprecision@k
map_top_k – Values of
k
to calculatemap@k
(Mean Average Precision)fmr_vals – Values of
fmr
(measured in quantiles) to calculatefnmr@fmr
(False Non Match Rate at the given False Match Rate). For example, iffmr_values
is (0.2, 0.4) we will calculatefnmr@fmr=0.2
andfnmr@fmr=0.4
. Note, computing this metric requires additional memory overhead, that is why it’s turned off by default.pcf_variance – Values in range [0, 1]. Find the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by
pcf_variance
.postprocessor – Postprocessor which applies some techniques like query reranking
metrics_to_exclude_from_visualization – Names of the metrics to exclude from the visualization. It will not affect calculations.
return_only_overall_category – Set
True
if you want to return only the aggregated metricsvisualize_only_overall_category – Set
False
if you want to visualize each category separatelyverbose – Set
True
if you want to print metrics
- setup(num_samples: Optional[int] = None) None [source]
Method for preparing metrics to work: memory allocation, placeholder preparation, etc. Has to be called before the first call of
self.update_data()
.
- update(embeddings: FloatTensor, indices: Union[LongTensor, List[int]]) None [source]
- Parameters
embeddings – Representations of the dataset items containing in the current batch.
indices – Global indices of the dataset items within the range of
(0, dataset_size - 1)
. Indices are needed to make sure that we can align dataset items and collected information.
- compute_metrics() Dict[str, Any] [source]
The output must be in the following format:
{ "self.overall_categories_key": {"metric1": ..., "metric2": ...}, "category1": {"metric1": ..., "metric2": ...}, "category2": {"metric1": ..., "metric2": ...} }
Where
category1
andcategory2
are optional.
- get_plot_for_queries(query_ids: List[int], n_instances: int, verbose: bool = True) Figure [source]
- Parameters
query_ids – Indices of the queries
n_instances – Amount of the retrieved items to show
verbose – Set
True
for additional information
- get_plot_for_worst_queries(metric_name: str, n_queries: int, n_instances: int, verbose: bool = False) Figure [source]
- visualize() Tuple[Collection[Figure], Collection[str]] [source]
Visualize worst queries by all the available metrics.