Logging & Visualization
Logging in Pipelines
There are several loggers integrated with Pipelines. You can also use your custom logger.
Tensorboard — is active by default if there is no
logger
in config.... logger: name: tensorboard args: save_dir: "." ...
-
... logger: name: neptune # requires <NEPTUNE_API_TOKEN> as global env args: project: "oml-team/test" ...
export NEPTUNE_API_TOKEN=your_token; python train.py
-
... logger: name: wandb args: project: "test_project" ...
export WANDB_API_KEY=your_token; python train.py
-
... logger: name: mlflow args: experiment_name: "test_project" tracking_uri: "file:./ml-runs" # another way: export MLFLOW_TRACKING_URI=file:./ml-runs ...
ClearML
... logger: name: clearml args: project_name: "test_project" task_name: "test" offline_mode: False # if True logging is directed to a local dir ...
An example of logging via Neptune in the feature extractor pipeline.

So, you get:
Metrics such as
CMC@1
,Precision@5
,MAP@5
, which were provided in a config file asmetric_args
. Note, you can setmetrics_args.return_only_overall_category: False
to log metrics independently for each of the categories (if your dataset has ones).Loss values averaged over batches and epochs. Some of the built-in OML’s losses have their unique additional statistics that is also logged. We used TripletLossWithMargin in our example, which comes along with tracking positive distances, negative distances and a fraction of active triplets (those for which loss is greater than zero).

The image above shows the worst model’s predictions in terms of MAP@5 metric. In particular, each row contains:
A query (blue)
Five closest items from a gallery to the given query & the corresponding distances (they are all red because they are irrelevant to the query)
At most two ground truths (grey), to get an idea of what model should return
You also get some artifacts for reproducibility, such as:
Source code
Config
Dataframe
Tags
Logging in Python
Using Lightning
Take a look at the example of usage the following loggers: Tensorboard, MLFlow, ClearML, Neptune or WandB.
See example
import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torch.optim import Adam
from oml.datasets import ImageLabeledDataset, ImageQueryGalleryLabeledDataset
from oml.lightning import ExtractorModule
from oml.lightning import MetricValCallback
from oml.losses import ArcFaceLoss
from oml.metrics import EmbeddingMetrics
from oml.models import ViTExtractor
from oml.samplers import BalanceSampler
from oml.utils import get_mock_images_dataset
from oml.lightning import logging
from oml.retrieval import ConstantThresholding
df_train, df_val = get_mock_images_dataset(global_paths=True, df_name="df_with_category.csv")
# model
extractor = ViTExtractor("vits16_dino", arch="vits16", normalise_features=True)
# train
optimizer = Adam(extractor.parameters(), lr=1e-6)
train_dataset = ImageLabeledDataset(df_train)
criterion = ArcFaceLoss(in_features=extractor.feat_dim, num_classes=df_train["label"].nunique())
batch_sampler = BalanceSampler(train_dataset.get_labels(), n_labels=2, n_instances=3)
train_loader = DataLoader(train_dataset, batch_sampler=batch_sampler)
# val
val_dataset = ImageQueryGalleryLabeledDataset(df_val)
val_loader = DataLoader(val_dataset, batch_size=4)
metric_callback = MetricValCallback(
metric=EmbeddingMetrics(dataset=val_dataset, postprocessor=ConstantThresholding(0.8)),
log_images=True
)
# 1) Logging with Tensorboard
logger = logging.TensorBoardPipelineLogger(".")
# 2) Logging with Neptune
# logger = logging.NeptunePipelineLogger(api_key="", project="", log_model_checkpoints=False)
# 3) Logging with Weights and Biases
# import os
# os.environ["WANDB_API_KEY"] = ""
# logger = logging.WandBPipelineLogger(project="test_project", log_model=False)
# 4) Logging with MLFlow locally
# logger = logging.MLFlowPipelineLogger(experiment_name="exp", tracking_uri="file:./ml-runs")
# 5) Logging with ClearML
# logger = logging.ClearMLPipelineLogger(project_name="exp", task_name="test")
# run
pl_model = ExtractorModule(extractor, criterion, optimizer)
trainer = pl.Trainer(max_epochs=3, callbacks=[metric_callback], num_sanity_val_steps=0, logger=logger)
trainer.fit(pl_model, train_dataloaders=train_loader, val_dataloaders=val_loader)
Using plain Python
Log whatever information you want using the tool of your choice. Just take a look at:
Criterion (loss). Some of the built-in OML’s losses have their unique additional statistics, which is stored in the
last_logs
field. See the training example.