Logging & Visualization

Logging in Pipelines

There are several loggers integrated with Pipelines. You can also use your custom logger.

  • Tensorboard — is active by default if there is no logger in config.

    ...
    logger:
      name: tensorboard
      args:
        save_dir: "."
    ...
    
  • Neptune

    ...
    logger:
      name: neptune  # requires <NEPTUNE_API_TOKEN> as global env
      args:
        project: "oml-team/test"
    ...
    
    export NEPTUNE_API_TOKEN=your_token; python train.py
    
  • Weights and Biases

    ...
    logger:
        name: wandb
        args:
          project: "test_project"
    ...
    
    export WANDB_API_KEY=your_token; python train.py
    
  • MLFlow

    ...
    logger:
        name: mlflow
        args:
            experiment_name: "test_project"
            tracking_uri: "file:./ml-runs"  # another way: export MLFLOW_TRACKING_URI=file:./ml-runs
    ...
    
  • ClearML

    ...
    logger:
        name: clearml
        args:
            project_name: "test_project"
            task_name: "test"
            offline_mode: False # if True logging is directed to a local dir
    ...
    

An example of logging via Neptune in the feature extractor pipeline.

Graphs

So, you get:

  • Metrics such as CMC@1, Precision@5, MAP@5, which were provided in a config file as metric_args. Note, you can set metrics_args.return_only_overall_category: False to log metrics independently for each of the categories (if your dataset has ones).

  • Loss values averaged over batches and epochs. Some of the built-in OML’s losses have their unique additional statistics that is also logged. We used TripletLossWithMargin in our example, which comes along with tracking positive distances, negative distances and a fraction of active triplets (those for which loss is greater than zero).

Model's mistakes

The image above shows the worst model’s predictions in terms of MAP@5 metric. In particular, each row contains:

  • A query (blue)

  • Five closest items from a gallery to the given query & the corresponding distances (they are all red because they are irrelevant to the query)

  • At most two ground truths (grey), to get an idea of what model should return

You also get some artifacts for reproducibility, such as:

  • Source code

  • Config

  • Dataframe

  • Tags

Logging in Python

Using Lightning

Take a look at the example of usage the following loggers: Tensorboard, MLFlow, ClearML, Neptune or WandB.

See example
import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torch.optim import Adam

from oml.datasets import ImageLabeledDataset, ImageQueryGalleryLabeledDataset
from oml.lightning import ExtractorModule
from oml.lightning import MetricValCallback
from oml.losses import ArcFaceLoss
from oml.metrics import EmbeddingMetrics
from oml.models import ViTExtractor
from oml.samplers import BalanceSampler
from oml.utils import get_mock_images_dataset
from oml.lightning import logging
from oml.retrieval import ConstantThresholding

df_train, df_val = get_mock_images_dataset(global_paths=True, df_name="df_with_category.csv")

# model
extractor = ViTExtractor("vits16_dino", arch="vits16", normalise_features=True)

# train
optimizer = Adam(extractor.parameters(), lr=1e-6)
train_dataset = ImageLabeledDataset(df_train)
criterion = ArcFaceLoss(in_features=extractor.feat_dim, num_classes=df_train["label"].nunique())
batch_sampler = BalanceSampler(train_dataset.get_labels(), n_labels=2, n_instances=3)
train_loader = DataLoader(train_dataset, batch_sampler=batch_sampler)

# val
val_dataset = ImageQueryGalleryLabeledDataset(df_val)
val_loader = DataLoader(val_dataset, batch_size=4)
metric_callback = MetricValCallback(
    metric=EmbeddingMetrics(dataset=val_dataset, postprocessor=ConstantThresholding(0.8)),
    log_images=True
)

# 1) Logging with Tensorboard
logger = logging.TensorBoardPipelineLogger(".")

# 2) Logging with Neptune
# logger = logging.NeptunePipelineLogger(api_key="", project="", log_model_checkpoints=False)

# 3) Logging with Weights and Biases
# import os
# os.environ["WANDB_API_KEY"] = ""
# logger = logging.WandBPipelineLogger(project="test_project", log_model=False)

# 4) Logging with MLFlow locally
# logger = logging.MLFlowPipelineLogger(experiment_name="exp", tracking_uri="file:./ml-runs")

# 5) Logging with ClearML
# logger = logging.ClearMLPipelineLogger(project_name="exp", task_name="test")

# run
pl_model = ExtractorModule(extractor, criterion, optimizer)
trainer = pl.Trainer(max_epochs=3, callbacks=[metric_callback], num_sanity_val_steps=0, logger=logger)
trainer.fit(pl_model, train_dataloaders=train_loader, val_dataloaders=val_loader)

Using plain Python

Log whatever information you want using the tool of your choice. Just take a look at: