Logging & Visualization

Logging in Pipelines 

There are several loggers integrated with Pipelines. You can also use your custom logger.

Tensorboard — is active by default if there is no logger in config.

...
logger:
  name: tensorboard
  args:
    save_dir: "."
...

Neptune

...
logger:
  name: neptune  # requires <NEPTUNE_API_TOKEN> as global env
  args:
    project: "oml-team/test"
...

export NEPTUNE_API_TOKEN=your_token; python train.py

Weights and Biases

...
logger:
    name: wandb
    args:
      project: "test_project"
...

export WANDB_API_KEY=your_token; python train.py

MLFlow

...
logger:
    name: mlflow
    args:
        experiment_name: "test_project"
        tracking_uri: "file:./ml-runs"  # another way: export MLFLOW_TRACKING_URI=file:./ml-runs
...

ClearML

...
logger:
    name: clearml
    args:
        project_name: "test_project"
        task_name: "test"
        offline_mode: False # if True logging is directed to a local dir
...

An example of logging via Neptune in the feature extractor pipeline.

So, you get:

Metrics such as CMC@1, Precision@5, MAP@5, which were provided in a config file as metric_args. Note, you can set metrics_args.return_only_overall_category: False to log metrics independently for each of the categories (if your dataset has ones).
Loss values averaged over batches and epochs. Some of the built-in OML’s losses have their unique additional statistics that is also logged. We used TripletLossWithMargin in our example, which comes along with tracking positive distances, negative distances and a fraction of active triplets (those for which loss is greater than zero).

The image above shows the worst model’s predictions in terms of MAP@5 metric. In particular, each row contains:

A query (blue)
Five closest items from a gallery to the given query & the corresponding distances (they are all red because they are irrelevant to the query)
At most two ground truths (grey), to get an idea of what model should return

You also get some artifacts for reproducibility, such as:

Source code
Config
Dataframe
Tags

Logging in Python

Using Lightning

Take a look at the following example: Training + Validation [Lightning and logging]. It shows how to use each of: Tensorboard, MLFlow, ClearML, Neptune or WandB.

Using plain Python

Log whatever information you want using the tool of your choice. We just provide some tips on how to get this information. There are two main sources of logs:

Criterion (loss). Some of the built-in OML’s losses have their unique additional statistics, which is stored in the last_logs field. See Training in the examples.
Metrics calculator — EmbeddingMetrics. It has plenty of methods useful for logging. See Validation in the examples.

We also recommend you take a look at:

Visualisation notebook for interactive errors analysis and visualizing attention maps.
ViTExtractor.draw_attention()
ResnetExtractor.draw_gradcam()

Logging & Visualization

Logging in Pipelines

Logging in Python

Using Lightning

Using plain Python

Logging in Pipelines 