Logging & Visualization
Logging in Pipelines
There are several loggers integrated with Pipelines. You can also use your custom logger.
Tensorboard — is active by default if there is no
logger
in config.... logger: name: tensorboard args: save_dir: "." ...
-
... logger: name: neptune # requires <NEPTUNE_API_TOKEN> as global env args: project: "oml-team/test" ...
export NEPTUNE_API_TOKEN=your_token; python train.py
-
... logger: name: wandb args: project: "test_project" ...
export WANDB_API_KEY=your_token; python train.py
-
... logger: name: mlflow args: experiment_name: "test_project" tracking_uri: "file:./ml-runs" # another way: export MLFLOW_TRACKING_URI=file:./ml-runs ...
ClearML
... logger: name: clearml args: project_name: "test_project" task_name: "test" offline_mode: False # if True logging is directed to a local dir ...
An example of logging via Neptune in the feature extractor pipeline.
So, you get:
Metrics such as
CMC@1
,Precision@5
,MAP@5
, which were provided in a config file asmetric_args
. Note, you can setmetrics_args.return_only_overall_category: False
to log metrics independently for each of the categories (if your dataset has ones).Loss values averaged over batches and epochs. Some of the built-in OML’s losses have their unique additional statistics that is also logged. We used TripletLossWithMargin in our example, which comes along with tracking positive distances, negative distances and a fraction of active triplets (those for which loss is greater than zero).
The image above shows the worst model’s predictions in terms of MAP@5 metric. In particular, each row contains:
A query (blue)
Five closest items from a gallery to the given query & the corresponding distances (they are all red because they are irrelevant to the query)
At most two ground truths (grey), to get an idea of what model should return
You also get some artifacts for reproducibility, such as:
Source code
Config
Dataframe
Tags
Logging in Python
Using Lightning
Take a look at the following example: Training + Validation [Lightning and logging]. It shows how to use each of: Tensorboard, MLFlow, ClearML, Neptune or WandB.
Using plain Python
Log whatever information you want using the tool of your choice. We just provide some tips on how to get this information. There are two main sources of logs:
Criterion (loss). Some of the built-in OML’s losses have their unique additional statistics, which is stored in the
last_logs
field. See Training in the examples.Metrics calculator — EmbeddingMetrics. It has plenty of methods useful for logging. See Validation in the examples.
We also recommend you take a look at:
Visualisation notebook for interactive errors analysis and visualizing attention maps.