DDP

Note, that this is an advanced section for developers or curious users. Normally, you don’t even need to know about the existence of the classes and functions below.

ModuleDDP

class oml.lightning.modules.ddp.ModuleDDP(loaders_train: Optional[Union[DataLoader, Sequence[DataLoader], Dict[str, DataLoader]]] = None, loaders_val: Optional[Union[DataLoader, Sequence[DataLoader]]] = None)[source]

Bases: LightningModule

The module automatically patches training and validation dataloaders to DDP mode by splitting available indices between devices. Note, don’t use trainer.fit(...) or trainer.validate(...), because in this case, PytorchLightning will ignore our patching.

__init__(loaders_train: Optional[Union[DataLoader, Sequence[DataLoader], Dict[str, DataLoader]]] = None, loaders_val: Optional[Union[DataLoader, Sequence[DataLoader]]] = None)[source]

ExtractorModuleDDP

class oml.lightning.modules.extractor.ExtractorModuleDDP(loaders_train: Optional[Any] = None, loaders_val: Optional[Any] = None, *args: Any, **kwargs: Any)[source]

Bases: ExtractorModule, ModuleDDP

This is a base module for the training of your model with Lightning in DDP.

__init__(loaders_train: Optional[Any] = None, loaders_val: Optional[Any] = None, *args: Any, **kwargs: Any)[source]
Parameters
  • extractor – Extractor to train

  • criterion – Criterion to optimize

  • optimizer – Optimizer

  • scheduler – Learning rate scheduler

  • scheduler_interval – Interval of calling scheduler (must be step or epoch)

  • scheduler_frequency – Frequency of calling scheduler

  • input_tensors_key – Key to get tensors from the batches

  • labels_key – Key to get labels from the batches

  • embeddings_key – Key to get embeddings from the batches

  • scheduler_monitor_metric – Metric to monitor for the schedulers that depend on the metric value

  • freeze_n_epochs – number of epochs to freeze model (for n > 0 model has to be a successor of IFreezable interface). When current_epoch >= freeze_n_epochs model is unfreezed. Note that epochs are starting with 0.

PairwiseModuleDDP

class oml.lightning.modules.pairwise_postprocessing.PairwiseModuleDDP(loaders_train: Optional[Any] = None, loaders_val: Optional[Any] = None, *args: Any, **kwargs: Any)[source]

Bases: PairwiseModule, ModuleDDP

__init__(loaders_train: Optional[Any] = None, loaders_val: Optional[Any] = None, *args: Any, **kwargs: Any)[source]
Parameters
  • pairwise_model – Pairwise model to train

  • pairs_miner – Miner of pairs

  • criterion – Criterion to optimize

  • optimizer – Optimizer

  • scheduler – Learning rate scheduler

  • scheduler_interval – Interval of calling scheduler (must be step or epoch)

  • scheduler_frequency – Frequency of calling scheduler

  • input_tensors_key – Key to get tensors from the batches

  • labels_key – Key to get labels from the batches

  • embeddings_key – Key to get embeddings from the batches

  • scheduler_monitor_metric – Metric to monitor for the schedulers that depend on the metric value

  • freeze_n_epochs – number of epochs to freeze model (for n > 0 model has to be a successor of IFreezable interface). When current_epoch >= freeze_n_epochs model is unfreezed. Note that epochs are starting with 0.

DDPSamplerWrapper

class oml.ddp.patching.DDPSamplerWrapper(sampler: Union[BatchSampler, Sampler, IBatchSampler], shuffle_samples_between_gpus: bool = True, pad_data_to_num_gpus: bool = True)[source]

Bases: DistributedSampler

This is a wrapper to allow using custom sampler in DDP mode.

Default DistributedSampler allows us to build a sampler for a dataset in DDP mode. Usually we can easily replace default SequentialSampler [when DataLoader(shuffle=False, ...)] and RandomSampler [when DataLoader(shuffle=True, ...)] with DistributedSampler. But for the custom sampler, we need an extra wrapper.

Thus, this wrapper distributes indices produced by sampler among several devices for further usage.

__init__(sampler: Union[BatchSampler, Sampler, IBatchSampler], shuffle_samples_between_gpus: bool = True, pad_data_to_num_gpus: bool = True)[source]
Parameters
  • sampler – Sequential or batch sampler

  • pad_data_to_num_gpus – When using DDP we should manage behavior with the last batch, because each device should have the same amount of data. If the sampler length is not evenly divisible by the number of devices, we must duplicate part of the data (pad_data_to_num_gpus=True), or discard part of the data (pad_data_to_num_gpus=False).

  • shuffle_samples_between_gpus – shuffle available indices before feeding them to GPU. Note, that shuffle inside GPU after the feeding will be used according to behavior of the sampler.

Note: Wrapper can be used with both the default SequentialSampler or RandomSampler from PyTorch and with some custom sampler.

_reload() None[source]

We need to re-instantiate the wrapper in order to update the available indices for the new epoch. We don’t perform this step on the epoch 0, because we want to be comparable with no DDP setup there.

patch_dataloader_to_ddp

oml.ddp.patching.patch_dataloader_to_ddp(loader: DataLoader) DataLoader[source]

Function inspects loader and modifies sampler for working in DDP mode.

Note

We ALWAYS use the padding of samples (in terms of the number of batches or number of samples per epoch) in order to use the same amount of data for each device in DDP. Thus, the behavior with and without DDP may be slightly different (e.g. metrics values).

sync_dicts_ddp

oml.ddp.utils.sync_dicts_ddp(outputs_from_device: Dict[str, Any], world_size: int, device: Union[device, str] = 'cpu') Dict[str, Any][source]

The function allows you to combine and merge data stored in dictionaries from all devices. You can place this function in your code and all the devices upon reaching this function will wait for each other to synchronize and merge dictionaries.

Note

The function under the hood pickles all object, converts bytes to tensor, then unpickles them after syncing. With NCCL DDP backend (the default one) intermediate tensors are stored on CUDA.