
torch.distributed.init_process_group(backend, init_method=None, timeout=datetime.timedelta(0, 1800), world_size=-1, rank=-1, store=None, group_name='')[source]
Initializes the default distributed process group, and this will also initialize the distributed package.
There are 2 main ways to initialize a process group:
store, rank, and world_size explicitly.
init_method (a URL string) which indicates where/how to discover peers. Optionally specify rank and world_size, or encode all required parameters in the URL and omit them.
If neither is specified, init_method is assumed to be “env://”.
Parameters:
mpi, gloo, and nccl. This field should be given as a lowercase string (e.g., "gloo"), which can also be accessed via Backend attributes (e.g., Backend.GLOO). If using multiple processes per machine with nccl backend, each process must have exclusive access to every GPU it uses, as sharing GPUs between processes can result in deadlocks.
init_method or store is specified. Mutually exclusive with store.
store is specified.
store is specified.
init_method.
gloo backend. For nccl, this is applicable only if the environment variable NCCL_BLOCKING_WAIT is set to 1.
To enable backend == Backend.MPI, PyTorch needs to built from source on a system that supports MPI.