PipeFusion: Displaced Patch Pipeline Parallelism for Diffusion Models

Chinese Blog 1; Chinese Blog 2

PipeFusion is the innovative method first proposed by us. It is a sequence-level pipeline parallel method, similar to TeraPipe, demonstrates significant advantages in weakly interconnected network hardware such as PCIe/Ethernet.

PipeFusion innovatively harnesses input temporal redundancy—the similarity between inputs and activations across diffusion steps, a diffusion-specific characteristics also employed in DistriFusion. PipeFusion not only reduces communication volume but also streamlines pipeline parallelism with TeraPipe, avoiding the load balancing issues inherent in LLM models with Causal Attention. It significantly surpasses other methods in communication efficiency, particularly in multi-node setups connected via Ethernet and multi-GPU configurations linked with PCIe.

The above picture compares DistriFusion and PipeFusion. (a) DistriFusion replicates DiT parameters on two devices. It splits an image into 2 patches and employs asynchronous allgather for activations of every layer. (b) PipeFusion shards DiT parameters on two devices. It splits an image into 4 patches and employs asynchronous P2P for activations across two devices.

We briefly explain the workflow of PipeFusion. It partitions an input image into $M$ non-overlapping patches. The DiT network is partitioned into $N$ stages ($N$ < $L$), which are sequentially assigned to $N$ computational devices. Note that $M$ and $N$ can be unequal, which is different from the image-splitting approaches used in sequence parallelism and DistriFusion. Each device processes the computation task for one patch of its assigned stage in a pipelined manner.

The PipeFusion pipeline workflow when $M$ = $N$ =4 is shown in the following picture.

We have evaluated the accuracy of PipeFusion, DistriFusion and the baseline as shown bolow. To conduct the FID experiment, follow the detailed instructions provided in the documentation.

For more details, please refer to the following paper.

@article{wang2024pipefusion,
      title={PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models}, 
      author={Jiannan Wang and Jiarui Fang and Jinzhe Pan and Aoyu Li and PengCheng Yang},
      year={2024},
      eprint={2405.07719},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipefusion.md

pipefusion.md

PipeFusion: Displaced Patch Pipeline Parallelism for Diffusion Models

Files

pipefusion.md

Latest commit

History

pipefusion.md

File metadata and controls

PipeFusion: Displaced Patch Pipeline Parallelism for Diffusion Models