ComfyUI-MultiGPU – ComfyUI Node

This tool is a custom node for ComfyUI that enhances memory management by allowing users to offload layers of UNet and CLIP models to different memory devices, optimizing GPU VRAM usage. It supports multi-GPU integration and simplifies the management of model components, enabling efficient processing of larger models.

Provides one-click "Virtual VRAM" functionality to maximize GPU performance by managing layer offloading.
Integrates seamlessly with WanVideoWrapper, offering specialized nodes for multi-GPU setups.
Features intuitive model-driven allocation options, allowing precise control over resource distribution across devices.

Context

The ComfyUI-MultiGPU tool is designed to enhance the capabilities of ComfyUI by optimizing memory management for AI models. Its primary purpose is to allow users to maximize the latent space of their GPUs by offloading components of models, such as UNet and CLIP, to alternative memory sources, thus improving performance during image generation tasks.

Key Features & Benefits

This tool offers several practical features that significantly enhance the user experience. The universal support for .safetensors and GGUF models ensures compatibility across various formats. Users can experience up to a 10% increase in inference speed with GGUF models compared to previous versions, making the workflow more efficient. The bespoke integration with WanVideoWrapper provides tailored nodes that streamline the process of managing multiple GPUs, allowing for a more organized and effective setup.

Advanced Functionalities

ComfyUI-MultiGPU includes advanced functionalities such as two expert modes for model allocation: 'bytes' and 'ratio'. These modes enable users to specify exactly how model components are distributed across available devices. The 'bytes' mode allows for precise allocation in terms of memory size, while the 'ratio' mode simplifies distribution based on percentage splits, catering to both novice and advanced users.

Practical Benefits

This tool significantly improves workflow efficiency by freeing up GPU VRAM, allowing for the execution of larger models without the need for complex configurations. By offloading layers to other memory sources, users can dedicate their primary GPU's VRAM to actual computation, enhancing the overall quality and speed of image generation processes. The straightforward management of model components leads to a more streamlined and productive user experience.

Credits/Acknowledgments

The tool is currently maintained by pollockjj and was originally developed by Alexander Dzhoganov. Special thanks are extended to City96 for their contributions.