API

Pricing

Workflows

API

Pricing

ComfyUI-VideoBasicLatentSync

Author jax-explorer

https://github.com/jax-explorer/ComfyUI-VideoBasicLatentSync

Last updated

2025-04-07

Run hundreds of ComfyUI nodes and workflows in your browser.

ComfyUI-VideoBasicLatentSync is an advanced tool designed for ComfyUI that enhances video lip synchronization using ByteDance's LatentSync 1.5 model. It addresses out-of-memory (OOM) issues while providing improved performance and support for multiple languages.

Utilizes advanced lip-sync capabilities to synchronize video lips with audio inputs, offering better temporal consistency.
Significantly reduces VRAM requirements, making it accessible for users with less powerful GPUs, such as the RTX 3090.
Features customizable parameters for lip movement intensity and inference steps, allowing for tailored results based on user needs.

Context

This tool serves as an unofficial implementation of the LatentSync 1.5 model within the ComfyUI framework, primarily aimed at enhancing the lip-syncing of video content. Its purpose is to provide users with a robust solution for synchronizing lip movements with audio, improving the overall quality of video presentations.

Key Features & Benefits

One of the standout features is the improved temporal consistency, which ensures that lip movements align closely with spoken audio, enhancing realism. The tool also offers better support for the Chinese language, making it versatile for a broader audience. Additionally, the reduced VRAM requirements allow users with mid-range hardware to effectively utilize this powerful model without encountering memory issues.

Advanced Functionalities

The tool incorporates sophisticated optimizations such as gradient checkpointing and native PyTorch FlashAttention-2, which significantly improve performance while minimizing memory usage. Users can also adjust the expressiveness of lip movements through a dedicated parameter, allowing for greater control over the realism of the output.

Practical Benefits

By integrating this tool into their workflow, users can achieve higher quality lip synchronization in their videos, improving both the visual and auditory experience. The ability to customize key parameters also streamlines the process, enabling users to balance quality and processing speed according to their specific project needs.

Credits/Acknowledgments

This project is based on the work of ByteDance Research for the LatentSync 1.5 model and is developed for use with ComfyUI. It is licensed under the Apache License 2.0, allowing for open-source collaboration and improvement.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.6k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

jax-explorer