This repository offers a ComfyUI integration for FLOAT, a system designed for generating motion in talking portraits based on audio input. It allows users to create animated visual content that synchronizes with audio, enhancing storytelling and presentation capabilities.
- Provides a user-friendly interface for generating audio-driven talking portraits using ComfyUI.
- Automates the downloading of necessary models, streamlining the setup process for users.
- Includes adjustable parameters for customizing output, such as emotion intensity and frame rate.
Context
This tool is a wrapper for FLOAT, which implements Generative Motion Latent Flow Matching specifically for audio-driven talking portraits. Its primary goal is to facilitate the creation of animated portraits that can mimic speech and expressions based on audio cues, making it a valuable asset for creators looking to enhance their visual content with dynamic features.
Key Features & Benefits
The tool simplifies the process of generating animated portraits by providing a straightforward interface within ComfyUI. Users can easily upload a reference image and corresponding audio, with the system automatically managing model downloads, reducing setup time and complexity.
Advanced Functionalities
Advanced features include the ability to specify emotional tones, allowing for more nuanced and expressive animations. Users can adjust parameters such as the audio classifier-free guidance scale and emotion intensity, enabling a high degree of customization in the output video.
Practical Benefits
This tool enhances workflow efficiency by automating model management and providing customizable output settings. It allows users to produce high-quality animated content that is synchronized with audio, improving both the control and quality of the generated media.
Credits/Acknowledgments
The original authors of the FLOAT project are Taekyung Ki, Dongchan Min, and Gyeongsu Chae, and it is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).





