Workflows

Pricing

HunyuanVideo Foley: Create a Lifelike Sound

HunyuanVideo Foley

Video2Video

804

video_00002-audio11 sample 2_1761740907897.gif

hunyuan-foley sample 1_1761740907897.gif

Generates in about 3 mins 3 secs

floyoofficial

Nodes & Models

ComfyUI_HunyuanVideoFoley

HunyuanVideoFoleyModelLoader

HunyuanVideoFoleyDependenciesLoader

HunyuanVideoFoleyTorchCompile

HunyuanVideoFoleyGeneratorAdvanced

ComfyUI Official

LoadVideo

PreviewAny

GetVideoComponents

Reroute

PreviewAudio

ComfyUI-VideoHelperSuite

VHS_VideoCombine

Overview

HunyuanVideo-Foley is Tencent’s state-of-the-art, open-source AI system for generating Foley sound lifelike audio effects that are synchronized precisely to video content. Leveraging multimodal diffusion transformers, large-scale data curation, and advanced latent alignment, this tool automatically creates high-fidelity, context-aware Foley audio for everything from silent AI-generated clips to complex film, gaming, or advertising projects.

Key Features

End-to-End Foley Synthesis: From silent or original videos, HunyuanVideo-Foley automatically generates professional-grade synchronized audio effects such as footsteps, doors, ambient noises, and action sounds removing the need for manual SFX editing or laborious sound library searches.
Multi-Scenario Adaptability: Ideal for short videos, feature films, advertisements, and game content, thanks to robust support for diverse visual scenes and cues.
Scalable Multimodal Pipeline: Trained on over 100,000 hours of video, audio, and text pairings, the model uses automated scene detection, audio annotation, and semantic captioning to ensure broad coverage and balance across content types.
Semantic-Temporal Precision: Dual-stream transformer architecture interprets both visual and textual instructions, fusing them via cross-attention with tight event-level temporal synchronization—resulting in sound effects that match not just the timing, but also the intent and emotion of each scene.
High-Fidelity Output: Employs a 48kHz audio variational autoencoder for professional quality; audio output is suitable for production-grade use in film, broadcast, or interactive media.
Open-Source & Efficient: Designed for ease of use, rapid synthesis, and seamless integration into automated video workflows; democratizes high-level sound design for all creators, not just studios.

Who Benefits

Video Content Creators: Elevate short clips, vlogs, documentaries, or feature films with instantly tailored sound design.
Filmmakers & Game Developers: Replace manual SFX workflows with scalable, context-aware sound generation.
Advertisers & Marketers: Synchronize product or event videos with immersive, professionally-matched audio cues.
AI Developers & Researchers: Integrate advanced auditory intelligence into creative and research pipelines with open-source flexibility.

Discover more workflows

You might like these too.

floyoofficial

12.3k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video

Wan2.1 Fun Control and Flux for V2V Restyle

floyoofficial

3.4k

Controlnet

Flux

Video2Video

Wan2.1

Create a new video by restyling an existing video with a reference image.

Wan2.1 Fun Control and Flux for V2V Restyle

Create a new video by restyling an existing video with a reference image.

floyoofficial

24.5k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

20.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.0k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

mdmz

10.5k

wan 2.2

wan22

wan 2.2 animate

wan 22 animate

wan animate

Wan 2.2 Animate Preprocess by Kijai (MDMZ Edition)

goshnii

9.8k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap