Wan2.1 FusionX and MultiTalk - Image to Video
Turn any portrait - artwork, photos, or digital characters - into speaking, expressive videos that sync perfectly with audio input. MultiTalk handles lip movements, facial expressions, and body motion automatically.
Animation
Filmmaking
Image to Video
Lipsync
Marketing
Multitalk
Wan2.1
4
1.5k
Nodes & Models
WanVideoBlockSwap
WanVideoTorchCompileSettings
LoadWanVideoT5TextEncoder
umt5-xxl-enc-bf16.safetensors
WanVideoVAELoader
Wan2_1_VAE_bf16.safetensors
WanVideoLoraSelect
detailz-wan.safetensors
Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
WanVideoModelLoader
Wan2.1_14B_FusionX.safetensors
WanVideoBlockSwap
WanVideoTorchCompileSettings
WanVideoTeaCache
WanVideoEnhanceAVideo
DownloadAndLoadWav2VecModel
LoadWanVideoT5TextEncoder
umt5-xxl-enc-bf16.safetensors
WanVideoVAELoader
Wan2_1_VAE_bf16.safetensors
WanVideoLoraSelect
detailz-wan.safetensors
Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
MultiTalkModelLoader
Wan2_1-InfiniTetalk-Single_fp16.safetensors
WanVideoTextEncodeSingle
WanVideoApplyNAG
WanVideoModelLoader
Wan2.1_14B_FusionX.safetensors
WanVideoClipVisionEncode
MultiTalkWav2VecEmbeds
WanVideoImageToVideoMultiTalk
WanVideoSampler
WanVideoDecode
DownloadAndLoadWav2VecModel
MultiTalkModelLoader
Wan2_1-InfiniTetalk-Single_fp16.safetensors
MultiTalkWav2VecEmbeds
WanVideoImageToVideoMultiTalk
DownloadAndLoadWav2VecModel
AudioCrop
AudioSeparation
ImageResizeKJv2
ImageResizeKJv2
AudioSeparation
VHS_VideoCombine
Turn any portrait - artwork, photos, or digital characters - into speaking, expressive videos that sync perfectly with audio input.
MultiTalk handles lip movements, facial expressions, and body motion automatically.
MultiTalk is an open-source AI framework that converts static images into realistic talking videos using audio input. Built by MeiGen AI, it accurately syncs lip movements and facial expressions to speech or singing, supporting both single and multi-person scenes.
With support for single or multi-person scenes, text prompts for emotion and behavior control, and compatibility with real or stylized characters, MultiTalk offers incredible creative flexibility. Integrated into ComfyUI and optimized for fast performance, it’s ideal for digital artists, content creators, educators, and developers who want to bring portraits, avatars, or original characters to life in seconds.
Key Inputs
Load Image: Upload an image of a single person or multiple people
Load Audio: Upload audio clip of either speech or singing
Prompt: Describe the motion and speech
Read more
