⚡️Nano Banana 2 ⚡️ just landed. Start creating now.

ThinkDiffusion

Product

Pricing

Enterprise

Docs

⚡️Nano Banana 2 ⚡️ just landed. Start creating now.

ThinkDiffusion

⚡️Nano Banana 2 ⚡️ just landed. Start creating now.

HunyuanVideo 1.5 for Image to Video

Animation

Filmmaking

HunyuanVideo 1.5

Image2Video

180

Generates in about -- secs

floyoofficial

Nodes & Models

ComfyUI Official

MarkdownNote

Note

RandomNoise

KSamplerSelect

UNETLoader

hunyuanvideo1.5_720p_i2v_fp16.safetensors

DualCLIPLoader

qwen_2.5_vl_7b_fp8_scaled.safetensors

byt5_small_glyphxl_fp16.safetensors

CLIPVisionLoader

sigclip_vision_patch14_384.safetensors

VAELoader

hunyuanvideo15_vae_fp16.safetensors

LoadImage

EasyCache

CLIPTextEncode

CLIPVisionEncode

BasicScheduler

ModelSamplingSD3

HunyuanVideo15ImageToVideo

CFGGuider

SamplerCustomAdvanced

VAEDecode

VAEDecodeTiled

CreateVideo

SaveVideo

HunyuanVideo 1.5 Image‑to‑Video takes a single still image plus a short prompt and turns it into a 5–10 second 480p–720p clip with smooth, cinematic motion while preserving the original composition and subject.

What HunyuanVideo 1.5 I2V is

It is Tencent’s lightweight 8.3B‑parameter image‑to‑video model within the HunyuanVideo family, designed specifically to animate a single reference frame into a coherent sequence.
The model emphasizes structure preservation: characters, layout, and key details stay consistent while camera and environment move, avoiding jitter and “melty” drift.

Core capabilities

Single‑image animation: Takes one input image and a guiding text prompt, generating 5–8 (sometimes 10) second clips at 480p or native 720p.
Cinematic motion: Supports gentle push‑ins, pans, tilts, subtle subject motion (breathing, hair, clothing, water, clouds), and background parallax without collapsing structure.
Fast, low‑VRAM options: A distilled 480p I2V variant can run in 8–12 steps, cutting generation time by about 75% and making it practical on a single consumer GPU (like a 4090) or hosted APIs.

Typical image‑to‑video workflow

Provide a clean, well‑lit source frame (photo or AI image), ideally already close to your target aspect ratio (for example 1280×720).
Add a short motion prompt that clearly states what should move versus what should stay fixed, using spatial language like “foreground”, “background”, “center frame”.
Choose duration (commonly 5 or 8 seconds) and resolution (480p for previews, 720p for final), then generate and iterate, adjusting seed, duration, or motion wording one variable at a time.

Strengths in a workflow stack

Pairs well with high‑end image models: generate a sharp keyframe (for example with HunyuanImage 3.0 or Qwen‑Image‑2512), then hand it to HunyuanVideo 1.5 I2V to animate.
Has native ComfyUI support and example workflows, so it slots neatly into node‑based pipelines with clear nodes for latent creation, I2V conditioning, and decode.

When to choose HunyuanVideo 1.5 I2V

Short character vignettes, product hero shots, or landscape moves where the still frame is already strong and you mainly need motion.
Scenarios where consumer‑GPU friendliness and predictable 5–8 second 720p clips matter more than ultra‑long or 4K video.