API

Pricing

Workflows

API

Pricing

Kling O3 Video to Video — Standard Reference

API

Video to Video

325

_MConverter.eu_AnimateDiff_00435-audio_1773934861428.webp

Generates in about 5 mins 3 secs

floyoofficial

Nodes & Models

Floyo Partner Nodes

KlingO3StandardVideoToVideoReference_floyo

Ver Private

Comm Use

VideoToFrames

ComfyUI Official

WorkflowGraphics

LoadVideo

ComfyUI-VideoHelperSuite

VHS_VideoCombine

ComfyUI-S3-IO

VHS_VideoCombine

Upload a source video and a reference image, describe the scene, and Kling O3 generates a new clip where the subject matches your reference. The source video provides scene and motion context. The reference image controls who or what appears in the output.

This is the reference mode. Use it when the subject's appearance needs to match something specific.

How do you use Kling O3 video to video with a reference image?

Upload a source clip and a reference image, write a prompt describing the action and scene, and Kling O3 generates a video where the subject matches your reference. Duration, aspect ratio, and shot type are all configurable.

Prompt Describe the action and setting. The example in the workflow: "man walking in the streets of NYC." The reference image handles the subject's appearance, so your prompt doesn't need to describe them — focus on what they're doing and where.

Reference image (image_1 / element_1_frontal_image) The core input. Upload a clear image of the character or subject you want in the video. The frontal image slot is for face or full-body reference. The closer the reference matches what you're describing, the more consistent the output.

Duration 5 seconds by default. Shorter clips give the model less to manage and tend to hold subject consistency better. Increase if the action needs more time to play out.

Shot type "Customize" by default, letting your prompt steer the framing. Switch to a specific option to lock in a camera angle regardless of the prompt.

Aspect ratio "Auto" by default, which matches the source video. Override it if you need a specific output format (16:9, 9:16, 1:1).

Keep audio On by default. The original audio track carries into the output. Turn it off if the audio doesn't match the new scene.

What is Kling O3 reference-controlled video editing good for?

Reference mode is for when you need a specific person or character to appear in generated video. The edit mode gives you action control. Reference mode adds appearance control on top of that — the subject in the output matches whoever you upload.

Good scenarios: you have a character reference (portrait, product shot, concept art) and need them in a scene with natural motion. You're producing character-driven content across multiple clips and need face consistency. You want to match a real person's likeness to AI-generated footage.

Not the right tool for pure style edits or action changes where subject appearance doesn't matter. For those, the standard edit workflow is faster and has one fewer input to manage.

FAQ

What's the difference between Kling O3 edit mode and reference mode? Edit mode rewrites the action in your clip from a text prompt alone. Reference mode does the same but also uses an image you provide to anchor what the subject looks like in the output. Same base model, one extra input, much more control over the subject.

What makes a good reference image for Kling O3? A well-lit, front-facing image with the subject clearly visible against a clean background. Avoid partial crops or heavily stylized images. The cleaner the reference, the more reliably the model carries the appearance into the video.

How long can Kling O3 reference videos be? Default is 5 seconds. Shorter clips hold subject consistency better. Increase duration if the action genuinely needs it, but expect more variance over longer clips.

Does Kling O3 reference mode preserve the original audio? Yes. Keep_audio is on by default and the source audio passes into the output MP4. Turn it off if you're scoring the video separately.

Discover more workflows

You might like these too.

agi

1.3k

API

Image to Video

Motion Control

Reference

Seedance　

Seedance2.0

Video to Video

This is a full-reference workflow for Seedance 2.0. It supports image, video, and audio references for more flexible video generation. Seedance 2.0用のフルリファレンス対応ワークフローです。画像、動画、音声を参照でき、柔軟な動画生成が可能です。

Seedance2.0 Full Reference-to-Video

floyoofficial

273

API

Video to Video

Kling O3 Video to Video — Standard Edit

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

Wan 2.1 FusionX: Cinematic Image to Video

floyoofficial

4.6k

FusionX

Image to Video

Video Generation

Wan

Created by @vrgamedevgirl on Civitai, please support the original creator!

Wan 2.1 FusionX: Cinematic Image to Video

Created by @vrgamedevgirl on Civitai, please support the original creator!

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.6k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video

floyoofficial

14.6k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images