floyo logo
Powered by
ThinkDiffusion
floyo logo
Powered by
ThinkDiffusion

Wan 2.1 Vid2Vid Style Transfer with Ditto

Upload any video, describe a new style, and Wan 2.1 rewrites every frame. Ditto keeps motion and structure intact across anime, Pixar, clay, and dozens more.

375

Generates in about -- secs

Nodes & Models

VAELoader
wan_2.1_vae.safetensors
GetNode
INTConstant
Note
CLIPLoader
umt5_xxl_fp16.safetensors
DiffusionModelSelector
ditto_global_comfy.safetensors
SetNode
DiffusionModelLoaderKJ
Wan2_1-T2V-14B_fp8_e4m3fn.safetensors
ImageResizeKJv2
TorchCompileModelWanVideoV2
wanBlockSwap
PatchModelPatcherOrder
ModelSamplingSD3
CLIPTextEncode
WanVaceToVideo
KSampler
TrimVideoLatent
VAEDecode
VHS_LoadVideo
VHS_VideoInfo
VHS_VideoCombine
VHS_LoadVideo
VHS_VideoInfo
VHS_VideoCombine

Description:

Restyle any video using Wan 2.1 14B with the Ditto model for video-to-video style transfer.

Upload a video, write a short style prompt like "Make it Ghibli style" or "Turn it into a pencil sketch," and the workflow transforms your clip frame by frame. The CausVid LoRA brings sampling down to 4 steps, so you get results fast without sacrificing coherence.

832x480 output at 24fps. Up to 125 frames per run.

How do you restyle a video with Wan 2.1 and Ditto?

Upload a source video, describe the target style in the positive prompt, and hit run. The Ditto model handles the style conversion while Wan 2.1's VACE pipeline keeps the motion and structure of your original clip intact. CausVid LoRA drops the step count to 4 for faster generation.

Source Video Your input clip. The workflow loads it, resizes it to your target resolution, and uses it as the control video. Shorter clips process faster. Start with 2-3 second clips while you dial in a style, then go longer.

Positive Prompt This is where you tell the model what style to apply. Short, direct instructions work best. "Make it Ghibli style." "Turn it into pixel art." "Make it a charcoal drawing." You can combine ideas: "Make it cyberpunk with neon rain." The Note in the workflow has 30+ tested style prompts to get you started.

Negative Prompt Pre-filled with quality terms in Chinese and English that work well with Wan 2.1. You can leave this alone for most runs.

Width and Height Default is 832x480. Both values need to be divisible by 16. Going higher increases VRAM usage and generation time. For testing styles, drop to 512x320 to iterate faster.

Frames Default is 125. This controls how many frames of your source video get processed. More frames means longer video output but longer generation time.

Seed Set to randomize by default. Lock it to a specific number when you want to compare different prompts on the same source video with identical noise.

Steps Set to 4 thanks to the CausVid LoRA. Increasing beyond 4 gives diminishing returns with this LoRA active.

CFG Set to 1.2. Low CFG values work better with CausVid. Going above 2 can introduce artifacts.

What is Wan 2.1 Ditto video style transfer good for?

Ditto excels at global style changes where you want the motion and composition of the original video to stay the same but the visual style to change completely. It handles artistic styles (anime, sketch, painting) and material transformations (gold, ice, chocolate) with strong temporal consistency.

This workflow shines when you have a real-world clip and want to see it in a completely different visual language. Product videos restyled as animations. Live-action footage turned into concept art. Personal clips transformed into cartoon or illustration styles.

The 4-step CausVid setup makes experimentation practical. You can try ten different styles in the time a standard 30-step workflow would finish one.

Ditto works best for full-frame style changes. If you need localized edits (changing one object while keeping the rest), a different approach like inpainting or targeted vid2vid would serve you better.

FAQ

What style prompts work best with Wan 2.1 Ditto? Short, direct instructions. "Make it Ghibli style" outperforms long descriptive paragraphs. The workflow includes 30+ tested prompts covering anime, sketch, painting, sculpture, and material styles. Start with those and tweak from there.

How many frames can Wan 2.1 Ditto process at once? The default is 125 frames at 832x480. You can go higher, but VRAM and processing time scale up. For longer videos, consider splitting them into segments and running each one separately.

What resolution should I use for Wan 2.1 Ditto style transfer? 832x480 is the default and works well for most styles. Keep both dimensions divisible by 16. Drop to 512x320 for quick style tests, then scale up once you find a look you like.

Can I change the CausVid LoRA strength in this workflow? It is set to 1 by default, which is optimized for 4-step generation. Lowering it means you will need more sampling steps to get clean results. For most use cases, leave it at 1 and keep steps at 4.

How to run Wan 2.1 Ditto style transfer online? You can run Wan 2.1 Ditto style transfer online through Floyo. No installation, no setup. Open the workflow in your browser, upload your video, and hit run. Free to try.

Read more

N