floyo logo
Powered by
ThinkDiffusion
Pricing
Wan 2.7 is now live. Check it out 👉🏼
floyo logo
Powered by
ThinkDiffusion
Pricing
Wan 2.7 is now live. Check it out 👉🏼
WAN LoRAs hero

WAN LORAS

Train & use custom WAN LoRAs on your own products, characters, styles and ideas so the model learns your exact look and style.

Step 1: Build Your Dataset

Qwen 2511 Edit - Single Image to Character Dataset

Character Dataset

Prompt List

Qwen 2511

Create a 60 image character dataset from one character image or sheet.

Qwen 2511 Edit - Single Image to Character Dataset

Create a 60 image character dataset from one character image or sheet.

Image to Character Sheet with Kontext

Character Sheet

Flux

Image to Image

Kontext

Create a character sheet with multiple poses and expressions from a single image!

Image to Character Sheet with Kontext

Create a character sheet with multiple poses and expressions from a single image!

A great LoRA starts with a great dataset. If you already have a set of images ready, you can skip ahead to Step 2. If you need to generate one from scratch, the Qwen 2511 Edit workflow makes it easy to produce a consistent, varied character dataset from a single reference image. Optionally, you may start with a character sheet as it is the best way of making sure your character is consistent front and back throughout the whole dataset, in which case start with the Image to Character Sheet with Kontext, or your favorite text-to-image workflow!

This workflow uses Qwen Image Edit to generate multiple variations of your character with different poses, angles, expressions, and lighting while keeping facial identity consistent. It outputs a ready-to-use dataset folder.

What Makes a Good Dataset?

  • 15-25 images is the sweet spot for character and subject LoRAs. As few as 9 can work if they are excellent quality. Quality always beats quantity.

  • Vary your angles - front, side, three-quarter, from above.

  • Vary your lighting - natural, studio, warm, cool, dramatic, soft.

  • Vary poses and expressions - do not rely on the same headshot repeated.

  • Include non-portrait shots - environmental, hands, partial body. This prevents the model from locking into one composition.

  • Mix your backgrounds - if every image has the same backdrop, the model will associate your subject with that setting.

  • No watermarks, heavy filters, blur, or compression artifacts.

  • Avoid near-duplicate images - slight crop variations of the same photo increase overfitting risk.

Step 2: Caption Your Images

Detailed Auto Caption
jacob

jacob

1.1k

Captioning

Flux

LORA Training

Generate high-quality captions for LoRA training and automatically resize images to SDXL/Flux-compatible resolution.

Detailed Auto Caption

Generate high-quality captions for LoRA training and automatically resize images to SDXL/Flux-compatible resolution.

Captions tell the model what it is looking at, which helps it learn your subject without accidentally baking in backgrounds, lighting, or other unintended details.

Each image needs a matching .txt file with the same filename:

my_dataset/
  image01.png
  image01.txt
  image02.png
  image02.txt

Use the auto caption workflow to generate these automatically.

Caption Tips

  • Use a unique trigger word - a short nonsense token like zxqperson or floyochar works best. Avoid real words that might already exist in the model vocabulary.

  • Keep captions descriptive and factual: zxqperson, woman with long dark hair, blue denim jacket, standing in park, natural lighting

  • Do not over-describe mood or aesthetics. Words like "moody," "cinematic," or "ethereal" get amplified by the model and become hard to escape at inference time.

  • Stay consistent - use the same caption structure across all images so the model has a clear pattern to learn.

Step 3: Train Your LoRA

With your dataset captioned and ready, it is time to train with the Musubi Wan Trainer.

Musubi Tuner - Wan LoRA Trainer

LoRA

Musubi Tuner

Training

Wan

Train Wan LoRAs directly on Floyo with this simple all-in-one custom node.

Musubi Tuner - Wan LoRA Trainer

Train Wan LoRAs directly on Floyo with this simple all-in-one custom node.

This workflow uses three loader nodes connected to the Musubi Wan Trainer node, which handles the full training loop:

  • Musubi DiT Loader - loads the Wan 2.2 DiT model (wan2.2_t2v_low_noise_14B)

  • Musubi VAE Loader - loads the Wan 2.1 VAE (wan_2.1_vae.safetensors)

  • Musubi Text Encoder Loader - loads the T5 text encoder (t5_umt5-xxl-enc-bf16.pth)

All defaults are already optimized for Wan 2.2. Most users only need to do two things before hitting Queue.

Choosing a Task Type

The task setting determines which type of training is performed. The default is t2v-A14B (Wan 2.2 text-to-video), which is the recommended starting point for most users training with image datasets. The trainer supports the following tasks:

  • t2v-1.3B - Wan 2.1 text to video (1.3B). Lightweight, lower resource requirements.

  • t2v-14B - Wan 2.1 text to video (14B). Higher quality output.

  • i2v-14B - Wan 2.1 image to video (14B). Requires CLIP vision encoder and video clip datasets.

  • t2i-14B - Wan 2.1 text to image (14B).

  • t2v-A14B - Wan 2.2 text to video (14B active, MoE architecture). Default and recommended.

  • i2v-A14B - Wan 2.2 image to video (14B active, MoE architecture). Requires video clip datasets.

Important: I2V tasks (i2v-14B and i2v-A14B) require video clip datasets, not standalone images. The I2V training pipeline uses the first frame of each clip as a conditioning image, so a plain image dataset will fail. If you are training with images, use a T2V task instead. T2V LoRAs generally work well for both T2V and I2V inference.

Image Datasets vs. Video Datasets

This guide focuses on image datasets, and we recommend starting there. Image-based training is faster, more cost-effective, and produces excellent results for character and style LoRAs, even though the output model generates video.

Video dataset training requires significantly more compute time and FloTime. A training run that takes 1-2 hours with images can take many hours or even days with video clips, depending on clip count, resolution, and frame length. The costs scale accordingly, and longer training runs leave less room for the iteration that good LoRA training usually requires. Most users find that 2-4 training attempts are needed to dial in the right settings, and that is much more practical at image training speeds and costs.

If you do train with video clips, keep them short (9-25 frames), use a small dataset, and save checkpoints frequently so you can evaluate progress without committing to a full run.

Setting Your Dataset Path

  1. Open the Floyo file browser by clicking the folder icon on the left side of the canvas.

  2. Navigate to your dataset folder inside #inputs.

  3. Click the three-dot menu on your dataset folder and select "Copy Path".

  4. Paste the path into the data path field on the trainer node. Your path should look like #inputs/my_dataset. If your dataset lives outside of #inputs, use "Copy path as input" instead, which adds the (as-input) prefix automatically.

For more detailed guidance on the Floyo file browser, see documentation

Name Your LoRA

In the output_name field, type the name you want your LoRA saved as. The trained file will appear in #models/loras when complete.

Key Settings

  • task (default: t2v-A14B) - Training task type. See "Choosing a Task Type" above.

  • data_path (no default) - Path to your dataset folder (required).

  • output_dir (default: #models/loras) - Where your trained LoRA is saved.

  • output_name (default: floyo_wan_lora) - Name of your saved LoRA file (change this!).

  • resolution (default: 848,480) - Training resolution.

  • batch_size (default: 1) - Images processed per training step.

  • max_train_epochs (default: 16) - Number of full passes through your dataset.

  • save_every_n_epochs (default: 0) - Save a checkpoint each epoch, 0 = only save at the end.

  • learning_rate (default: 0.00010) - How fast the model learns, lower is more stable.

  • network_dim (default: 16) - LoRA rank, higher captures more detail but creates a larger file.

  • discrete_flow_shift (default: 0.0) - Flow matching shift value. Leave at default unless you know what you're doing.

  • gradient_checkpointing (default: false) - Trades speed for lower memory usage.

  • fp8_base (default: false) - Use fp8 precision for the base model.

  • blocks_to_swap (default: 0) - Offload transformer blocks to CPU to save GPU memory.

  • target_frames (default: 81) - Number of frames the model targets per training sample.

  • frame_extraction (default: head) - How frames are extracted from training data.

Quick Tuning Tips

  • Want to compare checkpoints? Set save_every_n_epochs to 1 so you can test each epoch and pick the best one.

  • Want more detail captured? Try network_dim at 32. The file will be larger but the LoRA can learn finer features.

  • Overfitting or too prompt-rigid? Lower the learning rate to 0.00005 or reduce epochs.

  • Training too slow or running out of memory? Enable gradient_checkpointing and try setting blocks_to_swap to offload some work to CPU.

Step 4: Test Your LoRA

Wan 2.2 14b - Text to Video w/ LoRA

LoRA

Text to Video

Wan2.2

Run Wan 2.2 14b with a custom LoRA

Wan 2.2 14b - Text to Video w/ LoRA

Run Wan 2.2 14b with a custom LoRA

Once training is complete, your LoRA is saved to #models/loras and ready to use immediately. Load it into the testing workflow to see your results.

Testing Tips

  • Start with a LoRA strength of 0.7 to 0.8 and adjust from there.

  • If outputs look too locked-in to your training data, reduce the strength.

  • If your concept is not showing up strongly enough, increase it.

  • Use your trigger word in the prompt to activate the learned concept.

  • If you saved multiple epoch checkpoints, test each one and compare. Earlier epochs are often more flexible, later epochs more faithful to identity.

Table of Contents
OVERVIEW

Train a custom Wan 2.2 LoRA from scratch, from building your dataset all the way to testing your results. This guide walks through four simple steps, each backed by a dedicated Floyo workflow.