
WAN LORAS
Train & use custom WAN LoRAs on your own products, characters, styles and ideas so the model learns your exact look and style.
Step 1: Build Your Dataset
Character Dataset
Prompt List
Qwen 2511
Create a 60 image character dataset from one character image or sheet.
Qwen 2511 Edit - Single Image to Character Dataset
Create a 60 image character dataset from one character image or sheet.
pixelworld_ai
1.5k
Character Sheet
Flux
Image to Image
Kontext
Create a character sheet with multiple poses and expressions from a single image!
Image to Character Sheet with Kontext
Create a character sheet with multiple poses and expressions from a single image!

A great LoRA starts with a great dataset. If you already have a set of images ready, you can skip ahead to Step 2. If you need to generate one from scratch, the Qwen 2511 Edit workflow makes it easy to produce a consistent, varied character dataset from a single reference image. Optionally, you may start with a character sheet as it is the best way of making sure your character is consistent front and back throughout the whole dataset, in which case start with the Image to Character Sheet with Kontext, or your favorite text-to-image workflow!
This workflow uses Qwen Image Edit to generate multiple variations of your character with different poses, angles, expressions, and lighting while keeping facial identity consistent. It outputs a ready-to-use dataset folder.
What Makes a Good Dataset?
15-25 images is the sweet spot for character and subject LoRAs. As few as 9 can work if they are excellent quality. Quality always beats quantity.
Vary your angles - front, side, three-quarter, from above.
Vary your lighting - natural, studio, warm, cool, dramatic, soft.
Vary poses and expressions - do not rely on the same headshot repeated.
Include non-portrait shots - environmental, hands, partial body. This prevents the model from locking into one composition.
Mix your backgrounds - if every image has the same backdrop, the model will associate your subject with that setting.
No watermarks, heavy filters, blur, or compression artifacts.
Avoid near-duplicate images - slight crop variations of the same photo increase overfitting risk.
Step 2: Caption Your Images
jacob
1.1k
Captioning
Flux
LORA Training
Generate high-quality captions for LoRA training and automatically resize images to SDXL/Flux-compatible resolution.
Detailed Auto Caption
Generate high-quality captions for LoRA training and automatically resize images to SDXL/Flux-compatible resolution.

Captions tell the model what it is looking at, which helps it learn your subject without accidentally baking in backgrounds, lighting, or other unintended details.
Each image needs a matching .txt file with the same filename:
my_dataset/
image01.png
image01.txt
image02.png
image02.txtUse the auto caption workflow to generate these automatically.
Caption Tips
Use a unique trigger word - a short nonsense token like
zxqpersonorfloyocharworks best. Avoid real words that might already exist in the model vocabulary.Keep captions descriptive and factual:
zxqperson, woman with long dark hair, blue denim jacket, standing in park, natural lightingDo not over-describe mood or aesthetics. Words like "moody," "cinematic," or "ethereal" get amplified by the model and become hard to escape at inference time.
Stay consistent - use the same caption structure across all images so the model has a clear pattern to learn.
Step 3: Train Your LoRA
With your dataset captioned and ready, it is time to train with the Musubi Wan Trainer.
jacob
75
LoRA
Musubi Tuner
Training
Wan
Train Wan LoRAs directly on Floyo with this simple all-in-one custom node.
Musubi Tuner - Wan LoRA Trainer
Train Wan LoRAs directly on Floyo with this simple all-in-one custom node.

This workflow uses three loader nodes connected to the Musubi Wan Trainer node, which handles the full training loop:
Musubi DiT Loader - loads the Wan 2.2 DiT model (
wan2.2_t2v_low_noise_14B)Musubi VAE Loader - loads the Wan 2.1 VAE (
wan_2.1_vae.safetensors)Musubi Text Encoder Loader - loads the T5 text encoder (
t5_umt5-xxl-enc-bf16.pth)
All defaults are already optimized for Wan 2.2. Most users only need to do two things before hitting Queue.
Choosing a Task Type
The task setting determines which type of training is performed. The default is t2v-A14B (Wan 2.2 text-to-video), which is the recommended starting point for most users training with image datasets. The trainer supports the following tasks:
t2v-1.3B- Wan 2.1 text to video (1.3B). Lightweight, lower resource requirements.t2v-14B- Wan 2.1 text to video (14B). Higher quality output.i2v-14B- Wan 2.1 image to video (14B). Requires CLIP vision encoder and video clip datasets.t2i-14B- Wan 2.1 text to image (14B).t2v-A14B- Wan 2.2 text to video (14B active, MoE architecture). Default and recommended.i2v-A14B- Wan 2.2 image to video (14B active, MoE architecture). Requires video clip datasets.
Important: I2V tasks (
i2v-14Bandi2v-A14B) require video clip datasets, not standalone images. The I2V training pipeline uses the first frame of each clip as a conditioning image, so a plain image dataset will fail. If you are training with images, use a T2V task instead. T2V LoRAs generally work well for both T2V and I2V inference.
Image Datasets vs. Video Datasets
This guide focuses on image datasets, and we recommend starting there. Image-based training is faster, more cost-effective, and produces excellent results for character and style LoRAs, even though the output model generates video.
Video dataset training requires significantly more compute time and FloTime. A training run that takes 1-2 hours with images can take many hours or even days with video clips, depending on clip count, resolution, and frame length. The costs scale accordingly, and longer training runs leave less room for the iteration that good LoRA training usually requires. Most users find that 2-4 training attempts are needed to dial in the right settings, and that is much more practical at image training speeds and costs.
If you do train with video clips, keep them short (9-25 frames), use a small dataset, and save checkpoints frequently so you can evaluate progress without committing to a full run.
Setting Your Dataset Path
Open the Floyo file browser by clicking the folder icon on the left side of the canvas.
Navigate to your dataset folder inside
#inputs.Click the three-dot menu on your dataset folder and select "Copy Path".
Paste the path into the data path field on the trainer node. Your path should look like
#inputs/my_dataset. If your dataset lives outside of#inputs, use "Copy path as input" instead, which adds the(as-input)prefix automatically.

For more detailed guidance on the Floyo file browser, see documentation
Name Your LoRA
In the output_name field, type the name you want your LoRA saved as. The trained file will appear in #models/loras when complete.
Key Settings
task(default:t2v-A14B) - Training task type. See "Choosing a Task Type" above.data_path(no default) - Path to your dataset folder (required).output_dir(default:#models/loras) - Where your trained LoRA is saved.output_name(default:floyo_wan_lora) - Name of your saved LoRA file (change this!).resolution(default:848,480) - Training resolution.batch_size(default:1) - Images processed per training step.max_train_epochs(default:16) - Number of full passes through your dataset.save_every_n_epochs(default:0) - Save a checkpoint each epoch, 0 = only save at the end.learning_rate(default:0.00010) - How fast the model learns, lower is more stable.network_dim(default:16) - LoRA rank, higher captures more detail but creates a larger file.discrete_flow_shift(default:0.0) - Flow matching shift value. Leave at default unless you know what you're doing.gradient_checkpointing(default:false) - Trades speed for lower memory usage.fp8_base(default:false) - Use fp8 precision for the base model.blocks_to_swap(default:0) - Offload transformer blocks to CPU to save GPU memory.target_frames(default:81) - Number of frames the model targets per training sample.frame_extraction(default:head) - How frames are extracted from training data.
Quick Tuning Tips
Want to compare checkpoints? Set
save_every_n_epochsto1so you can test each epoch and pick the best one.Want more detail captured? Try
network_dimat32. The file will be larger but the LoRA can learn finer features.Overfitting or too prompt-rigid? Lower the learning rate to
0.00005or reduce epochs.Training too slow or running out of memory? Enable
gradient_checkpointingand try settingblocks_to_swapto offload some work to CPU.
Step 4: Test Your LoRA
Wan 2.2 14b - Text to Video w/ LoRA
Run Wan 2.2 14b with a custom LoRA

Once training is complete, your LoRA is saved to #models/loras and ready to use immediately. Load it into the testing workflow to see your results.
Testing Tips
Start with a LoRA strength of 0.7 to 0.8 and adjust from there.
If outputs look too locked-in to your training data, reduce the strength.
If your concept is not showing up strongly enough, increase it.
Use your trigger word in the prompt to activate the learned concept.
If you saved multiple epoch checkpoints, test each one and compare. Earlier epochs are often more flexible, later epochs more faithful to identity.
Train a custom Wan 2.2 LoRA from scratch, from building your dataset all the way to testing your results. This guide walks through four simple steps, each backed by a dedicated Floyo workflow.




