Horizontal to Vertical Video Reframe
Video to Video
Wan2.1
0
31
Nodes & Models
VHS_LoadVideoFFmpeg
VHS_VideoInfoSource
VHS_LoadVideoFFmpeg
VHS_VideoInfoSource
UNETLoader
wan2.1_vace_14B_fp16.safetensors
PrimitiveInt
WorkflowGraphics
MarkdownNote
LoraLoader
#community_models/loras/Wan2.1_I2V_14B_FusionX_LoRA-horizontal-to-vertical-v-e3eVCrtp.safetensors
ModelSamplingSD3
CLIPTextEncode
ImagePadForOutpaint
MaskToImage
RepeatImageBatch
ImageToMask
WanVaceToVideo
KSampler
VAEDecode
CreateVideo
SaveVideo
ImageResize+
SimpleMath+
ImageResize+
DisplayAny
SimpleMath+
DisplayAny
easy compare
Horizontal to Vertical Video Conversion Workflow (Wan 2.1 VACE)
This workflow converts horizontal videos into vertical format by using AI outpainting with Wan 2.1 VACE, allowing the scene to be expanded naturally above and below the original frame instead of cropping important content.
The workflow begins by loading the source video using the VHS Load Video FFmpeg node. This extracts all frames and retrieves video metadata such as FPS, frame count, width, and height, which are used to determine the original aspect ratio and guide the resizing process.
The frames are then resized while maintaining proportions so the original horizontal content fits correctly within a vertical canvas (typically 720×1280). Mathematical nodes calculate the required padding values to center the original video and determine how much space should be generated above and below the frame.
To generate these missing areas, the workflow creates masks for the padded regions using ImagePadForOutpaint. These masked areas represent the sections that will be filled by AI.
The generation stage uses the Wan 2.1 VACE 14B diffusion model combined with FusionX LoRA to improve consistency and motion quality in video generation. The system also uses:
UMT5 text encoder for prompt conditioning
Wan 2.1 VAE for decoding generated latents
Optional positive and negative prompts to guide scene generation and reduce artifacts
The WanVaceToVideo node converts the masked frames into a latent video representation while preserving the original motion from the input video. The KSampler then performs the diffusion process to generate realistic visual content in the masked regions across the entire frame sequence.
After generation, the latent frames are decoded using the VAE and assembled back into a video using the CreateVideo node. The output video maintains the original FPS for smooth playback and is saved using the SaveVideo node.
Key Features
Converts horizontal video to vertical format
Uses AI outpainting instead of cropping
Maintains original motion and timing
Supports prompt-guided scene extension
Optimized with Wan 2.1 + FusionX LoRA
Output
The final result is a vertical video with expanded top and bottom regions, where the AI-generated areas blend naturally with the original footage while keeping the main subject centered.
Read more


