API

Pricing

Workflows

API

Pricing

VOID Video Inpainting + SAM3 Text Masking

Remove objects from video with VOID's two-pass model. Type what to erase, SAM3 builds the mask, then VOID fills the holes coherently across every frame.

Inpainting

Video to Video

820

_MConverter.eu_Image to Talking Video - LTX 2.3 + ElevenLabs UGC (2)_1779814564020.webp

_MConverter.eu_Image to Talking Video - LTX 2.3 + ElevenLabs UGC (3)_1779814783717.webp

Generates in about 4 mins 12 secs

floyoofficial

Nodes & Models

ComfyUI Official

MarkdownNote

INTConstant

ComfyUI-VideoHelperSuite

VHS_LoadVideo

VHS_VideoCombine

ComfyUI_StarNodes

VHS_LoadVideo

VHS_VideoCombine

ComfyUI-S3-IO

VHS_LoadVideo

VHS_VideoCombine

Video object removal that erases things across time, not frame by frame.

Upload a clip, type what you want gone ("person in blue jacket"), and SAM3 builds the mask automatically. Then VOID's two-pass video model fills the holes with coherent motion, lighting, and even the shadows the object was casting.

No mask drawing. Defaults already work. Pick your video, name the thing, write what should be there instead.

How do you remove objects from a video with VOID?

Load your video, write a SAM3 prompt naming what to erase ("person in blue jacket"), then write a positive prompt describing what the empty space should look like ("empty sidewalk, daylight"). VOID runs two passes: Pass 1 fills the hole, Pass 2 stabilizes it across frames. No manual masking required.

SAM3 object prompt This is the "what to remove" field. Use a short referring phrase like "red cup on table" or "person in blue jacket". Concrete and specific wins. Vague prompts produce vague masks, and the rest of the pipeline cannot recover from a bad mask.

Positive prompt (inpaint fill) Describe the result, not "remove X". Write what the scene looks like after the object is gone. "Empty kitchen counter, daylight, tiles visible" beats "remove the cup". The model is filling a hole, so tell it what should be there.

Negative prompt Leave it blank unless you see repeating defects. If outputs come back with watermarks, blur, or extra limbs, add those terms here. Most clips need nothing in this field.

Skip Pass 2 toggle Default is off, so both passes run. Pass 1 fills the masked region. Pass 2 cleans up temporal jitter so the fill stops shimmering between frames. Turn Pass 2 off for faster previews on short, simple clips. Keep it on for longer cuts or textured backgrounds where flickering shows.

Resolution (672 x 384 default) Tuned for the VOID model. Push higher if your source has fine detail, but generation time climbs fast on video. Keep the aspect ratio close to your input or you will get cropping.

Steps and CFG (30 steps, CFG 6) Good defaults for both passes. Drop steps to 20 for faster iteration. Raise CFG to 7 or 8 if the fill drifts away from your prompt. Play with one variable at a time.

What is VOID video inpainting good for?

VOID handles object removal where you need coherent motion, lighting, and causal cues, not frame-by-frame patching. Use it to delete people, vehicles, props, or watermarks from clips and have the background behave like they were never there. It is a single-purpose tool: removal, not general editing.

Useful for VFX cleanup (rigging, crew in the shot, microphones, signs), film production continuity fixes, accidental brand visibility, and any clip where a person or object needs to be gone without flickering edges or shifting backgrounds.

The model goes further than naive erase. Occluded pixels, including shadows the object cast and things it was blocking, fill in as if the object was never there. Lighting and seams stay believable.

When not to use it: chaotic motion, ambiguous masks where the target blends into the background, or objects that take up most of the frame. Prompting cannot fix a bad SAM3 mask. If the segmentation is wrong, fix that first.

FAQ

What is the difference between Pass 1 and Pass 2 in VOID? Pass 1 fills the masked region and is the main generation step. Pass 2 refines temporal stability so the fill stops flickering between frames. On short, simple clips you can skip Pass 2 to save time. On longer cuts or textured backgrounds, Pass 2 is the difference between watchable and unusable output.

Do I need to draw the mask myself for VOID video inpainting? No. SAM3 builds the mask from your text prompt. Type what you want gone, like "person in blue jacket" or "red cup on table", and SAM3 segments it across every frame. You only draw a mask manually if SAM3 cannot find the object or you need a narrow region.

What resolution does VOID work at? This workflow defaults to 672 by 384, which the VOID model is tuned for. You can push higher if your source has fine detail, but video generation time climbs fast. Keep the aspect ratio close to your input or you will get cropping at the edges.

Why does my VOID output have flickering or jitter? Usually a Pass 2 issue. Make sure "Skip Pass 2" is off. If it still flickers, the SAM3 mask is probably unstable between frames, so try a more specific object prompt. Chaotic motion and targets that nearly leave the frame are the hardest cases.

Discover more workflows

You might like these too.

floyoofficial

5.7k

Flux

Flux.2 Klein

Image2Image

Inpainting

LanPaint

Inpainting image using Flux.2 Klein and LanPaint

FLUX.2 Klein 9B: Image Inpainting

Inpainting image using Flux.2 Klein and LanPaint

floyoofficial

4.3k

Animate

Animation

Filmmaking

Video to Video

Wan2.2

Wan 2.2

Wan2.2 Animate Character

Wan 2.2

floyoofficial

25.9k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

Wan 2.1 FusionX: Cinematic Image to Video

floyoofficial

4.7k

FusionX

Image to Video

Video Generation

Wan

Created by @vrgamedevgirl on Civitai, please support the original creator!

Wan 2.1 FusionX: Cinematic Image to Video

Created by @vrgamedevgirl on Civitai, please support the original creator!

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

22.8k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

15.0k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

floyoofficial

14.8k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video