API

Pricing

Workflows

API

Pricing

SAM3 for Video Masking using Points

Create a video masking using SAM3 and Points only.

SAM3

Video2Video

Video Masking

351

sam3_00017-audio-ezgif.com-vi_1770805006141.webp

Generates in about 3 mins 8 secs

floyoofficial

Nodes & Models

ComfyUI Official

WorkflowGraphics

MaskPreview

MaskToImage

ComfyUI-SAM3

LoadSAM3Model

SAM3PointCollector

SAM3VideoSegmentation

SAM3Propagate

SAM3VideoOutput

ComfyUI-VideoHelperSuite

VHS_LoadVideo

VHS_VideoInfo

VHS_VideoCombine

ComfyUI-S3-IO

VHS_LoadVideo

VHS_VideoInfo

VHS_VideoCombine

SAM 3 with point prompts lets you build precise, interactive video masks by clicking on objects, then having the model track and segment them through the whole clip.

Overview

SAM 3 is a unified segmentation model that supports both concept prompts (text) and visual prompts (points, boxes, masks). When you use point prompts, it behaves like an advanced “click‑to‑segment” tool: foreground clicks say “include this,” background clicks say “exclude this,” and the model refines the mask accordingly. For video, SAM 3 then propagates that mask across frames with tracking, so the same object stays masked over time.

How point‑based video masking works

You load a video and select one or more frames (often the first frame or a key frame) where you click on the target object.
You pass those point coordinates and labels (1 = foreground, 0 = background) to SAM 3’s video predictor.
The model generates masks for the clicked object on that frame, and then tracks and updates those masks across the rest of the video, producing a mask (and ID) per frame.
You threshold or directly export these masks as per‑frame alpha mattes for compositing, background edits, or feeding into other video models.

Why use points instead of only text

Points give pixel‑accurate control over which instance to track when text like “car” or “person” matches multiple objects in the scene.
Positive and negative clicks let you quickly refine the mask (add missing regions, remove stray areas) without rewriting prompts or re‑running the whole model.
For difficult or unusual objects where text is ambiguous, a single click can be more reliable than open‑vocabulary detection.

Typical use cases

Isolating a specific character, prop, or vehicle in a crowded scene by clicking on it and tracking it through the clip.
Creating clean masks for VFX tasks like background replacement, localized color grading, or stylizing only the subject.
Combining text and points: use text to find all “people,” then point‑click to refine or pick just one to mask and track.

Discover more workflows

You might like these too.

floyoofficial

313

SAM3

Video2Video

Video Masking

Create a video masking using SAM3 and Text only.

SAM3 for Video Masking using Text

Create a video masking using SAM3 and Text only.

floyoofficial

14.6k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

FLUX.2 Klein 9B + SAM3 + GhostMannequin LoRA

floyoofficial

768

FLUX

FLUX.2 Klein

Ghost Mannequin

Image2Image

SAM3

Create a ghost mannequin clothes using flux.2 klein, SAM3 and Ghost mannequin LoRA

FLUX.2 Klein 9B + SAM3 + GhostMannequin LoRA

Create a ghost mannequin clothes using flux.2 klein, SAM3 and Ghost mannequin LoRA

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.6k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

mdmz

11.0k

wan 2.2

wan22

wan 2.2 animate

wan 22 animate

wan animate

Wan 2.2 Animate Preprocess by Kijai (MDMZ Edition)