comfyui_EdgeTAM – ComfyUI Node

A custom node for ComfyUI, the EdgeTAM Wrapper enables efficient, interactive video object tracking using the EdgeTAM model, which is optimized for on-device performance. It significantly accelerates video object segmentation tasks, offering a user-friendly interface for both interactive and automated workflows.

Interactive video object tracking allows users to manually create masks in real-time.
Supports automated processing through JSON input for batch operations.
High-performance capabilities enable real-time inference on standard consumer hardware.

Context

The EdgeTAM Wrapper for ComfyUI is a specialized tool designed to facilitate video object tracking and segmentation directly within the ComfyUI environment. By leveraging the EdgeTAM model, it aims to enhance the efficiency and interactivity of video processing tasks.

Key Features & Benefits

This tool provides a unique interactive mask editor that allows users to pause video playback and draw segmentation masks directly on the first frame. It also supports automation through JSON input, which is crucial for users who need to process multiple videos efficiently without manual intervention.

Advanced Functionalities

The interactive mask editor can operate in two modes: interactive and automation. In interactive mode, users can dynamically create masks by clicking on the video frames, while automation mode allows for pre-defined mask points to be processed without user input, making it suitable for batch processing scenarios.

Practical Benefits

The EdgeTAM Wrapper enhances workflow efficiency by allowing real-time interaction with video frames, thus enabling precise object tracking. Its ability to switch between interactive and automated modes provides users with flexibility, improving overall control and reducing the time required for video processing tasks.

Credits/Acknowledgments

This tool is based on the EdgeTAM model developed by Meta Reality Labs, with the original paper titled "EdgeTAM: On-Device Track Anything Model" presented at CVPR 2025. The project adheres to the EdgeTAM license (Apache 2.0), and further details can be found in the LICENSE file.