API

Pricing

Workflows

API

Pricing

ComfyUI-DeZoomer-Nodes

Author De-Zoomer

https://github.com/De-Zoomer/ComfyUI-DeZoomer-Nodes

Last updated

2025-06-28

Run hundreds of ComfyUI nodes and workflows in your browser.

A collection of custom nodes designed for ComfyUI, focusing on enhancing video captioning and refining existing captions. This tool integrates advanced AI models to provide detailed descriptions and improve the coherence of textual content.

Supports video frame analysis and caption generation using multiple AI models.
Includes functionality for refining captions, enhancing their quality and coherence.
Optimized for performance with options for memory management and model loading.

Context

This repository features a set of custom nodes specifically for ComfyUI, aimed at improving video captioning and the refinement of captions. The nodes leverage advanced AI models to generate and enhance textual descriptions, making them valuable for users working with video content.

Key Features & Benefits

The tool includes two primary nodes: the Video Captioning Node and the Caption Refinement Node. The Video Captioning Node generates comprehensive captions from video frames, while the Caption Refinement Node enhances existing captions by making them more coherent and detailed, which is crucial for clarity in video content.

Advanced Functionalities

The Video Captioning Node utilizes the Qwen2.5-VL model and supports other models like SkyCaptioner-V1 and ShotVL for varied captioning results. It offers parameters such as temperature control for randomness, memory optimization settings, and the option to keep the model loaded in memory for efficiency. The Caption Refinement Node similarly employs the Qwen2.5 model, allowing users to adjust the output's specificity and coherence.

Practical Benefits

This tool streamlines the workflow for users by automating the process of generating and refining captions, thereby saving time and enhancing the quality of output. The ability to analyze video frames and produce detailed descriptions significantly improves the control and precision of captioning tasks in ComfyUI.

Credits/Acknowledgments

The development of this project acknowledges the contributions of various models from Alibaba Cloud, including the Qwen2.5-VL and ShotVL models for video captioning, as well as the Qwen2.5 model for caption refinement. The project is licensed under the GPL License.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

De-Zoomer