API

Pricing

Workflows

API

Pricing

ComfyUI-GPT4V-Image-Captioner

Author 438443467

https://github.com/438443467/ComfyUI-GPT4V-Image-Captioner

Last updated

2025-04-06

Run hundreds of ComfyUI nodes and workflows in your browser.

You can efficiently utilize GPT-4 Vision (GPT4V) to annotate images by filling in the necessary API key and URL. This tool is a personal adaptation of the GPT4V-Image-Captioner project, enhancing its functionality within the ComfyUI environment.

Enables seamless integration with GPT4V for image recognition and labeling.
Automates image processing to streamline workflows without manual adjustments.
Offers customizable labeling options, including seed values and exclusion of specific terms.

Context

This tool serves as an extension for ComfyUI, designed to facilitate the integration of the GPT4V model for automatic image annotation. It allows users to quickly and efficiently generate descriptive labels for images, enhancing the overall functionality of AI-driven art workflows.

Key Features & Benefits

The primary feature of this tool is its ability to automate image processing, which removes the need for users to manually scale images. Additionally, it provides a straightforward method to connect with the GPT4V API by entering the required key and URL, making it accessible for users to annotate images effortlessly. The tool also supports different prompt types and weighted labels, allowing for tailored outputs that meet specific user needs.

Advanced Functionalities

This extension includes advanced options such as seed value management, which helps maintain consistency in labeling results. Users can modify the seed to generate varied outputs if the initial results are unsatisfactory. Furthermore, the tool allows for the exclusion of unwanted words from labels, providing greater control over the annotation process.

Practical Benefits

By integrating this tool into their workflows, ComfyUI users can significantly enhance their image annotation efficiency, gaining better control over the labeling process and improving the quality of outputs. The automation of image processing and the customizable labeling options contribute to a more streamlined and effective workflow, ultimately saving time and effort.

Credits/Acknowledgments

This project is based on the original work of the authors of the GPT4V-Image-Captioner repository, with acknowledgments to their contributions and efforts.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

438443467