API

Pricing

Workflows

API

Pricing

image-caption-comfyui

Author alpertunga-bile

https://github.com/alpertunga-bile/image-caption-comfyui

Last updated

2025-05-21

Run hundreds of ComfyUI nodes and workflows in your browser.

image-caption-comfyui is a specialized tool for ComfyUI that facilitates the extraction of prompts from images using advanced image captioning models. It allows users to generate textual prompts based on visual inputs, enhancing the creative process in AI art generation.

Supports integration of pretrained image caption models for prompt generation.
Includes an Insert Prompt Node to easily incorporate custom prompts into workflows.
Offers customizable variables for fine-tuning the output of the image captioning process.

Context

This tool serves as an image captioning node within the ComfyUI framework, enabling the generation of descriptive prompts from images. Its primary purpose is to streamline the workflow for users who wish to utilize visual content as a basis for creating or enhancing artistic outputs.

Key Features & Benefits

The image caption node allows users to load pretrained models that generate prompts based on input images, which can significantly reduce the time and effort required to come up with descriptive text. The Insert Prompt Node further enhances usability by enabling users to seamlessly add their own prompts, ensuring flexibility in creative workflows.

Advanced Functionalities

The tool provides a variety of adjustable parameters, such as min_new_tokens and max_new_tokens, which control the length of the generated prompts. Users can also manipulate settings like num_beams for search path optimization and repetition_penalty to manage redundancy in output, allowing for a tailored prompt generation experience.

Practical Benefits

By integrating image captioning capabilities, this tool enhances the efficiency and quality of prompt generation in ComfyUI. Users can achieve greater control over their artistic outputs, improving workflow and enabling more precise customization of prompts based on visual input.

Credits/Acknowledgments

The image-caption-comfyui tool is developed by contributors to the ComfyUI community. It relies on the Hugging Face Transformers library for its image captioning functionalities, and contributions to the project are encouraged, with guidelines provided for those interested in enhancing the tool.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.6k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

alpertunga-bile