API

Pricing

Workflows

API

Pricing

IF_Gemini

Author if-ai

https://github.com/if-ai/ComfyUI-IF_Gemini

Last updated

2025-07-06

Run hundreds of ComfyUI nodes and workflows in your browser.

Enjoy the capabilities of the Google Gemini API within ComfyUI to generate images, transcribe audio, and summarize videos, all through a streamlined implementation of previous IF_AI tools for easier installation.

Seamlessly integrates with ComfyUI to enhance multimedia capabilities.
Supports multi-modal input, allowing users to combine text and images in prompts.
Offers customizable parameters for tailored outputs, including temperature and output tokens.

Context

This tool, known as ComfyUI-IF_Gemini, is designed to leverage the Google Gemini API within the ComfyUI framework, enabling users to perform a variety of tasks such as image generation, audio transcription, and video summarization. The extension serves to simplify the integration of these advanced capabilities, making it accessible to users who wish to enhance their workflows with AI-generated content.

Key Features & Benefits

The primary features of ComfyUI-IF_Gemini include text generation, image analysis, and image generation. Users can also utilize multi-modal inputs, which allows for the combination of text and images to create more complex prompts. The tool provides customizable parameters, enabling users to adjust settings such as temperature and output tokens, which are crucial for controlling the randomness and specificity of the generated content.

Advanced Functionalities

ComfyUI-IF_Gemini includes advanced functionalities like batch processing, which allows users to generate multiple outputs from a single prompt, and a chat mode that maintains conversation history for interactive sessions. Additionally, users can configure a custom Gemini API endpoint through environment variables or configuration files, enhancing flexibility in how the tool can be used.

Practical Benefits

This tool significantly improves workflow efficiency by allowing for rapid generation and analysis of multimedia content. It provides users with greater control over the output quality and the ability to customize parameters to suit specific needs. Overall, ComfyUI-IF_Gemini enhances the creative process by making it easier to generate diverse content types seamlessly.

Credits/Acknowledgments

The tool is developed by the original author and contributors, and it is released under the MIT license. Users are encouraged to support the project through various platforms, including GitHub and Patreon, to foster continued development and improvements.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

if-ai