floyo logo
Powered by
ThinkDiffusion
Pricing
Wan 2.7 is now live. Check it out 👉🏼
floyo logo
Powered by
ThinkDiffusion
Pricing
Wan 2.7 is now live. Check it out 👉🏼
Last updated
2026-02-06

A collection of custom nodes designed for ComfyUI that facilitates integration with various closed-source AI models through their APIs. This tool enables users to leverage advanced AI capabilities such as image generation, editing, text-to-speech, and video creation directly within the ComfyUI environment.

  • Provides seamless access to models like Google's Gemini, OpenAI's GPT-Image-1, and FLUX for enhanced creative workflows.
  • Supports multimodal functionalities, allowing users to work with images, text, and audio in a unified manner.
  • Includes specialized nodes for generating 3D models from text and images, expanding the creative possibilities within ComfyUI.

Context

This tool is a set of nodes for ComfyUI that connects local workflows to various closed-source AI models via their APIs. Its primary purpose is to enhance the capabilities of ComfyUI users by integrating advanced features from leading AI technologies, enabling a more versatile and powerful creative experience.

Key Features & Benefits

The tool includes a variety of nodes that allow users to perform specific tasks such as image editing, video generation, and speech synthesis. Notable features include the ability to generate images from text prompts using Google's Imagen, perform detailed image segmentation with Gemini, and create high-quality speech from text using OpenAI and ElevenLabs models. These features significantly broaden the scope of creative projects that can be accomplished within ComfyUI.

Advanced Functionalities

Advanced functionalities include multimodal capabilities, such as Gemini Chat, which allows for interactive querying about images and generating prompts or descriptions. The tool also provides sophisticated image-to-3D model generation with Tripo AI, enabling users to create complex 3D assets from both text and images. Additionally, features like speaker diarization and audio separation enhance the tool's utility in audio processing tasks.

Practical Benefits

This tool streamlines workflows by integrating various AI functionalities into a single interface, reducing the need for users to switch between different applications. It offers greater control over creative outputs, allowing for precise adjustments and modifications through prompt-based editing. The efficiency gained from utilizing these nodes can lead to higher quality results and faster project turnaround times.

Credits/Acknowledgments

The development of this tool acknowledges the contributions of the ComfyUI team for their foundational work on the platform. Special thanks are also extended to Google, OpenAI, and Black Forest Labs for their innovative models, as well as Replicate for providing accessible API interfaces to these advanced technologies.

Inner Nodes

NanoBananaNode