API

Pricing

Workflows

API

Pricing

ComfyUI-Orpheus-TTS

Author ShmuelRonen

https://github.com/ShmuelRonen/ComfyUI-Orpheus-TTS

Last updated

2025-05-03

Run hundreds of ComfyUI nodes and workflows in your browser.

This tool enhances ComfyUI by integrating high-quality Text-to-Speech (TTS) functionality through the Orpheus TTS model, allowing users to produce natural-sounding speech with emotional nuances and multilingual options.

Supports a variety of emotional expressions and voice styles, providing a more engaging audio experience.
Features advanced audio effects like pitch shifting, speed adjustment, and reverb, enabling detailed audio customization.
Compatible across multiple platforms, including Windows, Linux/WSL, and macOS, ensuring accessibility for a wide range of users.

Context

This extension is designed to add sophisticated Text-to-Speech capabilities to ComfyUI, utilizing the Orpheus TTS model. Its primary aim is to allow users to create lifelike speech outputs that can convey emotions and support various languages, enhancing the overall functionality of ComfyUI for developers and creators.

Key Features & Benefits

The tool offers high-quality speech synthesis that sounds natural, significantly improving the user experience in applications that require voice output. Its support for emotional expressions allows for a more dynamic and relatable audio output, while the ability to handle long texts with automatic chunking ensures consistent performance even with extensive input.

Advanced Functionalities

The extension includes advanced audio processing features, such as customizable pitch shifting, speed adjustments, and various audio effects like reverb and echo. Users can finely tune their audio outputs to achieve specific auditory effects, making it suitable for diverse applications, from storytelling to interactive media.

Practical Benefits

By integrating this TTS capability into ComfyUI, users gain enhanced control over audio output, leading to improved workflow efficiency and higher quality results. The availability of multiple voice options and emotional expressions allows for greater creativity in audio projects, ultimately elevating the standard of generated content.

Credits/Acknowledgments

The original implementation of the Orpheus TTS model was developed by Canopy AI. The project also utilizes the SNAC model for audio synthesis, with contributions from Hubert Siuzdak. This extension is built on the foundation of ComfyUI, which is maintained by the ComfyUI community.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

ShmuelRonen