floyo logo
Powered by
ThinkDiffusion
floyo logo
Powered by
ThinkDiffusion

ComfyUI-Sopro

6

Last updated
2026-01-19

Sopro TTS is a set of custom nodes designed for ComfyUI that provides efficient text-to-speech capabilities with the added feature of zero-shot voice cloning. This tool operates on CPU, making it lightweight and fast while integrating seamlessly into existing audio workflows within ComfyUI.

  • Efficient CPU performance allows for quick audio generation, achieving real-time factors of 0.25.
  • The zero-shot voice cloning feature enables users to generate speech in a specific voice using just a short audio sample.
  • The tool is designed to work with all ComfyUI audio nodes, enhancing flexibility and user experience.

Context

Sopro TTS is an extension for ComfyUI that enhances the platform's audio processing capabilities through custom nodes focused on text-to-speech (TTS) functionality. The primary aim of this tool is to provide users with a lightweight solution for generating speech from text while also allowing for voice cloning using minimal audio samples.

Key Features & Benefits

The main features of Sopro TTS include its efficient CPU-based operation, which ensures that audio generation is quick and responsive. The zero-shot voice cloning capability allows users to create speech that mimics a specific voice using just a short reference audio clip, making it highly versatile for various applications. Additionally, its compatibility with all ComfyUI audio workflows means that users can easily integrate it into their existing projects without significant adjustments.

Advanced Functionalities

Sopro TTS includes advanced functionalities such as adjustable speech speed and temperature settings for audio generation. The speed parameter allows users to customize how quickly the generated speech is delivered, while the temperature setting influences the variability of the output, providing control over the creativity of the generated audio. These features enable users to fine-tune the TTS output to meet their specific needs.

Practical Benefits

This tool significantly improves workflow efficiency within ComfyUI by streamlining the process of generating and saving audio. Users can quickly produce high-quality speech outputs that can be further processed or saved in various formats, enhancing both control and quality in audio projects. The integration of voice cloning also expands creative possibilities, allowing for personalized audio experiences.

Credits/Acknowledgments

The Sopro TTS tool was developed by samuel-vitorino, with contributions from the open-source community. The project is available under a license that supports collaborative development and usage.

Inner Nodes

Sopro TTS Generator, Sopro Load Reference Audio, Sopro Save Audio