API

Pricing

Workflows

API

Pricing

Sopro for Text to Speech

Turn your text to excellent speech using SoproTTS

Audio2Audio

SoproTTS

Text to Speech

TTS

188

Generates in about 17 secs

floyoofficial

Nodes & Models

ComfyUI Official

WorkflowGraphics

LoadAudio

SaveAudioMP3

SoproTTS is a lightweight text‑to‑speech system with zero‑shot voice cloning that can turn text into speech in almost real time, even on CPU‑only machines.

What it is

An open‑source TTS model (~135–169M parameters, depending on version) that generates English speech from text.
Designed to run efficiently on CPUs with real‑time or faster‑than‑real‑time performance, and exposed in ComfyUI via Sopro TTS custom nodes.

Key features

Zero‑shot voice cloning: give it a few seconds of reference audio and it mimics that speaker’s voice for new text.
CPU‑friendly speed: around 0.05–0.25 real‑time factor (up to ~20× real time on an M3 CPU, or ~4× on typical CPUs).
Streaming and non‑streaming modes, so you can get low‑latency first audio or batch‑generate longer clips.
ComfyUI integration: Sopro TTS nodes accept text plus optional reference audio, and output waveform audio for the rest of your graph.
Adjustable speech speed and “temperature” for pacing and variation control.

Best‑fit use cases

Local, low‑resource voiceovers for videos or tutorials when you only have CPU and want open‑source TTS.
Voice cloning for characters or narrators in ComfyUI workflows, using short reference samples.
Interactive tools and prototypes where you need quick speech feedback without cloud TTS or big GPU models.

Discover more workflows

You might like these too.

Multi Model for Voice Convesion and Text to Speech

floyoofficial

366

ChatterBox

Higgs

Text to Speech

TTS

VibeVoice

A workflow of TTS Audio Suite which can to use different type of audio models.

Multi Model for Voice Convesion and Text to Speech

A workflow of TTS Audio Suite which can to use different type of audio models.

floyoofficial

166

Soprano

Text to Speech

TTS

Turn speech using Soprano TTS

SopranoTTS for Text to Speech

Turn speech using Soprano TTS

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.6k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video

floyoofficial

14.6k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

mdmz

11.0k

wan 2.2

wan22

wan 2.2 animate

wan 22 animate

wan animate

Wan 2.2 Animate Preprocess by Kijai (MDMZ Edition)