Workflows

Pricing

Multi Model for Voice Convesion and Text to Speech

A workflow of TTS Audio Suite which can to use different type of audio models.

ChatterBox

Higgs

Text to Speech

TTS

VibeVoice

329

Generates in about 2 mins 20 secs

floyoofficial

Nodes & Models

ComfyUI Official

F5TTSEngineNode

ChatterBoxEngineNode

LoadAudio

HiggsAudioEngineNode

RVCEngineNode

VibeVoiceEngineNode

ChatterBoxOfficial23LangEngineNode

Note

WorkflowGraphics

Text Multiline

Reroute

UnifiedTTSTextNode

UnifiedVoiceChangerNode

PreviewAudio

PreviewAny

ComfyUI-Easy-Use

easy anythingIndexSwitch

A TTS Audio Suite workflow is a unified ComfyUI setup that handles both text‑to‑speech and voice conversion while letting you switch between different audio engines in one graph.

Why use it

Centralizes TTS and voice conversion so you do not need separate tools or projects for narration, cloning, and re‑voicing.
Allows engine A/B testing (naturalness, speed, multilingual support) on the same input, helping pick the best model per job with minimal graph changes.
Keeps pipelines reproducible and shareable: one workflow file can encapsulate complex audio behavior, including model loading and VRAM management.

Use cases

Creating narrations, tutorials, or audiobooks from scripts, with engine‑specific choices for emotion or language.
Re‑voicing existing dialogue for dubbing, localization, or anonymity by converting into a consistent target voice.
Building character voices for games, VTubers, or story content by generating lines via TTS then passing them through voice conversion.
Rapidly prototyping different vocal styles (casual, corporate, dramatic) for marketing videos or explainer content.

Discover more workflows

You might like these too.

VibeVoice: Single-Speaker Text to Speech

floyoofficial

932

text to speech

TTS

VibeVoice

voice cloning

VibeVoice

VibeVoice: Single-Speaker Text to Speech

VibeVoice

floyoofficial

453

Multi Speaker

TTS

VibeVoice

Speech Multi Speaker

VibeVoice Text to Speech Multi Speaker

Speech Multi Speaker

floyoofficial

24.5k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

LTX 2.3 Image-to-Video and Text-to-Video (Combo)

luminousinitiative

2.7k

Image to Video

LTX2.3

Text to Video

Create both from Image-to-Video and Text-to-Video using LTX 2.3

LTX 2.3 Image-to-Video and Text-to-Video (Combo)

Create both from Image-to-Video and Text-to-Video using LTX 2.3

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

20.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.0k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

floyoofficial

12.3k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video