floyo logo
Powered by
ThinkDiffusion
⚡️Nano Banana 2 ⚡️ just landed. Start creating now.
floyo logo
Powered by
ThinkDiffusion
⚡️Nano Banana 2 ⚡️ just landed. Start creating now.

Multi Model for Voice Convesion and Text to Speech

A workflow of TTS Audio Suite which can to use different type of audio models.

85

Generates in about -- secs

Nodes & Models

F5TTSEngineNode
ChatterBoxEngineNode
HiggsAudioEngineNode
RVCEngineNode
VibeVoiceEngineNode
ChatterBoxOfficial23LangEngineNode
UnifiedTTSTextNode
UnifiedVoiceChangerNode
LoadAudio
Note
WorkflowGraphics
Text Multiline
Reroute
PreviewAudio
PreviewAny
easy anythingIndexSwitch

A TTS Audio Suite workflow is a unified ComfyUI setup that handles both text‑to‑speech and voice conversion while letting you switch between different audio engines in one graph.​

Why use it

  • Centralizes TTS and voice conversion so you do not need separate tools or projects for narration, cloning, and re‑voicing.​

  • Allows engine A/B testing (naturalness, speed, multilingual support) on the same input, helping pick the best model per job with minimal graph changes.​

  • Keeps pipelines reproducible and shareable: one workflow file can encapsulate complex audio behavior, including model loading and VRAM management.​

Use cases

  • Creating narrations, tutorials, or audiobooks from scripts, with engine‑specific choices for emotion or language.​

  • Re‑voicing existing dialogue for dubbing, localization, or anonymity by converting into a consistent target voice.​

  • Building character voices for games, VTubers, or story content by generating lines via TTS then passing them through voice conversion.​

  • Rapidly prototyping different vocal styles (casual, corporate, dramatic) for marketing videos or explainer content.

Read more

N