Multi Model for Voice Convesion and Text to Speech
A workflow of TTS Audio Suite which can to use different type of audio models.
ChatterBox
Higgs
Text to Speech
TTS
VibeVoice
0
85
Nodes & Models
F5TTSEngineNode
ChatterBoxEngineNode
HiggsAudioEngineNode
RVCEngineNode
VibeVoiceEngineNode
ChatterBoxOfficial23LangEngineNode
UnifiedTTSTextNode
UnifiedVoiceChangerNode
LoadAudio
Note
WorkflowGraphics
Text Multiline
Reroute
PreviewAudio
PreviewAny
easy anythingIndexSwitch
A TTS Audio Suite workflow is a unified ComfyUI setup that handles both text‑to‑speech and voice conversion while letting you switch between different audio engines in one graph.
Why use it
Centralizes TTS and voice conversion so you do not need separate tools or projects for narration, cloning, and re‑voicing.
Allows engine A/B testing (naturalness, speed, multilingual support) on the same input, helping pick the best model per job with minimal graph changes.
Keeps pipelines reproducible and shareable: one workflow file can encapsulate complex audio behavior, including model loading and VRAM management.
Use cases
Creating narrations, tutorials, or audiobooks from scripts, with engine‑specific choices for emotion or language.
Re‑voicing existing dialogue for dubbing, localization, or anonymity by converting into a consistent target voice.
Building character voices for games, VTubers, or story content by generating lines via TTS then passing them through voice conversion.
Rapidly prototyping different vocal styles (casual, corporate, dramatic) for marketing videos or explainer content.
Read more
