API

Pricing

Workflows

API

Pricing

ComfyUI-Muyan-TTS

Author Yuan-ManX

https://github.com/Yuan-ManX/ComfyUI-Muyan-TTS

Last updated

2025-05-08

Run hundreds of ComfyUI nodes and workflows in your browser.

Muyan-TTS is an advanced text-to-speech (TTS) model integrated into ComfyUI, enabling high-quality voice synthesis suitable for podcast applications. It allows users to generate custom voices based on minimal target speech input, making it versatile for various audio projects.

Supports zero-shot TTS synthesis, allowing for immediate voice generation without extensive training.
Offers speaker adaptation, enabling customization with just a few minutes of target speech for personalized voice output.
Pre-trained on a vast dataset of over 100,000 hours of podcast audio, ensuring high-quality and natural-sounding voice generation.

Context

Muyan-TTS serves as a powerful text-to-speech tool within the ComfyUI framework, specifically designed to enhance audio content creation. Its primary purpose is to facilitate the generation of realistic and customizable voice outputs, particularly for podcasting and similar applications.

Key Features & Benefits

One of the standout features of Muyan-TTS is its zero-shot synthesis capability, which allows users to generate speech without needing extensive voice data beforehand. Additionally, the model's speaker adaptation feature permits users to tailor the voice output to match specific individuals, making it an excellent choice for personalized audio projects.

Advanced Functionalities

Muyan-TTS is equipped with the ability to adapt to new speaker voices using only a few minutes of recorded speech. This advanced functionality enables users to create unique voice profiles, expanding the model's usability across different applications and allowing for a more personalized listener experience.

Practical Benefits

By integrating Muyan-TTS into ComfyUI, users can significantly enhance their workflow in audio production. The tool provides greater control over voice synthesis, improves the quality of generated speech, and increases overall efficiency, making it easier to produce high-quality audio content quickly.

Credits/Acknowledgments

Muyan-TTS was developed by the team at MYZY-AI and is available under an open-source license. The model relies on extensive pre-training and contributions from various developers, ensuring a robust and effective TTS solution for users.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

Yuan-ManX