API

Pricing

Workflows

API

Pricing

ComfyUI-MegaTTS

Author 1038lab

https://github.com/1038lab/ComfyUI-MegaTTS

Last updated

2025-06-19

Run hundreds of ComfyUI nodes and workflows in your browser.

A custom node for ComfyUI, ComfyUI-MegaTTS leverages ByteDance's MegaTTS3 technology to provide high-quality text-to-speech (TTS) synthesis, including the capability to clone voices in both Chinese and English. This tool is designed for users seeking advanced TTS functionalities, enabling realistic speech generation with customizable voice attributes.

Supports high-fidelity voice synthesis that closely mimics natural speech patterns.
Allows users to clone voices using minimal audio samples, enhancing versatility in voice applications.
Features robust memory management to optimize performance on systems with limited GPU resources.

Context

ComfyUI-MegaTTS is an advanced extension for ComfyUI that introduces a custom node based on ByteDance's MegaTTS3 model. Its primary function is to synthesize speech from text, enabling voice cloning for both English and Chinese languages, making it a valuable asset for developers and artists working with audio generation.

Key Features & Benefits

This tool offers several practical features that enhance its usability:

High-Quality Voice Synthesis: Users can generate speech that sounds natural and engaging, which is crucial for applications like virtual assistants or content creation.
Voice Cloning: The ability to clone voices from short audio samples allows for personalized applications, making it easier to create unique voiceovers without extensive recordings.
Bilingual Support: The node can handle both Chinese and English text, including code-switching, which is essential for projects targeting multilingual audiences.

Advanced Functionalities

ComfyUI-MegaTTS provides advanced parameter controls that let users fine-tune the quality of speech generation. Users can adjust settings related to pronunciation accuracy and voice similarity, enabling a high degree of customization. This is particularly useful for creating expressive speech or maintaining specific accents in the generated audio.

Practical Benefits

This tool significantly enhances workflow efficiency by streamlining the TTS process within ComfyUI. Users gain better control over the voice generation quality and can easily manage GPU resources, which is beneficial for those working on machines with limited memory. The automatic model downloading feature also simplifies the setup process, allowing users to focus on creating rather than managing dependencies.

Credits/Acknowledgments

The original MegaTTS3 model was developed by ByteDance, and the project is licensed under GPL-3.0. For more information, users can refer to the original ByteDance MegaTTS3 GitHub repository and the corresponding Hugging Face model.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

1038lab