API

Pricing

Workflows

API

Pricing

ComfyUI_Spark_TTS

Author KERRY-YUAN

https://github.com/KERRY-YUAN/ComfyUI_Spark_TTS

Last updated

2025-06-10

Run hundreds of ComfyUI nodes and workflows in your browser.

ComfyUI_Spark_TTS is a custom node package designed for integration with ComfyUI that leverages the capabilities of the Spark-TTS text-to-speech model. It offers nodes for both controllable speech synthesis and voice cloning, allowing users to generate customized speech or replicate voices from reference audio.

Provides two main nodes: one for generating speech with adjustable parameters (gender, pitch, speed) and another for cloning voices based on input audio.
Supports various configurations for output quality and model performance, including options for CPU execution and memory management.
Facilitates easy integration with existing ComfyUI workflows, enhancing the versatility of text-to-speech applications.

Context

This tool serves as an extension within ComfyUI, enabling users to harness the advanced features of the Spark-TTS model. Its primary focus is on controllable speech synthesis and the ability to clone voices, making it ideal for applications requiring personalized voice interactions.

Key Features & Benefits

The key functionalities include the Spark_TTS_Creation node, which allows for the generation of speech by specifying various parameters such as gender, pitch, and speed. The Spark_TTS_Clone node facilitates voice cloning by using either a reference audio file or a preset speaker, thereby enhancing the realism and personalization of generated audio outputs.

Advanced Functionalities

Advanced capabilities include fine-tuning options for speech generation, such as temperature settings that influence the creativity of the output, and the ability to keep models loaded in memory for faster subsequent generations. Additionally, users can specify whether to run processes on CPU, accommodating devices that may not support GPU scheduling.

Practical Benefits

This package significantly streamlines the workflow for users looking to implement text-to-speech features in their projects. By providing easy-to-use nodes and flexible configurations, it enhances control over audio output quality and efficiency, allowing for rapid prototyping and deployment of voice synthesis applications.

Credits/Acknowledgments

The development of this tool is based on the original Spark-TTS project, with contributions from its developers acknowledged for providing the foundational model and library components. The project is released under the Apache License 2.0, ensuring open access and collaboration within the community.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

KERRY-YUAN