API

Pricing

Workflows

API

Pricing

ComfyUI-FishAudioS2

Author Saganaki22

https://github.com/Saganaki22/ComfyUI-FishAudioS2

Last updated

N/A

Run hundreds of ComfyUI nodes and workflows in your browser.

ComfyUI-FishAudioS2 is a specialized extension for ComfyUI that integrates the Fish Audio S2 Pro text-to-speech (TTS) model, enabling advanced voice cloning, multi-speaker synthesis, and emotional expression in generated speech. This tool enhances the capabilities of ComfyUI by allowing users to create highly realistic and emotive audio outputs from text inputs.

Supports zero-shot voice cloning from short audio samples, allowing for the creation of unique voice profiles.
Offers inline emotion and prosody control through a simple tagging system, enabling more expressive speech synthesis.
Facilitates multi-speaker conversations in a single pass, with the ability to isolate audio tracks for each speaker, enhancing workflow in applications such as animation and gaming.

Context

This tool is designed to provide seamless integration of Fish Audio S2 Pro's TTS capabilities within the ComfyUI framework. Its primary purpose is to enable users to generate high-quality speech outputs that can convey various emotions and tones, making it ideal for creative projects that require nuanced vocal performances.

Key Features & Benefits

The extension includes several practical features that significantly enhance the user experience:

Zero-Shot Voice Cloning: Users can clone voices using just a short audio reference, making it easy to create custom voice profiles without extensive training.
Emotive Tags: The ability to use over 1500 emotive tags allows users to control the emotional tone of the speech, adding depth and character to the generated audio.
Multi-Speaker Support: The tool can synthesize conversations with multiple speakers, providing a more dynamic and engaging audio experience.

Advanced Functionalities

One of the standout capabilities of this extension is its support for multi-speaker synthesis, which enables the generation of conversations with distinct voices in a single operation. Additionally, it provides per-speaker audio isolation, allowing for precise control over audio tracks, which is particularly useful in projects that require lip-syncing or detailed audio editing.

Practical Benefits

Integrating this tool into a ComfyUI workflow significantly improves efficiency and control over audio generation. Users can quickly generate realistic speech that reflects various emotions and tones, streamline the process of creating multi-speaker dialogues, and manage audio outputs effectively for projects in animation, game development, or any other creative field requiring nuanced vocal performances.

Credits/Acknowledgments

The development of this extension is attributed to the original authors and contributors of the Fish Audio S2 Pro model. The project operates under the Fish Audio Research License, which permits research and non-commercial use while requiring a separate license for commercial applications. For further details, users can refer to the original repository and associated documentation.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.6k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

Saganaki22