floyo logobeta logo
Powered by
ThinkDiffusion
floyo logobeta logo
Powered by
ThinkDiffusion

ComfyUI-AudioSuiteAdvanced

13

Last updated
2025-10-25

This tool enhances ComfyUI by enabling the processing of long text and audio synthesis through a variety of multifunctional nodes. It allows users to split text files, concatenate audio, align subtitles with audio timestamps, and separate audio tracks based on speakers.

  • Supports various subtitle formats for easy integration and text extraction.
  • Provides advanced audio processing features, including speaker separation and audio mixing.
  • Facilitates the creation of synthesized speech from text, streamlining workflows for TTS applications.

Context

This plugin, known as ComfyUI-AudioSuiteAdvanced, is designed to extend the functionality of ComfyUI by adding capabilities for handling long text and audio synthesis. Its primary purpose is to streamline audio production workflows, particularly for text-to-speech (TTS) applications, by providing tools for text segmentation, audio merging, and speaker differentiation.

Key Features & Benefits

The plugin includes several practical features:

  1. Long Text Splitter allows users to divide text based on sentences or custom delimiters, which is essential for preparing text for TTS processing.
  2. Subtitle File Loader can automatically load and extract text from various subtitle formats, making it easier to incorporate existing scripts into audio projects.
  3. Audio Concatenation and Separation tools enable users to merge audio files seamlessly while also providing the ability to separate tracks based on different audio components, enhancing production quality.

Advanced Functionalities

The plugin offers advanced functionalities such as:

  • Audio Separation, which uses the Hybrid Demucs model to split audio into distinct tracks (bass, drums, vocals, and others), useful in music production and audio analysis.
  • Speaker Separation, capable of isolating up to four speakers from an audio file, which is particularly beneficial for dialogue-heavy content, allowing for clearer audio editing and processing.

Practical Benefits

By integrating this tool into their workflow, users can significantly enhance their control over audio production processes. It improves efficiency by automating tasks such as text splitting and audio merging, while also ensuring high-quality outputs through precise audio separation and alignment with subtitles. This leads to a more streamlined and effective approach to creating TTS content and handling complex audio projects.

Credits/Acknowledgments

The development of this plugin is credited to CyberDickLang, with references to other contributors and projects such as ComfyUI-KJNodes and WhisperX for their foundational audio processing techniques. The plugin is designed to be compatible with various audio and subtitle formats, ensuring broad usability within the ComfyUI ecosystem.