API

Pricing

Workflows

API

Pricing

ComfyUI_WhisperSRT

Author bikiam

https://github.com/bikiam/ComfyUI_WhisperSRT

Last updated

2025-06-01

Run hundreds of ComfyUI nodes and workflows in your browser.

This tool serves as a custom node within ComfyUI, designed specifically for transcribing audio files into text and generating SRT (SubRip Subtitle) files. By leveraging OpenAI's Whisper technology, it facilitates seamless audio transcription, enhancing the functionality of ComfyUI for users needing text output from audio sources.

Provides accurate audio transcription using advanced AI models.
Generates SRT files for easy subtitle integration with video content.
Streamlines the workflow for audio-to-text conversion within the ComfyUI environment.

Context

This custom node, known as ComfyUI_WhisperSRT, integrates with ComfyUI to enable users to transcribe audio files efficiently. Its primary purpose is to convert spoken content into written text while also creating SRT files, which are widely used for subtitles in video editing and production.

Key Features & Benefits

The tool employs OpenAI's Whisper model, known for its high accuracy in speech recognition. This means users can expect reliable transcriptions that can be directly used for captioning or documentation purposes, saving time and effort compared to manual transcription methods.

Advanced Functionalities

In addition to standard transcription, the tool's ability to generate SRT files allows for easy synchronization with video playback. This feature is particularly beneficial for content creators who require accurate subtitles without the hassle of formatting them separately.

Practical Benefits

By incorporating this tool into their workflow, users can significantly enhance their productivity when dealing with audio content. It provides a streamlined approach to audio transcription, improving control over the output quality and efficiency, which is crucial for projects involving multimedia content.

Credits/Acknowledgments

This tool is built upon the capabilities of OpenAI's Whisper model, and special thanks are due to the original developers and contributors who have made this integration possible.

Discover most popular workflows

Hand-picked based on what hundreds of other artists looked at.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

goshnii

10.7k

Face swap

Flux

flux 2 klein

Flux 2 Klein face swap

Flux face swap

head swap

image 2 image

image editing

Instead of using outdated or unstable techniques, this workflow was designed to take full advantage of FLUX 2 KLEIN's editing capabilities—using a face image and a reference character image to produce clean, highly consistent results.

Flux 2 Klein 9b - Perfect Face swap

floyoofficial

4.7k

API

Image to Video

LTX2.3

LTX 2.3

LTX 2.3 Pro Image to Video

LTX 2.3

Author

bikiam