floyo logo
Powered by
ThinkDiffusion
floyo logo
Powered by
ThinkDiffusion

Fish Audio S2 TTS - Expressive Text to Speech

33

Generates in about 2 secs

Nodes & Models

FishAudioTTSAdvanced_floyo
CR Prompt Text
WorkflowGraphics
SaveAudio

Fish Audio S2 text-to-speech with fine-grained emotion and tone control built into the prompt.

Write your script, drop in emotion tags like [happy], [whispering], or [professional broadcast tone], and the model reads it back with those qualities baked in. No post-production. No separate audio editing pass. What you write is what you hear.

Runs the S2 Pro FP8 model. One input field, one audio file out.

How do you control emotion and tone in Fish Audio S2 TTS?

Add emotion tags directly in your text prompt. Wrap a tag like [angry], [laughing], or [soft tone] before the line you want affected. The model adjusts delivery at that exact point. You can stack multiple tags through a single script to shift tone sentence by sentence.

Text prompt This is your script. Drop emotion tags anywhere in the line to change how it's read. [happy] Welcome back! sounds warm and upbeat. [whispers] Don't tell anyone. drops to near-silent delivery. Mix them freely across sentences.

Here's a quick reference for what's available:

Basic emotions: [angry] [sad] [excited] [happy] [fearful] [surprised] [satisfied] [nervous] [confused] [curious]

Tone and delivery: [whispering] [soft tone] [shouting] [professional broadcast tone] [pitch up] [pitch down] [in a hurry tone]

Sound effects: [laughing] [sobbing] [sighing] [chuckling] [inhale] [exhale] [pause] [short pause] [clearing throat]

Volume: [loud] [low volume] [whisper in small voice]

Tags apply from where you place them until the next tag. Experiment. The same sentence reads completely differently with [excited] vs [serious] in front of it.

Temperature / top-p Default is 0.8 for both. Want more variation between runs? Push toward 1.0. Need consistent, predictable output for a repeating character voice? Drop toward 0.5. The tradeoff: higher values are more expressive, lower values are more stable.

Repetition penalty Default: 1.1. If the output stutters or repeats a phrase, nudge this up slightly. If it sounds clipped or unnatural, pull it back toward 1.0.

Seed Set to a fixed number to reproduce a specific output. Leave on randomize when exploring delivery options.

What is Fish Audio S2 TTS good for?

Fish Audio S2 is built for scripted speech that needs emotional range. Think character dialogue, narration with mood shifts, interactive fiction, social content, or any use case where flat monotone delivery breaks the experience.

Script work where tone carries the scene. A horror narration needs [fearful] delivery. A kids' story needs [excited] and [delighted]. Fish S2 handles the shift without needing multiple voice actors or a separate editing pass.

Podcast-style content where you want a polished, broadcast-quality voice with controlled pacing via [pause] and [short pause] tags.

Not the right choice for: long-form audiobooks where consistent voice identity matters across hours of output. For that, a cloning-based TTS with a reference audio is a better fit.

FAQ

What emotion tags does Fish Audio S2 support? Over 60 tags across six categories: basic emotions, advanced emotions, tone and delivery, sound effects, volume control, and dynamic effects. Drop any tag in brackets directly before the text it should affect. Tags apply inline, so you can shift tone multiple times within a single line.

How do I make Fish Audio S2 sound consistent across runs? Set a fixed seed number and keep your temperature and top-p values stable. With the same seed and settings, the output is reproducible. Switch to randomize when you're exploring delivery options.

Can I combine multiple emotion tags in one script? Yes. Tags carry forward from where you place them until the next tag. [excited] This is great news! [soft tone] But here's the catch. shifts delivery mid-script. Chain as many as your script needs.

What's the difference between temperature and top-p in TTS? Both affect variation in output. Temperature controls how adventurous the model's choices are. Top-p limits which options are on the table. Lower values: stable, predictable delivery. Higher values: more expressive variation between runs.

How do I run Fish Audio S2 TTS online? You can run Fish Audio S2 TTS online through Floyo. No installation, no setup. Open the workflow in your browser, type your script with emotion tags, and hit run. Free to try.

Read more

N