floyo logo
Powered by
ThinkDiffusion
floyo logo
Powered by
ThinkDiffusion

ACE-Step 1.5 for Music Generation

Create stunning music using ACE Step 1.5

19

ACE‑Step 1.5 is an open‑source music foundation model that can generate and edit full songs (up to ~10 minutes) from text prompts, running locally on consumer GPUs with <4 GB VRAM.

What it is

  • Text‑to‑music and music‑editing model with a hybrid architecture: a language model plans song structure, and a diffusion transformer renders high‑quality audio.

  • Designed to reach or beat many commercial music models on quality while staying extremely fast (seconds per song on common GPUs).

Key features

  • Generates full songs from simple prompts, from short loops to ~10‑minute tracks, with coherent structure, style, and lyrics if requested.

  • Strong style and prompt control, supporting 50+ languages for captions/lyrics and fine‑grained genre, mood, instrument, and tempo steering.

  • Unified tasks: text‑to‑music, cover generation, “repainting” sections, continuations, vocal‑to‑BGM and track extraction in one model.

  • Runs locally with low VRAM; LoRA‑style personalization lets you capture your own musical style from a few songs.

  • Native ComfyUI support, with nodes and example workflows so you can integrate it like any other model in your graph.​

Best‑fit use cases

  • Creating royalty‑free background music and themes for videos, streams, or games, fully offline.​

  • Rapid idea sketching for producers: generate drafts in a target style, then rework stems in a DAW.

  • Covers and remixes: re‑render songs in a new style, repaint sections, or continue/reshape existing tracks.

  • Localized music content (jingles, songs with lyrics) in many languages for marketing or education.


Read more

N
Generates in about -- secs

Nodes & Models

WorkflowGraphics
PrimitiveStringMultiline
CheckpointLoaderSimple
ace_step_1.5_turbo_aio.safetensors
ModelSamplingAuraFlow
EmptyAceStep1.5LatentAudio
TextEncodeAceStepAudio1.5
ConditioningZeroOut
KSampler
VAEDecodeAudio
SaveAudioMP3

ACE‑Step 1.5 is an open‑source music foundation model that can generate and edit full songs (up to ~10 minutes) from text prompts, running locally on consumer GPUs with <4 GB VRAM.

What it is

  • Text‑to‑music and music‑editing model with a hybrid architecture: a language model plans song structure, and a diffusion transformer renders high‑quality audio.

  • Designed to reach or beat many commercial music models on quality while staying extremely fast (seconds per song on common GPUs).

Key features

  • Generates full songs from simple prompts, from short loops to ~10‑minute tracks, with coherent structure, style, and lyrics if requested.

  • Strong style and prompt control, supporting 50+ languages for captions/lyrics and fine‑grained genre, mood, instrument, and tempo steering.

  • Unified tasks: text‑to‑music, cover generation, “repainting” sections, continuations, vocal‑to‑BGM and track extraction in one model.

  • Runs locally with low VRAM; LoRA‑style personalization lets you capture your own musical style from a few songs.

  • Native ComfyUI support, with nodes and example workflows so you can integrate it like any other model in your graph.​

Best‑fit use cases

  • Creating royalty‑free background music and themes for videos, streams, or games, fully offline.​

  • Rapid idea sketching for producers: generate drafts in a target style, then rework stems in a DAW.

  • Covers and remixes: re‑render songs in a new style, repaint sections, or continue/reshape existing tracks.

  • Localized music content (jingles, songs with lyrics) in many languages for marketing or education.


Read more

N