floyo logo
Powered by
ThinkDiffusion
floyo logo
Powered by
ThinkDiffusion

Z-Image Base

It will come out soon.

112

Z-Image “Base” has effectively become Z-Image Omni Base, an all‑in‑one 6B model for text‑to‑image and image editing that is announced and wired up in tooling, but its weights are still marked “to be released.”

What “Base / Omni Base” is

  • Z‑Image is a 6‑billion‑parameter foundation model using a Scalable Single‑Stream DiT, processing text and image tokens in a single stream for efficiency and quality.

  • The originally teased “Z‑Image‑Base” (non‑distilled foundation) is now positioned and renamed as Z‑Image‑Omni‑Base, described as an all‑in‑one model for both generation and editing, not just a pure gen base.

  • It sits alongside Z‑Image‑Turbo (distilled, 8‑step fast model) and Z‑Image‑Edit (editing‑focused variant) as the “full‑fat” checkpoint meant for flexibility and community fine‑tuning.

Key features (from what’s known)

  • All‑in‑one tasks: Omni Base is explicitly described as unifying text‑to‑image and image editing in a single model, avoiding separate base/edit checkpoints.

  • Strong realism & text: Z‑Image as a family is reported to match commercial systems on photorealism and bilingual text rendering, with Turbo already ranking strongly on public leaderboards.

  • Fine‑tuning focus: The non‑distilled Base/Omni Base is being positioned as the community workhorse for LoRA training, ControlNet, and other custom adapters rather than just a consumer “Turbo‑like” model.

  • Tooling‑ready: Code and configs for Omni Base (e.g., ZImageControlNet, Image2LoRA, SigLIP2 encoder, low‑VRAM wrapper) are already merged into ecosystems like ModelScope/Diffusers so it can slot straight into pipelines once weights drop.

Why people care / expected use cases

  • Creator workflows: Use a single checkpoint for txt2img, img2img, inpaint/outpaint, and iterative editing without swapping models, similar to how Flux.2 Klein 9B is used today.

  • LoRA and ControlNet stacks: Omni Base is anticipated as the main base for training and stacking style/character LoRAs and ControlNets, including image‑to‑LoRA flows, with better support than Turbo.

  • Higher‑fidelity alternative to Turbo: Turbo is favored for speed; Omni Base is expected to trade some latency for more control, consistency, and compatibility with heavy workflows (large prompts, multi‑control signals, complex edits).​

“Coming soon” status

  • Official resources list Z‑Image‑Omni‑Base as “to be released” on Hugging Face/ModelScope, and recent commits add full pipeline support but not the checkpoint itself yet.

  • Community posts this month highlight those commits and describe Omni Base/Base as “coming soon,” with expectations that once it lands, we’ll see a wave of community fine‑tunes and direct comparisons against Flux Klein 9B and others.


Read more

N
Generates in about -- secs

Nodes & Models

WorkflowGraphics

Z-Image “Base” has effectively become Z-Image Omni Base, an all‑in‑one 6B model for text‑to‑image and image editing that is announced and wired up in tooling, but its weights are still marked “to be released.”

What “Base / Omni Base” is

  • Z‑Image is a 6‑billion‑parameter foundation model using a Scalable Single‑Stream DiT, processing text and image tokens in a single stream for efficiency and quality.

  • The originally teased “Z‑Image‑Base” (non‑distilled foundation) is now positioned and renamed as Z‑Image‑Omni‑Base, described as an all‑in‑one model for both generation and editing, not just a pure gen base.

  • It sits alongside Z‑Image‑Turbo (distilled, 8‑step fast model) and Z‑Image‑Edit (editing‑focused variant) as the “full‑fat” checkpoint meant for flexibility and community fine‑tuning.

Key features (from what’s known)

  • All‑in‑one tasks: Omni Base is explicitly described as unifying text‑to‑image and image editing in a single model, avoiding separate base/edit checkpoints.

  • Strong realism & text: Z‑Image as a family is reported to match commercial systems on photorealism and bilingual text rendering, with Turbo already ranking strongly on public leaderboards.

  • Fine‑tuning focus: The non‑distilled Base/Omni Base is being positioned as the community workhorse for LoRA training, ControlNet, and other custom adapters rather than just a consumer “Turbo‑like” model.

  • Tooling‑ready: Code and configs for Omni Base (e.g., ZImageControlNet, Image2LoRA, SigLIP2 encoder, low‑VRAM wrapper) are already merged into ecosystems like ModelScope/Diffusers so it can slot straight into pipelines once weights drop.

Why people care / expected use cases

  • Creator workflows: Use a single checkpoint for txt2img, img2img, inpaint/outpaint, and iterative editing without swapping models, similar to how Flux.2 Klein 9B is used today.

  • LoRA and ControlNet stacks: Omni Base is anticipated as the main base for training and stacking style/character LoRAs and ControlNets, including image‑to‑LoRA flows, with better support than Turbo.

  • Higher‑fidelity alternative to Turbo: Turbo is favored for speed; Omni Base is expected to trade some latency for more control, consistency, and compatibility with heavy workflows (large prompts, multi‑control signals, complex edits).​

“Coming soon” status

  • Official resources list Z‑Image‑Omni‑Base as “to be released” on Hugging Face/ModelScope, and recent commits add full pipeline support but not the checkpoint itself yet.

  • Community posts this month highlight those commits and describe Omni Base/Base as “coming soon,” with expectations that once it lands, we’ll see a wave of community fine‑tunes and direct comparisons against Flux Klein 9B and others.


Read more

N