FLUX.2 Klein 9B for Image Editing
Unified workflow: one model for text‑to‑image, image‑to‑image, and image editing
Flux
Flux.2 Klein
Image2image
3
767
Flux.2 Klein 9B is a 9‑billion‑parameter image model from Black Forest Labs that unifies text‑to‑image, image‑to‑image, and powerful image editing (including multi‑reference) in a single fast architecture.
What it is
A rectified flow transformer with a 9B image model plus an 8B Qwen3 text encoder, designed for high‑quality generation and editing at 1024×1024.
Released in both Base and Distilled variants: Base for maximum control and quality, Distilled for 4‑step, sub‑second inference on modern GPUs.
Positioned as the “small flagship” of the FLUX.2 line, matching or beating much larger models in quality while staying efficient enough for interactive use.
Key features
Unified workflow: one model for text‑to‑image, image‑to‑image, and image editing (single‑reference and multi‑reference).
4‑step distillation in the 9B Distilled variant for very low latency while keeping strong visual fidelity.
Multi‑reference editing: can use multiple reference images (e.g., up to three) to control subject, style, or composition.
Strong prompt adherence and output diversity, with good handling of detailed instructions and photorealistic scenes.
Runs on high‑end consumer GPUs; the distilled 9B is explicitly targeted at production, real‑time, and interactive applications.
Typical use cases
Fast, high‑quality image editing:
Text‑guided edits (“make it nighttime”, “change the outfit to a blue jacket”, “remove the logo”).
Object replacement/removal and style transformation for photos, product shots, and scenes.
Multi‑reference creative work:
Combine elements from several images (e.g., “place the person from image 1 in the room from image 2 with the lighting of image 3”).
Interactive and production tools:
Real‑time or near‑real‑time preview in editors, design tools, and web apps where latency is critical.
As a default image model in pipelines that need both generation and later fine‑grained edits without swapping checkpoints.
Read more
Nodes & Models
WorkflowGraphics
KSamplerSelect
RandomNoise
CLIPLoader
qwen_3_8b_fp8mixed.safetensors
VAELoader
flux2-vae.safetensors
UNETLoader
flux-2-klein-9b.safetensors
Fast Groups Bypasser (rgthree)
LoadImage
CLIPTextEncode
ImageScaleToTotalPixels
VAEEncode
GetImageSize
Flux2Scheduler
EmptyFlux2LatentImage
ReferenceLatent
CFGGuider
SamplerCustomAdvanced
VAEDecode
SaveImage
PreviewImage
AddLabel
ImageConcanate
AddLabel
ImageConcanate
Flux.2 Klein 9B is a 9‑billion‑parameter image model from Black Forest Labs that unifies text‑to‑image, image‑to‑image, and powerful image editing (including multi‑reference) in a single fast architecture.
What it is
A rectified flow transformer with a 9B image model plus an 8B Qwen3 text encoder, designed for high‑quality generation and editing at 1024×1024.
Released in both Base and Distilled variants: Base for maximum control and quality, Distilled for 4‑step, sub‑second inference on modern GPUs.
Positioned as the “small flagship” of the FLUX.2 line, matching or beating much larger models in quality while staying efficient enough for interactive use.
Key features
Unified workflow: one model for text‑to‑image, image‑to‑image, and image editing (single‑reference and multi‑reference).
4‑step distillation in the 9B Distilled variant for very low latency while keeping strong visual fidelity.
Multi‑reference editing: can use multiple reference images (e.g., up to three) to control subject, style, or composition.
Strong prompt adherence and output diversity, with good handling of detailed instructions and photorealistic scenes.
Runs on high‑end consumer GPUs; the distilled 9B is explicitly targeted at production, real‑time, and interactive applications.
Typical use cases
Fast, high‑quality image editing:
Text‑guided edits (“make it nighttime”, “change the outfit to a blue jacket”, “remove the logo”).
Object replacement/removal and style transformation for photos, product shots, and scenes.
Multi‑reference creative work:
Combine elements from several images (e.g., “place the person from image 1 in the room from image 2 with the lighting of image 3”).
Interactive and production tools:
Real‑time or near‑real‑time preview in editors, design tools, and web apps where latency is critical.
As a default image model in pipelines that need both generation and later fine‑grained edits without swapping checkpoints.
Read more






