FLUX.2 [klein]
Sub-second image generation and editing. Unified text-to-image, image editing, and multi-reference in one compact model - runs on consumer GPUs.
Run directly in your browser on Floyo with no installation, no setup, and no API configuration required.
Free to try - No installation - Runs in browser
What is FLUX.2 [klein]?
FLUX.2 [klein] is Black Forest Labs' fastest image model family, delivering sub-second generation and editing on consumer hardware. The name "klein" comes from the German word for "small," reflecting compact model sizes of 4B and 9B parameters. Despite its size, it matches or exceeds much larger models in quality while running in under half a second.
Unlike previous generation models that required separate pipelines for generation and editing, FLUX.2 [klein] unifies text-to-image, single-reference editing, and multi-reference generation in one architecture. You can generate from scratch, edit existing images, or blend multiple reference images - all with the same model.
What's new in FLUX.2 [klein]?
FLUX.2 [klein] represents a shift toward interactive visual intelligence. While the larger FLUX.2 [max] and [pro] models chase maximum photorealism, [klein] is purpose-built for speed and accessibility - targeting real-time applications and consumer hardware that previous models couldn't reach.
Generate or edit images in under 0.5 seconds on modern hardware. Step-distilled to just 4 inference steps without sacrificing quality.
One model handles text-to-image, image editing, and multi-reference generation. No need for separate pipelines or adapters like ControlNets.
The 4B model fits in approximately 13GB VRAM, running on RTX 3090 and RTX 4070. FP8 and NVFP4 quantization reduces requirements further.
The 4B model is fully open under Apache 2.0 - use it commercially, modify it, redistribute it. The 9B models use the FLUX Non-Commercial License.
What can you create with FLUX.2 [klein]?
FLUX.2 [klein]'s speed makes it ideal for workflows where iteration matters - exploring ideas quickly, testing variations, and refining concepts in real-time. The unified architecture handles generation and editing without switching models.
Explore visual directions in seconds. Test dozens of variations, styles, and compositions without waiting between generations.
Take an existing image and transform its style instantly. Blend reference images to create new aesthetics while preserving structure.
Generate product shots, packaging concepts, and marketing visuals quickly. Iterate on colorways and presentations in real-time.
Use multi-reference capabilities to maintain character identity across multiple outputs. Combine reference images for consistent results.
Build real-time generation into apps, games, or tools. Sub-second latency enables responsive user experiences.
The Base variants (undistilled) are designed for custom training, LoRA development, and research pipelines where control matters more than speed.
Which FLUX.2 [klein] variant should you use?
FLUX.2 [klein] comes in four main variants: distilled versions (4B and 9B) optimized for speed, and Base versions designed for fine-tuning and maximum flexibility. The 9B distilled model is the flagship, matching larger models at a fraction of the latency.
| Variant | Parameters | VRAM | Steps | License | Best for |
|---|---|---|---|---|---|
| klein 4B | 4 billion | ~13GB | 4 | Apache 2.0 | Commercial apps, consumer GPUs |
| klein 9B | 9 billion | ~29GB | 4 | FLUX NCL | Maximum quality + speed |
| klein 4B Base | 4 billion | ~13GB | 50 | Apache 2.0 | Fine-tuning, LoRA training |
| klein 9B Base | 9 billion | ~29GB | 50 | FLUX NCL | Research, custom pipelines |
FP8 and NVFP4 quantized versions are also available for all variants, developed with NVIDIA. FP8 offers up to 1.6x faster inference with 40% lower VRAM. NVFP4 offers up to 2.7x faster with 55% lower VRAM.
What are FLUX.2 [klein]'s capabilities?
FLUX.2 [klein] supports the full range of FLUX.2 capabilities in a unified architecture: text-to-image generation, single-reference image editing, and multi-reference composition. All modes run through the same model with sub-second latency.
Generate images from text descriptions with photorealistic quality. Supports complex prompts with good text rendering and spatial understanding.
Edit existing images with text instructions. Transform styles, change elements, or adjust compositions while preserving structure.
Combine multiple input images to blend concepts, maintain character identity, or transfer styles across compositions.
Rapidly explore different aesthetics by scrubbing through style options in real-time. Test variations instantly.
FP8 and NVFP4 versions maintain capabilities while reducing VRAM by up to 55% and increasing speed by up to 2.7x on RTX GPUs.
Official workflow templates for ComfyUI available from Black Forest Labs. Drop into existing pipelines immediately.
What are FLUX.2 [klein]'s technical specifications?
FLUX.2 [klein] is a rectified flow transformer architecture, distilled from larger FLUX.2 models to achieve sub-second inference in just 4 steps. The 9B model pairs a 9B flow model with an 8B Qwen3 text embedder for superior prompt understanding.
| Architecture | Rectified flow transformer |
| Model sizes | 4B and 9B parameters |
| Text encoder (9B) | 8B Qwen3 |
| Inference steps (distilled) | 4 steps |
| Inference steps (base) | 50 steps |
| Default resolution | 1024 x 1024 |
| VRAM (4B) | ~13GB (RTX 3090/4070) |
| VRAM (9B) | ~29GB (RTX 4090) |
| Quantization | FP8, NVFP4 (with NVIDIA) |
| License (4B) | Apache 2.0 |
| License (9B) | FLUX Non-Commercial License |
How does FLUX.2 [klein] work?
FLUX.2 [klein] uses a rectified flow transformer architecture that maps noise to images through a learned flow process. The model is trained to generate images under text conditioning, but the same architecture supports editing by initializing from existing images instead of pure noise.
The key to [klein]'s speed is distillation - a process where a larger, more complex model "teaches" a smaller one to approximate its outputs in fewer steps. The distilled [klein] variants require only 4 steps to generate an image, compared to 50 steps for the Base variants. This turns generation from a multi-second process into a sub-second one.
The 9B flagship model pairs its flow transformer with an 8B Qwen3 text embedder, providing strong semantic understanding and world knowledge. This combination allows [klein] to handle complex prompts with good spatial logic, lighting, and composition - capabilities usually reserved for much larger models.
For editing and multi-reference tasks, [klein] operates on latent representations of input images. The flow process updates these latents under text guidance while preserving structural information, enabling style transfer and composition blending without losing the source content.
Frequently Asked Questions
Is FLUX.2 [klein] free to use?
The 4B model is released under Apache 2.0, making it free for commercial and non-commercial use. The 9B models use the FLUX Non-Commercial License - free for personal and research use, but commercial use requires a separate agreement with Black Forest Labs. On Floyo, you can run FLUX.2 [klein] with 20 free minutes of generation time daily.
How do I use FLUX.2 [klein] without installing anything?
On Floyo, FLUX.2 [klein] runs directly in your browser. No local installation, no Python setup, no API keys needed. Just pick a workflow and start generating. The model and all dependencies are pre-loaded on cloud GPUs.
Who made FLUX.2 [klein]?
Black Forest Labs is a German AI research company founded by former Stability AI engineers. They created the original FLUX.1 models and the FLUX.2 family including [max], [pro], [dev], and [klein]. The quantized versions were developed in collaboration with NVIDIA.
How does FLUX.2 [klein] compare to FLUX.2 [dev]?
FLUX.2 [dev] is the full 32B parameter model designed for maximum quality and flexibility, requiring data-center GPUs. FLUX.2 [klein] is distilled for speed and accessibility - it runs on consumer GPUs in under a second. The 9B [klein] matches [dev] quality for most tasks at a fraction of the latency and hardware requirements.
Can I use FLUX.2 [klein] images commercially?
Images generated with the 4B model (Apache 2.0) can be used commercially. For the 9B model, check the FLUX Non-Commercial License terms - commercial use requires a separate license from Black Forest Labs.
What resolution does FLUX.2 [klein] support?
The default generation resolution is 1024x1024. The model supports various aspect ratios at this scale. Higher resolutions may require more VRAM and longer generation times.
What GPU do I need for FLUX.2 [klein]?
The 4B model runs on GPUs with approximately 13GB VRAM (RTX 3090, RTX 4070 and above). The 9B model requires approximately 29GB VRAM (RTX 4090 and above). FP8 and NVFP4 quantized versions reduce these requirements further. On Floyo, you don't need any local GPU - workflows run on cloud H100s.
Where can I find FLUX.2 [klein] workflows for ComfyUI?
Black Forest Labs released official ComfyUI workflow templates on their GitHub repository. On Floyo, pre-built workflows are available in the workflow library - just pick one and run it without any setup.
Start generating with FLUX.2 [klein]
Sub-second image generation in your browser. No installation, no setup, no API configuration.
Try FLUX.2 [klein] FreeFree to try - No installation - Runs in browser