API

Pricing

Workflows

API

Pricing

Z-Image Turbo Inpainting

controlnet

inpainting

z-image-turbo

795

Generates in about 23 secs

floyoofficial

Nodes & Models

ComfyUI Official

UNETLoader

z_image_turbo_bf16.safetensors

Ver Private

Comm Use

CLIPLoader

qwen_3_4b.safetensors

Ver Private

Comm Use

VAELoader

ae.safetensors

Ver Private

Comm Use

LoadImage

ModelPatchLoader

Z-Image-Turbo-Fun-Controlnet-Union-2.1.safetensors

Ver Private

Comm Use

ModelSamplingAuraFlow

CLIPTextEncode

VAEEncode

ConditioningZeroOut

ZImageFunControlnet

DifferentialDiffusion

KSampler

VAEDecode

SaveImage

ComfyUI-Inference-Core-Nodes

AIO_Preprocessor

comfyui_controlnet_aux

AIO_Preprocessor

Z-Image Turbo depth-aware inpainting. Paint a mask over what you want to change, describe the edit, and run.

The workflow combines three components: Z-Image Turbo for fast, high-quality inpainting, DepthAnything V2 for structure-aware depth guidance via ControlNet, and Differential Diffusion for smooth edge blending at the mask boundary. The result is an edit that fits the existing image's perspective and lighting rather than looking pasted in.

The default prompt is "change the colour of the car to blood red." A targeted, single-instruction edit that shows the workflow's precision. Eight steps with the res_multistep sampler. Edits complete fast.

How do you use Z-Image Turbo Inpainting?

Paint a mask over the region you want to change, describe the edit in the prompt, and run. DepthAnything V2 reads the depth map of the full image and passes it to the ControlNet, keeping your edit structurally consistent with the original scene. Differential Diffusion handles the mask boundary for clean blending.

Input image with mask Upload your image and paint a mask over the region to edit. The white area of the mask defines what the model can change. Everything outside stays untouched. Paint precisely around the subject for clean inpainting; looser masks produce broader edits.

For face corrections: mask the specific feature (a region of skin, eyes, mouth area). For object changes: mask the full object, including its shadow and reflection if visible. For outfit changes: mask the garment precisely, stopping at skin and background edges. For background edits: mask the background area while excluding the subject.

Prompt Describe the specific change you want inside the masked region. The default "change the colour of the car to blood red" is a targeted, single-instruction edit. Keep prompts focused on what changes inside the mask; don't redescribe the surrounding image.

For color changes: "change the jacket to navy blue," "make the shirt white." For texture edits: "add realistic stubble to the face," "change the fabric to leather." For object replacement: "replace the cup with a wine glass on the table." For removal: "remove the person and fill with matching background."

DepthAnything V2 (depth preprocessor) Before conditioning the ControlNet, the workflow runs DepthAnything V2 at 512px resolution to extract a depth map from the full image. This depth map tells the ControlNet the spatial structure of the scene: which elements are closer, which are further back. The ControlNet (Z-Image-Turbo-Fun-Controlnet-Union-2.1) uses this depth information at strength 0.7 to constrain the inpainted region to match the original perspective and depth relationships.

This is what separates depth-aware inpainting from basic mask inpainting. A standard inpaint can generate content that ignores the scene's perspective: faces that don't match the angle, objects that float at the wrong depth. The depth guidance keeps edits grounded in the original structure.

Differential Diffusion Differential Diffusion is active in this workflow. It applies a graduated diffusion process at the mask boundary, smoothing the transition between the edited region and the surrounding image. Hard mask edges produce visible seams; Differential Diffusion reduces this by treating the boundary area with a lighter touch.

ControlNet strength (default: 0.7) The depth ControlNet runs at 0.7 strength. This is the balance between structural constraint and edit freedom. Higher strength (toward 1.0) locks the edit more tightly to the original depth structure. Lower strength (toward 0.5) gives the model more freedom, which can be useful for edits that need to deviate from the original depth.

Steps (default: 8) 8 steps with the res_multistep sampler. Z-Image Turbo is optimized for low step counts. 8 steps produces production-quality output for most inpainting tasks. Reduce to 4-6 for faster preview runs.

Flow shift (default: 3) Set for Z-Image Turbo's architecture. Leave at default.

What is Z-Image Turbo Inpainting good for?

Z-Image Turbo depth-aware inpainting is strongest for edits where the replaced or modified content must match the original scene's perspective and depth. Face corrections, outfit swaps, color changes, and object replacement all benefit from depth guidance keeping the edit grounded in the original image's spatial structure.

Face corrections and enhancements. Mask a specific facial feature and describe the change. The depth guidance keeps the edited feature at the correct depth relative to the rest of the face, avoiding the pasted-on look of basic inpainting. Natural skin texture, corrected features, and makeup changes all work well.

Outfit and accessory changes. Mask a garment and describe the replacement. The workflow handles fabric texture, lighting matches, and correct depth placement of the new item relative to the body.

Object replacement and removal. Replace an object with a different one or remove it entirely. "Remove the bag from the table and fill with matching surface." Depth guidance ensures the replacement sits at the correct spatial position in the scene.

Background edits. Change specific background elements while preserving the foreground subject at its correct depth. The ControlNet prevents the foreground from shifting as the background changes.

Color and material changes. The default prompt demonstrates this directly: change a car's color. For any color or material change where the structure must stay identical, depth-guided inpainting ensures only the surface appearance changes.

Honest notes: the depth preprocessor runs at 512px. For fine structural detail in large images, the depth map may not capture all fine-grained spatial relationships. For edits where depth accuracy is critical at high resolution, check the depth map preview before running the full generation. Differential Diffusion improves blending but may soften edge detail slightly at the boundary. For hard-edge subjects (text, sharp geometric forms), inspect the mask boundary after generation.

How does Z-Image Turbo Inpainting compare to standard mask inpainting?

Standard mask inpainting replaces the masked region guided only by the prompt and surrounding pixel context. Z-Image Turbo Inpainting adds depth guidance: DepthAnything V2 reads the scene's spatial structure and passes it to the ControlNet, keeping the edited region consistent with the original perspective. Edits look integrated, not inserted.

Standard inpainting works well for simple, flat edits where depth matching is not critical. For portrait retouching where the scene is relatively flat, a standard inpaint often suffices. For any edit where the replaced content exists in 3D space (objects at depth, architectural elements, full-body portrait changes), depth-guided inpainting produces more convincing results.

Differential Diffusion is the other addition. Standard inpainting produces hard edges at the mask boundary that need manual feathering. Differential Diffusion handles this automatically, producing smoother transitions without additional post-processing.

FAQ

What does DepthAnything V2 do in this inpainting workflow?
It extracts a depth map from the full image before inpainting. This depth map is passed to the Z-Image Turbo ControlNet, which uses it to constrain the inpainted content to match the original scene's spatial structure. Objects placed by the inpaint sit at the correct depth; faces match the original camera angle.

What does Differential Diffusion do in this workflow?
It smooths the transition between the masked (edited) region and the surrounding image. Standard mask inpainting can produce visible seams at the boundary. Differential Diffusion applies a graduated diffusion process at the edge, reducing seam visibility without manual feathering.

How many steps does Z-Image Turbo Inpainting need?
8 steps by default with the res_multistep sampler. Z-Image Turbo is optimized for low step counts. 8 steps is sufficient for production-quality inpainting. Use 4-6 for fast previews.

What ControlNet strength should I use for Z-Image Turbo Inpainting?
Default is 0.7. Increase toward 1.0 for edits where matching the original depth structure is the priority. Decrease toward 0.5 if the edit needs to deviate significantly from the original depth.

How do I run Z-Image Turbo Inpainting online?
You can run this workflow online through Floyo. No installation, no setup. Open the workflow in your browser, upload your image with mask, and hit run. Free to try.