Z-Image Turbo ControlNet Union 2.1: Complete Multi-Control Workflow Guide

يونيو ٦، ٢٠٢٦

Z-Image Turbo ControlNet Union 2.1: Complete Multi-Control Workflow Guide

Abstract: Z-Image Turbo ControlNet Union 2.1 is a unified control network model open-sourced by Alibaba's PAI team, supporting depth, edge, normal, semantic segmentation, grayscale, and multiple other control conditions. This article takes you from basic concepts to advanced multi-control workflows, teaching you Z-Image's multi-control point combination generation techniques step by step.


Introduction

ControlNet is the core technology for achieving precise spatial control in AI image generation. In early 2026, Alibaba's PAI team officially open-sourced Z-Image-Turbo-Fun-ControlNet-Union 2.1, a unified ControlNet model specifically trained for Z-Image Turbo.

Unlike traditional approaches requiring separate ControlNet models for each control condition, ControlNet Union consolidates all control conditions into a single model file, significantly simplifying workflows and improving flexibility.


1. What is ControlNet Union?

1.1 Basic Concepts

ControlNet Union is a unified ControlNet model supporting these control conditions:

Control Type Description Use Cases
Depth Spatial structure from depth maps Architecture, scene composition
Canny Contour control via Canny edge detection Line art coloring, sketch refinement
Normal Surface orientation from normal maps 3D rendering, material replacement
Seg Region control via semantic segmentation Scene editing, object replacement
Gray Light/dark control from grayscale B&W to color, tone control
Pose Human pose control Portrait generation, action control
Tile Local control via tile maps Image expansion, detail enhancement

1.2 Technical Background


2. ComfyUI Environment Setup

2.1 Install ComfyUI and Extensions

# Install ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

# Install ComfyUI Manager
cd custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager.git

2.2 Download Model Files

Z-Image Turbo Main Model (place in models/unet/):

z-image-turbo-fp8-e4m3fn.safetensors

ControlNet Union 2.1 (place in models/controlnet/):

Z-Image-Turbo-Fun-Controlnet-Union.safetensors

CLIP Text Encoder (place in models/clip/):

qwen3-4b-fp8-scaled.safetensors

2.3 Minimum Hardware Requirements

Configuration VRAM Notes
Entry Level 8GB FP8 quantization + basic resolution
Recommended 12GB FP8 + high resolution + multi-ControlNet
Ideal 16GB+ BF16 precision + high resolution

3. Basic Workflows

3.1 Single Control Condition Workflow

The most basic ControlNet workflow includes these nodes:

Load Checkpoint (Z-Image-Turbo)
  → ControlNet Loader (Union 2.1)
  → ControlNet Preprocessor (select control type)
  → Apply ControlNet
  → KSampler (8 steps)
  → Save Image

Canny Edge Detection Example:

  • Control type: canny
  • ControlNet strength: 0.8-1.0
  • Use case: Line art coloring, sketch refinement

3.2 Depth Map Control Workflow

Depth control is one of the most commonly used scenarios:

  1. Input: Depth map (can be auto-generated from RGB)
  2. Control type: depth
  3. Application: Change style while maintaining spatial structure
# Key ComfyUI workflow parameters
{
    "control_net_name": "Z-Image-Turbo-Fun-Controlnet-Union.safetensors",
    "control_net_type": "depth",
    "strength": 0.9,
    "start_percent": 0.0,
    "end_percent": 1.0,
    "num_inference_steps": 8,
    "guidance_scale": 7.5
}

4. Multi-Control Combination Workflows (Core Feature)

The biggest advantage of ControlNet Union 2.1 is simultaneously using multiple control conditions. This is the core content of this article.

4.1 Depth + Canny Combination

Application: Precisely control spatial structure while maintaining contour details.

ControlNet 1: Depth (strength 0.8)
ControlNet 2: Canny (strength 0.6)
  ↓ Combined application
KSampler (8 steps)
  ↓
Output: Structurally precise + detail-rich images

Tips:

  • Depth controls spatial structure (front, higher strength)
  • Canny controls edge details (back, lower strength)
  • Total strength of both ControlNets: 1.2-1.4

4.2 Semantic Segmentation + Depth Combination

Application: Scene editing, object replacement, style transfer.

ControlNet 1: Seg (semantic segmentation, strength 0.7)
  → Controls what content is in which region
ControlNet 2: Depth (strength 0.7)
  → Controls spatial structure
  ↓ Combined application
Prompt: "Add a running hound dog on the grass"
  ↓
Output: Semantically correct scene editing results

4.3 Pose + Canny Combination

Application: Character generation, outfit changes, action control.

ControlNet 1: Pose (strength 0.9)
  → Controls character pose
ControlNet 2: Canny (strength 0.5)
  → Controls outfit/appearance contours
  ↓ Combined application
Prompt: "Business person in red suit, standing in office"
  ↓
Output: Accurate pose + outfit matching description

5. Advanced Techniques

5.1 Multi-Resolution Strategy

# Two-stage high-resolution strategy
Stage 1: 512×512 + ControlNet (structure control)
  → Use Hires.Fix / Latent Upscale
Stage 2: 1024×1024 (no ControlNet, detail enhancement)
  → Use Z-Image Turbo native 8-step inference

5.2 Grayscale to Color Workflow

ControlNet Union's Gray mode excels at coloring black-and-white photos:

  1. Input black-and-white photo
  2. Control type: gray
  3. Prompt describes target color style
  4. Output: Colorized image

Tips:

  • Gray mode strength: 0.7-0.8
  • Combine with depth control to maintain structure
  • Suitable for old photo restoration, sketch coloring

5.3 ControlNet Strength Tuning Tips

Scenario Recommended Strength Notes
Structure (Depth) 0.8-1.0 High strength for spatial structure
Edge (Canny) 0.5-0.8 Medium strength for detail flexibility
Semantic (Seg) 0.6-0.9 Adjust based on edit scope
Pose 0.8-1.0 High strength for pose accuracy
Grayscale (Gray) 0.6-0.8 Balance light/dark and color
Multi-control combo 0.4-0.7 each Total ≤ 1.5

6. Common Application Scenarios

6.1 Architectural Design

ControlNet combo: Depth + Normal
Prompt: "Modern glass facade building, sunset light, reflecting sky"
Effect: Maintain building structure, change materials and lighting

6.2 E-commerce Product Photography

ControlNet combo: Canny + Tile
Prompt: "Professional product photography, marble surface, soft studio lighting, white background"
Effect: Generate professional product images from product sketches

6.3 Character Design

ControlNet combo: Pose + Canny + Seg
Prompt: "Cyberpunk-styled character, neon-lit city background"
Effect: Precisely control character pose, outfit contours, and scene elements

6.4 Image Expansion (Uncrop)

ControlNet combo: Tile + Depth
Prompt: "Expand the frame, show a wider city skyline"
Effect: Intelligently expand image boundaries, maintaining style consistency

7. Troubleshooting

7.1 Insufficient VRAM

# Solution 1: Use FP8 quantized models
# Solution 2: Reduce resolution
# Solution 3: Enable --lowvram mode
python main.py --lowvram

7.2 Weak Control Effect

  • Check if ControlNet strength is too low
  • Confirm correct control type selection
  • Verify preprocessing map quality (depth, edges, etc.)
  • Try increasing start_percent (from 0.0 to 0.1-0.2)

7.3 Over-Control Causing Stiff Images

  • Reduce ControlNet strength
  • Set end_percent < 1.0 (let model be free in final steps)
  • Reduce the number of simultaneously used ControlNets

8. Complete Workflow JSON Template

Key node configuration for ComfyUI multi-control combination workflow:

{
  "controlnet_depth": {
    "model": "Z-Image-Turbo-Fun-Controlnet-Union.safetensors",
    "type": "depth",
    "strength": 0.8,
    "start_percent": 0.0,
    "end_percent": 0.9
  },
  "controlnet_canny": {
    "model": "Z-Image-Turbo-Fun-Controlnet-Union.safetensors",
    "type": "canny",
    "strength": 0.6,
    "start_percent": 0.0,
    "end_percent": 0.8
  },
  "sampler": {
    "steps": 8,
    "cfg": 7.5,
    "scheduler": "euler",
    "model": "z-image-turbo-fp8-e4m3fn.safetensors"
  }
}

9. Summary

Z-Image Turbo ControlNet Union 2.1 is one of the most powerful control tools in the Z-Image ecosystem. Its unified architecture, multi-control combination capabilities, and open-source nature make it the preferred choice for professional image generation workflows.

Key Takeaways:

  1. Unified Model: One model supports all control types
  2. Multi-Control Combination: Use multiple ControlNets simultaneously for precise control
  3. Strength Tuning: Properly distribute strength across ControlNets
  4. Scenario Adaptation: Select different control combinations for different scenarios
  5. FP8 Deployment: Runs even in low VRAM environments

Mastering ControlNet Union 2.1 means mastering Z-Image Turbo's core competitive advantage.


Related Articles:

  • ZI-072: Z-Image ComfyUI Power Nodes Advanced Workflow
  • ZI-058: Z-Image ControlNet Union 2.1 Multi-Control Practical Guide
  • ZI-080: Z-Image Batch Generation and Performance Tuning

Z-Image Team

Z-Image Turbo ControlNet Union 2.1: Complete Multi-Control Workflow Guide | Blog