Z-Image ControlNet Advanced Application: Complete Multi-Control Workflow Guide

Summary: ControlNet is one of the most powerful image control tools in the Z-Image ecosystem. This article dives deep into the multi-control capabilities of the Z-Image Turbo ControlNet Union model, covering five control conditions — Canny edge detection, pose estimation, depth maps, normal maps, and segmentation — with complete ComfyUI workflow configurations and practical case studies.

ControlNet Union Model Overview
Five Control Conditions Explained
ComfyUI Workflow Setup
Multi-Control Practical Cases
Control Strength Tuning Guide
Troubleshooting

1. ControlNet Union Model Overview

The ControlNet Union model released by the Z-Image team is a multi-condition fusion model that merges training weights for multiple ControlNet preprocessors (Canny, Pose, Depth, Normal, Segmentation) into a single model file. This means:

One model, multiple controls: No need to load multiple ControlNet models, reducing VRAM usage
Combined control: Use multiple control conditions simultaneously for precise image generation
Performance optimized: The Union model is specifically trained for stable performance under combined control conditions

Getting the Model

Model name: z-image-turbo-controlnet-union
File size: ~2.5 GB
Supported format: .safetensors

Comparison with Standard ControlNet

Feature	Standard ControlNet	Union ControlNet
Conditions per model	1	5 (Canny/Pose/Depth/Normal/Seg)
VRAM usage	~2GB per condition	~2.5GB total
Combined control	Requires parallel models	Natively supported
Training consistency	Independently trained	Unified training, more coherent

2. Five Control Conditions Explained

2.1 Canny Edge Detection

Canny detection extracts edge information from images — the most commonly used ControlNet input. Best for:

Line art coloring: Transform hand-drawn sketches into high-quality renders
Style transfer: Maintain compositional structure while changing style
Sketch refinement: Convert rough sketches into detailed images

Parameter suggestions:

Low threshold: 50-100
High threshold: 100-200
Lower thresholds capture more detail; higher thresholds keep only strong edges

2.2 OpenPose Pose Estimation

OpenPose extracts skeletal pose information (keypoints), ideal for:

Character pose control: Precisely control character stance, sitting, or action poses
Character consistency: Maintain character poses across different scenes
Batch generation: Generate multiple variations of the same pose

Output format: 17 keypoint 2D coordinates + confidence values

2.3 Depth Maps

Depth maps encode spatial depth information, best for:

Scene composition control: Maintain original scene foreground-background relationships
Photo style transfer: Change style while preserving spatial structure
3D consistency: Ensure generated images follow 3D spatial logic

Recommended depth estimation models:

MiDaS v3: Fast, high quality — recommended first choice
ZoeDepth: Higher precision for professional scenarios
Depth Anything V2: Open-source, excellent results

2.4 Normal Maps

Normal maps encode surface orientation information, best for:

Material replacement: Maintain object geometry while changing surface materials
Lighting control: Control lighting effects through normal information
3D style rendering: Give 2D images a 3D quality

2.5 Segmentation (Semantic)

Semantic segmentation maps divide images into semantic regions (person, sky, building), best for:

Regional style control: Apply different styles to different regions
Background replacement: Precisely separate foreground and background
Content editing: Perform local edits within specific regions

3. ComfyUI Workflow Setup

3.1 Environment Preparation

# Update ComfyUI to latest version (ControlNet Union requires latest node support)
cd ComfyUI && git pull

# Install necessary dependencies
pip install controlnet_aux opencv-python

# Download ControlNet Union model
# Place in ComfyUI/models/controlnet/ directory

3.2 Core Workflow Nodes

The essential node chain for Z-Image Turbo ControlNet:

[Load Checkpoint] → Z-Image Turbo base model
[Load ControlNet Model] → ControlNet Union model
[Load Image] → Reference image input
[ControlNet Preprocessor] → Select Canny/Pose/Depth/Normal/Seg
[Model Patch for ControlNet] → Apply ControlNet to model
[KSampler] → Sampling generation (steps=8, cfg=0)
[VAE Decode] → Decode output image
[Save Image] → Save result

3.3 Key Parameters

Parameter	Recommended Value	Description
ControlNet strength	0.6-0.9	Impact intensity of control condition
start_at	0.0	Step ratio where ControlNet starts applying
end_at	1.0	Step ratio where ControlNet stops applying
steps	8	Turbo mode recommended steps
cfg	0.0	Turbo distilled model, no CFG needed
scheduler	euler	Recommended scheduler

3.4 Multi-Control Node Configuration

To use multiple control conditions simultaneously, add multiple Model Patch for ControlNet nodes, applying ControlNets sequentially:

[Model] → [Model Patch #1: Canny] → [Model Patch #2: Pose] → [KSampler]

Each Model Patch node independently configures strength, start_at, end_at parameters, enabling differentiated application of different control conditions across sampling stages.

4. Multi-Control Practical Cases

Case 1: Canny + Depth — Precise Composition Style Transfer

Scenario: Convert an architectural photo to cyberpunk style while maintaining structural accuracy.

Workflow:

Canny control (strength=0.7): Extract building edge lines
Depth control (strength=0.5): Maintain spatial depth relationships
Prompt: cyberpunk city, neon lights, rain, night scene, photorealistic

Result: Architectural style completely transformed while building outlines and spatial relationships are precisely preserved.

Case 2: Pose + Canny — Precise Character Pose Control

Scenario: Generate character images with specific poses while maintaining costume design details.

Workflow:

Pose control (strength=0.8): Extract character pose from reference
Canny control (strength=0.4): Maintain costume outline details
Prompt: fantasy warrior, detailed armor, dynamic pose, dramatic lighting

Case 3: Depth + Normal — 3D Quality Rendering

Scenario: Give 2D line art 3D quality and realistic lighting.

Workflow:

Depth control (strength=0.6): Encode spatial depth
Normal control (strength=0.3): Add surface normal information
Prompt: product photography, studio lighting, realistic materials, 4K

Case 4: All Five Conditions Combined — Maximum Control

Scenario: Use all five control conditions simultaneously for maximum precision.

Workflow:

Canny (0.5) + Depth (0.4) + Normal (0.3) + Pose (0.7) + Segmentation (0.3)
Note: Total strength should not be too high to avoid overfitting the input image

Tuning tip: Start with single conditions, gradually add more, observing effect changes each time.

5. Control Strength Tuning Guide

5.1 Strength Parameter Matrix

ControlNet strength	Effect Description	Use Case
0.0-0.2	Almost no control, prompt dominates	Slight guidance only
0.3-0.5	Moderate control, prompt still has room	Style reference
0.6-0.8	Strong control, structure highly consistent	Precise composition
0.9-1.0	Very strong control, may suppress creativity	Line art coloring

5.2 start_at / end_at Tuning

Parameter Combination	Effect	Use Case
0.0 → 1.0	Full-range control	Standard usage
0.0 → 0.5	Early control, late freedom	Fixed structure + free details
0.3 → 1.0	Late-stage intervention	Free generation then correction
0.0 → 0.8	Avoid over-control in final steps	Prevent overfitting

5.3 Multi-Condition Strength Distribution Principles

Primary-secondary clarity: Designate one primary condition (strength 0.6-0.8), others as secondary (0.2-0.4)
Total under 2.0: Combined strength should ideally not exceed 2.0
Structure before detail: Structural conditions (Canny/Depth) first, detail conditions (Normal/Seg) added later

6. Troubleshooting

Problem	Possible Cause	Solution
Blurry output	ControlNet strength too high	Reduce strength to 0.6-0.7
Control not noticeable	Strength too low or model not loaded	Check model path, increase strength
Output unrelated to reference	Wrong preprocessor selected	Ensure input image matches preprocessor type
"Model patch not found" error	ComfyUI version too old	`git pull` to latest
Out of VRAM	Multi-condition parallel overhead	Reduce input resolution or number of active conditions
Jagged edges	Canny thresholds inappropriate	Adjust low/high thresholds, or use Depth instead

Final Thoughts

The Z-Image ControlNet Union model unifies multiple control conditions into a single model, greatly simplifying workflow configuration and reducing VRAM requirements. Mastering multi-control combination techniques enables complex control scenarios from precise composition to style transfer.

Start with single conditions, progress to dual-condition combinations, and ultimately master multi-condition synergy. Remember: ControlNet is a tool, not a constraint — appropriate strength settings make it an accelerator for creativity, not a shackle.

📌 Related Links

z-image-turbo-controlnet-union — Official model
ComfyUI official workflows — Community workflow templates
ControlNet original paper — Technical principles

Z-Image ControlNet Advanced Application: Complete Multi-Control Workflow Guide

目录

Z-Image ControlNet Advanced Application: Complete Multi-Control Workflow Guide

Table of Contents

1. ControlNet Union Model Overview

Getting the Model

Comparison with Standard ControlNet

2. Five Control Conditions Explained

2.1 Canny Edge Detection

2.2 OpenPose Pose Estimation

2.3 Depth Maps

2.4 Normal Maps

2.5 Segmentation (Semantic)

3. ComfyUI Workflow Setup

3.1 Environment Preparation

3.2 Core Workflow Nodes

3.3 Key Parameters

3.4 Multi-Control Node Configuration

4. Multi-Control Practical Cases

Case 1: Canny + Depth — Precise Composition Style Transfer

Case 2: Pose + Canny — Precise Character Pose Control

Case 3: Depth + Normal — 3D Quality Rendering

Case 4: All Five Conditions Combined — Maximum Control

5. Control Strength Tuning Guide

5.1 Strength Parameter Matrix

5.2 start_at / end_at Tuning

5.3 Multi-Condition Strength Distribution Principles

6. Troubleshooting

Final Thoughts