Z-Image ControlNet Union 2.1 Multi-Control Practice: Next-Gen Unified Control Model
Keywords: z-image controlnet union multi-control
Table of Contents
- Introduction
- What Is ControlNet Union 2.1
- New Features in Union 2.1
- Fun Union ControlNet Capabilities
- Depth V3 Improvements
- Combined Control Strategies
- Single-Model Multi-Condition Workflow
- Weight Balancing
- ComfyUI Union 2.1 Nodes
- Practical Examples
- Troubleshooting
- Summary
Introduction
ControlNet has evolved from single-condition models (one model per condition type) to unified multi-condition models. Union 2.1 represents the current state-of-the-art, enabling multiple control conditions through a single model file. This guide covers practical implementation with Z-Image, including multi-condition workflows, weight balancing, and real-world use cases.
What Is ControlNet Union 2.1
Evolution from Single-Condition ControlNet
Traditional ControlNet required separate model files per condition:
controlnet_canny.safetensors → Canny edge control
controlnet_depth.safetensors → Depth map control
controlnet_pose.safetensors → OpenPose control
This approach accumulates VRAM overhead and requires manual weight management between conditions.
Union 2.1 Architecture
Union 2.1 consolidates multiple conditions into one model:
controlnet_union21.safetensors → Canny + Depth + Pose + Tile +
Lineart + Normal + Segment + Depth V3 + ...
The model uses type embeddings to route condition-specific features internally, handling multiple simultaneous controls in a single forward pass.
Key Advantages
| Aspect | Single-Condition | Union 2.1 |
|---|---|---|
| Model files | One per condition | Single unified model |
| VRAM | Cumulative per model | Single model + conditions |
| Weight management | Manual per-condition | Unified weight parameter |
| Condition conflicts | Possible | Resolved by internal routing |
New Features in Union 2.1
Expanded Condition Types
| Type ID | Condition | Description |
|---|---|---|
| 0 | Canny | Edge detection control |
| 1 | Depth | Depth map control |
| 2 | Pose | OpenPose skeleton |
| 3 | Tile | Regional tile control |
| 4 | Lineart | Line art / sketch |
| 5 | Normal | Surface normal map |
| 6 | Segment | Semantic segmentation |
| 7 | MLP | Multi-layer perception |
| 8 | Depth V3 | Enhanced depth (new) |
| 9 | Soft Edge | HED soft edges |
Type Embedding Mechanism
Control Input → Type Embedding → Union Model → Condition Features
Learnable type embeddings distinguish between condition types internally, allowing shared weights without interference.
Multi-Condition Stacking
Canny (weight 0.8) + Depth (weight 0.6) + Pose (weight 0.9)
→ Single Union 2.1 ControlNet node → KSampler
Each condition maintains its own weight while sharing the model backbone.
Fun Union ControlNet Capabilities
Fun Union ControlNet is a community variant optimized for Z-Image compatibility:
- Z-Image fine-tuned: Optimized for Z-Image's Flux-based DiT architecture
- Extended conditions: Additional types beyond standard Union
- Improved weight interpolation: Smoother transitions between condition types
- Better ComfyUI integration: Native node support
Supported Conditions
| Condition | Union 2.1 | Fun Union |
|---|---|---|
| Canny / Depth / Pose | Yes | Yes |
| Depth V3 | Yes | Yes |
| Lineart / Tile / Normal | Yes | Yes |
| Segment / Soft Edge | Yes | Yes |
| IP-Adapter | No | Yes (Fun Union extension) |
Depth V3 Improvements
Depth V3 (type ID 8), new in Union 2.1, provides:
- Higher precision: Finer depth granularity
- Better edge preservation: Depth boundaries align with visual edges
- Fewer artifacts: Reduced estimation errors in complex scenes
- Improved occlusion handling: Better overlapping object depth
Practical Use
# Type ID 8 = Depth V3
condition_type = 8
depth_map = preprocess_depth_v3(input_image)
condition = prepare_union_condition(depth_map, type_id=8)
Best for: Architectural visualization, product photography, portrait structure, landscape composition.
Combined Control Strategies
Strategy 1: Pose + Depth
Character generation with accurate spatial positioning:
OpenPose → Pose (type 2, weight 0.9)
Depth Map → Depth (type 8, weight 0.7)
→ Union 2.1 → KSampler
Strategy 2: Canny + Depth + Pose
Triple-condition for maximum structural fidelity:
Canny → Edge (type 0, weight 0.8)
Depth → Depth (type 8, weight 0.6)
OpenPose → Pose (type 2, weight 0.9)
→ Union 2.1 → KSampler
Strategy 3: Lineart + Tile
Sketch-guided generation with regional refinement:
Line Art → Lineart (type 4, weight 0.85)
Tile Map → Tile (type 3, weight 0.5)
→ Union 2.1 → KSampler
Strategy 4: Depth V3 + Segment
Semantic-aware scene generation:
Depth Map → Depth V3 (type 8, weight 0.7)
Seg Map → Segment (type 6, weight 0.6)
→ Union 2.1 → KSampler
Single-Model Multi-Condition Workflow
Workflow Architecture
┌──────────────────────────────────┐
│ Pose Image │ Depth │ Canny Map │
└──────┬─────────┬────────┬────────┘
↓ ↓ ↓
[Preprocess each condition]
↓ ↓ ↓
└────┬────┴───┬────┘
↓ ↓
[Union Condition Merge]
↓
[ControlNet Union 2.1 Apply]
↓
[KSampler + VAE Decode → Output]
ComfyUI Multi-Condition JSON
{
"2": {
"class_type": "CheckpointLoaderSimple",
"inputs": {"ckpt_name": "z-image-base.safetensors"}
},
"4": {
"class_type": "ControlNetUnion21Loader",
"inputs": {"union_model": "controlnet_union21_zimage.safetensors"}
},
"6": {"class_type": "LoadImage", "inputs": {"image": "pose.png"}},
"8": {"class_type": "LoadImage", "inputs": {"image": "depth.png"}},
"10": {"class_type": "LoadImage", "inputs": {"image": "canny.png"}},
"12": {
"class_type": "CLIPTextEncode",
"inputs": {
"text": "a woman in a red dress in a modern lobby, cinematic lighting",
"clip": ["2", 1]
}
},
"14": {
"class_type": "ControlNetUnion21Apply",
"inputs": {
"control_net": ["4", 0],
"image_pose": ["6", 0], "image_depth": ["8", 0], "image_canny": ["10", 0],
"strength_pose": 0.9, "strength_depth": 0.7, "strength_canny": 0.8,
"conditioning": ["12", 0]
}
},
"16": {
"class_type": "KSampler",
"inputs": {
"model": ["2", 0], "positive": ["14", 0],
"negative": ["12", 1], "latent_image": ["12", 0],
"seed": 42, "steps": 30, "cfg": 7.5,
"sampler_name": "euler_ancestral", "scheduler": "normal"
}
},
"18": {
"class_type": "VAEDecode",
"inputs": {"samples": ["16", 0], "vae": ["2", 2]}
},
"20": {"class_type": "SaveImage", "inputs": {"images": ["18", 0]}}
}
Weight Balancing
Weight Guidelines
| Condition | Recommended Range | Effect at High Weight |
|---|---|---|
| Pose | 0.7–1.0 | Strict skeleton adherence |
| Depth / Depth V3 | 0.5–0.8 | Strong depth fidelity |
| Canny | 0.5–0.9 | Precise edge following |
| Lineart | 0.6–0.9 | Faithful sketch rendering |
| Tile | 0.3–0.6 | Regional guidance |
| Segment | 0.4–0.7 | Semantic placement |
Resolving Conflicts
| Conflict | Resolution |
|---|---|
| Pose vs Depth | Favor Pose (0.9) over Depth (0.5) for characters |
| Canny vs Depth | Canny (0.8) + Depth (0.6) — edges dominate structure |
| Lineart vs Pose | Pose (0.9) + Lineart (0.7) — skeleton guides, lines refine |
| Segment vs Depth | Depth (0.7) + Segment (0.5) — depth dominates |
Adjustment Strategy
- Start with equal weights (0.7 for all)
- Generate a test image
- Identify over/under-influenced conditions
- Adjust weights ±0.1–0.2
- Repeat until balanced
ComfyUI Union 2.1 Nodes
Required Nodes
| Node | Purpose |
|---|---|
| ControlNetUnion21Loader | Load Union 2.1 model |
| ControlNetUnion21Apply | Apply multi-condition control |
| UnionConditioning | Merge multiple conditions |
Installation
cd ComfyUI/custom_nodes
git clone https://github.com/docssw/ComfyUI-Union.git
pip install -r ComfyUI-Union/requirements.txt
The ControlNetUnion21Apply node accepts condition images (pose, depth, canny, etc.) with per-condition strength parameters. Missing conditions are handled via zero-weight type embeddings.
Practical Examples
Example 1: Pose + Depth + Canny Combination
Inputs: OpenPose skeleton + Depth map + Canny edges
Prompt: "a detective in a trench coat in a dimly lit office,
film noir atmosphere, dramatic shadows, photorealistic"
Weights: Pose 0.9 / Depth 0.6 / Canny 0.7
Params: steps=30, cfg=7.5
Result: Character matches exact pose, positioned correctly in room layout, with furniture structure preserved.
Example 2: Architectural Visualization
Inputs: Depth V3 + Canny edges + Semantic segment
Prompt: "modern minimalist living room, white walls, large windows,
natural wood furniture, warm sunlight, architectural photography"
Weights: Depth V3 0.7 / Canny 0.8 / Segment 0.5
Params: steps=35, cfg=8.0
Result: Architecturally accurate interior with correct proportions and material placement.
Example 3: Character Design with Pose + Depth
Inputs: OpenPose + Depth V3
Prompt: "cyberpunk warrior in futuristic armor, neon accents,
detailed mechanical parts, dramatic lighting, concept art"
Weights: Pose 0.9 / Depth V3 0.65
Params: steps=30, cfg=7.5
Result: Character matches pose exactly with accurate 3D volume, suitable for design iteration.
Troubleshooting
| Issue | Solution |
|---|---|
| Conditions conflict | Adjust weights to prioritize one condition |
| No control effect | Check type IDs; increase weight to 0.7+ |
| Overly rigid output | Reduce individual weights; combined < 2.0 |
| Wrong condition applied | Verify type ID mapping |
| VRAM OOM | Downscale condition inputs to 512; keep batch=1 |
| Depth V3 artifacts | Use better depth estimation; preprocess depth maps |
Summary
ControlNet Union 2.1 consolidates multiple control conditions into a single model:
- Single model, multiple conditions: Type embeddings route pose, depth, canny, lineart, and more
- Depth V3 upgrade: Improved precision and edge handling over previous depth
- Weight balancing matters: Each condition needs individual tuning; start at 0.7
- Fun Union for Z-Image: Community-optimized variant with extended features
- Mature ComfyUI support: Complete node support for Union 2.1 workflows
Union 2.1 enables complex multi-condition workflows that were previously cumbersome with single-condition ControlNet models.