Character Consistency LoRA Training for Z-Image: Keep Your Character Consistent Across Scenes

Train a LoRA so your character maintains facial features and style across any prompt and any scene.

Why Character Consistency?

The Problem

One of the biggest pain points in AI image generation: the same character looks different every time.

Use Case	Problem	LoRA Solution
Comic/Webtoon creation	Protagonist's face changes between panels	✅ Lock facial features
Social media accounts	AI influencers need consistent identity	✅ Cross-scene consistency
Game character design	Character concept art needs same face	✅ Multi-angle consistency
Novel covers	Protagonist needs uniform look across covers	✅ Style lock

Z-Image Turbo's Advantage

Z-Image Turbo is a 6B parameter distilled model that compared to full-size models:

Trains faster: LoRA training time reduced by 60-70%
Lower VRAM: Training possible on 8GB VRAM
Faster inference: Sub-second generation
No quality compromise: 95%+ quality retention after distillation

Data Preparation

Requirements

Metric	Minimum	Recommended	High Quality
Image count	5 images	15-20 images	30+ images
Resolution	512×512	1024×1024	1024×1024+
Face clarity	Recognizable	Clear, unobstructed	Multiple angles
Background variety	Some variation	Multiple scenes	Rich variety

Image Collection Guidelines

Front-facing为主: At least 50% front/three-quarter view
Multiple angles: Include 3-4 profile/side shots
Different expressions: Smile, neutral, surprised
Different outfits: Reduce clothing bias in training
Different backgrounds: Prevent background features from being learned
Lighting variation: Natural light, indoor, side lighting

Image Preprocessing

# Preprocess images with Python
python prepare_training_data.py /
  --input ./raw_images/ /
  --output ./training_data/ /
  --size 1024 /
  --face-detect /
  --crop-face-ratio 0.6

Key steps:

Resize all images to 1024×1024
Face detection and center crop
Remove watermarks and irrelevant elements
Generate caption files (optional)

LoRA Training Configuration

Using Kohya_ss WebUI

Base Parameters

Parameter	Recommended	Notes
Model	z_image_turbo_bf16.safetensors	Z-Image Turbo
Optimizer	Prodigy	Optimized for Z-Image
Learning rate	Auto (Prodigy)	Prodigy auto-scales
Network dim	32-64	Depends on character complexity
Network alpha	16-32	Typically dim/2
Epochs	10-20	More epochs for fewer images
Batch size	1-2	Adjust based on VRAM
Resolution	1024×1024	Z-Image native resolution

Prodigy Optimizer Parameters

Prodigy is specifically designed for Z-Image Turbo:

optimizer: prodigy
lr: 1.0  # Prodigy auto-scales this
d0: 0.05  # Initial step scale
weight_decay: 0.01

Why Prodigy?

Traditional AdamW requires manual learning rate tuning
Prodigy auto-scales based on gradient magnitude
More stable training, lower overfitting risk
Especially suited for distilled models like Z-Image Turbo

Command-Line Training

accelerate launch train_text_to_image.py /
  --pretrained_model_name_or_path=./z_image_turbo /
  --train_data_dir=./training_data /
  --resolution=1024 /
  --train_batch_size=1 /
  --num_train_epochs=15 /
  --learning_rate=1.0 /
  --optimizer=prodigy /
  --optimizer_args="d0=0.05,weight_decay=0.01" /
  --lora_rank=32 /
  --lora_alpha=16 /
  --output_dir=./lora_output /
  --checkpointing_steps=500 /
  --mixed_precision=bf16

Training Monitoring

Key Metrics

Metric	Normal Range	Warning Sign
Loss	0.01-0.1, gradually decreasing	Not decreasing or volatile
Training time	~2-5 hours (15 images, 8GB)	Over 8 hours
VRAM usage	< 7GB (batch=1)	OOM error

Validation Every 5 Epochs

Test with these prompts:

# Test 1: Base consistency
character_name, portrait photo, white background, studio lighting

# Test 2: Different scene
character_name, walking in a park, sunny day

# Test 3: Different style
character_name, anime style, watercolor painting

# Test 4: Extreme test
character_name, in a sci-fi spaceship, dramatic lighting

Overfitting Detection

Symptom	Cause	Fix
Training background appears	Background learned	Add background variety
Outfit stays fixed	Clothing learned	Train with different outfits
Face is blurry	Over-trained	Reduce epochs or alpha
No change at all	Under-trained	Increase epochs or alpha

Inference Usage

ComfyUI LoRA Loading

Load Checkpoint → z_image_turbo_bf16.safetensors
    ↓
Load LoRA → character_lora.safetensors
    ↓
[Set LoRA strength]
    ↓
KSampler

LoRA Strength Tuning

Strength	Effect	Use Case
0.3-0.5	Subtle features	Style reference, loose consistency
0.6-0.8	Medium features	Recommended daily use range
0.9-1.0	Strong features	When high consistency needed
1.0-1.2	Over-strong	May show overfitting artifacts

Multi-LoRA Combinations

Load multiple LoRAs simultaneously:

LoRA 1 (Character face) — strength 0.8
LoRA 2 (Art style) — strength 0.6
LoRA 3 (Clothing style) — strength 0.5

Note: Total strength should not exceed ~2.0 to avoid artifacts.

Advanced Techniques

Multi-Character Training

For maintaining consistency across multiple characters:

Approach A: Train separate LoRA for each character
Approach B: Train multi-character LoRA (with token differentiation)

Approach A recommended: More flexible, independently adjustable per character.

Face Enhancement

After LoRA training, stack a face restore node:

KSampler → LoRA generation → Face Restore node → Output

Prompt Template

# Character consistency prompt template
[character trigger word], [age description], [outfit description],
[scene description], [action description],
[style modifiers], [quality words]

# Example
character_name, 25-year-old woman, red dress,
walking through a garden at sunset,
cinematic lighting, photorealistic, 8k, sharp focus

FAQ

Q: How many images are enough?

5 images: Bare minimum, simple scenes only
10-15 images: Recommended starting point, basic consistency
20+ images: High quality, cross-scene stability
30+ images: Professional grade, multi-angle, multi-expression

Q: How long does training take?

Images	VRAM	Optimizer	Estimated Time
10	8GB	Prodigy	~2 hours
15	8GB	Prodigy	~3 hours
20	12GB	Prodigy	~4 hours
30	16GB	Prodigy	~6 hours

Q: Character still not consistent after training?

Increase LoRA strength to 0.8-1.0
Check training image quality (faces clear?)
Increase training epochs
Ensure trigger word is at the beginning of prompt
Reduce other descriptive elements that might override character features

Q: Are LoRAs trained on Z-Image Turbo compatible with Z-Image Base?

No. Turbo and Base are different models — training parameters are not compatible. Turbo LoRAs work only with Turbo.

Summary

Z-Image Turbo Character Consistency LoRA Training Workflow:

Data prep: 15-20 clear face photos, multiple angles and scenes
Model: Z-Image Turbo + Prodigy optimizer
Config: dim=32, alpha=16, epochs=15
Monitoring: Validate every 5 epochs, detect overfitting
Inference: LoRA strength 0.6-0.8, with template prompts

Key advantages:

Fast training (8GB VRAM, 2-3 hours)
High quality (distilled model retains 95%+ quality)
Good consistency (cross-scene, cross-style feature retention)

This guide is based on ComfyUI + Z-Image Turbo + Prodigy optimizer.

Character Consistency LoRA Training for Z-Image: Keep Your Character Consistent Across Scenes

Table of Contents

Character Consistency LoRA Training for Z-Image: Keep Your Character Consistent Across Scenes

Why Character Consistency?

The Problem

Z-Image Turbo's Advantage

Data Preparation

Requirements

Image Collection Guidelines

Image Preprocessing

LoRA Training Configuration

Using Kohya_ss WebUI

Base Parameters

Prodigy Optimizer Parameters

Command-Line Training

Training Monitoring

Key Metrics

Validation Every 5 Epochs

Overfitting Detection

Inference Usage

ComfyUI LoRA Loading

LoRA Strength Tuning

Multi-LoRA Combinations

Advanced Techniques

Multi-Character Training

Face Enhancement

Prompt Template

FAQ

Q: How many images are enough?

Q: How long does training take?

Q: Character still not consistent after training?

Q: Are LoRAs trained on Z-Image Turbo compatible with Z-Image Base?

Summary