Z-Image img2img 图像转换工作流：风格重塑与细节增强完全指南

关键词：z-image img2img image-to-image workflow

引言

Image-to-Image（img2img）是 AI 图像生成中最重要的工作流之一。它允许用户以现有图像为输入，通过提示词引导生成新的变体。与文生图（text2img）不同，img2img 保留了原始图像的结构和部分内容，同时根据提示词进行风格或内容的转换。

本文提供 Z-Image img2img 的完整进阶指南，与 ZI-008 中的基础介绍不同，本文聚焦于高级工作流和参数调优技术。

参考资源包括 YouTube "Image-to-Image with Z Image Turbo" 教程及社区工作流。

img2img 基础原理

技术原理

img2img 的核心机制是在文本条件基础上加入图像条件：

图像编码：输入图像通过 VAE 编码为 latent 空间表示
噪声添加：根据 strength 参数向 latent 添加不同程度的噪声
条件生成：模型在噪声和文本提示的双重引导下进行去噪
迭代去噪：逐步去噪生成最终结果

img2img vs text2img vs inpainting

工作流	输入	控制方式	主要用途
text2img	纯提示词	提示词 + 随机种子	从零生成
img2img	图像 + 提示词	提示词 + 原始图像 + strength	风格转换/增强
inpainting	图像 + 遮罩 + 提示词	遮罩区域重新生成	局部编辑

Strength 参数详解

Strength 参数核心作用

strength 是 img2img 最重要的参数，控制原始图像被"改变"的程度：

strength = 0.0：完全保留原始图像，不做任何修改
strength = 1.0：完全忽略原始图像，等同于 text2img
strength = 0.3-0.7：最常用的范围，保留结构同时应用风格

Strength 效果对比

使用同一张人物照片进行 img2img，提示词为 "oil painting style, warm colors"：

strength	效果描述	适用场景
0.1-0.2	轻微风格化，几乎看不出变化	微调、色彩调整
0.3-0.4	明显风格化，保留大部分结构	风格迁移、质量提升
0.5-0.6	显著变化，结构保留约 50%	风格重塑、中等改造
0.7-0.8	大幅变化，只保留大致轮廓	创意转换、大幅改造
0.9-1.0	几乎全新生成，参考原始构图	参考构图生成新图

Strength 选择决策指南

需要什么程度的变化？
  ↓
  微调（颜色/光影） → strength 0.1-0.3
  风格迁移（保留构图） → strength 0.3-0.5
  创意转换（改变风格） → strength 0.5-0.7
  参考构图（重新创作） → strength 0.7-0.9

降噪强度指南

Denoising Strength 参数

在 diffusers 和 ComfyUI 中，img2img 使用 denoising_strength 参数（等价于 strength）：

from diffusers import ZImagePipeline
import torch

pipe = ZImagePipeline.from_pretrained(
    "Tongyi-ZImage/Z-Image-Turbo",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda")

# img2img
result = pipe(
    prompt="a watercolor painting of a forest",
    image=input_image,
    strength=0.5,              # 关键参数
    guidance_scale=7.5,
    num_inference_steps=28,
    width=1024,
    height=1024
)

不同用途的推荐参数组合

用途	strength	guidance_scale	steps	说明
照片修复	0.1-0.3	5-7	20-28	保留细节，轻微优化
风格迁移	0.4-0.6	7-9	28-40	平衡结构与风格
草图细化	0.6-0.8	7-10	30-40	从草图到完整图像
低分辨率增强	0.3-0.5	5-7	28-35	提升细节和清晰度
艺术创作	0.7-0.9	8-12	30-50	创意性转换

ComfyUI img2img 工作流

基础 img2img 工作流

┌─────────────┐
│ Load Image  │──── 输入图像 ───┐
└─────────────┘                 │
                                ↓
┌─────────────┐           ┌──────────────┐
│ Load Model  │──── 模型 ──→  KSampler   │──── Latent
└─────────────┘           │ (img2img)    │
                          └──────────────┘
                                ↑
┌─────────────┐                 │
│ Text Prompt │──── 提示词 ─────┘
└─────────────┘

┌─────────────┐
│ VAE Encode  │──── 图像编码 ──→ (KSampler 作为初始 latent)
└─────────────┘

┌─────────────┐
│ VAE Decode  │──── Latent 解码 ──→ 输出图像
└─────────────┘

ComfyUI JSON 工作流

{
  "2": {
    "class_type": "CheckpointLoaderSimple",
    "inputs": {
      "ckpt_name": "z-image-turbo.safetensors"
    }
  },
  "4": {
    "class_type": "LoadImage",
    "inputs": {
      "image": "input_photo.jpg",
      "upload": "image"
    }
  },
  "6": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "text": "professional oil painting, warm golden tones, textured brush strokes, museum quality",
      "clip": ["2", 1]
    }
  },
  "8": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "text": "blurry, low quality, watermark, text",
      "clip": ["2", 1]
    }
  },
  "10": {
    "class_type": "VAELoader",
    "inputs": {
      "vae_name": "zimage_vae.safetensors"
    }
  },
  "12": {
    "class_type": "VAEEncode",
    "inputs": {
      "pixels": ["4", 0],
      "vae": ["10", 0]
    }
  },
  "14": {
    "class_type": "SetLatentNoiseMask",
    "inputs": {
      "samples": ["12", 0],
      "mask": null
    }
  },
  "16": {
    "class_type": "KSampler",
    "inputs": {
      "model": ["2", 0],
      "positive": ["6", 0],
      "negative": ["8", 0],
      "latent_image": ["14", 0],
      "seed": 42,
      "steps": 28,
      "cfg": 7.5,
      "sampler_name": "euler_ancestral",
      "scheduler": "normal",
      "denoise": 0.5
    }
  },
  "18": {
    "class_type": "VAEDecode",
    "inputs": {
      "samples": ["16", 0],
      "vae": ["10", 0]
    }
  },
  "20": {
    "class_type": "SaveImage",
    "inputs": {
      "images": ["18", 0]
    }
  }
}

高级 img2img 工作流：多阶段处理

阶段 1：风格迁移
  输入图像 → strength 0.5 → 风格化结果
        ↓
阶段 2：细节增强
  风格化结果 → strength 0.2 → 增强细节
        ↓
阶段 3：超分辨率
  增强结果 → upscale → 高分辨率输出

风格迁移工作流

与 IP-Adapter 的区别

方法	原理	控制精度	适用场景
img2img	原始图像作为初始 latent + 噪声	保留结构和构图	整体风格转换
IP-Adapter	提取参考图像特征注入	精确风格复制	特定风格迁移
LoRA	训练特定风格的模型	最高精度	固定风格复用

img2img 风格迁移步骤

案例：照片转油画

步骤 1：准备输入照片
  - 选择高质量照片
  - 调整到目标分辨率（1024x1024）

步骤 2：设置参数
  - strength: 0.5（中等变化）
  - guidance_scale: 8.0
  - steps: 30

步骤 3：编写提示词
  "masterpiece oil painting, impressionist style,
   thick brush strokes, vibrant colors,
   canvas texture, gallery quality"

步骤 4：执行生成
  - 使用 Z-Image Turbo 快速预览
  - 使用 Z-Image Base 精细生成

步骤 5：迭代优化
  - 调整 strength 精细控制
  - 尝试不同种子获得变体

不同风格迁移参数推荐

目标风格	strength	guidance_scale	关键提示词
油画	0.4-0.6	7-9	"oil painting, thick brush strokes"
水彩	0.5-0.7	7-8	"watercolor painting, soft edges, wet-on-wet"
素描	0.4-0.6	8-10	"pencil sketch, graphite, detailed lines"
像素艺术	0.6-0.8	9-11	"pixel art, 16-bit, retro gaming style"
赛博朋克	0.5-0.7	8-10	"cyberpunk, neon lights, futuristic city"
动漫	0.5-0.7	7-9	"anime style, cel shading, vibrant colors"

细节增强工作流

低分辨率图像增强

场景：将低分辨率图像转换为高分辨率、高细节版本

# 两阶段增强工作流
from diffusers import ZImagePipeline

pipe = ZImagePipeline.from_pretrained(
    "Tongyi-ZImage/Z-Image-Base",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda")

# 阶段 1：初步增强（低 strength）
stage1 = pipe(
    prompt="ultra detailed, sharp focus, high resolution,
            professional photography, 8K quality",
    image=low_res_input,
    strength=0.3,
    guidance_scale=6.0,
    num_inference_steps=30,
    width=1024,
    height=1024
)

# 阶段 2：精细增强（更低 strength）
stage2 = pipe(
    prompt="extreme detail, sharp focus, crystal clear,
            high resolution, professional quality",
    image=stage1.images[0],
    strength=0.2,
    guidance_scale=5.0,
    num_inference_steps=25,
    width=2048,
    height=2048
)

细节增强参数建议

阶段	strength	guidance_scale	steps	目的
初步增强	0.3	6.0	30	修复基本质量问题
精细增强	0.15	5.0	25	添加细节纹理
最终优化	0.1	4.0	20	微调光影色彩

草图到图像

草图到图像工作流

将手绘草图转换为完整图像是 img2img 的经典应用。

基本方法

步骤 1：准备草图
  - 线条清晰的草图（黑白线条图最佳）
  - 或使用 ControlNet Canny 从草图提取边缘

步骤 2：设置参数
  - strength: 0.7-0.85（高变化度，让模型填充内容）
  - guidance_scale: 8-10（强提示词引导）
  - steps: 30-40

步骤 3：编写描述性提示词
  详细描述草图要转换的目标场景

结合 ControlNet 的草图到图像

草图 → ControlNet (Canny/Lineart) + img2img
         ↓
   KSampler (strength 0.75)
         ↓
   完整图像

{
  "ControlNetApply": {
    "inputs": {
      "conditioning": ["CLIPTextEncode", 0],
      "control_net": ["ControlNetLoader", 0],
      "image": ["LoadImage", 0],
      "strength": 0.8
    }
  }
}

照片增强

老照片修复

# 老照片修复工作流
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-ZImage/Z-Image-Base",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda")

result = pipe(
    prompt="professional portrait photography, sharp focus,
            high quality, detailed, color corrected,
            professional retouching",
    image=old_photo,
    strength=0.25,
    guidance_scale=5.0,
    num_inference_steps=30,
    width=1024,
    height=1024
)

照片到插画转换

步骤 1：准备照片
步骤 2：strength 0.5-0.6
步骤 3：提示词 "vector illustration, flat design, clean lines"
步骤 4：执行 img2img
步骤 5：迭代优化（降低 strength 微调）

批量 img2img 处理

ComfyUI 批量处理工作流

┌───────────────────┐
│ Load Images       │──── 图像列表
│ (Directory)       │
└───────────────────┘
         ↓
┌───────────────────┐
│ Batch Process     │──── 批量处理
│ (img2img Loop)    │
└───────────────────┘
         ↓
┌───────────────────┐
│ Save Images       │──── 批量保存
│ (Directory)       │
└───────────────────┘

Python 批量处理脚本

import os
from diffusers import ZImagePipeline
from PIL import Image

pipe = ZImagePipeline.from_pretrained(
    "Tongyi-ZImage/Z-Image-Turbo",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda")

input_dir = "./input_images"
output_dir = "./output_images"
os.makedirs(output_dir, exist_ok=True)

prompt = "professional photography, sharp focus, 8K quality"
strength = 0.4
guidance_scale = 7.5
steps = 28

for img_name in os.listdir(input_dir):
    if img_name.lower().endswith(('.jpg', '.jpeg', '.png')):
        img_path = os.path.join(input_dir, img_name)
        input_image = Image.open(img_path).convert("RGB")

        result = pipe(
            prompt=prompt,
            image=input_image,
            strength=strength,
            guidance_scale=guidance_scale,
            num_inference_steps=steps,
            width=1024,
            height=1024
        )

        out_path = os.path.join(output_dir, f"enhanced_{img_name}")
        result.images[0].save(out_path)
        print(f"Processed: {img_name}")

实践案例

案例 1：照片转插画

输入：城市街景照片
提示词：vector illustration, flat design, vibrant colors, clean lines, modern style, graphic design
参数：strength=0.55, guidance=8.0, steps=30

效果：照片转换为矢量风格插画，保留建筑轮廓但应用平面设计风格

案例 2：草图转写实

输入：人物素描草图
提示词：photorealistic portrait, natural lighting, detailed skin texture, professional photography, 8K
参数：strength=0.75, guidance=9.0, steps=35
配合：ControlNet Canny (strength=0.8)

效果：草图转换为写实质感的人物肖像

案例 3：低分辨率增强

输入：512x512 低分辨率产品照片
提示词：ultra high resolution product photography, sharp details, studio lighting, professional quality
参数：strength=0.3, guidance=6.0, steps=30
输出：1024x1024 高清产品照片

案例 4：风格迁移 - 赛博朋克

输入：普通城市街景
提示词：cyberpunk city, neon lights, rain, holographic advertisements, futuristic, moody atmosphere
参数：strength=0.6, guidance=8.5, steps=32

效果：普通街景转换为赛博朋克风格的未来城市

常见问题与解决

问题 1：生成结果与原始图像差异太小

原因：strength 值过低
解决：

增加 strength 到 0.5-0.7
增加 guidance_scale
使用更具描述性的提示词

问题 2：生成结果与原始图像差异太大

原因：strength 值过高
解决：

降低 strength 到 0.3-0.5
使用 ControlNet 辅助保留结构
降低 guidance_scale

问题 3：细节丢失

原因：strength 过高导致原始细节被噪声覆盖
解决：

使用多阶段处理（低 strength 逐步增强）
结合 ControlNet 保留结构信息
在提示词中强调 "detailed", "sharp"

问题 4：颜色偏差

原因：模型重新生成了颜色信息
解决：

降低 strength
提示词中包含颜色描述
使用色彩转移后处理

问题 5：批量处理时质量不一致

原因：不同输入图像的复杂度差异
解决：

对每张图片调整 strength
使用固定种子确保一致性
预处理统一输入图像质量

总结

Z-Image 的 img2img 工作流提供了强大的图像转换能力。与 ZI-008 中的基础介绍相比，本文深入讲解了：

Strength 参数调优：精确控制变化程度的核心技术
多阶段处理：通过多次 img2img 逐步优化结果
风格迁移技巧：不同风格的参数组合推荐
草图到图像：结合 ControlNet 的创作工作流
批量处理：高效处理大量图像的方法

掌握这些进阶技巧，可以充分发挥 Z-Image 在图像转换领域的潜力。

Z-Image img2img 图像转换工作流：风格重塑与细节增强完全指南

目录

Z-Image img2img 图像转换工作流：风格重塑与细节增强完全指南

目录

引言

img2img 基础原理

技术原理

img2img vs text2img vs inpainting

Strength 参数详解

Strength 参数核心作用

Strength 效果对比

Strength 选择决策指南

降噪强度指南

Denoising Strength 参数

不同用途的推荐参数组合

ComfyUI img2img 工作流

基础 img2img 工作流

ComfyUI JSON 工作流

高级 img2img 工作流：多阶段处理

风格迁移工作流

与 IP-Adapter 的区别

img2img 风格迁移步骤

不同风格迁移参数推荐

细节增强工作流

低分辨率图像增强

细节增强参数建议

草图到图像

草图到图像工作流

基本方法

结合 ControlNet 的草图到图像

照片增强

老照片修复

照片到插画转换

批量 img2img 处理

ComfyUI 批量处理工作流

Python 批量处理脚本

实践案例

案例 1：照片转插画

案例 2：草图转写实

案例 3：低分辨率增强

案例 4：风格迁移 - 赛博朋克

常见问题与解决

问题 1：生成结果与原始图像差异太小

问题 2：生成结果与原始图像差异太大

问题 3：细节丢失

问题 4：颜色偏差

问题 5：批量处理时质量不一致

总结