Simple-Lora

这是一个区别于automatic1111 webui,对开发者更友好的lora训练，或者说虚拟idol训练

好消息

google colab端可操作:

展示

展示一下我用少量迪丽热巴照片训练的lora效果，一个欧美混血热巴

环境

pip install -r requirements.txt
git lfs install

预训练模型

# blip 模型
wget https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth -P ./pretrained_models

# bert-base-uncased
cd pretrained_models
git clone https://huggingface.co/bert-base-uncased

# diffusion base model
# 我选用的是chilloutmix_NiPrunedFp32Fix
git clone https://huggingface.co/naonovn/chilloutmix_NiPrunedFp32Fix
# safetenosor模型转换
cd ..
python process/convert_original_stable_diffusion_to_difdusers.py \
    --checkpoint_path ./pretrained_models/chilloutmix_NiPrunedFp32Fix/chilloutmix_NiPrunedFp32Fix.safetensors \
    --dump_path ./pretrained_models/chilloutmixNiPruned_Tw1O --from_safetensors

数据准备

huggingface数据[option]
pokemon数据为例

# 下载数据
mkdir -p dataset
cd dataset
git clone https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions/

用户数据[option]
单张图片的lora训练

# 图片文本获取
python process/run_caption.py --img_base ./dataset/custom

# 将a woman 替换成<dlrb>
python process/change_txt.py --img_base ./dataset/custom --ori_txt 'a woman' --new_txt "<dlrb>"

训练

参数调整self.custom = True为True使用用户数据，False使用huggingfaec数据

--train_text_encoder # 开启text_encoder lora训练
--dist # 关闭DDP多机多卡训练模式
--batch_size 1 # 设置batch_size大小

# 训练脚本
python  train.py  --batch_size 1 --dist --train_text_encoder

推理

python inference.py \
    --mode 'lora' \
    --lora_path checkpoint/Lora/000-00000600.pth \
    --prompt  "<dlrb>,solo, long hair, black hair, choker, breasts, earrings, blue eyes, jewelry, lipstick, makeup, dark, bare shoulders, mountain, night, upper body, dress, large breasts, ((masterpiece))" \
    --outpath results/1.png \
    --num_images_per_prompt 2

越少的训练图片，选取的模型迭代次数应该越小，比如单张图训练选1000左右，10张图训练选2500左右

controlnet

新增controlnet转换，参考Here

下载原始模型v1-5-pruned.ckpt,control_sd15_openpose.pth到pretrained_models中
将自有基础模型转换成controlnet形式

python process/tool_transfer_control.py \
--path_input pretrained_models/chilloutmix_NiPrunedFp32Fix/chilloutmix_NiPrunedFp32Fix.safetensors \
--path_output pretrained_models/chilloutmix_control.pth

controlnet转成diffuser形式

python process/convert_controlnet_to_diffusers.py \
--checkpoint_path  pretrained_models/chilloutmix_control.pth \
--original_config_file model/third/cldm_v15.yaml \
--dump_path  pretrained_models/chilloutmix_control --device cuda

下载openpose模型body_pose_model.pth,hand_pose_model.pth到pretrained_models/openpose下
推理

python inference.py \
    --mode 'control' \
    --lora_path checkpoint/Lora/000-00000600.pth \
    --control_path pretrained_models/chilloutmix_control \
    --pose_img assets/pose.png \
    --prompt  "<dlrb>,solo, long hair, black hair, choker, breasts, earrings, blue eyes, jewelry, lipstick, makeup, dark, bare shoulders, mountain, night, upper body, dress, large breasts, ((masterpiece))" \
    --outpath results/1.png \
    --num_images_per_prompt 2

Inpaiting

下载模型

cd pretrained_models
git clone https://huggingface.co/runwayml/stable-diffusion-inpainting
# 下载parsing模型
wget https://github.com/LeslieZhoa/LVT/releases/download/v0.0/face_parsing.pt -P pretrained_models

推理

python inference.py \
    --mode 'inpait' \
    --inpait_path pretrained_models/stable-diffusion-inpainting \
    --mask_area all \
    --ref_img assets/ref.png \
    --prompt  "green hair,short hair,curly hair, green hair,beach,seaside" \
    --outpath results/1.png \
    --num_images_per_prompt 2

T2I-Adapter

inpaiting更加丝滑

下载adapter模型

wget https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_seg_sd14v1.pth -P pretrained_models

推理

python inference.py \
    --mode 't2iinpait' \
    --ref_img assets/t2i-input.png \
    --mask assets/t2i-mask.png \
    --adapter_mask assets/t2i-adapter.png \
    --prompt  "green hair,curly hair, green hair,beach,seaside" \
    --outpath results/1.png \
    --num_images_per_prompt 2

Insruct-Pix2Pix风格化

模型下载

cd pretrained_models
git clone https://huggingface.co/timbrooks/instruct-pix2pix

推理

python inference.py \
    --mode 'instruct' \
    --ref_img assets/t2i-input.png \
    --prompt  "turn her face to comic style" \
    --neg_prompt None \
    --image_guidance_scale 1 \
    --outpath results/1.png \
    --num_images_per_prompt 1

静态照片动起来

模型主要来源于FaceVid2Vid增加了512高清清晰度

wget https://github.com/LeslieZhoa/Simple-Lora/releases/download/v0.0/script.zip
unzip script.zip && rm -rf script.zip 
python script/run.py  --input assets/6.png
ffmpeg  -r 25 -f image2 -i results/%06d.png  -vcodec libx264   11.mp4

11.mp4

参考

https://github.com/huggingface/diffusers
https://github.com/AUTOMATIC1111/stable-diffusion-webui
https://github.com/salesforce/BLIP
https://github.com/haofanwang/Lora-for-Diffusers
https://github.com/lllyasviel/ControlNet
https://github.com/haofanwang/ControlNet-for-Diffusers
https://github.com/haofanwang/T2I-Adapter-for-Diffusers
https://github.com/TencentARC/T2I-Adapter
https://github.com/HimariO/diffusers-t2i-adapter
https://github.com/zhanglonghao1992/One-Shot_Free-View_Neural_Talking_Head_Synthesis

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
assets		assets
dataloader		dataloader
dataset/custom		dataset/custom
model		model
process		process
trainer		trainer
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
ReadMe.md		ReadMe.md
demo.ipynb		demo.ipynb
inference.py		inference.py
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple-Lora

好消息

展示

环境

预训练模型

数据准备

训练

推理

controlnet

Inpaiting

T2I-Adapter

Insruct-Pix2Pix风格化

静态照片动起来

参考

Star History

About

Releases 1

Packages

Languages

License

LeslieZhoa/Simple-Lora

Folders and files

Latest commit

History

Repository files navigation

Simple-Lora

好消息

展示

环境

预训练模型

数据准备

训练

推理

controlnet

Inpaiting

T2I-Adapter

Insruct-Pix2Pix风格化

静态照片动起来

参考

Star History

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages