Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

Fast artistic style transfer by using feed forward network.

input image size: 1024x768
process time(CPU): 17.78sec (Core i7-5930K)
process time(GPU): 0.994sec (GPU TitanX)

Differences from original

Training

default --image_size set to 512 (original uses 256). It's slow, but time is the price you have to pay for quality
ability to switch off dataset cropping with --fullsize option. Crops by default to preserve aspect ratio
cropping implementation uses ImageOps.fit, which always scales and crops, whereas original uses custom solution, which upscales the image if it's smaller than --image_size, otherwise just crops without scaling
bicubic and Lanczos resampling when scaling dataset and input style images respectively provides sharper shrinking, whereas original uses nearest neighbour

Generating

Ability to specify multiple files for input to exclude model reloading every iteration. The format is standard Unix path expansion rules, like file* or file?.png Don't forget to quote, otherwise the shell will expand it first. Saves about 0.5 sec on each image.
Output specifies path prefix if multiple files are used for input, otherwise an explicit filename
Option -x indicates content image scaling factor before transformation
Preserve original content colors with --original_colors flag. More info: Transfer style but not the colors

Video Processing

The repo includes a bash script to transform your videos. It depends on ffmpeg. Compilation instructions

./genvid.sh input_video output_video model start_time duration

The first three arguments are mandatory and should contain path to files.
The last two are optional and indicate starting position and duration in seconds.

I integrated Optical Flow implementation by @larspars to provide more consistent output for sequence of images by smoothing out the differences between frames. It requires opencv-python. Separate thanks to @genekogan for providing a thorough explanation of remarkably simple, yet efficient steps to put this together.

To use it, append the -flow option followed by amount of alpha blending like so

python generate.py 'frames/image*.png' -m models/any.model -o dir/prefix_ -flow 0.02

I find values between 0.02 and 0.05 to work best. It calculates motion vector between previous and current source frames, applies the distortion to previously transformed frame, overlays it on top of current source frame with -flow opacity and finally transforms it. This helps the network transformation to reveal the same features in current frame as were discovered in the previous. It only affects sequence of images so if there's a single image in the list, you won't see any difference.

Requirement

Chainer

$ pip install chainer

Prerequisite

Download VGG16 model and convert it into smaller file so that we use only the convolutional layers which are 10% of the entire model.

sh setup_model.sh

Train

Need to train one image transformation network model per one style target. According to the paper, the models are trained on the Microsoft COCO dataset.

python train.py -s <style_image_path> -d <training_dataset_path> -g 0

Generate

python generate.py <input_image_path> -m <model_path> -o <output_image_path>

This repo has pretrained models as an example.

example:

python generate.py sample_images/tubingen.jpg -m models/composition.model -o sample_images/output.jpg

or

python generate.py sample_images/tubingen.jpg -m models/seurat.model -o sample_images/output.jpg

Difference from paper

Convolution kernel size 4 instead of 3.
Training with batchsize(n>=2) causes unstable result.

No Backward Compatibility

Jul. 19, 2016

This version is not compatible with the previous versions. You can't use models trained by the previous implementation. Sorry for the inconvenience!

License

MIT

Reference

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Codes written in this repository based on following nice works, thanks to the author.

chainer-gogh Chainer implementation of neural-style. I heavily referenced it.
chainer-cifar10 Residual block implementation is referred.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
models		models
sample_images		sample_images
.gitignore		.gitignore
README.md		README.md
create_chainer_model.py		create_chainer_model.py
generate.py		generate.py
genvid.sh		genvid.sh
net.py		net.py
setup_model.sh		setup_model.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

Differences from original

Training

Generating

Video Processing

Requirement

Prerequisite

Train

Generate

Difference from paper

No Backward Compatibility

Jul. 19, 2016

License

Reference

About

Releases

Packages

Languages

6o6o/chainer-fast-neuralstyle

Folders and files

Latest commit

History

Repository files navigation

Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

Differences from original

Training

Generating

Video Processing

Requirement

Prerequisite

Train

Generate

Difference from paper

No Backward Compatibility

Jul. 19, 2016

License

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages