Real-Time Style Transfer in TensorFlow
This repository is Tensorflow implementation of Johnson's Perceptual Losses for Real-Time Style Transfer and Super-Resolution.
It takes 385 ms on a GTX1080Ti to style the MIT Stata Center (1024x680).
- tensorflow 1.8.0
- python 3.5.3
- numpy 1.14.2
- scipy 0.19.0
- moviepy 0.2.3.2
Here we transformed every frame in a video using various stylle images, then combined the results. Click to go to the full demo on YouTube!
A photo of Chicago was applied for various style paintings. Click on the ./examples/style folder to see full applied style images. For more results you can find here.
Implementation uses TensorFlow to train a real-time style transfer network. Same transformation network is used as described in Johnson, except that batch normalization is replaced with Ulyanov's instance normalization, zero padding is replaced by reflected padding to reduce boundary artifacts, and the scaling/offset of the output tanh
layer is slightly different.
We follow Logan Engstrom to use a loss function close to the one described in Gatys, using VGG19 instead of VGG16 and typically using "shallower" layers than in Johson's implementation (e.g. relu1_1
is used rather than relu1_2
).
Use main.py
to train a new style transform network. Training takes 6-8 hours on a GTX 1080Ti. Before you run this, you should run setup.sh
. Example usage:
python main.py --style_img path/to/style/img.jpg \
--train_path path/to/trainng/data/fold \
--test_path path/to/test/data/fold \
--vgg_path path/to/vgg19/imagenet-vgg-verydeep-19.mat
--gpu_index
: gpu index, default:0
--checkpoint_dir
: dir to save checkpoint in, default:./checkpoints
--style_img
: style image path, default:./examples/style/la_muse.jpg
--train_path
: path to trainng images folder, default:../Data/coco/img/train2014
--test_path
: test image path, default:./examples/content
--test_dir
: test oa,ge save dor. default:./examples/temp
--epochs
: number of epochs for training data, default:2
--batch_size
: batch size for single feed forward, default:4
--vgg_path
: path to VGG19 network, default:../Models_zoo/imagenet-vgg-verydeep-19.mat
--content_weight
: content weight, default:7.5
--style_weight
: style weight, default:100.
--tv_weight
: total variation regularization weight, default:200.
--print_freq
: print loss frequency, default:100
--sample_freq
: sample frequency, default:2000
Use evaluate.py
to evaluate a style transfer network. Evaluation takes 300 ms per frame on a GTX 1080Ti. Takes several seconds per frame on a CPU. Models for evaluation are located here. Example usage:
python evaluate.py --checkpoint_dir path/to/checkpoint /
--in_path path/to/test/image/folder
--gpu_index
: gpu index, default:0
--checkpoint_dir
: dir to read checkpoint in, default:./checkpoints/la_muse
--in_path
: test image path, default:./examples/test
--out_path
: destination dir of transformed files, default:./examples/results
Use transform_video.py
to transfer style into a video. Requires moviepy
. Example usage:
python transform_video.py --checkpoint_dir path/to/checkpoint /
--in_path path/to/input/video.mp4 /
--out_path path/to/write/predicted_video.mp4
--gpu_index
: gpu index, default:0
--checkpoint_dir
: dir to read checkpoint in, default:./checkpoints/la_muse
--in_path
: input video path, default:None
--out_path
: path to save processed video to, default:None
@misc{chengbinjin2018realtimestyletransfer,
author = {Cheng-Bin Jin},
title = {Real-Time Style Transfer},
year = {2018},
howpublished = {\url{https://github.com/ChengBinJin/Real-time-style-transfer/}},
note = {commit xxxxxxx}
}
- This project borrowed some code from Logan Engstrom adnd Anish's Neural Style
- Some readme formatting was borrowed from Logan Engstrom
- The image of the MIT Stata Center at the very beginning of the README was taken by Juan Paulo
Copyright (c) 2018 Cheng-Bin Jin. Contact me for commercial use (or rather any use that is not academic research) (email: [email protected]). Free for research use, as long as proper attribution is given and this copyright notice is retained.