Skip to content

Commit f5ee2d5

Browse files
authored
Merge pull request #21 from albarji/feauture/multiresolution
Gatys multiresolution and automatic tile adjustment based on GPU model
2 parents a78399a + 427ebdd commit f5ee2d5

File tree

9 files changed

+188
-40
lines changed

9 files changed

+188
-40
lines changed

Dockerfile

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,13 @@ RUN curl -o Miniconda3-latest-Linux-x86_64.sh https://repo.continuum.io/minicond
2929
&& chmod +x Miniconda3-latest-Linux-x86_64.sh \
3030
&& ./Miniconda3-latest-Linux-x86_64.sh -b -p "${MINICONDA_HOME}" \
3131
&& rm Miniconda3-latest-Linux-x86_64.sh
32+
COPY conda.txt conda.txt
33+
RUN conda install -y --file=conda.txt
34+
RUN conda clean -y -i -l -p -t && \
35+
rm -f conda.txt
36+
COPY pip.txt pip.txt
37+
RUN pip install -r pip.txt && \
38+
rm -f pip.txt
3239

3340
# Clone neural-style app
3441
WORKDIR /app
@@ -58,9 +65,11 @@ RUN ln -s /app/neural-style/models /app/style-swap/models
5865
# Add precomputed inverse network model
5966
ADD models/dec-tconv-sigmoid.t7 /app/style-swap/models/dec-tconv-sigmoid.t7
6067

61-
# Copy wrapper scripts
68+
# Copy wrapper scripts and config files
6269
COPY ["entrypoint.py" ,"/app/entrypoint/"]
6370
COPY ["/neuralstyle/*.py", "/app/entrypoint/neuralstyle/"]
71+
COPY ["gpuconfig.json", "/app/entrypoint/"]
6472

73+
WORKDIR /app/entrypoint
6574
ENTRYPOINT ["python", "/app/entrypoint/entrypoint.py"]
6675

README.md

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ A dockerized version of neural style transfer algorithms.
2222

2323
* [docker](https://www.docker.com/)
2424
* [nvidia-docker](https://github.com/NVIDIA/nvidia-docker)
25-
* Appropriate nvidia drivers for your GPU
25+
* Appropriate [nvidia drivers](http://www.nvidia.es/Download/index.aspx) for your GPU
2626

2727
### Installation
2828

@@ -79,14 +79,16 @@ Better results can be attained by modifying some of the transfer parameters.
7979
The --alg parameter allows changing the neural style transfer algorithm to use.
8080

8181
* **gatys**: highly detailed transfer, slow processing times (default)
82+
* **gatys-multiresolution**: multipass version of Gatys method, provides even better quality, but is also much slower
8283
* **chen-schmidt**: fast patch-based style transfer
8384
* **chen-schmidt-inverse**: even faster aproximation to chen-schmidt through the use of an inverse network
8485

8586
The following example illustrates kind of results to be expected by these different algorithms
8687

8788
| Content image | Algorithm | Style image |
8889
| ------------- | --------- | ----------- |
89-
| ![Content](./doc/avila-walls.jpg) | Gatys ![Gatys](./doc/avila-walls_broca_gatys_ss1.0_sw10.0.jpg) | ![Style](./doc/broca.jpg) |
90+
| ![Content](./doc/avila-walls.jpg) | Gatys ![Gatys](./doc/avila-walls_broca_gatys_ss1.0_sw10.0.jpg) | ![Style](./doc/broca.jpg) |
91+
| ![Content](./doc/avila-walls.jpg) | Gatys Multiresolution ![Gatys-Multiresolution](./doc/avila-walls_broca_gatys-multiresolution_ss1.0_sw3.0.jpg) | ![Style](./doc/broca.jpg) |
9092
| ![Content](./doc/avila-walls.jpg) | Chen-Schmidt ![Chen-Schmidt](./doc/avila-walls_broca_chen-schmidt_ss1.0.jpg) | ![Style](./doc/broca.jpg) |
9193
| ![Content](./doc/avila-walls.jpg) | Chen-Schmidt Inverse ![Chen-Schmidt Inverse](./doc/avila-walls_broca_chen-schmidt-inverse_ss1.0.jpg) | ![Style](./doc/broca.jpg) |
9294

@@ -102,28 +104,37 @@ of the target image, the height being scaled accordingly to keep proportion.
102104

103105
If the image to be generated is large, a tiling strategy will be used, applying the neural style transfer method
104106
to small tiles of the image and stitching them together. Tiles overlap to provide some guarantees on overall
105-
consistency.
107+
consistency, though results might vary depending on the algorithm used.
106108

107109
![Tiling](./doc/tiling.png)
108110

109-
You can control the size of these tiles through the --tilesize parameter.
110-
Higher values will generally produce better quality results and faster rendering times, but they will also incur in
111-
larger memory consumption.
112-
Note also that since the full style image is applied to each tile, as a result the style features will appear
111+
The size of these tiles is defined through the configuration file **gpuconfig.json** inside the container.
112+
This file contains dictionary keys for different GPU models and each neural style algorithm. Your GPU will be
113+
automatically checked against the registered configurations and the appropriate tile size will be selected. These values
114+
have been chosen to maximize the use of the available GPU memory, asumming the whole GPU is available for the style
115+
transfer task.
116+
117+
If your GPU is not included in the configuration file, the *default* values will we used instead, though to obtain
118+
better performance you might want to edit this file and rebuild the docker images.
119+
120+
Note also that since the full style image is applied to each tile separately, as a result the style features will appear
113121
as smaller in the rendered image.
114122

115123
#### Style weight
116124

117-
Gatys algorithm allows to adjust the amount of style imposed over the content image, by means of the --sw parameter.
118-
By default a value of **5** is used, meaning the importance of the style is 5 times the importance of the content.
119-
Smaller weight values result in the transfer of colors, while higher values transfer textures and even objects of the
120-
style.
125+
Gatys and Gatys Multiresolution algorithms allow to adjust the amount of style imposed over the content image, by means
126+
of the --sw parameter. By default a value of **5** is used, meaning the importance of the style is 5 times the
127+
importance of the content. Smaller weight values result in the transfer of colors, while higher values transfer textures
128+
and even objects of the style.
121129

122130
If several weight values are provided, all combinations will be generated. For instance, to generate the same
123131
style transfer with three different weights, use
124132

125133
nvidia-docker run --rm -v $(pwd):/images albarji/neural-style --content contents/docker.png --style styles/vangogh.png --sw 5 10 20
126-
134+
135+
Note also that they Gatys Multiresolution algorithm tends to produce a stronger style imprint, and this you might want
136+
to use weight values smaller than the default (e.g. 3).
137+
127138
#### Style scale
128139

129140
If the transferred style results in too large or too small features, the scaling can be modified through the --ss
@@ -145,5 +156,7 @@ logo example above the transparent background is not transformed.
145156
* [Gatys et al method](https://arxiv.org/abs/1508.06576), [implementation by jcjohnson](https://github.com/jcjohnson/neural-style)
146157
* [Chen-Schmidt method](https://arxiv.org/pdf/1612.04337.pdf), [implementation](https://github.com/rtqichen/style-swap)
147158
* [A review on style transfer methods](https://arxiv.org/pdf/1705.04058.pdf)
159+
* [Controlling Perceptual Factors in Neural Style Transfer](https://arxiv.org/abs/1611.07865)
148160
* [Neural-tiling method](https://github.com/ProGamerGov/Neural-Tile)
161+
* [Multiresolution strategy](https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db)
149162
* [The Wikipedia logo](https://en.wikipedia.org/wiki/Wikipedia_logo)

conda.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
numpy
81.7 KB
Loading

entrypoint.py

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@
1919
--ss STYLE_SCALE (default 1.0): scaling or list of scaling factors for the style images
2020
--alg ALGORITHM: style-transfer algorithm to use. Must be one of the following:
2121
gatys Highly detailed transfer, slow processing times (default)
22+
gatys-multiresolution Multipass version of Gatys method, provides even better quality
2223
chen-schmidt Fast patch-based style transfer
2324
chen-schmidt-inverse Even faster aproximation to chen-schmidt through the use of an inverse network
24-
--tilesize TILE_SIZE: maximum size of each tile in the style transfer.
25-
If your GPU runs out of memory you should try reducing this value. Default: 400
26-
--tileoverlap TILE_OVERLAP: overlap of tiles in the style transfer, measured in pixels. Default: 100
25+
--tileoverlap TILE_OVERLAP: overlap of tiles in the style transfer, measured in pixels. If you experience
26+
artifacts in the image you should try increasing this. Default: 100
2727
2828
Additionally provided parameters are carried on to the underlying algorithm.
2929
@@ -42,7 +42,6 @@ def main(argv=None):
4242
alg = "gatys"
4343
weights = None
4444
stylescales = None
45-
tilesize = None
4645
tileoverlap = None
4746
otherparams = []
4847

@@ -72,9 +71,6 @@ def main(argv=None):
7271
elif argv[i] == "--ss":
7372
stylescales = [float(x) for x in sublist(argv[i+1:], stopper="-")]
7473
i += len(stylescales) + 1
75-
elif argv[i] == "--tilesize":
76-
tilesize = int(argv[i+1])
77-
i += 2
7874
elif argv[i] == "--tileoverlap":
7975
tileoverlap = int(argv[i+1])
8076
i += 2
@@ -100,10 +96,8 @@ def main(argv=None):
10096
LOGGER.info("\tStyle weights = %s" % str(weights))
10197
LOGGER.info("\tStyle scales = %s" % str(stylescales))
10298
LOGGER.info("\tSize = %s" % str(size))
103-
LOGGER.info("\tTile size = %s" % str(tilesize))
10499
LOGGER.info("\tTile overlap = %s" % str(tileoverlap))
105-
styletransfer(contents, styles, savefolder, size, alg, weights, stylescales, tilesize, tileoverlap,
106-
algparams=otherparams)
100+
styletransfer(contents, styles, savefolder, size, alg, weights, stylescales, tileoverlap, algparams=otherparams)
107101
return 1
108102

109103
except Exception:

gpuconfig.json

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
{
2+
"GeForce GTX 970M": {
3+
"gatys": 512,
4+
"gatys-multiresolution": 750,
5+
"chen-schmidt": 750,
6+
"chen-schmidt-inverse": 400
7+
},
8+
"Tesla K80": {
9+
"gatys": 1300,
10+
"gatys-multiresolution": 1300,
11+
"chen-schmidt": 1500,
12+
"chen-schmidt-inverse": 800
13+
},
14+
"Tesla P100-PCIE-16GB": {
15+
"gatys": 1300,
16+
"gatys-multiresolution": 1300,
17+
"chen-schmidt": 2048,
18+
"chen-schmidt-inverse": 900
19+
},
20+
"default": {
21+
"gatys": 512,
22+
"gatys-multiresolution": 750,
23+
"chen-schmidt": 750,
24+
"chen-schmidt-inverse": 400
25+
}
26+
}

neuralstyle/algorithms.py

Lines changed: 110 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@
55
from shutil import copyfile
66
import logging
77
from math import ceil
8+
import numpy as np
9+
import json
10+
import GPUtil
811
from neuralstyle.utils import filename, fileext
912
from neuralstyle.imagemagick import (convert, resize, shape, assertshape, choptiles, feather, smush, composite,
1013
extractalpha, mergealpha)
@@ -29,6 +32,7 @@
2932
"-num_iterations", "500"
3033
]
3134
},
35+
"gatys-multiresolution": {},
3236
"chen-schmidt": {
3337
"folder": "/app/style-swap",
3438
"command": "th style-swap.lua",
@@ -46,16 +50,20 @@
4650
}
4751
}
4852

53+
# Load file with GPU configuration
54+
with open("gpuconfig.json", "r") as f:
55+
GPUCONFIG = json.load(f)
56+
4957

5058
def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=None, stylescales=None,
51-
maxtilesize=400, tileoverlap=100, algparams=None):
59+
tileoverlap=100, algparams=None):
5260
"""General style transfer routine over multiple sets of options"""
5361
# Check arguments
5462
if alg not in ALGORITHMS.keys():
5563
raise ValueError("Unrecognized algorithm %s, must be one of %s" % (alg, str(list(ALGORITHMS.keys()))))
5664

5765
# Plug default options
58-
if alg != "gatys":
66+
if alg != "gatys" and alg != "gatys-multiresolution":
5967
if weights is not None:
6068
LOGGER.warning("Only gatys algorithm accepts style weights. Ignoring style weight parameters")
6169
weights = [None]
@@ -64,8 +72,6 @@ def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=
6472
weights = [5.0]
6573
if stylescales is None:
6674
stylescales = [1.0]
67-
if maxtilesize is None:
68-
maxtilesize = 400
6975
if tileoverlap is None:
7076
tileoverlap = 100
7177
if algparams is None:
@@ -75,13 +81,13 @@ def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=
7581
for content, style, weight, scale in product(contents, styles, weights, stylescales):
7682
outfile = outname(savefolder, content, style, alg, scale, weight)
7783
# If the desired size is smaller than the maximum tile size, use a direct neural style
78-
if fitsingletile(targetshape(content, size), maxtilesize):
84+
if fitsingletile(targetshape(content, size), alg):
7985
styletransfer_single(content=content, style=style, outfile=outfile, size=size, alg=alg, weight=weight,
8086
stylescale=scale, algparams=algparams)
8187
# Else use a tiling strategy
8288
else:
83-
neuraltile(content=content, style=style, outfile=outfile, size=size, maxtilesize=maxtilesize,
84-
overlap=tileoverlap, alg=alg, weight=weight, stylescale=scale, algparams=algparams)
89+
neuraltile(content=content, style=style, outfile=outfile, size=size, overlap=tileoverlap, alg=alg,
90+
weight=weight, stylescale=scale, algparams=algparams)
8591

8692

8793
def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight=5.0, stylescale=1.0, algparams=None):
@@ -101,6 +107,8 @@ def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight
101107
algfile = workdir.name + "/" + "algoutput.png"
102108
if alg == "gatys":
103109
gatys(rgbfile, stylepng, algfile, size, weight, stylescale, algparams)
110+
elif alg == "gatys-multiresolution":
111+
gatys_multiresolution(rgbfile, stylepng, algfile, size, weight, stylescale, algparams)
104112
elif alg in ["chen-schmidt", "chen-schmidt-inverse"]:
105113
chenschmidt(alg, rgbfile, stylepng, algfile, size, stylescale, algparams)
106114
# Enforce correct size
@@ -111,8 +119,8 @@ def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight
111119
mergealpha(algfile, alphafile, outfile)
112120

113121

114-
def neuraltile(content, style, outfile, size=None, maxtilesize=400, overlap=100, alg="gatys", weight=5.0,
115-
stylescale=1.0, algparams=None):
122+
def neuraltile(content, style, outfile, size=None, overlap=100, alg="gatys", weight=5.0, stylescale=1.0,
123+
algparams=None):
116124
"""Strategy to generate a high resolution image by running style transfer on overlapping image tiles"""
117125
LOGGER.info("Starting tiling strategy")
118126
if algparams is None:
@@ -123,7 +131,7 @@ def neuraltile(content, style, outfile, size=None, maxtilesize=400, overlap=100,
123131
fullshape = targetshape(content, size)
124132

125133
# Compute number of tiles required to map all the image
126-
xtiles, ytiles = tilegeometry(fullshape, maxtilesize, overlap)
134+
xtiles, ytiles = tilegeometry(fullshape, alg, overlap)
127135

128136
# First scale image to target resolution
129137
firstpass = workdir.name + "/" + "lowres.png"
@@ -187,6 +195,69 @@ def gatys(content, style, outfile, size, weight, stylescale, algparams):
187195
tmpout.close()
188196

189197

198+
def gatys_multiresolution(content, style, outfile, size, weight, stylescale, algparams, startres=256):
199+
"""Runs a multiresolution version of Gatys et al method
200+
201+
The multiresolution strategy starts by generating a small image, then using that image as initializer
202+
for higher resolution images. This procedure is repeated up to the tilesize.
203+
204+
Once the maximum tile size attainable by L-BFGS is reached, more iterations are run by using Adam. This allows
205+
to produce larger images using this method than the basic Gatys.
206+
207+
References:
208+
* Gatys et al - Controlling Perceptual Factors in Neural Style Transfer (https://arxiv.org/abs/1611.07865)
209+
* https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db
210+
"""
211+
# Multiresolution strategy: list of rounds, each round composed of a optimization method and a number of
212+
# upresolution steps.
213+
# Using "adam" as optimizer means that Adam will be used when necessary to attain higher resolutions
214+
strategy = [
215+
["lbfgs", 7],
216+
["lbfgs", 7],
217+
["lbfgs", 7],
218+
["lbfgs", 7],
219+
["lbfgs", 7]
220+
]
221+
LOGGER.info("Starting gatys-multiresolution with strategy " + str(strategy))
222+
223+
# Initialization
224+
workdir = TemporaryDirectory()
225+
maxres = targetshape(content, size)[0]
226+
if maxres < startres:
227+
LOGGER.warning("Target resolution (%d) might too small for the multiresolution method to work well" % maxres)
228+
startres = maxres / 2.0
229+
seed = None
230+
tmpout = workdir.name + "/tmpout.png"
231+
232+
# Iterate over rounds
233+
for roundnumber, (optimizer, steps) in enumerate(strategy):
234+
LOGGER.info("gatys-multiresolution round %d with %s optimizer and %d steps" % (roundnumber, optimizer, steps))
235+
roundmax = min(maxtile("gatys"), maxres) if optimizer == "lbfgs" else maxres
236+
resolutions = np.linspace(startres, roundmax, steps, dtype=int)
237+
iters = 1000
238+
for stepnumber, res in enumerate(resolutions):
239+
stepopt = "adam" if res > maxtile("gatys") else "lbfgs"
240+
LOGGER.info("Step %d, resolution %d, optimizer %s" % (stepnumber, res, stepopt))
241+
passparams = algparams[:]
242+
passparams.extend([
243+
"-num_iterations", iters,
244+
"-tv_weight", "0",
245+
"-print_iter", "0",
246+
"-optimizer", stepopt
247+
])
248+
if seed is not None:
249+
passparams.extend([
250+
"-init", "image",
251+
"-init_image", seed
252+
])
253+
gatys(content, style, tmpout, res, weight, stylescale, passparams)
254+
seed = workdir.name + "/seed.png"
255+
copyfile(tmpout, seed)
256+
iters = max(iters/2.0, 100)
257+
258+
convert(tmpout, outfile)
259+
260+
190261
def chenschmidt(alg, content, style, outfile, size, stylescale, algparams):
191262
"""Runs Chen and Schmidt fast style-transfer algorithm
192263
@@ -250,16 +321,20 @@ def correctshape(result, original, size=None):
250321
assertshape(result, targetshape(original, size))
251322

252323

253-
def tilegeometry(imshape, maxtilesize=400, overlap=50):
324+
def tilegeometry(imshape, alg, overlap=50):
254325
"""Given the shape of an image, computes the number of X and Y tiles to cover it"""
326+
maxtilesize = maxtile(alg)
255327
xtiles = ceil(float(imshape[0] - maxtilesize) / float(maxtilesize - overlap) + 1)
256328
ytiles = ceil(float(imshape[1] - maxtilesize) / float(maxtilesize - overlap) + 1)
257329
return xtiles, ytiles
258330

259331

260-
def fitsingletile(imshape, maxtilesize):
261-
"""Returns whether a given image shape will fit in a single tile or not"""
262-
return all([x <= maxtilesize for x in imshape])
332+
def fitsingletile(imshape, alg):
333+
"""Returns whether a given image shape will fit in a single tile or not.
334+
335+
This depends on the algorithm used and the GPU available in the system"""
336+
mx = maxtile(alg)
337+
return mx*mx >= np.prod(imshape)
263338

264339

265340
def targetshape(content, size=None):
@@ -272,3 +347,24 @@ def targetshape(content, size=None):
272347
return contentshape
273348
else:
274349
return [size, int(size * contentshape[1] / contentshape[0])]
350+
351+
352+
def gpuname():
353+
"""Returns the model name of the first available GPU"""
354+
gpus = GPUtil.getGPUs()
355+
if len(gpus) == 0:
356+
raise ValueError("No GPUs detected in the system")
357+
return gpus[0].name
358+
359+
360+
def maxtile(alg="gatys"):
361+
"""Returns the recommended configuration maximum tile size, based on the available GPU and algorithm to be run
362+
363+
The size returned should be understood as the maximum tile size for a square tile. If non-square tiles are used,
364+
a maximum tile of the same number of pixels should be used.
365+
"""
366+
gname = gpuname()
367+
if gname not in GPUCONFIG:
368+
LOGGER.warning("Unknown GPU model %s, will use default tiling parameters")
369+
gname = "default"
370+
return GPUCONFIG[gname][alg]

0 commit comments

Comments
 (0)