There has been a lot of improvement in GANs 1 in the last years. One of the many uses has been to upscale blurry images.

The GAN 1 generates a realistic face based on the training on all the faces it has seen. This, which looks like an enhance CSI joke, is of course not reconstructing the original image that information is not there in the low resolution image). It is extrapolating how an image could look like.

## A quick example

We use the image of Lena 2 and downscale it with a script. Check the end of the article for all the code.

We know this has decent results — like in the example above — with downscaled photos. How good would it be with stylized characters? Will it interpret the features correctly?

We have run the generator on all the emojis. From al the ones recognized as a face the generation has been performed and included in the image below.

## The people behind the emojis

We ran every emoji, for different platforms, through the GAN 1. These are the results for all emojis recognized as a face 3. The emojis that were not recognized are ommited. Following, a selection of interesting examples.

(For more detailed examples, check the Per category; for a selection of failures click here)

### Per category

#### Bald emojis

As with other examples, not a single bald face was generated while generating images for this article. We can see in the following images how the shape is similar, but it is always — in all the images we generated — covered with hair.

#### Bearded faces

Beards also do not seem to be generated. Below we can see examples of how they are interpreted as shadows or double chins.

#### Cooks

We can see how caps and accessories are not often present in the training dataset. A cook’s hat has been interpreted here as white hair, with pretty convincing results.

We can also notice how the moustaches are interpreted as “reserved space”, or as a slighly more hairy area, but not as a full moustache or bear, an indication that probably they were also absent in training. Same space is reserved in the toque’s white area. This produces an ample forehead.

#### Curly hair

Curly hair from emojis is not correclt reproduced. It could be because the curly hair in an emoji is overly stylized and unrealistic, or for lack of training examples. We can only speculate about it. Belowe some examples:

The results are similar for different hair and skin colours:

#### Elves

A few more elves examples below. We can see the lower ear lobes have been reintepreted as hair or long earrings. The reptile-like pattern of the shirt is surprisingly consistent. The ears are — as it would be expected — pointy.

#### Massaging hands

In the following examples the hands around the head seem to have been interpreted as part of the face, producing a long forehead. The relaxed faces produced reflect very well th emojis expression.

#### Glasses

In 2 out of 3 graduated emojis the Neural network has generated glasses that were not there in the emoji (!). I find that startling. Might be that we are reading too much into it.

#### Male teachers (and more glasses)

Here we have the opposite case as in the previous example: Glasses present in the emoji have disappeared in the generated faces.

In addition to this, we can notice how the blackboard in the background shapes the haircur of the generated faces.

#### Old age

This is a selection of some of the best images generated from an image representing an old person.

#### Young age

Some of the emojis generated images with younger characters. One specific style of emoji generated all the younger-looking images. I ignore if this is due to the random seed, or if something in the colours makes it more favourable.

#### Steamy pictures

To my surprise, the GAN 1 recognized very consistently images where there was a glow or a steam, and generated accorddinly diffuse images.

##### Crowns

Crowns are probably not part of the training set — my wild guess. Instead of appearing in the resulting image, they seem to worsen the lighten conditions of the generated image.

Diadems have a similar result as crowns. They are also not generated.

##### Hats

The generated images never generate a hair accessory — including headscarves. The rest of the generation is still believable.

##### Turbans

Most turban-wearing emojis seem to generate more skin surface. This translates in big foreheads and receding hairlines.

#### Left and right weight

The original image can influence if the generated image will be a frontal or a three quarter angle. We can see a good example of this in asymmetric source images, such as the emojis in the tipping hand and raising hand category. From the generated images they are more likely — not always — to weight more on the side where the raised hand was.

#### Vampires

With the vampire emojis as source we observe an interesting effect: it seems ears and teeth get shaped into the source image shape, while other attributes — eye colour — do not adapt as easily.

### Failures

Here is a summary of the main failures we ran into when feeding emojis as input. Probably some of these do not occur when using photographic images and others do.

## Limitations and biases

There are always biases based on the data used for training. This is a limitation of learning by example. While not a criticism, it is useful to be aware of what the biases of a specific model are. The authors of pulse very wisely acknowledge these biases 5. In this article this has less relevance, since we are not ven using photographies but stylized icons — emojis — as an investigation, which will have their own biases in how they are stylized.

In this case we can speculate that the training did not include many cases of:

• dark-skinned people
• Different ages

The training set does not seem to include very young people or babies.

• hair

• hair other than wavy or straight
• bald people
• red hair
• beards, moustaches

Moustaches produce a bigger space and some hair, but nothing too dense. I would guess training data consists mostly of clean-shaven or hairless subjects. In addition to this, we are using emojis as input — and not photographies — and it is very likely in most emojis the stylized version of hair has a very slight semblance to real hair.

• accessories, except earrings and glasses

• non-smiling faces

The faces generated are trying very hard to smile. Sometimes they result in a quirky half smile. This might be a cultural phenomenon in the training data — smiling when a photo is to be taken.

## The code

If you would like to reproduce the examples above or play with new ones.

### Cloning the repo

git clone https://github.com/adamian98/pulse


### Installing dependencies

The pulse repository 6 has instructions to install dependencies with conda. If you prefer virtualenv the following might be useful.

cd pulse

# install dependencies
sudo apt-get install python3.8-dev
sudo apt install libpython3.8-dev
virtualenv -p /usr/bin/python3.8 newenv3
./newenv3/bin/pip install certifi cffi chardet cryptography \
cycler idna intel-openmp kiwisolver matplotlib mkl install numpy \
olefile pandas pillow pycparser pyopenssl pyparsing pysocks \
python-dateutil torch pytz readline requests scipy tk torchvision \


### Running

There are two main scripts we need to use:

• align_face.py sets all the images in the input folder in the right format and downscales them to the desired size. The less resolution — more downscaling — the more room the GAN 1 has to reconstruct the high resolution image.

• run.py generates the prediction and — optionally — saves the intermediate steps. This can be useful to generate animations such as the one picturing Lena at the beginning of this article.

### Running (16px downscale)

# align face in image and downscale to resolution (16px)
./newenv3/bin/python align_face.py -input_dir 'lena_input_folder_16px' -output_size=16

# by default, downscaled images go to the pulse/inputfolder
# you might clear that of other images

# make the prediction and output intermediate stesps
./newenv3/bin/python run.py -output_dir='output_16' -save_intermediate -steps=200


### Running (32px downscale)

# align face in image and downscale to resolution (32px)
./newenv3/bin/python align_face.py -input_dir 'lena_input_folder_32px' -output_size=32

# by default, downscaled images go to the pulse/inputfolder
# you might clear that of other images

# make the prediction and output intermediate stesps
./newenv3/bin/python run.py -output_dir='output_32' -save_intermediate -steps=200


### Default folders and options

As stated above, some folders — like pulse/input are specified by default.

We cans see a list of all the default options and folders in the original source code:

For align_face.py:

For run.py:

## Coda

I hoped you liked it. This was a selection of the generated images. You can check all the emoji-source — generated image pairs in the github repository:

• each pair, individually, here
• In a big image with all the pairs, here

3. by align_faces.py. see Running the code  2