Anime Face Generation Using DCGAN with Keras and TensorFlow

Generative Adversarial Networks (GANs) have revolutionized image synthesis. In this post, we walk through the implementation of a Deep Convolutional GAN (DCGAN) using Keras and TensorFlow, trained to generate 64×64 anime-style faces.

Dataset Preparation

The dataset consists of preprocessed anime faces resized to 64×64 pixels. Each image is normalized to the range [-1, 1] using the formula:

image = (image - 127.5) / 127.5

Images are loaded using ImageDataGenerator with the following setup:

datagen = ImageDataGenerator(rescale=1./255)
dataset = datagen.flow_from_directory(
    data_path,
    target_size=(64, 64),
    batch_size=batch_size,
    class_mode=None)

Model Architecture

Generator

The generator maps a 100-dimensional noise vector to a 64×64 RGB image using a series of transposed convolutions.

model = Sequential()
model.add(Dense(8*8*256, input_dim=100))
model.add(Reshape((8, 8, 256)))
model.add(BatchNormalization())
model.add(Conv2DTranspose(128, kernel_size=4, strides=2, padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2DTranspose(64, kernel_size=4, strides=2, padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2DTranspose(3, kernel_size=4, strides=2, padding='same', activation='tanh'))

Discriminator

The discriminator uses Conv2D layers to downsample images and classify them as real or fake.

model = Sequential()
model.add(Conv2D(64, kernel_size=4, strides=2, padding='same', input_shape=(64, 64, 3)))
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(128, kernel_size=4, strides=2, padding='same'))
model.add(LeakyReLU(alpha=0.2))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))

GAN Training

The discriminator and generator are compiled separately:

discriminator.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5), metrics=['accuracy'])

# Freeze discriminator weights during generator training
discriminator.trainable = False

# Combined GAN model
gan_input = Input(shape=(100,))
generated_image = generator(gan_input)
gan_output = discriminator(generated_image)
gan = Model(gan_input, gan_output)
gan.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))

Training Loop

for epoch in range(epochs):
    # Sample real images
    real_imgs = dataset.next()
    # Generate fake images
    noise = np.random.normal(0, 1, (batch_size, 100))
    fake_imgs = generator.predict(noise)

    # Train discriminator
    d_loss_real = discriminator.train_on_batch(real_imgs, real_labels)
    d_loss_fake = discriminator.train_on_batch(fake_imgs, fake_labels)

    # Train generator
    noise = np.random.normal(0, 1, (batch_size, 100))
    g_loss = gan.train_on_batch(noise, real_labels)

    # Save generated images for monitoring
    if epoch % sample_interval == 0:
        save_generated_images(generator, epoch)

Results

Images are saved every 10 epochs showing gradual improvement.
Early outputs are noisy; mid-to-late training shows better-defined anime facial structures.
DCGAN successfully learns to generate convincing anime faces with minimal architectural complexity.

This project demonstrates how a DCGAN built with Keras and TensorFlow can effectively generate realistic anime-style faces from random noise. By leveraging transposed convolutions in the generator and convolutional layers in the discriminator, the model learns to produce increasingly detailed images over time. While basic in architecture, the results highlight the potential of GANs in creative AI applications. With further improvements such as advanced loss functions, deeper networks, and richer datasets, the quality and diversity of generated outputs can be significantly enhanced.