This is the second one of the Stable Diffusion tutorial series. In this tutorial, we will learn how to fine-tune the stable diffusion model on new images (aka Dreambooth).

Setup

First let's install all the dependencies we need to train the model. To avoid losing the model and the data, we will save them in the Google Drive. Make sure you have enough space left in your Google Drive before continuing.

import pathlib
from google.colab import drive
drive.mount('/content/drive')

!pip install -Uqq git+https://github.com/huggingface/diffusers.git transformers ftfy gradio accelerate bitsandbytes fastai fastbook

if not pathlib.Path('/content/diffusers').exists():
  !git clone https://github.com/huggingface/diffusers

import huggingface_hub
if not pathlib.Path('/root/.huggingface/token').exists():
  huggingface_hub.notebook_login()
Token is valid.
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /root/.huggingface/token
Login successful

Preparing Images

Getting Images

For the purpose of this tutorial, let's download some images from DuckDuckGo. Alternatively you can prepare your own images and put them inside a Google Drive Folder (referred to as images_src_dir in the next sections).

import pathlib
from fastbook import search_images_ddg, download_images, verify_images, get_image_files

def download_images_for_keyword(keyword: str, download_dir: str, max_images):
  dest = pathlib.Path(download_dir)
  dest.mkdir(exist_ok=True)
  results = search_images_ddg(keyword, max_images=max_images)
  download_images(dest, urls=results)
  downloaded_files = get_image_files(dest)
  failed = verify_images(downloaded_files)
  num_failed = len(failed)
  if num_failed > 0:
    print(f'Removing {num_failed} images')
    failed.map(pathlib.Path.unlink)

keyword = 'Saito Asuka 2022' #@param {type: "string"}
download_dir = '/content/drive/MyDrive/sd/images_download' #@param {type: "string"}
max_images = 60 #@param {type: "slider", min:20, max:100, step:1}
force_redownload = False #@param {type:"boolean"}

if not pathlib.Path(download_dir).exists() or force_redownload:
  download_images_for_keyword(keyword=keyword, download_dir=download_dir, max_images=max_images)
Removing 8 images

Cropping and Resizing

Although SD doesn't put restrictions on image sizes (other than the width and height should be divisible by 8), we preform center crop and resize on all images to make them the same square shape, since the training batches need to be same dimensions.

Cropping might result in bad images, but no worries, we will clean them up in the next section.

import PIL
import pathlib
from fastai.vision.widgets import ImagesCleaner
from fastai.vision.all import PILImage
from fastbook import get_image_files

images_src_dir = '/content/drive/MyDrive/sd/images_download' #@param {type: "string"}
images_train_dir = '/content/drive/MyDrive/sd/images_training' #@param {type: "string"}
size = 768 #@param {type: "slider", min:256, max:768, step:128}
images_dir = pathlib.Path(images_train_dir)
images_dir.mkdir(exist_ok=True)

for idx, image_file in enumerate(get_image_files(images_src_dir)):
  im = PILImage.create(image_file)
  im = im.crop_pad(min(im.size)).resize_max(max_h=size)
  im.save(images_dir / image_file.name)
!ls "{images_src_dir}" |wc -l
!ls "{images_train_dir}" |wc -l
50
50

Cleaning Up Images

One of the most important things of any ML applications, if not THE most important - is the data quality. To get best quality, let's check our training images and remove the "bad" ones (especially the ones that doesn't contain complete faces after cropping - we don't want the final model to learn to generate half faces!)

FastAI provides an ImagesCleaner class which is a very cool tool for removing images from the Jupyter notebook. Just select "Delete" for the images you want to delete, and then run the following cells to delete them.

fns = get_image_files(images_train_dir)
w = ImagesCleaner(max_n=100)
w.set_fns(fns)
w
import PIL; PIL.Image.open('raw.png')