Data Augmentation
Data augmentation is the process of increasing the size of a dataset by transforming it in ways that a neural network is unlikely to learn by itself.
After reanding this post, you will know:
- Common data augmentation methods.
- Image augmentation with
imgaug
. - Popular tools for data augmentation.
Methods
In this section, I will introduce these augmentation methods:
- cropping
- shifting
- rotating
- flipping
- shearing
- color jittering
- brigheness
- contrast
- fancy PCA
- salt and pepper
The original images get from cifar-10 is like this:
Code for this:
from keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
from numpy import linalg as la
from sklearn.decomposition import PCA
from imgaug import augmenters as iaa
import imgaug as ia
%matplotlib inline
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.astype('float32')
y_train = y_train.astype('float32')
def show9(imgs):
''' Create a grid of 3x3 images
'''
if imgs.max() > 1:
imgs = imgs / 255.
for i in range(0, 9):
plt.subplot(330 + 1 + i)
plt.imshow(imgs[i], cmap=plt.get_cmap())
plt.show()
imgs = X_train[0:9]
show9(imgs)
cropping
Code for this:
# simplify code
def show_img_augs(imgs, imgaug_seq):
ia.seed(1)
imgs_aug = seq.augment_images(imgs)
show9(imgs_aug)
seq = iaa.Sequential([
iaa.Crop(percent = (0, 0.3))
])
show_img_augs(imgs, seq)
shifting
Code for this:
seq = iaa.Sequential([
iaa.Affine(translate_percent={'x':(-0.1,0.1), 'y':(-0.1,0.1)})
])
show_img_augs(imgs, seq)
rotating
Code for this:
seq = iaa.Sequential([
iaa.Affine(rotate=(-25, 25))
])
show_img_augs(imgs, seq)
flipping
Also called as reflection or mirroring.
Code for this:
seq = iaa.Sequential([
iaa.Fliplr(0.5)
])
show_img_augs(imgs, seq)
Code for this:
seq = iaa.Sequential([
iaa.Flipud(0.5)
])
show_img_augs(imgs, seq)
shearing
Code for this:
seq = iaa.Sequential([
iaa.Affine(shear=(-20, 20))
])
show_img_augs(imgs, seq)
color jitter
For color jitter, the knowledge of color space may help.
brightness
Code for this:
seq = iaa.Sequential([
iaa.Multiply((0.1, 1.5))
])
show_img_augs(imgs, seq)
contrast
Code for this:
seq = iaa.Sequential([
iaa.ContrastNormalization((0.75, 1.5))
])
show_img_augs(imgs, seq)
fancy PCA
Fancy PCA or PCA Color Augmentation is a type of data augmentation technique first mentioned in Alex’s paper ImageNet Classification with Deep Convolutional Neural Networks.
The original words in the paper:
We perform PCA on the set of RGB pixel values throughout the ImageNet training set.
To each training image, we add multiples of the found principal components, with magnitudes proportional to the corresponding eigenvalues times a random variable drawn from a Gaussian with mean zero and standard deviation 0.1.
Therefore to each RGB image pixel \(I_{xy} = [I_{xy}^R,I_{xy}^G,I_{xy}^B]^T\) we add the following quantity: $$ [p_1,p_2,p_3][\alpha_1 \lambda_1 , \alpha_2 \lambda_2 , \alpha_3 \lambda_3]^T $$ where \(p_i\) and \(\lambda_i\) are \(i\)th eigenvector and eigenvalue of the \(3 \times 3\) covariance matrix of RGB pixel vales, respectively, and \(\alpha_i\) is the aforementioned random variable.
Each \(\alpha_i\) is drawn only once for all the pixels of a particular training image until that image is used for training again, at which point it is re-drawn. This scheme aproximately captures an important property of natural images, namely, that object identity is invariant to changes in the intensity and color of illuminiation. This scheme reduces the top-1 error rate by over 1%.
So the whole steps of fancy PCA is:
- Resize data’s shape into
(n * width * height, 3)
- Standardize data into unit scale ( mean=0, variance=1 )
- Compute the eigen values and eigen vectors.
- Decomposition of covariance matrix
- Decomposition of correlation matrix
- Use skit-learn’s tool
- Image augmentation
Code for this:
# 1. Resize
res = np.array([]).reshape([0,3])
for img in imgs:
img = img / 255.
arr = img.reshape(img.shape[0]*img.shape[1], 3)
res = np.vstack([res, arr])
# 2. Standardize
mean = res.mean(axis=0)
std = res.std(axis=0)
res_std = (res - mean) / std
# 3. Eigendecomposition
pca = PCA()
rgb_pca = pca.fit(res_std)
eigen_values = rgb_pca.explained_variance_
eigen_vectors = rgb_pca.components_.T
print 'Eigen Values \n%s' % eigen_values
print 'Eigen Vectors \n%s' % eigen_vectors
# 4. Image augmentation
def data_aug(img, eig_vals, eig_vecs):
if len(eig_vals.shape) == 1:
eig_vals = eig_vals[np.newaxis, :]
mu = 0
sigma = 0.1
# 3 x 1 scaled eigenvalue matrix
w = np.random.normal(mu, sigma, (1,3)) * eig_vals
noise = eig_vecs.dot(w.T).reshape([1,1,3])
# perturbe the image
img_aug = img + noise
return img_aug
def data_augs(imgs, eig_vals, eig_vecs):
img_augs = imgs.copy()
for i in xrange(img_augs.shape[0]):
img_augs[i] = data_aug(img_augs[i], eig_vals, eig_vecs)
return img_augs
img_augs = data_augs(imgs/255., eigen_values, eigen_vectors)
show9(img_augs)
The entire code in notebook is here or you can find it in my notebooks repo.
See also: Fancy PCA (Data Augmentation) with Scikit-Image
salt and pepper
Add salt (white points) and pepper (black points) to images.
Code for this:
seq = iaa.Sequential([
iaa.SaltAndPepper(p=(0, 0.1))
])
show_img_augs(imgs, seq)
distortion
Code for this:
seq = iaa.Sequential([
iaa.PiecewiseAffine(scale=(0.01, 0.07))
])
show_img_augs(imgs, seq)
You can read REDME.md of imgaug for more augmentation methods.
Tools
imgaug
imgaug
is a library for image augmentation in machine learning experiments. It supports a wide range of augmentation techniques, allows to easily combine these, has a simple yet powerful stochastic interface, can augment images and keypoints/landmarks on these and offers augmentation in background processes for improved performance.
A standard use case
The following example shows a standard use case. An augmentation sequence (crop + horizontal flips + gaussian blur) is defined once at the start of the script. Then many batches are loaded and augmented before being used for training.
from imgaug import augmenters as iaa
seq = iaa.Sequential([
iaa.Crop(px=(0, 16)), # crop images from each side by 0 to 16px (randomly chosen)
iaa.Fliplr(0.5), # horizontally flip 50% of the images
iaa.GaussianBlur(sigma=(0, 3.0)) # blur images with a sigma of 0 to 3.0
])
for batch_idx in range(1000):
# 'images' should be either a 4D numpy array of shape (N, height, width, channels)
# or a list of 3D numpy arrays, each having shape (height, width, channels).
# Grayscale images must have shape (height, width, 1) each.
# All images must have numpy's dtype uint8. Values are expected to be in
# range 0-255.
images = load_batch(batch_idx)
images_aug = seq.augment_images(images)
train_on_images(images_aug)