A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations

Advances in deep convolutional neural networks and efficient parallel processing are showing great promise when applied to image classification, object detection, image restoration and image segmentation. However, deep models require large amounts of annotated training data, which are not always acc...

Full description

Bibliographic Details
Main Author: Gonçalves, Guilherme Marques (author)
Format: masterThesis
Language:eng
Published: 2021
Subjects:
Online Access:http://hdl.handle.net/10773/30759
Country:Portugal
Oai:oai:ria.ua.pt:10773/30759
Description
Summary:Advances in deep convolutional neural networks and efficient parallel processing are showing great promise when applied to image classification, object detection, image restoration and image segmentation. However, deep models require large amounts of annotated training data, which are not always accessible. In this context, data augmentation has appeared as an effective technique by which the original dataset is expanded to cope with imbalanced datasets, avoid overfitting, and increase classification performance. This dissertation aims to compare the effectiveness of data augmentation techniques when applied to image classification problems, focusing on basic image manipulations and generative modelling. On the one hand, basic image manipulations include classical transformations of the original samples such as rotations, translations, flips and crops. On the other hand, generative adversarial networks (GANs) are used to synthesize artificial samples from the original dataset through adversarial training. This comparative study considers two distinct classification problems - handwritten digits recognition and melanoma skin cancer diagnosis - that are addressed using convolutional neural network models. A baseline multiclass classifier was developed from scratch for the handwritten digit recognition using the MNIST dataset. The binary melanoma classification uses pre-trained models, namely the VGG16 and the DenseNet201, on the ISIC2019 dataset. For generating handwritten digits, GAN-based data augmentation is supported by Deep Convolutional GANs (DCGANs) and Conditional GANs (cGANs). More advanced architectures like Progressive GANs (PGANs) and Style- GANs are used for synthesizing melanoma dermoscopic images. The results obtained demonstrate that basic image manipulations perform remarkably well in classification tasks. Further, GAN-based data augmentation does not yet compete with classical techniques, especially in problems that require high quality and realistic images, as is the case with medical applications. Nevertheless, it is shown that the StyleGAN2-Ada helps to improve the balanced accuracy by 2.1% when compared with the CNN model without any kind of augmentation. The combination of classical and synthetic augmentations may be the best option in the near future.