Data augmentation and deep classification with generative adversarial networks

Machine learning has seen many advances in recent years. One type of model that has evolved a lot recently is Generative Adversarial Networks (GANs). These models have the ability to create fake data that resembles the data on which they were trained on. The interest for these models has been ever g...

ver descrição completa

Detalhes bibliográficos
Autor principal: Silva, Gabriel Augusto Santos (author)
Formato: masterThesis
Idioma:eng
Publicado em: 2021
Assuntos:
Texto completo:http://hdl.handle.net/10773/32283
País:Portugal
Oai:oai:ria.ua.pt:10773/32283
Descrição
Resumo:Machine learning has seen many advances in recent years. One type of model that has evolved a lot recently is Generative Adversarial Networks (GANs). These models have the ability to create fake data that resembles the data on which they were trained on. The interest for these models has been ever growing since their creation, in 2014. The ability to create fake data has also been found to be quite useful, especially, in data starved areas, like medical imaging. GANs have been used, with positive results, in areas like these to increase the size of the datasets available, as a way to improve the quality of classifiers. This dissertation makes a study with a specific type of GAN, the Auxiliary Classification GAN (AC-GAN), to understand if there may be new ways in which GANs can improve classification tasks. For this, a three parted experimented was designed, with each part being denominated as a Scenario. In Scenario 1 a standalone classifier was trained, in Scenario 2 that same classifier was trained after data augmentation was done with a GAN and, finally, in Scenario 3 an AC-GAN was used instead of the classifier. Two distinct problems were considered here. The first was the CIFAR-10 problem, which is a well known and structured problem, quite often used as a benchmark in GAN related works. The second problem used here was a skin lesion one. This served two purposes: significantly increasing the difficulty of the problem at hand and approximating the work done here to, possibly, the biggest practical usage of GANs, which has been data augmentation for medical imaging problems. The models developed were based on the original version of the AC-GAN and on the BigGAN, which, when presented, was the best performing GAN known, able to produce high quality images of resolutions of up to 512x512. Adapting the BigGAN into an AC-GAN resulted in the best known performing AC-GAN on the CIFAR-10 dataset. The study made in this dissertation can serve as a solid backbone for further studies on this matter to be made, since the results obtained here strongly suggest that the use of AC-GANs can be an effective way to achieve superior classifiers.