Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation

Facial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purp...

ver descrição completa

Detalhes bibliográficos
Autor principal: Vilaça, Luís Miguel Salgado Nunes (author)
Formato: masterThesis
Idioma:eng
Publicado em: 2020
Assuntos:
Texto completo:http://hdl.handle.net/10400.22/18170
País:Portugal
Oai:oai:recipp.ipp.pt:10400.22/18170
Descrição
Resumo:Facial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset.