Resumo: | This thesis presents a novel integrated cortical architecture with significant emphasis on low-level attentional mechanisms—based on retinal nonstandard cells and pathways—that can group non-attentional, bottom-up features present in V1/V2 into “proto-object” shapes. These shapes are extracted at first using combinations of specific cell types for detecting corners, bars/edges and curves which work extremely well for geometrically shaped objects. Later, in the parietal pathway (probably in LIP), arbitrary shapes can be extracted from population codes of V2 (or even dorsal V3) oriented cells that encode the outlines of objects as “proto-objects”. Object shapes obtained at both cortical levels play an important role in bottom-up local object gist vision, which tries to understand scene context in less than 70 ms and is thought to use both global and local scene features. Edge conspicuity maps are able to detect borders/edges of objects and attribute them a weight based on their perceptual salience, using readily available retinal ganglion cell colour-opponency coding. Conspicuity maps are fundamental in building posterior saliency maps—important for both bottom-up attention schemes and also for Focus-of-Attention mechanisms that control eye gaze and object recognition. Disparity maps are also a main focus of this thesis. They are built upon binocular simple and complex cells in quadrature, using a Disparity-Enery Model. These maps are fundamental for perception of distance within a scene and close/far object relationships in doing foreground to background segregation. The role of cortical disparity in 3D facial recognition was also explored when processing faces with very different facial expressions (even extreme ones), yielding state-of-the-art results when compared to other, non-biological, computer vision algorithms.
|