Resumo: | Gastric cancer is the fifth most incident cancer in the world and, when diagnosed at an advanced stage, its survival rate is only 5%-25%, providing that it is essential that the cancer is detected at an early stage. However, physicians specialized in this diagnosis have difficulties in detecting early lesions during a diagnostic examination, esophagogastroduodenoscopy (EGD). Early lesions on the walls of the digestive system are imperceptible and confounded with the stomach mucosa, being difficult to detect. On the other hand, physicians run the risk of not covering all areas of the stomach during diagnosis, especially areas that may have lesions. The introduction of artificial intelligence into this diagnostic method may help to detect gastric cancer at an earlier stage. The implementation of a system capable of monitoring all areas of the digestive system during EGD would be a solution to prevent the diagnosis of gastric cancer in advanced states. This work focuses on the study of upper gastrointestinal (GI) landmarks monitoring, which are anatomical areas of the digestive system more conducive to the appearance of lesions and that allow better control of the missed areas during EGD exam. The use of convolutional neural networks (CNNs) in GI landmarks monitoring has been a great target of study by the scientific community, with such networks having a good capacity to extract features that better characterize EGD images. The aim of this work consisted in testing new automatic algorithms, specifically CNN-based systems able to detect upper GI landmarks to avoid the presence of blind spots during EGD to increase the quality of endoscopic exams. In contrast with related works in the literature, in this work we used upper GI landmarks images closer to real-world environments. In particular, images for each anatomical landmark class include both examples affected by pathologies and healthy tissue. We tested some pre-trained architectures as the ResNet-50, DenseNet-121, and VGG-16. For each pre-trained architecture, we tested different learning approaches, including the use of class weights (CW), the use of batch normalization and dropout layers, and the use of data augmentation to train the network. The CW ResNet-50 achieved an accuracy of 71.79% and a Mathews Correlation Coefficient (MCC) of 65.06%. In current state-of-art studies, only supervised learning approaches were used to classify EGD images. On the other hand, in our work, we tested the use of unsupervised learning to increase classification performance. In particular, convolutional autoencoder architectures to extract representative features from unlabeled GI images and concatenated their outputs withs with the CW ResNet-50 architecture. We achieved an accuracy of 72.45% and an MCC of 65.08%.
|