Looking for atypical groups of distributions in the context of genomic data

This work addresses the problem of detecting groups of observations (distributions) and flagging those that differ abnormally from the majority of the groups, termed as atypical groups. The proposed method combines a hierarchical classification technique, to identify groups of similar distributions,...

Full description

Bibliographic Details
Main Author: Tavares, Ana Helena (author)
Other Authors: Afreixo, Vera (author), Brito, Paula (author)
Format: conferenceObject
Language:eng
Published: 2022
Subjects:
Online Access:http://hdl.handle.net/10773/35317
Country:Portugal
Oai:oai:ria.ua.pt:10773/35317
Description
Summary:This work addresses the problem of detecting groups of observations (distributions) and flagging those that differ abnormally from the majority of the groups, termed as atypical groups. The proposed method combines a hierarchical classification technique, to identify groups of similar distributions, with a functional outlier detection method, to identify those groups that contain outliers. Groups with outlying observations are forwarded for sub clustering. Once the final partition is obtained, each cluster is represented by a class prototype, whose outlyingness is evaluated according to a functional approach. Clusters with atypical class labels are flagged as atypical groups. The method is applied for the detection of groups of atypical genomic words, based on their distances distributions.