Clustering genomic words in human DNA using peaks and trends of distributions

In this work we seek clusters of genomic words in human DNA by studying their inter-word lag distributions. Due to the particularly spiked nature of these histograms, a clustering procedure is proposed that first decomposes each distribution into a baseline and a peak distribution. An outlier-robust...

Full description

Bibliographic Details
Main Author: Tavares, Ana Helena (author)
Other Authors: Raymaekers, Jakob (author), Rousseeuw, Peter J. (author), Brito, Paula (author), Afreixo, Vera (author)
Format: article
Language:eng
Published: 2021
Subjects:
Online Access:http://hdl.handle.net/10773/30267
Country:Portugal
Oai:oai:ria.ua.pt:10773/30267