Mixed-integer programming model for profiling disease biomarkers from gene expression studies

Biomedical research has seen great advances in recent years, in great part due to the long-term aid of the ability to identify biological or genetic markers that uniquely match a given disease. Despite several successes stories, the reality is that most diseases still lack an effective way of treatm...

ver descrição completa

Detalhes bibliográficos
Autor principal: Santiago, André M. (author)
Outros Autores: Rocha, Miguel (author), Dourado, António (author), Arrais, Joel P. (author)
Formato: conferencePaper
Idioma:eng
Publicado em: 2017
Assuntos:
Texto completo:http://hdl.handle.net/1822/56427
País:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/56427
Descrição
Resumo:Biomedical research has seen great advances in recent years, in great part due to the long-term aid of the ability to identify biological or genetic markers that uniquely match a given disease. Despite several successes stories, the reality is that most diseases still lack an effective way of treatment, and even diagnostic. While the emergence of omic technologies, enabled the screening of a whole cell at the molecular level, the large quantities of data produced restricted the capability to extract valid outcomes. In this paper, we propose an optimization model, based of mixed-integer linear programming, capable of identifying a combination of biomarkers for distinguishing between healthy and diseased samples. The model achieves this taking several individuals gene expression profiles, identifying the most relevant genes for differentiation and discovering the optimal combination of biomarkers that best explains the difference between both states. This model was validated on two different datasets through sampling analysis, achieving an out of sample accuracy up to 93%.