Resumo: | Biomedical research has seen great advances in recent years, in great part due to the long-term aid of the ability to identify biological or genetic markers that uniquely match a given disease. Despite several successes stories, the reality is that most diseases still lack an effective way of treatment, and even diagnostic. While the emergence of omic technologies, enabled the screening of a whole cell at the molecular level, the large quantities of data produced restricted the capability to extract valid outcomes. In this paper, we propose an optimization model, based of mixed-integer linear programming, capable of identifying a combination of biomarkers for distinguishing between healthy and diseased samples. The model achieves this taking several individuals gene expression profiles, identifying the most relevant genes for differentiation and discovering the optimal combination of biomarkers that best explains the difference between both states. This model was validated on two different datasets through sampling analysis, achieving an out of sample accuracy up to 93%.
|