Evaluation of Genotype-Based Gene Expression Model Performance: A cross-framework and cross-dataset study

Predicting gene expression from genotyped data is valuable for studying inaccessible tissues such as the brain. Herein we present eGenScore, a polygenic/poly-variation method, and compare it with PrediXcan, a method based on regularized linear regression using elastic nets. While both methods have t...

ver descrição completa

Detalhes bibliográficos
Autor principal: Tavares, V. (author)
Outros Autores: Monteiro, J. (author), Vassos, E. (author), Coleman, J. (author), Prata, D. (author)
Formato: article
Idioma:eng
Publicado em: 2021
Assuntos:
Texto completo:http://hdl.handle.net/10071/23620
País:Portugal
Oai:oai:repositorio.iscte-iul.pt:10071/23620
Descrição
Resumo:Predicting gene expression from genotyped data is valuable for studying inaccessible tissues such as the brain. Herein we present eGenScore, a polygenic/poly-variation method, and compare it with PrediXcan, a method based on regularized linear regression using elastic nets. While both methods have the same purpose of predicting gene expression based on genotype, they carry important methodological differences. We compared the performance of expression quantitative trait loci (eQTL) models to predict gene expression in the frontal cortex, comparing across these frameworks (eGenScore vs. PrediXcan) and training datasets (BrainEAC, which is brain-specific, vs. GTEx, which has data across multiple tissues). In addition to internal five-fold cross-validation, we externally validated the gene expression models using the CommonMind Consortium database. Our results showed that (1) PrediXcan outperforms eGenScore regardless of the training database used; and (2) when using PrediXcan, the performance of the eQTL models in frontal cortex is higher when trained with GTEx than with BrainEAC.