Verification of Uncurated Protein Annotations

Molecular Biology research projects produced vast amounts of data, part of which has been preserved in a variety of public databases. However, a large portion of the data contains a significant number of errors and therefore requires careful verification by curators, a painful and costly task, befor...

ver descrição completa

Detalhes bibliográficos
Autor principal: Rebholz-Schuhmann, Dietrich (author)
Outros Autores: Kirsch, Harald (author), Apweiler, Rolf (author), Camon, Evelyn (author), Dimmer, Emily (author), Lee, Vivian (author), Silva, Mário J (author), Couto, Francisco M (author)
Formato: bookPart
Idioma:eng
Publicado em: 2010
Assuntos:
Texto completo:http://hdl.handle.net/10451/14649
País:Portugal
Oai:oai:repositorio.ul.pt:10451/14649
Descrição
Resumo:Molecular Biology research projects produced vast amounts of data, part of which has been preserved in a variety of public databases. However, a large portion of the data contains a significant number of errors and therefore requires careful verification by curators, a painful and costly task, before being reliable enough to derive valid conclusions from it. On the other hand, research in biomedical information retrieval and information extraction are nowadays delivering Text Mining solutions that can support curators to improve the efficiency of their work to deliver better data resources. Over the past decades, automatic text processing systems have successfully exploited biomedical scientific literature to reduce the researchers’ efforts to keep up to date, but many of these systems still rely on domain knowledge that is integrated manually leading to unnecessary overheads and restrictions in its use. A more efficient approach would acquire the domain knowledge automatically from publicly available biological sources, such as BioOntologies, rather than using manually inserted domain knowledge. An example of this approach is GOAnnotator, a tool that assists the verification of uncurated protein annotations. It provided correct evidence text at 93% precision to the curators and thus achieved promising results. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.