GOAnnotator: linking protein GO annotations to evidence text

Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Te...

Full description

Bibliographic Details
Main Author: Couto, Francisco M. (author)
Other Authors: Silva, Mário J. (author), Lee, Vivian (author), Dimmer, Emily (author), Camon, Evelyn (author), Apweiler, Rolf (author), Kirsch, Harald (author), Rebholz-Schuhmann, Dietrich (author)
Format: report
Language:por
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10451/14234
Country:Portugal
Oai:oai:repositorio.ul.pt:10451/14234
Description
Summary:Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators. In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins. The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one, which achieved high precision, which is crucial for the efficient support of GO curators. GOAnnotator is available at: http://xldb.fc.ul.pt/rebil/tools/goa/