Lexical semantics annotation for enriched Portuguese corpora

The semantic annotation of corpora has an important role to play in ensuring that sentences occurring in natural language texts are correctly understood based on their intended context. Two examples of lexical semantic units that contribute to this knowledge are word senses – which allow words with...

ver descrição completa

Detalhes bibliográficos
Autor principal: Neale, Steven (author)
Outros Autores: Valadas, Rita (author), Silva, João (author), Branco, António (author)
Formato: bookPart
Idioma:eng
Publicado em: 2018
Assuntos:
Texto completo:http://hdl.handle.net/10451/33110
País:Portugal
Oai:oai:repositorio.ul.pt:10451/33110
Descrição
Resumo:The semantic annotation of corpora has an important role to play in ensuring that sentences occurring in natural language texts are correctly understood based on their intended context. Two examples of lexical semantic units that contribute to this knowledge are word senses – which allow words with multiple meanings to be understood based on the context in which they are used – and named entities – which can be disambiguated and linked back to the specific encyclopedic resources that describe them. In this paper, we describe the construction of lexical semanticallyannotated corpora for Portuguese, annotated with both word senses linked to senses in a Portuguese wordnet and named entities linked to Portuguese Wikipedia entries using DBpedia. The result is a goldstandard lexical semantically-annotated resource that is useful in supporting the training and evaluation of tools for the disambiguation of these lexical units in Portuguese.