Dictionary alignment by rewrite-based entry translation

In this document we describe the process of aligning two standard monolingual dictionaries: a Portuguese language dictionary and a Galician synonym dictionary. The main goal of the project is to provide an online dictionary that can show, in parallel, definitions and synonyms in Portuguese and Galic...

ver descrição completa

Detalhes bibliográficos
Autor principal: Simões, Alberto (author)
Outros Autores: Gómez Guinovart, Xavier (author)
Formato: conferencePaper
Idioma:eng
Publicado em: 2013
Assuntos:
Texto completo:https://hdl.handle.net/1822/30685
País:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/30685
Descrição
Resumo:In this document we describe the process of aligning two standard monolingual dictionaries: a Portuguese language dictionary and a Galician synonym dictionary. The main goal of the project is to provide an online dictionary that can show, in parallel, definitions and synonyms in Portuguese and Galician for a specific word, written in Portuguese or Galician. These two languages are very close to each other, and that is the main reason we expect this idea to be viable. The main drawback is the lack of a good and free translation dictionary between these two languages, namely, a dictionary that can cover lexicons with more than one hundred thousand different words. To solve this issue we defined a translation function, based on substitutions, that is able to achieve an F1 score of 0.88 on a manually verified dictionary of nine thousand words. Using this same translation function to align a Portuguese–Galician dictionary we obtained almost 50% of the dictionary lexicon (more than eighty thousand words) alignment.