A large Portuguese corpus on-line: cleaning and preprocessing

We present a newly available on-line resource for Portuguese,a corpus of 310 million words, a new version of the Reference Corpus of Contemporary Portuguese, now searchable via a user-friendly web interface. Here we report on work carried out on the corpus previous toits publication on-line. We focu...

ver descrição completa

Detalhes bibliográficos
Autor principal: Généreux, Michel (author)
Outros Autores: Hendrickx, Iris (author), Mendes, Amália (author)
Formato: conferenceObject
Idioma:eng
Publicado em: 2019
Assuntos:
Texto completo:http://hdl.handle.net/10451/37430
País:Portugal
Oai:oai:repositorio.ul.pt:10451/37430