Collecting Statistics about the Portuguese Web

This report presents a characterization of text documents from the Portuguese Web. This characterization was produced from a crawl of over 4 million URLs and 131 thousand sites in 2003. We describe rules that we established for defvining its boundaries and the methodology used to gather statistics....

ver descrição completa

Detalhes bibliográficos
Autor principal: Gomes, Daniel (author)
Outros Autores: Silva, Mário J. (author)
Formato: report
Idioma:por
Publicado em: 2009
Assuntos:
Texto completo:http://hdl.handle.net/10451/14211
País:Portugal
Oai:oai:repositorio.ul.pt:10451/14211