Collecting Statistics about the Portuguese Web

This report presents a characterization of text documents from the Portuguese Web. This characterization was produced from a crawl of over 4 million URLs and 131 thousand sites in 2003. We describe rules that we established for defvining its boundaries and the methodology used to gather statistics....

Full description

Bibliographic Details
Main Author: Gomes, Daniel (author)
Other Authors: Silva, Mário J. (author)
Format: report
Language:por
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10451/14211
Country:Portugal
Oai:oai:repositorio.ul.pt:10451/14211