Collecting Statistics about the Portuguese Web
This report presents a characterization of text documents from the Portuguese Web. This characterization was produced from a crawl of over 4 million URLs and 131 thousand sites in 2003. We describe rules that we established for defvining its boundaries and the methodology used to gather statistics....
Main Author: | |
---|---|
Other Authors: | |
Format: | report |
Language: | por |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10451/14211 |
Country: | Portugal |
Oai: | oai:repositorio.ul.pt:10451/14211 |