Grabbing parallel corpora from the web
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortunately, these resources are difficult to obtain and organize. In this document we describe a set of tools designed to help in the task of mining bilingual resources from the web, from a specific site,...
Main Author: | |
---|---|
Other Authors: | , |
Format: | article |
Language: | eng |
Published: |
2002
|
Subjects: | |
Online Access: | http://hdl.handle.net/1822/599 |
Country: | Portugal |
Oai: | oai:repositorium.sdum.uminho.pt:1822/599 |
Summary: | Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortunately, these resources are difficult to obtain and organize. In this document we describe a set of tools designed to help in the task of mining bilingual resources from the web, from a specific site, from a file system, from a list of URLs, or from a translation memory. As a design goal we intend to build tools that can be used both cooperatively (in pipeline) and also in a independent way. |
---|