Searching dynamic Web pages with semi-structured contents

At present, information systems (IS) in higher education are usually supported by databases (DB) and accessed through a Web interface. So happens with SiFEUP, the IS of the Engineering Faculty of the University of Porto (FEUP). The typical SiFEUP user sees the system as a collection of Web pages and...

ver descrição completa

Detalhes bibliográficos
Autor principal: Filipe Silva (author)
Outros Autores: Armando Oliveira (author), Lígia M. Ribeiro (author), Gabriel David (author)
Formato: book
Idioma:eng
Publicado em: 2003
Assuntos:
Texto completo:https://repositorio-aberto.up.pt/handle/10216/621
País:Portugal
Oai:oai:repositorio-aberto.up.pt:10216/621
Descrição
Resumo:At present, information systems (IS) in higher education are usually supported by databases (DB) and accessed through a Web interface. So happens with SiFEUP, the IS of the Engineering Faculty of the University of Porto (FEUP). The typical SiFEUP user sees the system as a collection of Web pages and is not aware of the fact that most of them do not exist in the sense of being an actual HTML file stored in a server but corresponds to HTML code generated on the fly by a designated program that accesses the DB and brings the most up-to-date information to the user desktop. Typical search engines do not index dynamically generated Web pages or just do that for those that are specifically mentioned in a static page and do not follow on the links the dynamic page may contain. In this paper we describe the development of a search facility for SiFEUP, how the limitations put to indexing dynamic Web pages were circumvented, and an evaluation of the results obtained. The solution involves using a locally developed crawler, the Oracle Text full text indexer, plus meta-information automatically drawn from the DB or manually added to improve the relevance factor calculation.