A discrete particle swarm algorithm for OLAP data cube selection

Multidimensional analysis supported by Online Analytical Processing (OLAP) systems demands for many aggregation functions over enormous data volumes. In order to achieve query answering times compatible with the OLAP systems' users, and allowing all the business analytical views required, OLAP...

Full description

Bibliographic Details
Main Author: Loureiro, Jorge (author)
Other Authors: Belo, Orlando (author)
Format: conferencePaper
Language:eng
Published: 2006
Subjects:
Online Access:http://hdl.handle.net/1822/71942
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/71942
Description
Summary:Multidimensional analysis supported by Online Analytical Processing (OLAP) systems demands for many aggregation functions over enormous data volumes. In order to achieve query answering times compatible with the OLAP systems' users, and allowing all the business analytical views required, OLAP data is organized as a multidimensional model, known as data cube. The materialization of all the data cubes required for decision makers would allow fast and consistent answering times to OLAP queries. However, this also imply intolerable costs, concerning to storage space and time, even when a data warehouse had a medium size and dimensionality-this will be critical on refreshing operations. On the other hand, given a query profile, only a part of all subcubes are really interesting. Thus, cube selection must be made aiming to minimize query (and maintenance) costs, keeping as a constraint the materializing space. That is a complex problem: its solution is NP-hard. Many algorithms and several heuristics, especially of greedy nature and evolutionary approaches, have been used to provide an approximate solution. To this problem, a new algorithm is proposed in this paper: particle swarm optimization (PSO). According to our experimental results, the solution achieved by the PSO algorithm showed a speed of execution, convergence capacity and consistence that allow electing it to use in data warehouse systems of medium dimensionalities.