Data warehousing in big data: from multidimensional to tabular data models

Data warehouses are central pieces in business intelligence and analytics as these repositories ensure proper data storage and querying, being supported by data models that allow the analysis of data by different perspectives. Those perspectives support users and organizations in the decision-making...

ver descrição completa

Detalhes bibliográficos
Autor principal: Santos, Maribel Yasmina (author)
Outros Autores: Costa, Carlos (author)
Formato: conferencePaper
Idioma:eng
Publicado em: 2016
Assuntos:
Texto completo:http://hdl.handle.net/1822/42352
País:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/42352
Descrição
Resumo:Data warehouses are central pieces in business intelligence and analytics as these repositories ensure proper data storage and querying, being supported by data models that allow the analysis of data by different perspectives. Those perspectives support users and organizations in the decision-making process. In Big Data environments, Hive is used as a distributed storage mechanism that provides data warehousing capabilities. Its data schemas are defined attending to the analytical requirements specified by the users. In this work, multidimensional data models are used as the source of those requirements, allowing the automatic transformation of a multidimensional schema into a tabular schema suited to be implemented in Hive. To achieve this objective, a set of rules is proposed and tested in a demonstration case, showing the applicability and usefulness of the proposed approach.