Resumo: | The design and development of a data warehousing system (DWS) tends to be an exceptional resource consuming project which in turn makes it a high risk/reward project. In order to minimize the risk, some design methodologies and tools are used along the several phases of the project. The Extract-Transform-Load (ETL) component is normally one of the most critical components of a DWS since it gathers, corrects and conforms data in order to be loaded into the Data Warehouse (DW). Data conciliation task tends to be a dull and manual intensive job that often deals with several heterogeneous sources which is critical to the correct representation of the enterprise’s information. The manual nature of this task makes it prone to errors and subject of intensive and successive monitoring. In this paper, we analyse some of the most common ETL tasks for data conciliation using a Relational Algebra approach, as an effort to standardize them for future use in a generic ETL environment. A slowly changed dimension scenario will be used to support the data conciliation modelling process designed for this work.
|