In the good old days of "data processing", database experts and programmers focused their attention on storing and retrieving data in the best way to do real-time transactions. Being able to store and retrieve data to record a sale, to record a bank deposit and so on. Relational databases became the best thing since sliced-bread. These are built through a complex process of entity relationship modelling and are unintelligible to all but the technically inspired.
As relational databases became established and systems became more stable and reliable, management of organisations running these huge and complex systems began to realise that the databases contained more than data - they contained information. "Data Processing" became "Information Technology". The demand for information grew. IT departments became inundated with requests for management information reports of various types. They couldn't cope. Delivery of these reports was always a week too late. The extraction process was typically very complex and slow requiring data from a wide variety of tables. To make matters worse, the extraction of information for urgent reports often impacted negatively on the performance of the entire system.
A few bright sparks in the industry began to realise that accessing information for management reports required something other than a relational database. Something much simpler and more accessible was needed. Something that reflected the business rather than the technology. The operational database was not the ideal place from which to compile executive reports.
The data warehouse was born. It would contain the kind of information sought by management, organised in a way that was simple to understand and to easy to access.
One of the simplest and most amenable data warehouse model is the "Star Schema". "Facts" are kept in a Fact Table and "Dimensions" - the characteristics by which the facts are accessed - in a Dimension Table. For instance, Customer may be a fact file, while region, age, gender would be dimensions. The popular multi-dimensional cube works in a similar way.
Data for the warehouse is extracted from the operational database and undergoes a number of processes - extraction, cleaning, aggregating and loading. The extraction takes place at specific times - daily, weekly or monthly - and provides a "snapshot" of the organisation at a particular time and date.
Data for each Fact Table may be aggregated (or summarised)at a number of different levels. For example,
Below are the top articles rated and ranked by Helium members on:
by Barry Marcus
In the good old days of "data processing", database experts and programmers focused their attention on storing and retrieving
by Niamh Brown
What is it?
A data warehouse (DWH) put simply is a store of data, extracted at regular intervals from a live data source,
by Leigh Goessl
If you hear the phrase "data warehouse" does it conjure up visions of a large room filled with rows of computers spitting
by Anjan Roy
Data Warehouses, as the name suggests are designed for easy storage and retrieval of large amount of data. Imagine a warehouse
Summary
The business intelligence arena is one of the fastest growing technologies in the IT realm. However, it is a unique
View All Articles on:
A guide to data warehousing
Add your voice
Know something about A guide to data warehousing?
We want to hear your view.
Write now!
Cast your vote!
Click for your side.
Featured Partner
Universal Giving is a social entrepreneurship nonprofit whose vision is to create a world where giving and volunteeri...more
hide