Data Warehouses, as the name suggests are designed for easy storage and retrieval of large amount of data. Imagine a warehouse of a large retailer. What are the main purposes of the warehouse? It is created to store goods in bulk that are needed to support the retail stores. The ways the goods are stored and kept in the warehouse are based on two guiding principles. First the goods should be stored in an efficient manner so that large amount of goods can be accommodated and the second is that the goods could easily be retrieved depending on the demand from the stores.
Data warehouses are also designed on the similar guiding principles efficiency of data storage and ease of retrieval. The design of data warehouses differ from the traditional online transactional processing (OLTP) systems. OLTP systems were designed to support faster ACD (Add, Change and Delete) operations whereas the data warehouses are designed to support ease of data retrieval. In a Data Warehouse, changes do not happen as frequently as in an OLTP system. The changes are mostly in a batch mode.
In a typical organization, data warehouses are built to move data from the OLTP system (or multiple OLTP systems) into one centralized database. This centralized database is often known as the CDW (corporate data warehouse) or the EDW (Enterprise data warehouse). These data warehouses are designed to support executive reporting and provide single view of the enterprise information.
There are three pieces of data warehouse design. The first piece is Data modeling, i.e. designing the structure of the data warehouse. Most of the organizations these days follow multi-layer data warehouses. These various layers are used to stage the OLTP data, incorporate business transformational rules, perform data cleansing if needed and integrate the data before transforming it into the format that is easy for reporting. Data Modeling involves, designing the staging layer and the reporting layer.
The second piece of data warehouse design is designing the ETL (Extract, Transform and Load). As discussed earlier, the data is extracted from various OLTP systems. This data is in the raw format and in the format that was designed for the OLTP system. Various business rules need to be applied in order to integrate the data into a common format. This data then gets loaded into the reporting layer. ETL is often considered the back-bone of a data warehouse. The ETL should be designed in such a way that the reporting layer gets refreshed timely and accurately. As the data grows, and more OLTP systems are integrated into the data warehouse, the ETL MUST scale to be able to support the growth.
The third piece is the reporting layer or often known as the BI (Business Intelligence) layer. This is the presentation layer or the reporting front end of the data warehouse. This typically comprises of self-service reports, ad-hoc query tools, portals and dashboards. This allows the end users to access the data warehouse.
Learn more about this author, Anjan Roy.
Click here to send this author comments or questions.
Below are the top articles rated and ranked by Helium members on:
by Barry Marcus
In the good old days of "data processing", database experts and programmers focused their attention on storing and retrieving
by Niamh Brown
What is it?
A data warehouse (DWH) put simply is a store of data, extracted at regular intervals from a live data source,
by Leigh Goessl
If you hear the phrase "data warehouse" does it conjure up visions of a large room filled with rows of computers spitting
by Anjan Roy
Data Warehouses, as the name suggests are designed for easy storage and retrieval of large amount of data. Imagine a warehouse
Summary
The business intelligence arena is one of the fastest growing technologies in the IT realm. However, it is a unique
View All Articles on:
A guide to data warehousing
Add your voice
Know something about A guide to data warehousing?
We want to hear your view.
Write now!
Featured Partner
House Rabbit Society is a volunteer-based international non-profit organization with two primary goals: 1) To r...more
hide