Search Helium

Home > Computers & Technology > Computers & Technology (Other)

A guide to data warehousing

by Anjan Roy

Created on: November 20, 2008

Data Warehouses, as the name suggests are designed for easy storage and retrieval of large amount of data. Imagine a warehouse of a large retailer. What are the main purposes of the warehouse? It is created to store goods in bulk that are needed to support the retail stores. The ways the goods are stored and kept in the warehouse are based on two guiding principles. First the goods should be stored in an efficient manner so that large amount of goods can be accommodated and the second is that the goods could easily be retrieved depending on the demand from the stores.

Data warehouses are also designed on the similar guiding principles efficiency of data storage and ease of retrieval. The design of data warehouses differ from the traditional online transactional processing (OLTP) systems. OLTP systems were designed to support faster ACD (Add, Change and Delete) operations whereas the data warehouses are designed to support ease of data retrieval. In a Data Warehouse, changes do not happen as frequently as in an OLTP system. The changes are mostly in a batch mode.

In a typical organization, data warehouses are built to move data from the OLTP system (or multiple OLTP systems) into one centralized database. This centralized database is often known as the CDW (corporate data warehouse) or the EDW (Enterprise data warehouse). These data warehouses are designed to support executive reporting and provide single view of the enterprise information.

There are three pieces of data warehouse design. The first piece is Data modeling, i.e. designing the structure of the data warehouse. Most of the organizations these days follow multi-layer data warehouses. These various layers are used to stage the OLTP data, incorporate business transformational rules, perform data cleansing if needed and integrate the data before transforming it into the format that is easy for reporting. Data Modeling involves, designing the staging layer and the reporting layer.

The second piece of data warehouse design is designing the ETL (Extract, Transform and Load). As discussed earlier, the data is extracted from various OLTP systems. This data is in the raw format and in the format that was designed for the OLTP system. Various business rules need to be applied in order to integrate the data into a common format. This data then gets loaded into the reporting layer. ETL is often considered the back-bone of a data warehouse. The ETL should be designed in such a way that the reporting layer gets refreshed timely and accurately. As the data grows, and more OLTP systems are integrated into the data warehouse, the ETL MUST scale to be able to support the growth.

The third piece is the reporting layer or often known as the BI (Business Intelligence) layer. This is the presentation layer or the reporting front end of the data warehouse. This typically comprises of self-service reports, ad-hoc query tools, portals and dashboards. This allows the end users to access the data warehouse.

Learn more about this author, Anjan Roy.
Click here to send this author comments or questions.

171854

Featured Partner

One Note At A Time (ONAAT)

One Note At A Time has partnered with Helium, giving you the chance to write for a cause. Browse One Note At A Time's featured titles, pick an issue and write! You can also learn new perspectives on issues that you care about.more


CONNECT WITH US

Read
our blog
Helum for writers

Write and get published
Share with other writers
Polish your freelancing skills

Join our active writing community
Helium Content Source for Publishers

Quality articles from proven freelancers
Exclusive rights, fast turnaround
Brand engagement, business blogging -- our writers do it all

Get custom content today!

INFORMATION


Helium, Inc.
200 Brickstone Square Andover, MA 01810 USA
#