DATA WAREHOUSING BASICS

Best online resource for Data Warehousing Basics Tutorial Tutorials

Big Data Analytics

10:14 AM

Introduction

A data warehouse is a non-volatile time-variant repository of an organization's electronically stored data, designed to facilitate reporting and analysis.It is a is a copy of transaction data specifically structured for query and analysis. A Data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process"










Data Warehouse Structure



Data Warehouse Architecture


This definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform and load data into the repository, and tools to manage and retrieve metadata.

Data warehousing arises in an organisation's need for reliable, consolidated, unique and integrated reporting and analysis of its data, at different levels of aggregation.

Operational System and Data Warehouse
As told by Ralph Kimballs in -The Datawarehouse Toolkit

From all the explanations i have seen the following one by Ralph Kimball is the most simplest and clear one:-

The users of an operational system turn the wheels of the organization. They take orders, sign up new customers, and log complaints. Users of an operational system almost always deal with one record at a time. They repeatedly perform the same operational tasks over and over.

The users of a data warehouse, on the other hand, watch the wheels of the organization turn. They count the new orders and compare them with last week’s orders and ask why the new customers signed up and what the customers complained about. Users of a data warehouse almost never deal with one row at a time. Rather, their questions often require that hundreds or thousands of rows be searched and compressed into an answer set. To further complicate matters, users of a data warehouse continuously change the kinds of questions they ask.