DATA WAREHOUSING BASICS

Best online resource for Data Warehousing Basics Tutorial Tutorials

Big Data Analytics

As found on learndatamodelling.com



A great listing found on it.toolbox.com

A data warehouse architecture is primarily based on the business processes of a business enterprise taking into consideration the data consolidation across the business enterprise with adequate security, data modeling and organization, extent of query requirements, meta data management and application, warehouse staging area planning for optimum bandwidth utilization and full technology implementation.

The Data Warehouse Architecture includes many facets. Some of these are listed as follows:

1 Process Architecture
2 Data Model Architecture
3 Technology Architecture
4 Information Architecture
5 Resource Architecture
6 Various Architectures
7 More Resources

Process Architecture
Describes the number of stages and how data is processed to convert raw / transactional data into information for end user usage.
The data staging process includes three main areas of concerns or sub- processes for planning data warehouse architecture namely “Extract”, “Transform” and “Load”.

These interrelated sub-processes are sometimes referred to as an “ETL” process.

1)Extract- Since data for the data warehouse can come from different sources and may be of different types, the plan to extract the data along with appropriate compression and encryption techniques is an important requirement for consideration.

2)Transform- Transformation of data with appropriate conversion, aggregation and cleaning besides de-normalization and surrogate key management is also an important process to be planned for building a data warehouse.

3)Load- Steps to be considered to load data with optimization by considering the multiple areas where the data is targeted to be loaded and retrieved is also an important part of the data warehouse architecture plan.

Data Model Architecture
In Data Model Architecture (also known as Dimensional Data Model), there are 3 main data modeling styles for enterprise warehouses:
1.3rd Normal Form - Top Down Architecture, Top Down Implementation
2.Federated Star Schemas - Bottom Up Architecture, Bottom Up Implementation
3.Data Vault - Top Down Architecture, Bottom Up Implementation


Technology Architecture
Scalability and flexibility is required in all facets. The extent of these features are largely depend upon organizational size, business requirements, nature of business etc.
Technology or Technical architecture primary evolved from derivations from the process architecture, meta data management requirements based on business rules and security levels implementations and technology tool specific evaluation.
Besides these, the Technology architecture also looks into the various technology implementation standards in database management, database connectivity protocols (ODBC, JDBC, OLE DB etc), Middleware (based on ORB, RMI, COM/DOM etc.), Network protocols (DNS, LDAP etc) and other related technologies.


Information Architecture
Information Architecture is the process of translating the information from one form to another in a step by step sequence so as to manage the storage, retrieval, modification and deletion of the data in the data warehouse.


Resource Architecture
Resource architecture is related to software architecture in that many resources come from software resources. Resources are important because they help determine performance. Workload is the other part of the equation. If you have enough resources to complete the workload in the right amount of time, then performance will be high. If there are not enough resources for the workload, then performance will be low.


Various Architectures
Please notice that with the different architectures there is one that stands out: Data Model Architecture. What is happening in the integration industry at large is: the ability to integrate information across the enterprise is becoming dependent on the quality of the data model architecture below.
The ability to be compliant, consistent and repeatable depends on how the data model is built under the covers.
There are 3 main data modeling styles for enterprise warehouses:
3rd Normal Form - Top Down Architecture, Top Down Implementation
Federated Star Schemas - Bottom Up Architecture, Bottom Up Implementation
Data Vault - Top Down Architecture, Bottom Up Implementation
You can read more about the Data Vault by searching for "Data Vault Data Model" on the web.
The point to Data Warehousing Architecture, is it is not JUST a data warehouse anymore. It is now a full-scale data integration platform, including right-time (real-time) data, and batch or strategic data sets in a single, auditable (and integrated) data store.

There was an error in this gadget