Data Warehouse Concepts



index
Disabled back button Next Section
printable version

Section 0: Module Objectives or Competencies
Course Objective or Competency Module Objectives or Competency
The student will be able to list and explain basic data warehouse concepts. The student will be able to explain the purpose of a data warehouse and how it differs from a database.
The student will be able to explain the difference between operational and analytical data.
The student will be able to list and explain the components that make up a data warehouse.
The student will be able to explain related concepts like data lakes and data marts.
The student will be able to list and explain the steps involved in the development of a data warehouse.
Section 1: Overview

Data warehouses are central repositories of integrated data from one or more disparate sources, and they store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise.



Here is a brief Introduction to Data Warehouses



Here is a very general Explanation of Data Warehouses



This is a more thorough discussion of What is a Data Warehouse?





The Data Warehouse is a structured repository of integrated, subject-oriented, enterprise-wide, historical, time-variant data that provides support for decision making.

comparison


Retrieval of analytical information




Detailed and/or summarized data

Section 2: Justification

What Is A Data Warehouse And Why Do We Build Them?

A typical organization maintains and utilizes a number of operational data sources.

A data warehouse is created within an organization as a separate data store whose primary purpose is data analysis.

Often, the same fact can have both an operational and an analytical purpose.

  • For example, data describing that customer X bought product Y in store Z can be stored in an operational database for business-process support purposes, such as inventory monitoring or financial transaction record keeping.
  • That same fact can also be stored in a data warehouse where, combined with a vast number of similar facts accumulated over a period of time, it serves to reveal important trends, such as sale patterns or customer behavior.

There are two main reasons for the creation of a data warehouse as a separate analytical database.

Section 3: Data Warehouse Components

Data warehouse components

Data warehouse components.


Source Systems

Data warehouse components.


Extract - Transform - Load infrastructure

ETL Process.


Data warehouse




Data warehouse front-end (BI) applications

Business intelligence applications.
Section 4: Data Marts

A data mart is a data store based on the same principles as a data warehouse, but with a more limited scope.

There are two major categories of data marts: independent data marts and dependent data marts.

Data Marts




Difference between Data mart and Data Warehouse:.


While we are muddying the analogy waters, let's see some other buzzwords that describe variations of a data warehouse.

And then there is this outlier, which has no reference to water but is still related to data warehouses. They could have at least stayed with the theme by calling it a data ice cube!

Section 5: Steps in the Development of Data Warehouses
Steps in the Development of Data Warehouses.


Requirements collection, definition, and visualization

Step 1 in the Development of Data Warehouses.

This phase results in the requirements specifying the desired capabilities and functionalities of the future data warehouse.

Requirements collection is the most critical step in the development of the data warehouse.



Data warehouse modeling (logical data warehouse modeling)

Step 2 in the Development of Data Warehouses.

This phase involves creation of the data warehouse data model that is implementable by the DBMS software.



Creating the data warehouse

Step 3 in the Development of Data Warehouses.

This phase involves using a DBMS to implement the data warehouse data model as an actual data warehouse.



Creating ETL infrastructure

Step 4 in the Development of Data Warehouses.

This phase involves creating necessary procedures and code for:

Due to the amount of details that have to be considered, creating ETL infrastructure is often the most time- and resource-consuming part of the data warehouse development process.

ETL infrastructure has to account for and reconcile all of the differences in the metadata and the data between the operational sources and the target data warehouses.



Developing front-end (BI) applications

Step 5 in the Development of Data Warehouses.

This phase involves designing and creating applications for indirect use by the end-users.

The design and creation of data warehouse front-end applications can take place in parallel with data warehouse creation.



Data warehouse deployment

Step 6 in the Development of Data Warehouses.

This phase involves releasing the data warehouse and its front-end (BI) applications for use by the end users

Typically, prior to this step, the initial load populating the created data warehouse with an initial set of data from the operational data sources via the ETL infrastructure is executed.



Data warehouse use

Step 7 in the Development of Data Warehouses.

This phase involves the retrieval of the data in the data warehouse.



Data warehouse administration and maintenance

Step 8 in the Development of Data Warehouses.

This phase involves performing activities that support the data warehouse end user, including dealing with technical issues, such as:

Business intelligence (BI) is a process for gathering usable knowledge about the external business environment and turning it into the intelligence required for tactical or strategic decisions.