.png)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Data Warehousing basics
Description
This video explains the basics of Data Warehousing for Compose for Data Warehouses.
Video
Transcript
I'm going to talk about
Compose for Data Warehousing. But first
I'm going to do an introduction about the
How's and Why's of data warehousing
Business Intelligence.
It's a process for analyzing
enterprise data to present actionable information.
In other words
BI helps management make good decisions. Good
business intelligence
is of strategic importance. Now the link between
business intelligence and data warehousing.
To have good BI
we need data warehousing as an instrument.
First of all, it's about getting the data
in
integration of data from different sources.
You must present it
in a palatable way to the end-users. Now the
integration of data
we need to collect the data from this different storage
back into centralized storage. External
data also have different formats
Another aspect is the data quality. The end-users want to be able to rely
on what the data tells us. It shouldn't be
stale. It should be up to date. A certain
level of accuracy
is of course required. And fast delivery is
also essential.
For some companies, external requirements
also come into play
The data must be auditable. And they must
confine to Basel,
Sarbanes Oxley, whatever is imposed on them.
You have hackers
You need to be sure the data you have, especially
private oriented data
will not become public. How do you want to
do the reports
Data visualization. A data warehouse cannot
be compared to classical
database management systems. In the early days
of data warehousing,
two schools of thought emerged. One is about
data storage.
data consolidation. He proposed a normalized
collection of entities
that made up the data warehouse. The other
school of thought was more
focused on reporting. It was a central fact
table associated with a fact table where
dimension tables which were only one join
condition away
from the fact table. Compose for Data Warehouses
unites these schools of thought.
You have one data warehouse. One enterprise
storage and two types of tables for each entity.
And satellite tables where the satellite tables
hold the history of the object's
data marts, fact tables, and dimension tables.
So we have both.
Modeling a data warehouse, we work with entities.
We have for example
a customer entity. So for a customer we have
a hub table with some attributes
we are not interested in the history, and
a satellite table in which we keep track
of the history.
Suppose we have the address information
of a customer in the satellite tables. When
a customer moves, another row
is added to the satellite table. There is
a 'from' and a 'to' date.
so we know for every instance in time, what
the address of the customer was
The data marts...I always used to say that
the data
in the data, the warehouse should be correct under
any circumstances.
And the data marts, you can drop them and
re-create them because these always
are a subset of the existing data in the data
warehouse. You can say that
I only want data in the data mart that is
current.
So you can make all kinds of sub-selections.
For example, you can create
a data mart only targeted for one particular
country. Or perhaps two countries
so you can filter data out from the data warehouse,
which contains
all the tables. A data mart is disposable.
You can throw it away
and regenerate it again or you can create
a second one.
to get the data into the data warehouse, the
term often used is ETL
it's about extracting, transforming, and loading
Compose for Data Warehouse has one magnificent
candidate for
the extraction, and that's Replicate. The
Transform is about transforming,
repairing, or rejecting data. That is done by
Compose and after the transformation
The data ends up in the Data Warehouse. Information
out, we discussed that a little bit
That data marts are the presumed data interfaces
to the end-users. That data interface is in
fact the data mart. And of course,
the name to mention in exporting tools today
is Qlik. Here I have an overview
You have here external data. At least external
with respect to Compose
And Replicate will collect all of this data
from different formats, different sources
and
have this data end up in the Landing Data
Base. That green arrow
shows the data flow from the landing database
to the data warehouse. Also, the filtering
and the rejecting the checking is done here.
This is the extraction, transformation
, and loading. We have the data in the data
warehouse.
On top of the data warehouse, users may define
the data marts they need and then reporting
tools can pick up the data
in these data marts. And that about what I
wanted to tell you.