Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
bgerchikov
Partner - Creator III
Partner - Creator III

Concatenate big tables

Hello,

I'm trying to concatenate two very big tables to implement star schema in Data model. Having these tables linked makes chart performance very slow.

The first one has 50 fields and 440 millions records

The second one has about 160 fields (80% of them are flags) and about 100 million records.

The problem is the Concatenate operation uses one core only, so this process might take couple days to be accomplished.

Has anybody met such problem?

Thanks!

Boris

16 Replies
martinpohl
Partner - Master
Partner - Master

I've got the same problem.

800 Mio datas from table 1.

then concatenate datas from table 2.

There are common fields and different fields.

While concatenate load the memory grows by 8 Gb - why, because there are no datas in, only a null value....

So I load a fact table at the beginning, including all fields from all tables I want to concatenate.

The script is not freezing to generate the null values but the memory size is still too large.

to join the facts is not the answer.

Maybe the material number is the field you want to join.

You would load for example 10 datas for one material with sales information.

When now joining the facts with storage information (also 10 datasets) you would have 100 records at all, each record is duplicated 10 times.

Even if you join by material number and year/month, it could be possible to duplicate datas if there are more than one record in one case.

When using all dimesion fields as join fields there are no datas to join, the outer join would create the same result as a concatenate load.

To load the datas in different fact tables and using a link table is also a problem.

One fact is using material customer year month,

the other material, supplier, year month,

the next material workplace, year month

and mybe another workplace and yearmonth.

So a multible number of link fields what also generates a huge link table.

I'm thinking there is not THE answer at all

vvvvvvizard
Partner - Specialist
Partner - Specialist

Concatenate will be processed via a single  CPU core  and therefore is slower 

Use  Outer Join which is faster , which will use all available CPU Cores of the server

martinpohl
Partner - Master
Partner - Master

I could not detect that only one CPU was busy. All CPU were working with 100 %.

vvvvvvizard
Partner - Specialist
Partner - Specialist

I did not test myself but this is where i got my info from

Q-Tip #9 – CONCATENATE or JOIN ? | Natural Synergies

http://www.naturalsynergies.com/q-tip-9-concatenate-or-join/

rbecher
MVP
MVP

For concatenation I suggest to first load a (non existing) record containting all the fields from all the tables you want to concatenate with autogenerate(0):

LOAD

field1, field2, ... , fieldn

autogenerate(0);

This creates the data structure upfront and it doesn't needs to be rearranged on the second table..

- Ralf

Astrato.io Head of R&D
martinpohl
Partner - Master
Partner - Master

That's what I've done by creating the fact table at the beginning of the load statement.

rbecher
MVP
MVP

This OUTER JOIN approach works only if both fact tables have the same granularity..

Astrato.io Head of R&D