I'm trying to concatenate two large tables. I was trying to do this in several ways:
Both tables are residential
One residential, second loaded from QVD (optimized load)
Both tables are from QVD (optimized load)
The problem is this process is extremely slow because only one CPU thread is enabled, When I join these tables QV is using several cores. It looks like Concatenate operation has the same problem as Group By or Inner Join. So, the question is has anybody mitigated similar issue in some way.
What is wrong with the standard outer join in this case?
"Both tables have common key field used to link them. All other fields are different."
If your data are solid it should work?
Load * Inline
[key, testfield, flag1, flag2, flag3
A, 125, 1,0,1
B, 123, 1,0,1
C, 155, 0,1,0
D, 110, 0,0,1];
Load * Inline
[key, testfield2 ,flag4, flag5, flag6
A, 125, 0,1,0
B, What, 1,0,1
C, Why, 0,1,0
D, How, 1,1,1];
Then you have:
With concatenate activated instead you have:
So why do you need the concatenate version?
I've got the same problem.
800 Mio datas from table 1.
then concatenate datas from table 2.
There are common fields and different fields.
While concatenate load the memory grows by 8 Gb - why, because there are no datas in, only a null value....
So I load a fact table at the beginning, including all fields from all tables I want to concatenate.
The script is not freezing to generate the null values but the memory size is still too large.
to join the facts is not the answer.
Maybe the material number is the field you want to join.
You would load for example 10 datas for one material with sales information.
When now joining the facts with storage information (also 10 datasets) you would have 100 records at all, each record is duplicated 10 times.
Even if you join by material number and year/month, it could be possible to duplicate datas if there are more than one record in one case.
When using all dimesion fields as join fields there are no datas to join, the outer join would create the same result as a concatenate load.
To load the datas in different fact tables and using a link table is also a problem.
One fact is using material customer year month,
the other material, supplier, year month,
the next material workplace, year month
and mybe another workplace and yearmonth.
So a multible number of link fields what also generates a huge link table.
I'm thinking there is not THE answer at all
For concatenation I suggest to first load a (non existing) record containting all the fields from all the tables you want to concatenate with autogenerate(0):
field1, field2, ... , fieldn
This creates the data structure upfront and it doesn't needs to be rearranged on the second table..