Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have such code in script as is below (I've changed names and remove some dimensions). When I'm loading data as it is in the script - everything is fine. But this script it is not efficient on the large data (there are many syncs key so loading is very slow), so I wanted to add some keys or use "Concatenate". If I add keys then some values doesn't show on some dimensions:
my keys according to used dimensions in table:
CAST(Date AS varchar)+'|'+Product_id+'|'+Client_id AS [%KP_key]
or
CAST(Date AS varchar)+'||'+Client_id AS [%KP_key]
or
CAST(Date AS varchar)+'|'+Product_id+'|' AS [%KP_key]
and then I'm creating link table and I'm removing Client_id and Product_id fields.
When I add this and I want see what is Sum_c on Products, there are just zeros and after products there is a row with "-" id and whole sum. How it should look like -> Sum_c value should be repeated on every product.
Any ideas how can I do that or how to optimize my data model?
SumsC:
SELECT
TIME Date,
ID Client_id,
Value Sum_c
FROM Sums
where ID in (select ID from Clients);
SumsP:
SELECT
TIME Date,
ID Product_id,
Value Sum_p
FROM Sums
where ID in (select ID from Products);
Clients_data:
SELECT
TIME AS Date,
sth1 Sth1_id, //some dimensions
sth2 Sth2_id,
sth3 Sth3_id, //there are a few more - 8
Value Client_cost,
ID Client_id
FROM
Clients_data;
Products_data:
SELECT
TIME AS Date,
sth1 Sth1_id, //there are also 8 more dimensions as in Clients_data
sth2 Sth2_id,
sth3 Sth3_id,
Value Product_cost,
ID Product_id
FROM
Products_data;
Incomes:
SELECT
TIME AS Date,
ID Client_id,
ID Product_id,
Value Income,
Amount Amount,
Client_counter,
Product_counter
FROM
Incomes;
======
Updates:
I have some progress. I found similar problem here: database - Handling multiple fact tables in Qlikview - Stack Overflow. So:
a) I did temp tables with whole data from database.
Temp_SumsC:
SELECT
TIME Date,
ID Client_id,
Value Sum_c
FROM Sums
where ID in (select ID from Clients);
...
b) I loaded temp tables to right tables using key
SumsC:
Load Value,
Date & '|' & Client_id AS %C_key
Resident Temp_SumsC
...
c) I added data from all tables to TempLinkTable
TempLinkTable:
LOAD Distinct
Date, Client_id
Resident Temp_SumsC
Concatenate(TempLinkTable)
...
d) I created LinkTable with all keys and dimension fields
LinkTable:
LOAD Distinct
Date, Client_id , Product_id,
Sth1_id, Sth2_id, Sth3_id,
Date & '|' & Client_id %C_key,
Date & '|' & Product_id %P_key,
Date & '|' & Client_id & '|' & Sth1_id & '|' & Sth2_id & '|' & Sth3_id %CA_key,
Date & '|' & Product_id & '|' & Sth1_id & '|' & Sth2_id & '|' & Sth3_id %PA_key,
Date & '|' & Client_id & '|' & Product_id %CP_key
Resident TempLinkTable
e) I removed all unnecessary tables
DROP Table Temp_SumsC;
...
After script reloaded, the Sum_c and Sum_p started work, but Client_cost and Product_cost still doesn't work.
I solved this problem in the following way: I've changed subsection c). Instead of this:
TempLinkTable:
LOAD Distinct
Date, Client_id
Resident Temp_SumsC
Concatenate(TempLinkTable)
...
I used this:
TempLinkTableP:
LOAD Distinct Date, Client_id, Product_id
Resident Incomes
outer join (TempLinkTableP)
LOAD Distinct Date, Product_id, Sth1_id, Sth2_id, Sth3_id
Resident Products_data;
NoConcatenate
TempLinkTableC:
LOAD Distinct Date, Client_id, Product_id
Resident Incomes
outer join (TempLinkTableC)
LOAD Distinct Date, Client_id, Sth1_id, Sth2_id, Sth3_id
Resident Clients_data;
NoConcatenate
TempLinkTable:
LOAD Distinct Date, Product_id, Client_id, Sth1_id, Sth2_id, Sth3_id
Resident TempLinkTableP;
Concatenate
LOAD Distinct Date, Product_id, Client_id, Sth1_id, Sth2_id, Sth3_idResident TempLinkTableC;
Drop Table TempLinkTableC;
Drop Table TempLinkTableP;
And it works - the values are correct. Unfortunately, instead of fast loading data, the app still works slow, but this problem is for another question.
Thanks for all help Marcus.
You need to avoid any synthetic keys (unless you knows excactly what you are doing) and for this you might need to build composite keys and/or create a link-table and/or put tables per concatenating/joining/mapping together. Here you will find a collection of links how do you could change your script: Get started with developing qlik datamodels.
- Marcus
It' difficult to say what could be wrong - maybe there is inspite of the linktable no or not the right association between the tables to show or calculate the correct results. You could build a small tablebox with only a few fields around Client_cost and Product_cost with a date-, categorie-field or something similar and if this didn't returned "normal" data-rows something with the association don't worked properly.
Maybe your approach to put everything in the linktable is not completely suitable (and often not the easiest way) and a mix from different approaches from join/map tableparts together, adds other tables per concatenate (unlike sql asymetric tables are not a problem - it worked great) and use linktables (by larger datasets it's useful to use an autonumber(ForTheCompositeKey) - the performance will be better) for the rest which isn't covered by the other approaches.
- Marcus
For me it looked if there aren't any changes within this tread - updates from postings are (mostly) not helpful, please use the answer-button.
- Marcus
Thanks for answer!
I'll look at your link - I see that there are a lots of interesting things, but I think I tried everything from there and it didn't solve my problem to the end.
What I found out so far - when I remove all other sthX_id fields it works... Maybe because Client_cost and Product_cost are connected with these dimensions. So now I am trying to separate these fileds between Clients_data and Product_data - they should be connected with dimension definition, but not between fact tables. I know that for this I could use QUALIFY, but I'm not quite sure how to use it. Of course when dimension definition will have only id and name, then I'll use mapping to get dimension name.
Another overcome, which I found is to use QUALIFY and when I have two the same dimensions I use trigger to change this second when first is changed.
PS. I've edited my main post, because I couldn't add any reply for your answers.
I think that your datamodel isn't suitable yet - that there are still missing associations and/or a too complex structure and qualifying will be probably not helpful rather the opposite. Qualifying is only useful to separate data-areas from each other which is most not the aim within qlikview else the opposite again the association from (all) data is the aim. People with a sql-background where qualifying and normalization is quite common think often too complicated by building a datamodel in qlikview, many things (not all) a quite easy .
The easiest data-model is the star-sheme with one fact-table (if you have several fact-tables they will be joined/mapped/concatenated) and various dimension-tables which are direct linked to the fact-table (chained dimension-tables will be de-normalized by joining/mapping/concatenating the tables together). Thes description is a bit black and white colored and the reality is more grey but it should give you a direction and most often other datamodels needs more efforts to build them.
I suggest that you start with a simplified approach by not loading all tables and all fields and checked then within tableviewer and with one or two simple table-charts if the loaded data are like you expect them and then you could increase the complexity again step by step.
- Marcus
The point is that once there was one big table with all measures and dimensions, but it was splitted to increase performance of reports using SSAS (SQL Server Analysis Services), because many informations were duplicated in this one table.
Now I think that I should use this splitted table in QV - but maybe here I'm wrong? Does QlikView really like big tables with milions of records? I'm not quite sure what I should thinking about it.
You said about data-model which is the star-scheme. I have now such scheme, but the main table is not fact-table but link-table with connections products-clients-data. And something like this is not too good, right?
Quite often worked such big table very well in qlikview. It couldn't be compared with big tables in normal sql-databases then qlikview stored only the distinct values of a field and used then bit-stuffed pointer, see: Symbol Tables and Bit-Stuffed Pointers.
Nevertheless it will depend on various factors especially in which area will you have more likely a bottleneck which model-approach will fit best - see here what is meant: TableView Model - Star Or Snowflake - Does It Matter?.
Therefore I suggest you start with a simple model which worked and only there are bigger performance problems or you have enough time you begin to optimize this and that.
- Marcus
I followed your advice and I started from loading two problematic tables. Then I saw that already at this point I had additional row with whole sum (what I didn't want). When I added tables Sum_c and Sum_p (there were $syn keys) - everything was fine again...
I saw that table with all $syn keys was like full outer join of tables, whereas I just could do Concatenate or plain joins.
Is it possible to do full outer join of my tables (clients_data and products_data)?
*of course without numeric values, only dimensions to do Link Table
It's quite hard stuff here Should We Stop Worrying and Love the Synthetic Key? and I agree synthetic keys aren't a general problem. From them I recommend the comment from mov:
My two cents (no time for much more...)
If you are a beginner, the recomandation stays true - avoid synthetic keys. As a rule it's a result of poor design.
If you're an expert, it's a different story. You know the risks and understand what you're doing.
(Compare to car driving, you may by chance run on two left wheels, but better leave it for the experts :-))
- Marcus