Solved: Re: Problem with load script - Qlik Community

Report Inappropriate Content · ‎2015-07-06

I have such code in script as is below (I've changed names and remove some dimensions). When I'm loading data as it is in the script - everything is fine. But this script it is not efficient on the large data (there are many syncs key so loading is very slow), so I wanted to add some keys or use "Concatenate". If I add keys then some values doesn't show on some dimensions:

my keys according to used dimensions in table:

CAST(Date AS varchar)+'|'+Product_id+'|'+Client_id AS [%KP_key]

or

CAST(Date AS varchar)+'||'+Client_id AS [%KP_key]

or

CAST(Date AS varchar)+'|'+Product_id+'|' AS [%KP_key]

and then I'm creating link table and I'm removing Client_id and Product_id fields.

When I add this and I want see what is Sum_c on Products, there are just zeros and after products there is a row with "-" id and whole sum. How it should look like -> Sum_c value should be repeated on every product.

Any ideas how can I do that or how to optimize my data model?

SumsC:

SELECT

TIME Date,

ID Client_id,

Value Sum_c

FROM Sums

where ID in (select ID from Clients);

SumsP:

SELECT

TIME Date,

ID Product_id,

Value Sum_p

FROM Sums

where ID in (select ID from Products);

Clients_data:

SELECT

TIME AS Date,

sth1 Sth1_id, //some dimensions

sth2 Sth2_id,

sth3 Sth3_id, //there are a few more - 8

Value Client_cost,

ID Client_id

FROM

Clients_data;

Products_data:

SELECT

TIME AS Date,

sth1 Sth1_id, //there are also 8 more dimensions as in Clients_data

sth2 Sth2_id,

sth3 Sth3_id,

Value Product_cost,

ID Product_id

FROM

Products_data;

Incomes:

SELECT

TIME AS Date,

ID Client_id,

ID Product_id,

Value Income,

Amount Amount,

Client_counter,

Product_counter

FROM

Incomes;

======

Updates:
I have some progress. I found similar problem here: database - Handling multiple fact tables in Qlikview - Stack Overflow. So:
a) I did temp tables with whole data from database.

Temp_SumsC:
SELECT
  TIME Date,
  ID Client_id,
  Value Sum_c
FROM Sums
where ID in (select ID from Clients);

...

b) I loaded temp tables to right tables using key

SumsC:

Load Value,

Date & '|' & Client_id AS %C_key

Resident Temp_SumsC

...

c) I added data from all tables to TempLinkTable

TempLinkTable:

LOAD Distinct

Date, Client_id

Resident Temp_SumsC

Concatenate(TempLinkTable)

...

d) I created LinkTable with all keys and dimension fields

LinkTable:
LOAD Distinct
     Date, Client_id , Product_id,
     Sth1_id, Sth2_id, Sth3_id,
     Date & '|' & Client_id %C_key, 
     Date & '|' & Product_id %P_key,      
     Date & '|' & Client_id & '|' & Sth1_id & '|' & Sth2_id & '|' & Sth3_id %CA_key, 
     Date & '|' & Product_id & '|' & Sth1_id & '|' & Sth2_id & '|' & Sth3_id %PA_key, 
     Date & '|' & Client_id & '|' & Product_id %CP_key
Resident TempLinkTable

e) I removed all unnecessary tables

DROP Table Temp_SumsC;
...

After script reloaded, the Sum_c and Sum_p started work, but Client_cost and Product_cost still doesn't work.

Report Inappropriate Content · ‎2015-08-20

I solved this problem in the following way: I've changed subsection c). Instead of this:

TempLinkTable: 
LOAD Distinct 
    Date, Client_id 
Resident Temp_SumsC 

Concatenate(TempLinkTable) 
...

I used this:

TempLinkTableP:
LOAD Distinct      Date, Client_id, Product_id
Resident           Incomes
outer join (TempLinkTableP)
LOAD Distinct      Date, Product_id, Sth1_id, Sth2_id, Sth3_id
Resident           Products_data;

NoConcatenate

TempLinkTableC:
LOAD Distinct      Date, Client_id, Product_id
Resident           Incomes
outer join (TempLinkTableC)
LOAD Distinct      Date, Client_id, Sth1_id, Sth2_id, Sth3_id
Resident           Clients_data;


NoConcatenate

TempLinkTable:
LOAD Distinct      Date, Product_id, Client_id, Sth1_id, Sth2_id, Sth3_id
Resident TempLinkTableP;
Concatenate
LOAD Distinct      Date, Product_id, Client_id, Sth1_id, Sth2_id, Sth3_id
Resident TempLinkTableC;


Drop Table TempLinkTableC;
Drop Table TempLinkTableP;

And it works - the values are correct. Unfortunately, instead of fast loading data, the app still works slow, but this problem is for another question.

Thanks for all help Marcus.

View solution in original post

marcus_sommer · ‎2015-07-06

You need to avoid any synthetic keys (unless you knows excactly what you are doing) and for this you might need to build composite keys and/or create a link-table and/or put tables per concatenating/joining/mapping together. Here you will find a collection of links how do you could change your script: Get started with developing qlik datamodels.

- Marcus

marcus_sommer · ‎2015-07-06

It' difficult to say what could be wrong - maybe there is inspite of the linktable no or not the right association between the tables to show or calculate the correct results. You could build a small tablebox with only a few fields around Client_cost and Product_cost with a date-, categorie-field or something similar and if this didn't returned "normal" data-rows something with the association don't worked properly.

Maybe your approach to put everything in the linktable is not completely suitable (and often not the easiest way) and a mix from different approaches from join/map tableparts together, adds other tables per concatenate (unlike sql asymetric tables are not a problem - it worked great) and use linktables (by larger datasets it's useful to use an autonumber(ForTheCompositeKey) - the performance will be better) for the rest which isn't covered by the other approaches.

- Marcus

marcus_sommer · ‎2015-07-07

For me it looked if there aren't any changes within this tread - updates from postings are (mostly) not helpful, please use the answer-button.

- Marcus

Report Inappropriate Content · ‎2015-07-08

Thanks for answer!
I'll look at your link - I see that there are a lots of interesting things, but I think I tried everything from there and it didn't solve my problem to the end.

What I found out so far - when I remove all other sthX_id fields it works... Maybe because Client_cost and Product_cost are connected with these dimensions. So now I am trying to separate these fileds between Clients_data and Product_data - they should be connected with dimension definition, but not between fact tables. I know that for this I could use QUALIFY, but I'm not quite sure how to use it. Of course when dimension definition will have only id and name, then I'll use mapping to get dimension name.

Another overcome, which I found is to use QUALIFY and when I have two the same dimensions I use trigger to change this second when first is changed.

PS. I've edited my main post, because I couldn't add any reply for your answers.

marcus_sommer · ‎2015-07-08

I think that your datamodel isn't suitable yet - that there are still missing associations and/or a too complex structure and qualifying will be probably not helpful rather the opposite. Qualifying is only useful to separate data-areas from each other which is most not the aim within qlikview else the opposite again the association from (all) data is the aim. People with a sql-background where qualifying and normalization is quite common think often too complicated by building a datamodel in qlikview, many things (not all) a quite easy .

The easiest data-model is the star-sheme with one fact-table (if you have several fact-tables they will be joined/mapped/concatenated) and various dimension-tables which are direct linked to the fact-table (chained dimension-tables will be de-normalized by joining/mapping/concatenating the tables together). Thes description is a bit black and white colored and the reality is more grey but it should give you a direction and most often other datamodels needs more efforts to build them.

I suggest that you start with a simplified approach by not loading all tables and all fields and checked then within tableviewer and with one or two simple table-charts if the loaded data are like you expect them and then you could increase the complexity again step by step.

- Marcus

Report Inappropriate Content · ‎2015-07-08

The point is that once there was one big table with all measures and dimensions, but it was splitted to increase performance of reports using SSAS (SQL Server Analysis Services), because many informations were duplicated in this one table.

Now I think that I should use this splitted table in QV - but maybe here I'm wrong? Does QlikView really like big tables with milions of records? I'm not quite sure what I should thinking about it.

You said about data-model which is the star-scheme. I have now such scheme, but the main table is not fact-table but link-table with connections products-clients-data. And something like this is not too good, right?

marcus_sommer · ‎2015-07-08

Quite often worked such big table very well in qlikview. It couldn't be compared with big tables in normal sql-databases then qlikview stored only the distinct values of a field and used then bit-stuffed pointer, see: Symbol Tables and Bit-Stuffed Pointers.

Nevertheless it will depend on various factors especially in which area will you have more likely a bottleneck which model-approach will fit best - see here what is meant: TableView Model - Star Or Snowflake - Does It Matter?.

Therefore I suggest you start with a simple model which worked and only there are bigger performance problems or you have enough time you begin to optimize this and that.

- Marcus

Report Inappropriate Content · ‎2015-07-09

I followed your advice and I started from loading two problematic tables. Then I saw that already at this point I had additional row with whole sum (what I didn't want). When I added tables Sum_c and Sum_p (there were $syn keys) - everything was fine again...

I saw that table with all $syn keys was like full outer join of tables, whereas I just could do Concatenate or plain joins.

Is it possible to do full outer join of my tables (clients_data and products_data)?

*of course without numeric values, only dimensions to do Link Table

marcus_sommer · ‎2015-07-09

It's quite hard stuff here Should We Stop Worrying and Love the Synthetic Key? and I agree synthetic keys aren't a general problem. From them I recommend the comment from mov‌:

My two cents (no time for much more...)

If you are a beginner, the recomandation stays true - avoid synthetic keys. As a rule it's a result of poor design.
If you're an expert, it's a different story. You know the risks and understand what you're doing.
(Compare to car driving, you may by chance run on two left wheels, but better leave it for the experts :-))

- Marcus