Re: Loop through tables by 1000 and store result ... - Qlik Community

ioannagr · ‎2021-04-12

Hello all,

I have come up with a question that seems very advanced for me to do, so I am asking for your insight.

Let's say I have the following three tables : Table1, Table2, Table3

and Table1 id1 has 1:n relationship with id2 of Table 2 (so table 2 has id1_fk in it)

Table 2 id2 has 1:n relationship with id3 of Table3 (so table 3 has id2_fk in it and also a field called size_field)

I want to create an entirely new field in Table 1 called SIZE2 that will be populated like that:

For first batchfrom 1 to 1000 rows of table1

- check 1st id1, find the corresponding id2s

-for these id2s, find the corresponding id3s and size_field

- sum size_fields found

-store in SIZE2 in Table1

...

- check 1000th id1, find the corresponding id2s

-for these id2s, find the corresponding id3s and size_field

- sum size_fields found

-store in SIZE2 in Table1

Go on to second batch from row 1001 to 2000. Do again the above in loop until all rows in Table1 (id1) are finished.

Is there a way to do so?

Thanks in advance,

Ioanna

Dalton_Ruer · ‎2021-04-12

Why are you trying to do this in blocks of 1000 instead of just doing it for all at 1 time?

Is there literally a field called SIZE_FIELD in Tables 2 and 3 or are you looking for a count of how many rows matched the key value?

ioannagr · ‎2021-04-12

No, SIZE_FIELD exists only in Table3. I want not to load in one time this Table3, because it's huge and I only want to sum(SIZE_FILE) for all different id1s, but the associations between the tables are these 1 to n as described above.

So i want by going from Table 1 to table 3 the desirable way i described in my initial post, to create this new field SIZE2 with the corresponding information.

My thought was to do that in 1000s.

What do you think @Dalton_Ruer ? Can you help me out?

Dalton_Ruer · ‎2021-04-12

Gotcha.

If you create a simple Table object within your user interface and simply add Table 1 Dimension values (any fields from Table 1 that you want, and add any Table 2 Dimension values that you want and then create a MEASURE in the table that is SUM(SIZE_FIELD) it will do that calculation for you on the fly.

Qlik's Associative Model isn't worried about 1 to 1, 1 to M relationships. All values for Table 1 remain independent as do the others.

As long as the tables are associated (fields are named the same) then the SUM(Size_Field) will work and will honor any filters you may provide the end user. Meaning I could show you how to sum the value and store it in a row for Table 2 and then get it for Table 1. But if you filtered to some value(s) in Table 2 or 3 the value you showed the user would be wrong, because it was pre-aggregated.

Give it a try and I think you will like what you see. If you have problems I can build a simply version that demonstrates.

ioannagr · ‎2021-04-12

But my end goal is to create that field and keep it populated on the backend. The table i'm speaking about is really huge, so huge qlik engine crashes and it gets bigger by day.

So, i want not to create it on the UI, that's why I also need the in block logic in the script.

I am kinda lost. I think you can give me an insight and a way 🙂 ! @Dalton_Ruer

ioannagr · ‎2021-04-12

@Dalton_Ruer wrote:
Meaning I could show you how to sum the value and store it in a row for Table 2 and then get it for Table 1. But if you filtered to some value(s) in Table 2 or 3 the value you showed the user would be wrong, because it was pre-aggregated.

This is what I am looking for and i would be grateful if you shared your knowledge with me! I just want to learn how I can do this and how my idea unfolds, but my knowledge isn't that big. I count on yours! @Dalton_Ruer

ioannagr · ‎2021-04-15

Does anyone have an idea?

marcus_sommer · ‎2021-04-16

I think I would try a different approach not using loops else the more native Qlik features of filtering with exists() and using mappings, maybe something like:

a: load id1 as id2 from t1;

b: load id2, id3 from t2 where exists(id2);

c: load id3, file_size from t3 where exists(id3);

d: mapping load id3 sum(file_size) as file_size resident c group by id3;

e: mapping load id2, id3 resident b;

f: load *, applymap('d', id3, 0) as file_size;
load *, applymap('e', id2, '#NV') as id3 resident a;

just as a simplified approach.

Beside this it may be sensible to implement it within some incremental logic.

- Marcus

ioannagr · ‎2021-04-16

@marcus_sommer Sounds very interesting. Trying to implement it I came across two problems.

First of all, table2 does not have id3 in it.

Table1 has id1

Table2 has id2, id1_fk

Table3 has id3, id2_fk, file_size.

Can you support me with this setting?

also, I'm retrieving from a MySql database

and where exists does not work well here. I tried to do

a:
load id1 as id1_fk;
SQL SELECT id1
FROM db.table1;

b:
load id1_fk, id2;
SQL SELECT id2, `id1_fk`
FROM db.table2
where exists (select id1_fk from a where a.id1_fk=b.id1_fk); //looked this up, that's a suggested way, not sure if correct

but doesn't work.

marcus_sommer · ‎2021-04-16

Exists() is a Qlik feature and couldn't be directly used within the SQL.

It could be applied within the preceeding-part but quite often it's not useful because it means that at first all records would be pulled from the db and then filtered in Qlik. In general you couldn't simply mix Qlik loads and data with the SQL because Qlik didn't execute the SQL else just transferred the statement to the db and receive the results. The only possible thing is to put some information into a variable and use them within the SQL.

But by larger filter-logics it's often not very helpful because most SQL functions are limited to a max. number of items / parameter like in() which usually only a few hundred parameter accepted. I assume that this was your origin plan with the loop but IMO it's rather the worst case instead of a sensible solution.

Better than this is to export the Qlik field within a table to a csv which is then imported by the db and for example used as an inner-join filter within the SQL. Another approach would be to pull all needed tables and records from the db into Qlik within an incremental approach and storing them as qvd's. Afterwards you will be able to apply the Qlik features again.

It may look like a disadvantage to pull large datasets to Qlik because it will need some storing-space and some run-times to do the initial data-loading but afterwards you get the full power and flexibility of the Qlik features and applied within a multi-level data-architecture and multi-level incremental logics it's usually much faster as applying complex queries against the db.

- Marcus

Loop through tables by 1000 and store result in a single table field