Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
OmarBenSalem

Joining 2 tables using 2 composite Keys problem!

Hi all @sunny_talwar , @hic  , @Gysbert_Wassenaar , @swuehl , @jagan , @Clever_Anjos , @marcus_sommer , @tresesco , @rwunderlich ..

I've once faced a harsh problematic when I was trying to join 2 tables using 2 composite Keys while all of the values were not always available..

I've found a solution using an "index searching generic script" and have shared the script to the rest of the community (please READ IT to further understand my request) : https://community.qlik.com/t5/Qlik-Sense-Documents/Search-for-occurances-Generic-Script-for-Qlik-by-...

To resume : 

- Table 1 had a composite Key1 (Exp: |xx|02| )

- table 2 had another Key2 (Exp :|xx|1919|15|02|)

The goal is to join these 2 values since the xx and 02 from Key1 exists in Key 2 :|xx|1919|15|02|.

While it really works.. using my method, the table that search for the occurances will contain ALL OF THE POSSIBLE COMBINATIONS (maybe BILLIONS of lines..). This could affect the app's performance.

The question is : does ANYONE has another/BETTER way to do so without affecting the app's performance?

Thank you very much!

Omar BEN SALEM

Labels (1)
26 Replies
OmarBenSalem
Author

The problem is that key2 is not static.. I mean, depends on the filled values, the Key changes (a combination of 4 fields that are not always filled with values..)

marcus_sommer

I think like above already hinted that the key is not the real problem else the missing/wrong field-values within the data. This may be solvable with multiple stages of cleaning/preparing the data but by a really poor data-quality you might not be able to restore everything.

Of course this could become quite difficulty and complex and may cost a lot of efforts and therefore I suggest to re-think your entire concept again if not a concatenation of both tables (see also above) might be more suitable. Of course a poor data-quality remained poor until the degree of useless but you will be able to load and show all available records and you could show with flags which data are valid and which not respectively the degree of the data-quality. In my experience such an approach is much easier to realize as any join-logic.

- Marcus 

OmarBenSalem
Author

I'm sorry Marcus.. but I still can't see how concatenating the 2 tables will reslove the issue?

Plus.. I have no control over the data source.. The functionnal strategy has been defined as so.. we link lines of table 1 by any combination we find of the 4 fields of table 2..

Furthermore.. if we found for one line .. 5 possibles linkes 'combinations of 3 fields, 2 fields etc', there is an order.. where we take the most refined one..

marcus_sommer

Of course concatenating the tables won't link them and don't resulting in a joined/associated table. But that's not the target. The aim is to make the required views in the UI possible - at this point the look and name of the data-model is secondary.

The essential point is that tables without a controlled relation-ship couldn't be joined - at least not with simple methods. Even more important are missing key-values which could be never joined. Quite common in such cases is the use of a link-table - which mainly based on a concatenation-approach. Another method is to check both tables against each other and then to populate all missing keys.

Nothing of it is simple and may need a lot of efforts. Therefore again my suggestion to start with the simplest method to concatenate these tables because all data will be there / will be available without the need of using direct-keys or even creating complex ones. Even if you in the end may be able to apply another data-model it will be very helpful to validate the data - especially in regard to the data-quality to how many wrong/missing values are there and do they look like to estimate the measures and efforts to clean/prepare them (by a poor data-quality you will need to do the last by each kind of data-model).

- Marcus

OmarBenSalem
Author

Thanks Marcus, but the functionnal team does not "judge" their data quality as poor but are rather convinced that this is the only way they can do it..

Besides, we do need to join tables since our expressions depend on this.. (we use conditions in table 2 to calculate measures in table1..)

Thank u for your time.. but I really do not see any other way to do it other thant the solution I've come up with.. which is VERY resource heavy..

marcus_sommer

In regard to the creation of the data within their productive databases the data-quality might not be poor else working like intended. But from the point of view to match several data-sources without a common unique key the judgement to the data-quality could be differently viewed.

I think I have a quite similar case in which I need to match sales-data with provision-data. There exists no unique key else depending on the context (categories, types) I need to use different keys on both sides which often not exists or aren't valid. Therefore I apply multiple stages in beforehand to check them and to clean/fill/prepare the ID's - from several other sources but also from the origin tables. Afterwards I could combine the appropriate fields to key's - simplified to:

ID & Category & Period

whereby I have in the end around two dozens of key-values for a single record. Those keys with a bunch of other fields (string-concatenated) are then concatenated loaded within a mapping-table whereby the load-order ensured the priority in which later matches happens.

In the target-table are then multiple heavy nested applymap() used to match the data. No matching will be in further steps emulated.

I could imagine that you could also apply a similar method with nested mappings. Of course it are heavy transformation but mappings are much faster as joins especially if there are multiple keys which would require to repeat the joins n times because they couldn't be nested.

- Marcus

 

OmarBenSalem
Author

Thank for your time Marcus; I'll try to implement such a modelisation and see how it works!