Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Synthetic keys and concatenation

I have a rather basic concept question. The way our data is structured is we pull alot of different types of data (bookings vs invoice) from the exact same fields. So I have to pull the same data in multiple times. When it comes to avoiding synthetic keys, am I suppose to concat the fields that i want to be linked in the tables and rename the similar ones I don't? Its similar to what i do now. however, i dont' concat i just leave the field names the same names. Thoughts on organizing such large data for best performance?

Thanks,

1 Solution

Accepted Solutions
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Anthony,

if you are getting synthetic keys, than you probably do something wrong... (skip if you don't get synthetic keys).

The idea of concatenation is to concatenate tables (not fields!) when more than 1 table have more than key in common. For example:

load Item, Customer, Date, Order Quantity from Orders;

concatenate load Item, Customer, Date, Invoiced Quantity from Invoices;

In this example, the resulting table will store both Orders and Invoices, with the common fields Item, Customer and Date. The other two fields - Order Qiantity and Invoice Quantity will only have a value when available.

This way, you avoid synthetic keys and avoid the need in creating a Link Table. The nice thing about concatenation is that now you can also load tables with other repeating fields, for example:

Customer, Date, Budget Amount

Customer and Date will be shared fields, but not the Item.

regards,

Oleg

View solution in original post

4 Replies
Anonymous
Not applicable
Author

Anthony,

From what I understand from your description, you're doing it the right way. You have to concatenate fields only if you want to link tables by more than one field, that's probably not the case here.

Regards,
Michael

Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Anthony,

if you are getting synthetic keys, than you probably do something wrong... (skip if you don't get synthetic keys).

The idea of concatenation is to concatenate tables (not fields!) when more than 1 table have more than key in common. For example:

load Item, Customer, Date, Order Quantity from Orders;

concatenate load Item, Customer, Date, Invoiced Quantity from Invoices;

In this example, the resulting table will store both Orders and Invoices, with the common fields Item, Customer and Date. The other two fields - Order Qiantity and Invoice Quantity will only have a value when available.

This way, you avoid synthetic keys and avoid the need in creating a Link Table. The nice thing about concatenation is that now you can also load tables with other repeating fields, for example:

Customer, Date, Budget Amount

Customer and Date will be shared fields, but not the Item.

regards,

Oleg

Not applicable
Author

Oleg,

What do you recommend when you have to join 2 HUGE qvd's file through 3 fields? I have been thinking this over but I currently have the following problems with my options:

1. I can create a common compound key, but this kills the optimized QVD load.

2. I've tried concatenate loading the 2nd table, but this also kills the qvd super-fast load.

3. Synthetic keys (obviously not the best option but it is the fastest)

johnw
Champion III
Champion III

I suspect that synthetic keys ARE the best option. I'm guessing that your ideal data layout for reporting speed would be three tables - your two source tables with an autonumbered or hashed key, and then another table that contained your three real fields linked to that key. And that seems to be what QlikView is automatically generating for you with the synthetic key. And it's also the fastest way to load the data. So as best I can tell, it should be the fastest to load, the fastest to process, the simplest script, and so on. Seems like the best option to me.

That said, if one of the two main tables was a master table with ALL of the keys, the third table is a bit of a waste. In that case, if chart performance were a serious issue compared to load performance, I'd at least test a two table alternative. On the master table, I'd add the hash key. On the other table, I'd replace the three fields with the hash key. It would load more slowly, but it seems at least plausible that it might be a little faster to process in your charts. Only testing would show for sure, though.