4 Replies Latest reply: Sep 11, 2009 10:25 PM by John Witherspoon RSS

    Synthetic keys and concatenation

      I have a rather basic concept question. The way our data is structured is we pull alot of different types of data (bookings vs invoice) from the exact same fields. So I have to pull the same data in multiple times. When it comes to avoiding synthetic keys, am I suppose to concat the fields that i want to be linked in the tables and rename the similar ones I don't? Its similar to what i do now. however, i dont' concat i just leave the field names the same names. Thoughts on organizing such large data for best performance?




        • Synthetic keys and concatenation
          Michael Solomovich


          From what I understand from your description, you're doing it the right way. You have to concatenate fields only if you want to link tables by more than one field, that's probably not the case here.


          • Synthetic keys and concatenation
            Oleg Troyansky


            if you are getting synthetic keys, than you probably do something wrong... (skip if you don't get synthetic keys).

            The idea of concatenation is to concatenate tables (not fields!) when more than 1 table have more than key in common. For example:


            load Item, Customer, Date, Order Quantity from Orders;

            concatenate load Item, Customer, Date, Invoiced Quantity from Invoices;


            In this example, the resulting table will store both Orders and Invoices, with the common fields Item, Customer and Date. The other two fields - Order Qiantity and Invoice Quantity will only have a value when available.


            This way, you avoid synthetic keys and avoid the need in creating a Link Table. The nice thing about concatenation is that now you can also load tables with other repeating fields, for example:

            Customer, Date, Budget Amount

            Customer and Date will be shared fields, but not the Item.



              • Synthetic keys and concatenation


                What do you recommend when you have to join 2 HUGE qvd's file through 3 fields? I have been thinking this over but I currently have the following problems with my options:

                1. I can create a common compound key, but this kills the optimized QVD load.

                2. I've tried concatenate loading the 2nd table, but this also kills the qvd super-fast load.

                3. Synthetic keys (obviously not the best option but it is the fastest)

                  • Synthetic keys and concatenation
                    John Witherspoon

                    I suspect that synthetic keys ARE the best option. I'm guessing that your ideal data layout for reporting speed would be three tables - your two source tables with an autonumbered or hashed key, and then another table that contained your three real fields linked to that key. And that seems to be what QlikView is automatically generating for you with the synthetic key. And it's also the fastest way to load the data. So as best I can tell, it should be the fastest to load, the fastest to process, the simplest script, and so on. Seems like the best option to me.

                    That said, if one of the two main tables was a master table with ALL of the keys, the third table is a bit of a waste. In that case, if chart performance were a serious issue compared to load performance, I'd at least test a two table alternative. On the master table, I'd add the hash key. On the other table, I'd replace the three fields with the hash key. It would load more slowly, but it seems at least plausible that it might be a little faster to process in your charts. Only testing would show for sure, though.