Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
therealdees
Creator III
Creator III

Help with data model with link table and intervalmatch tables

Hi,

I'm having a hard time trying to build a data model that relates Inventory and Sales fact tables. The problem is I don't want to concatenate everything in a single fact.

In both Sales fact and Inventory fact there are 4 fields that could or should be considered as a key: deposit_key, date_key, store_key and product_key.

Each store could have more than 1 deposit and the sum of these deposits will be the available quantity for a product in that store, thus the sum of every store deposit is equal to the total available quantity.

The inventory data is based on a snapshot by the end of each month. What I did, was build a link table with a composite key (date_key|store_key|deposit_key|product_key), but considering stock date_key will be limited to the last day of each month, I used MonthEnd(sale_date) to relate both facts to the same month. Finally, the link table will have each key loaded from each table (deposit_key, date_key, store_key, product_key and composite key) and I'm able to link the dimensions to the link table, but the facts will only hold the composite key. So far the model seems to be working fine and look like this:

bbbb.png

 

The problem begins when I try to bring some other tables to the model. I have a Price History and a Cost History (probably a SCD, not a fact) qvds that hold the product_key and a Start Date and End Date for each product price/cost. When I IntervalMatch these tables, it will create a synthetic key as I'm using the calendar_key and product_key to match the intervals. It gets even more messy when I bring a "Goal Table" (that is, the sales goals per store) as it uses store_key and calendar_key too.

It's funny because when analyzing the data in a table it seems to work correctly (with some very few exceptions that I'm not sure yet if it's related to the model), although the model looks quite complex and the synthetic table carries too many rows (260kk in this case).

After loading every table needed the model looks like this:

ccc.png

 

Maybe the best approach would be having different apps with different analyzing purposes, but in this case the final user wants a dynamic table that he can combine different dimensions and measures (e.g inventory quantity + sold quantity + price ranges + costs ranges + store segmentation, etc.):

 

ddd.png

 

SKU = product_key

Faturamento = revenue

Qtd. Vendida = sold amount

Preço Tabela = unit selling price (should correspond to the filter period)

Qtd. Estoque (Total) = in stock quantity

Custo Unitário = unit buying cost

 

 

What would be the best approach here? Am I in the right path?

Any suggestion is appreciated. Thanks.

 

 

Labels (3)
1 Solution

Accepted Solutions
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

These are all good questions, and this is what Data Modeling is for. I can't explain everything in this post, but a few words, in a nutshell:

- Technically, every data table that is driven by date, can be treated as a Fact Table. For slowly changing dimensions, you could integrate (join) them into corresponding fact(s), but then you will find more common fields - for example, if Cost is a slowly changing dimension (it usually is), then you'd need to integrate Cost into Inventory and into Sales, and then it becomes a common field, unless you keep them in two separate fields (that's also an acceptable solution)

- I'd concatenate Inventories, Sales, and Goals as fact tables, and join Costs and Prices into the corresponding facts, to eliminate slowly changing dimensions and additional links. That being said, if your data size allows, you could fit those three facts into a Link Table as well. Pick your poison...

- In order to restore missing associations in a Concatenated Fact scenario, we use either Generic Link Tables, or a Set Analysis solution. Both are pretty advanced, and both are taught at the Masters Summit for Qlik. Considering the complexity of your data models, you should really make an effort to come to Orlando. Convince your employer that this is money well spent 🙂

Good luck!

 

View solution in original post

5 Replies
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi there!

This is a very complex problem indeed, and it requires solid knowledge of advanced Data Modeling. While we won't be able to solve the whole problem in the boundaries of a forum post, here are some pointers about the problem itself, and about potential sources of the needed knowledge:

- IF this data model has to remain a Link Table data model, then you should analyze ALL of your various facts, including Goals, Prices, etc., and identify ALL combinations of common keys - not just those from Sales and Inventory. Then, the various combinations of common keys should be placed in the Link table, to avoid these 7 levels of synthetic keys (those just won't fly...)

- Keep in mind, that the more Fact tables need to be added to the Link Table structure, the more complex it will get. This is why I believe that Concatenated Fact might be a better idea for you, even with the complexities that you will face there.

- Another consideration is performance - the Link Table data models don't perform very well with large data sets, so if you are expecting your data to get big, you should try and avoid the Link Table.

Now, to the sources of knowledge that you should look into:

- Check if you can join one of the upcoming sessions of the Masters Summit for Qlik - we will be in Dublin in October and in Orlando in November. One of our sessions is Advanced Data Modeling with Barry Harmsen, and he will teach you both the Link Tables and the Concatenated Facts techniques, and more. Another useful session is the Performance Tuning session that I teach, and that will teach you why Link Table may not be the best structure for you.

- Also, look at my book QlikView Your Business. Even though it  was written primarily for QlikView, it describes the universal Qlik Data Modeling principles that you need, with a lot of hands-on practice - you will learn in detail how to build your data structure as a Link Table, and as a Concatenated Fact. And, Inventory Analysis is one of the business scenarios that I describe in the book. It's perhaps the most advanced Qlik Data Modeling tutorial you could find anywhere. Check it out. 

 

therealdees
Creator III
Creator III
Author

Hi Oleg!! Thank you very much for the reply and links provided. Will sure look into your book! Can't promise to attend the Masters Summit as I'm in Brazil, but I'd be glad to participate if I could to.

 

As for concatenating the facts. If I go that way, what should be concatatenated? The Sales and Inventory only? Or every other fact?

Actually, are price and cost tables indeed fact tables? Arent them slowly changing dimensions?

Given every table in the example above, what would you concatenate and what would you not?

And finally, if I concatenate the facts will I be able to see the inventory level and quantity sold and other attributes without being limited to the fact itself? (e.g will I be able to see data for skus that havent been sold, just like in the table printed above?)

Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

These are all good questions, and this is what Data Modeling is for. I can't explain everything in this post, but a few words, in a nutshell:

- Technically, every data table that is driven by date, can be treated as a Fact Table. For slowly changing dimensions, you could integrate (join) them into corresponding fact(s), but then you will find more common fields - for example, if Cost is a slowly changing dimension (it usually is), then you'd need to integrate Cost into Inventory and into Sales, and then it becomes a common field, unless you keep them in two separate fields (that's also an acceptable solution)

- I'd concatenate Inventories, Sales, and Goals as fact tables, and join Costs and Prices into the corresponding facts, to eliminate slowly changing dimensions and additional links. That being said, if your data size allows, you could fit those three facts into a Link Table as well. Pick your poison...

- In order to restore missing associations in a Concatenated Fact scenario, we use either Generic Link Tables, or a Set Analysis solution. Both are pretty advanced, and both are taught at the Masters Summit for Qlik. Considering the complexity of your data models, you should really make an effort to come to Orlando. Convince your employer that this is money well spent 🙂

Good luck!

 

therealdees
Creator III
Creator III
Author

Oleg, thank you very much for the insights.

 

I will try to achieve the result by concatenating the facts. I was trying to avoid it since it results in a big and confusing table, but I'm convinced now.

And I'll def try to convince my employer to send me to Orlando 😎

therealdees
Creator III
Creator III
Author

Hey, Oleg

 

I concatenated inventory and sales, then also concatenated the current price for each SKU (just the price and the sku and a flag to identify the source). It's working almost perfectly, except when I add a dimension that doesn't relate directly to the current price (as it has no values for stores, for example).

alalala.png

As you can see, it shows the price even for items that had no sales and have no available stock (which is exactly what I want, except I need it to show it for every SKU record, even if it's duplicated if I open another dimension, e.g stores)

 

I tried to tweak with {1<>}, Total, All, Distinct, everything, but I can't make it to work. The closest I got was with this expression:

Aggr(Only({<fato_origem = {'Preço Atual'}, mes = , ano = , data = >} produto_preco_atual), key_produto_sku)

 

But it only shows the value for a single record, as shown in the print below:

kkk.png

 

 

Any idea how could I achieve this?