Our application has millions of records of data around 5 GB of fact which is hugely impacting the performance of the dashboard and increases the response time on making any selection in the report. . We tried, to analyze an entire big data with different data cuts in one time by the use of ODAG app.
We have implemented the ODAG (On Demand App Generation) functionality to give users an aggregate view of big data and allow them to identify and load relevant subsets of data for detailed analysis. ODAG template application (detailed analysis app) can be generated using 2 methods:
We are dealing with multiple fact tables (these are relational facts)and dimensions with different where conditions . From the two methods which should be the best design approach for optimizing the performance of the app? Do let us know your thoughts.
Thanks in advance!
Hi, interesting topic
My PoV is that a direct database query to an ODAG app should be used only when:
1. You have really big data, so it would be a massive overhead to replicate it to QVDs. Something in 100s of GBs / billions of rows.
2. You have a high-performing (MPP) database like Teradata or HP Vertica.
For any other scenario (without knowing the specifics of yours) I found the QVDs faster.
Of course it really matters, if the QVD load is optimized or if there are transformations happening in the ODAG app load script.
Even I have a similar issue where there is huge data (not in 100s of GBs) and planning to use ODAG application.
In my case, the data that I need to pull from Teradata is scattered in different fact tables and need to use multiple where conditions in order to pull only required set of data into ODAG application.
However, ODAG script will always work on one single where condition pulled from the selection dashboard - I am unable to use the ODAG directly to the database.
- To avoid this scenario, I thought of doing all the transformations in QVD generator and generate QVD's at the required level and filtered data and Use the ODAG to connect to QVD's instead of Teradata tables.
Could you please help whether this approach/ design is as per the standards of using ODAG feature? Kindly please share your feedback.
Your statement of:
We are dealing with multiple fact tables (these are relational facts)and dimensions with different where conditions ...
gives me the impression that there is just a sql-db is transferred into Qlik. Purely functionally this worked quite often but from a performance point of view it is usually a poor approach. If this is case it could be that with some optimizations and/or a more or less adjusted datamodel you don't need the ODAG approach anymore - therefore I suggest just to begin with begin (which is the datamodel).
Yes, I believe that you sould first build QVD layers and connect your ODAG app to these.
QVDs allow for much better data remix, reuse and reload speed, especially if you need to combine multiple fact tables and create snowflake schemas. So yes, it is a preferred approach.
ODAG should really be about slicing a Big Data source to user-relevant datasets which can be handeled in-memory.
It is a bit like a traditional query-based approach. This means that users will loose some potential of the associative data model, so it should be used carefully. (This is also a reason why Qlik is working on Associative Big Data Index)
So if you really don't have 100s of GBs of data, I would suggest that it is more effective to invest into building a solid QVD structure and boosting a RAM of your servers so that users could work with all of their data in one appliciation.