Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a number of questions relating to best practices when architecting a solution built off of a ledger, so as to incorporate incremental loads, any help would be much appreciated!
Our database is a ledger, which will only ever have rows added. A reduction to a previous entry is applied as a new row with a negative value, such that the aggregation of rows results in the current values. This means the table is ideal for incremental extracts.
We want to do extracts broken down by various different criteria, some overlapping (eg, region, year, etc).
Once the data has been extracted and chopped up into the various QVDs, we will have a suite of dashboards which consume various different selections of these QVDs. A large part of this exercise is to ensure that data is the same across the various different dashboards, where currently it might be interperated in different ways.
Would the best approach to the initial extract be to have multiple QVWs running much the same code with a different where clause with multiple QVDs generated, or have one initial QVW that spits out a huge QVD which multiple QVWs then access?
I see advantages to multiple initial loads, in that
I also see advantages to having just one initial extract.
Am I obviously missing anything here?
Once we have the QVDs in place split by region / year etc, we then plan to have another aggregation layer sitting beyond this. While some of our dashboards will be able to directly consume this data, for others we will want to aggregate the data to various different levels. As such there will be a bunch of QVWs that take this data and apply aggregation, creating a new selection of tailored QVDs. This starts to get into the territory of ad-hoc logic, which we want to get away from, so ideally these aggregation QVWs will all be copied from a template, which has a user interface allowing configuration of the backend script via variables to select the levels of aggregation / fields excluded etc. Considerations at this stage:
Any other common abstractions that I might need to take into account? I'm surely missing something here.
Ideally, this should mean that for every dashboard, we should be able to load in the section access from database, and then load the data from the QVD/s. (Both fact and relevant dimension data). With the knowledge that the QVDs are incrementally loaded, what is the best approach to refreshing the dashboards themselves? I have seen in the past that when a dashboard loads, it can either kick the current users off if it takes too long, or reset all their selections. Is there a way to maintain a users session through a reload? Would incrementally loading the dashboard from the QVD make the most sense, or simply doing an optimised load from the entire QVD? I imagine this is something that differs on a case by case basis and would need testing?
Thanks for any and all help, I've obviously got quite a lot to investigate.