Huge Workload - Qlik Community

Report Inappropriate Content · ‎2014-06-26

Hi all, can anybody help me to improve/change the logical operation of the following scenario? Our customer is the owner of Tera of data. They have, in turn, customers (with QlikView installed) that only needs the data aggregated in the correct way to create their own reports. The problem is that the reload of the data takes about all night and moving the qvd created is almoust creating a lot of problem in terms of huge workload. We tried to analise the chance to implement the incremental load but our customer would need the "insert, update, delete" case and the aim to add a trigger on the DB, add a ModificationTime field, select the deleted rows in order to do an inner join with the qvd's could be not feasible since the DB is already overburdened. We tried also to analyse the QlikView Real Time module but it seems that performance are not so good if we work with more than few fields and if they change values a lot of time. Furthemore it seems that the module is not present anymore as separate module/license in QlikView 11.2. Another option is to implement the Direct Discovery on a subset of data but not necessarily customers should have direct access to the DB. Have you ever worked with QlikView and Message queuing to improve performance? Thanks in advance for your feedback. Have a nice day, Eros

vgutkovsky · ‎2014-06-26

Eros, I wouldn't write off the idea of incremental reloads if I were you. I think that approach may be necessary in your case. Yes, you would need implement delete and update, but I don't see any other way around it.

Regards,

Vlad

Report Inappropriate Content · ‎2014-06-27

Hi Vlad,

I know that the correct approach is to use the incremental reloads and this is the way we used to proceed in these cases. The problem is that, in this particular case, act on the DB side seems prohibitive for the customer.

I am wondering if is there some other choice to follow.

Thank you so much for your advice.

Regards,

Eros

vgutkovsky · ‎2014-06-27

On the DB side, couldn't they just create views (not tables) of changed data? So, if you're reloading weekly for example, they could create a stored procedure that would create a weekly view that would, in turn, only contain those rows of data that have either changed or are new? This wouldn't require any additional storage on their part...would it really be that hard on the DB to implement that? If so, then I think you have other problems (like maybe they need to switch to a DB that can actually handle big data) that definitely can't be solved with QlikView.

Vlad

Report Inappropriate Content · ‎2014-06-30

Thanks again Vlad,

we will try to check if it is feasible but I have a feeling that the answer will be negative.

Have a nice day,

Eros

DavidFoster1 · ‎2014-06-30

Sounds like a fairly typical dataload of a poorly thought out source system.

At Terabyte data volumes you HAVE to consider incremental loading (really anything over a couple of Gigabytes).

If it is difficult to identify CRUD transactions in the data then you will need to consider:

getting audit trail fields added to the source data.
creating a QVD file specifically for matching to the data holding the primary key of the source table along with a checksum of all the fields you want to watch for updates. A record with matching key and checksum can be ingored, etc (btw DONT delete your deletions, just added a deleted timestamp to them)

Report Inappropriate Content · ‎2014-07-02

Hi David,

thank you very much I will take it into account for our scenario.

Have a nice day,

Eros