Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I am currently working on an application for Qlik sense server, however I got some technical issues / architecture to figure out.
We got an application today, where our customers can create a new project, set different parameters, calculate the project and get loads of statistical data as output. The output consists of X datasets.
The sum of the output for each project can be between 2-10 million rows - extremely detailed data.
After a project is ran to completion in the application, it has to be in qlik sense within minutes.
The application got approx 1500 projects today, it grows with 10-15 new projects each day.
Each customer can "recalculate" any project: The old data in qlik has to be replaced by the new
Each customer can delete any project at a given time: The project has to be deleted in Qlik sense
New project data must be in Qlik sense within minutes.
I have created a service for extracting all the data from our application, the problem is how I load this data into qlik sense, taken into account that projects can get new data or be deleted.
I have tried MongoDB and it works fine with the whole incremental process, including deletion of changed/deleted projects, but as the size of the collections grows the load time is very slow. I have tried the Qlik connector and the official connector. The official connector was a bit faster than the Qlik Mongo Connector. Load 200-300k rows / sec.
I have also tried with flat files, which is fast, but I have no way to delete projects here. Load 4 mill rows / sec.
What is the best strategy to implement? The data can get into billions rows quite easily, and the data can be changed/deleted anytime by the customer.
Would it be better to "split" the data into different apps for each customer? So each load would be less data?
The fastest way to get data into qlik is the use of optimized qvd-loads. This meant you will need to store the data as qvd-files and only the new/deleted data/projects will be queried from the database and also stored as qvd. For this each project need a unique ID. I think the various links to incremental loads and optimized loadings exists() here will be quite useful for you:
Advanced topics for creating a qlik datamodel
- Marcus