Could you please help me out with an idea to load billions of records into QVD from table which have more than 8 billion records. As I used the date condition to get data since 2015, the record count got reduced to 4.5 billion.
Do anyone have idea how to load the huge dataset in QVD'S. What all are the methods by which we can compress the data as it has 37 columns among which 6 columns are related to date, month and year.
Further, I want to know if this dataset needs to be partitioned to multiple QVD's say with 1 billion of records in each QVD at stage level and then perform incremental load condition.
Personally I would tend to split the data into multiple qvd's on a yearly or maybe even a monthly level. Splitting the data will lead to some kind of additionally overhead by creating and also by reading the qvd's again but if we talk about a few dozens of qvd's it won't be rather significant in regard to the load-times (by thousands of files it should be measured).
The benefits of splitting the data could be to load directly different datasets into various applications without the need to filter the data appropriate. Probably even more important would be the possibility to implement any incremental logic.
Beside this there is a limitation to 2 billion unique field-values and if your data contain a timestamp or some kind of record-id from the database you may hit this limitation. Whereby both mentioned types of data shouldn't be included within Qlik. A record-id isn't very useful - only in cases to validate data but not in any reports and timestamps should be better splitted into dates, times (hh:mm:ss) and milli-seconds (if available). Further there is no need to have further period-fields like month and year because they could be easily deduced from the date within the target-application by using a master-calendar. Also other fields might be optimized in this way or by removing formats and so on.