Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Vajid
Contributor III
Contributor III

Converting into CSV file

Hi Team,

I have a QVD which is of 2.5GB, When I read data from that QVD store it into CSV by using store command (into test,csv(txt); ), the CSV file size is becoming something around 13 GB. 

QVD is not a compressed file, then while converting it into CSV, why size is getting increased by mutiple time?

I tried to store using [test.csv];...then the file size is matching but while opening that CSV file am getting metadata of the file like field names n all...nothing related to data.

Anyone know the solution for this?

Thank you

Labels (1)
7 Replies
tresesco
MVP
MVP

@Vajid , the statement 


@Vajid wrote:

QVD is not a compressed


is not entirely correct. The fact is qvd is compressed when created with the same algorithm that is used for qlik to store the data in memory. So the qvd size (specially when storing large data) would be smaller.  

Or
MVP
MVP

And yet, the data structure is different.

In a CSV, each individual value is saved in each row.

In a QVD, distinct values in each field are only saved once, with a pointer to that value being saved in a row (simplification, but good enough).

https://community.qlik.com/t5/Design/Symbol-Tables-and-Bit-Stuffed-Pointers/ba-p/1475369

This means that a data model with a lot of repetition of longer fields will be more effective in QVD than in CSV, whereas a data with a lot of short, unique values will be more effective in CSV.

Vajid
Contributor III
Contributor III
Author

@Or so you are telling that QVD's are compressed?

Or
MVP
MVP

I am telling you how data is stored by Qlik. This isn't compression, but compression uses similar mechanisms. Consider:

Row# - Pointer

1 - A

2 - A

3 - A

4 - B

Is very efficient, for example, when you just want to count the number of A values. You don't need to know that behind the pointer 'A' is, perhaps, a string like:

"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."

 

Of course, if behind the pointer 'A' is just the value 'Hello' and B is 'World', then this isn't very efficient, but Qlik will do it anyway.

Vajid
Contributor III
Contributor III
Author

@Or  I understood this pointer part. I have observed that QVF and QVW's getting compressed.

So if I conclude this, compared to CSV or xls, QVD's size might be less and compared to QVD, qvw and qvf's size will be lesser.

Size decreasing order ---->

CSV, xls ---> QVD --->QVF,QVW

Please correct me if am wrong.

Thank you

Or
MVP
MVP

You can't make deductions on the size of CSV or XLS compared to Qlik's files, because it depends on the data they contain. If a file contains mostly distinct, short values in each column, you might find that a CSV is smaller than a QVD.

A QVD will match the in-memory size, while a QVF/QVW may be compressed (at least in QV, it depends on your settings, I don't remember if QS has the same setting or not). A fully compressed QVF/QVW will typically be roughly 10-20% of the in-memory size of the same app (and the size of the corresponding QVD), but again, this depends on the actual data it contains.