Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have build a generic CSV to QVX converter in Java. This runs very well. To keep it simple I'm using the following settings for each field:
QvxFieldType.QVX_TEXT
QvxFieldExtent.QVX_ZERO_TERMINATED
QvxNullRepresentation.QVX_NULL_ZERO_LENGTH
But then I made some tests on the QlikView site. The load of the QVX file takes around 30% longer than of the original CSV file (15.8M rows by 21 cols; size is 1.1GB). I also noticed that the QVX load required less CPU utilization than during the CSV load. Maybe the QVX load is single threaded?
I wonder if the QVX load perfomance is dependend from the data types (QvxFieldType). Most of the values in my test file are integer and got converted during the QlikView load into numericals. So, maybe this performance issue is not happen when most of the values are strings..
Has somebody similar or different experiences to share?
- Ralf
Some more details on this. It seems that this performance issue only occurs in QV11, in QV10 the loading time is nearly equal:
My testing measures:
QV11
CSV Load: 210 sec => 80%
QVX Load: 262 sec
QV10
CSV Load: 195 sec
QVX Load: 188 sec => 96%
I also want to mention that there are a lot of NULL values loaded from the QVX file which are empty strings if loaded from CSV. I would expect that this should be much faster..
- Ralf
It is very slow if field type QVX_TEXT is used! And also a problem arise when you have mixed data types in one field. Therefor I've written a Java class QVXWriter which creates QVX files identical to the files created by QlikView's STORE command (which is using field type QVX_QV_DUAL, "for internal use").
My test with 15.8M records and 21 fields (most numericals) and a size of 1.1 GB shows that this kind of QVX file loads approx. 3x faster than the CSV file although the file size of the QVX file is much bigger in this case because of the additional textual representation of numericals and dates.
I would recommend to create QVX files with this Java class in UNIX based big data environments like Hadoop, Google BIG Table etc.:
http://community.qlik.com/docs/DOC-2796
- Ralf