Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have an issue with creating a large hierarchy in QlikView.
My script loads data from Oracle db into a temporary table and then using Hierarchy function to create a tree.
The data has grown to about 6 000 000 rows and now QV stops responding when I try to load the data.
Via debug I found out it's the Hierarchy function that causes QV to crash.
Any tips how can I actually load all the data without leaving anything out?
Is there a limitation somewhere inside QV that causes the application to crash?
Thanks,
Miro
Hi Miro
It's difficult to say without seeing the model. Can you share? Or at least list the code?
I would suggest you first try loading only a 1000 records into each table (or some other way of limiting your dataset) and keeping an eye on the Performance via Task Manager). Have you got log files switched on?
If it loads, check your table connections and specifically what the hierarchy table is generating.
I hope that helps. Let me know how you get on.
Thanks
Steve
Hi Steve,
I have added the "where rownum <= 10000;" limit into the SQL Select and the load worked correctly and took about 15 minutes.
The Qv.exe process in Windows used around 2GB of RAM.
When there is no limit it goes up to 5 GB of RAM. The reason of the crash might be using too much memory.
I cannot share the full model, but the code that loads the data looks like this:
TABLEtmp:
LOAD
if(isnull(PARENTNODEID),-1,PARENTNODEID) as PARENTNODEID,
AGE_,
AUID,
CLOSED,
CODE,
CREATIONDATE,
CREATOR,
FULLQUALIFIEDCODE,
ID,
DESCRIPTION;
SQL SELECT
PARENTNODEID,
AGE_,
AUID,
CLOSED,
CODE,
CREATIONDATE,
CREATOR,
FULLQUALIFIEDCODE,
ID,
DESCRIPTION
FROM table1
where rownum <= 10000;
TABLE:
Hierarchy(NODEID, PARENTNODEID, DESCRIPTION, NAME, DESCRIPTION, PATH, '#', LEVEL) LOAD
PARENTNODEID,
AGE_,
AUID,
CLOSED,
CODE,
CREATIONDATE,
CREATOR,
FULLQUALIFIEDCODE,
ID as NODEID,
DESCRIPTION
Resident TABLEtmp;
Thanks
Miro
Hi Miro
Do you drop the TABLEtmp after the resident load?
Steve
Yes, the TABLEtmp is dropped after the resident load.
So, there is no synthetic key? The table joins on NODEID?
How long does the load take (with the 10000 limit) without loading the hierarchy? How long does the hierarchy load on it's own take?
For hierarchy, is there any duplication in data? A 'Load Distinct' would help here. Are there redundant, date delimited nodes which can be removed?
The code if(isnull(PARENTNODEID),-1,PARENTNODEID) as PARENTNODEID, interests me. You may be loading a considerable amount of useless data where the PARENTNODEID is null. On your hierarchy load try putting the qualifier;
Resident TABLEtmp where NOT PARENTNODEID = -1;
Regards
Steve
Yes, the table joins on NODEID.
With the 10000 limit the select takes about 20 seconds and the hierarchy load about 40 seconds.
There should be no duplication in the data. However I need to confirm this with someone who understands the DB/application structure.
I have tried "Resident TABLEtmp where NOT PARENTNODEID = -1;", but there is no difference.
It looks like there only about 60 rows where the PARENTNODEID is null.
All the other rows have a valid ID.
Regards
Miro
Hi
How many records are there in TABLE in total and how many of them are PARENTNODEID -1? How long does it take to load the data and hierarchy without the 10000 limit?
One thing is certain, 6000000 records with a hierarchy should be feasible. Of course, this depends upon how many fields, uniqueness of fields, hardware, etc. But we've got some heavy tables, lots of fields with 10,000,000 records attached to a hierarchy and we don't have any issues (at least none that bring the model down).
Steve
Hey Miro,
I had the same problem you are experiencing now. The problem was strictly related to the hardware that was not sufficent for my hierarchy to complete .
If anyone has some other tips, and i already tried all you suggested Steve Matthews, would be much appreciated since i still got the loadscript for that hierarchy.
Cheers
Juan Pedro
Yes, Juan is correct. It could simply be that your hardware can't cope. Have you checked the task manager when running see how the hardware is being hit as the load progresses.
If this is so, then you may want to revisit your model in terms of it's design and remove unnecessary fields and try to normalise some of the data.