Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Learn how to migrate to Qlik Cloud Analytics™: On-Demand Briefing!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Join large tables

Hi all,

I'm trying to join two large table from qvd files (each one has 40 millions of records) because I need to filter on two fields which are in the two different tables. Every time I try that QlikView starts loading the data, it waits a long time and at the end it simply says that it failed to load the data. I' running the software on a server with 4 cores and 10 GB of ram and during the loading I see the all the memory is used. When I try to run the same script in debug mode with a limited load of, for example, 10K records everything works well then there are no errors in the script. How can I solve this issue? Thanks a lot for your help.

13 Replies
boorgura
Specialist
Specialist

The original data need not be modified... But you can have them altered just before storing into QVDs.

If you have a situation where you should not even do that. Then this would not help you.

johnw
Champion III
Champion III


b wrote:qualify *;
unqualify A_field;
LOAD A_field, B_field FROM X_table.qvd (qvd);
join LOAD A_field, C_field FROM Y_table.qvd (qvd);


Is A_field a unique identifier on each table? Each value of A_field occurs only once on each table? Or can a value of A_field appear multiple times on each table? If multiple times, you're getting a many to many join which can vastly inflate the number of rows. Here's a quick example:

X_table:
A_field, B_field
A, B
A, C
B, D

Y_table:
A_field, C_field
A, X
A, Y
B, Z

Your join would produce this:

A_field, X_table.B_field, Y_table.C_field
A, B, X
A, B, Y
A, C, X
A, C, Y
B, D, Z

Not applicable
Author

Thanks for your suggestion, but I can't do that because those QVDs are automatically created and need for other projects.

Not applicable
Author

Thanks a lot for all your answers John.

I have just tried a left join (A_field occurs only once in Y_table) without any good result. Again, if I try a limited load (let's say 10000 rows) everything works well. Then is adding more RAM the only solution or can I try something else?