Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Left Join of 3 tables stored in multiple files

Hi,

I am loading 4 excel files (in From script I put *.xlsx, so that i could load any number of them) containing the same types of data for different companies.

There are basically 3 tables in them:

Cost Center table

CC,

User,

Reference,

... other

Manager table:

User,

Reference,

Manager

Limit Table:

User,

Reference

Limit

I load Cost Center table and then use Left Join (Cost Center) to load other 2 tables (I do this to be able to display in one row the CC, its user, the manager of this user and user limit)

Everything works very fine, the only problem is that the two tables loaded with Left Join  (Manager and Limit) are loaded only from the first excel file in the folder. Data for those tables from the remaining 3 files are not appearing...

I was trying to change orders of the files, or even to multiplicate the Left Join statements, entering the proper file names after FROM... the rule is - data are always loaded from the first file in the folder (or first loaded file I inticate in the script)

Can you please help me understand why, and how to correct this..??

This is probably with no meaning, but I add as well one variable for all the scripts, being subfield(FileBasename(),'_',1) as COMPANY - not having the company name in the files, I take it from the file name...

1 Solution

Accepted Solutions
kuba_michalik
Partner - Specialist
Partner - Specialist

I'd suggest just loading all the excels first into a temporary table, and then doing the join from this table using LOAD ... Resident:

Manager:

Load Distinct [User Name] & Reference as USER ,

  subfield(FileBasename(),'_',1) as COMPANY,

     Manager & [Mngr. Ref.] as MANAGER

    FROM

*.xlsx

(ooxml, embedded labels, header is 2 lines, table is Manager);

Left Join ([Cost Centre])

LOAD *

Resident Manager;

DROP Table Manager; //the data at this point is joined into Cost Centre, so we don't need to keep the original table.

View solution in original post

9 Replies
Anonymous
Not applicable
Author

Hi

Use the script like this

Cost Center table

CC,

User,

Reference,

... other

Left Join(Cost Center table)

Load Distinct

User,

Reference,

Manager

Left Join(Cost Center table)

Load Distinct

User,

Reference

Limit

Not applicable
Author

Thanks ... but it unfortunatelly does not work

I attach my script, in case the above description was too simplistic:

CostCentre:

LOAD [Chart of Account],

     [GL Account],

     [Controlling Area],

     [Cost Center],

     if(mid([Cost Center],4,2)= '01', 'PLASTIC', if(mid([Cost Center],4,2)= '02','CAN','OTHER')) as Business,

     mid([Cost Center],6,3) as Form,

     [Valid From],

     Responsible & Reference as USER ,

     Reference,

     subfield(FileBasename(),'_',1) as COMPANY

FROM

*.xlsx

(ooxml, embedded labels, header is 9 lines, table is [Cost Center]);

//--------------------------------------------------------------------------------------------

Manager:

left join (CostCentre)

Load Distinct [User Name] & Reference as USER ,

  subfield(FileBasename(),'_',1) as COMPANY,

     Manager & [Mngr. Ref.] as MANAGER

    

FROM

*.xlsx

(ooxml, embedded labels, header is 2 lines, table is Manager);

//--------------------------------------------------------------------------------

Limit:

left join (CostCentre)

Load Distinct [User Name] & Reference as USER,

subfield(FileBasename(),'_',1) as COMPANY,

     Limit as Limit_User

FROM

*.xlsx

(ooxml, embedded labels, header is 2 lines, table is Limit);

//-------------------------------------------------------------------------------------

Limit_Mgrs:

left join (CostCentre)

Load Distinct [User Name] & Reference as MANAGER,

subfield(FileBasename(),'_',1) as COMPANY,

     Limit as Limit_Manager

FROM

*.xlsx

(ooxml, embedded labels, header is 2 lines, table is Limit);

Anonymous
Not applicable
Author

Hi,

Post the sample data then only easily understand and replies will come soon........

Anonymous
Not applicable
Author

Unlike in SQL, where we can specify what fields to use to join, QV script joins by all common fields.  So, after you join the tables from the first file, you have all fields: User, Reference, Manager, Limit.  Left join eliminates load of the rows where the combination of common field values is not the same, hence you get data from the first file only.
Try to use a different approach.  Maybe not using "left" - but it could bring dirty data.  Or maybe load in a loop using "for each file in filelist".

Not applicable
Author

I see Michael,

What is the most elegant apprach in such case? I want strongly to avoid overcomplications in the script (it is quite simple now...)

thanks for any hints

Anonymous
Not applicable
Author

A simple way is to use join instead of left join.  See if it gives you the expected result.

kuba_michalik
Partner - Specialist
Partner - Specialist

I'd suggest just loading all the excels first into a temporary table, and then doing the join from this table using LOAD ... Resident:

Manager:

Load Distinct [User Name] & Reference as USER ,

  subfield(FileBasename(),'_',1) as COMPANY,

     Manager & [Mngr. Ref.] as MANAGER

    FROM

*.xlsx

(ooxml, embedded labels, header is 2 lines, table is Manager);

Left Join ([Cost Centre])

LOAD *

Resident Manager;

DROP Table Manager; //the data at this point is joined into Cost Centre, so we don't need to keep the original table.

Anonymous
Not applicable
Author

Yes, this is simple enough and reliable.

Not applicable
Author

thanks a lot Jakub,

It worked perfectly, and the script is still not so complex..