Solved: Re: Load table with same content but different col... - Qlik Community

danielcapp · ‎2018-11-28

I have a folder with hundreds of Excel files (with same root name. Eg.: FileName_YYYYMMDD) that contain similar information, but whose columns may be named differently. For example, one file contains column name 'DealName' and the other 'DealID'.

Important to mention that columns may also be in different sequence. However, there are few possible different 'layouts', so I would like to know if there is a way to load all of them at the same time, creating conditional fields.

In a logical way, I would like to do something like:

LOAD

If (Exists(DealName), DealName, DealID) as DealRef

If (Exists(USDAmount), USDAmount, Amount) as DealAmount

...

FROM [TableName]

Is there any way to doing so?

Thank you all in advance!

rwunderlich · ‎2018-11-29

I think the most practical way would be to load all fields and then drop the fields you don't want. Something like:

ALIAS Adresse as Address;
ALIAS Dirección as Address;
ALIAS Estado as Status;

Clients:
// Load dummy field for Concatenate.
LOAD 0 as DummyField AutoGenerate 0;
Concatenate (Clients)
LOAD *
FROM addr*.csv (ansi, txt, delimiter is ',', embedded labels, msq)
;

DROP Field DummyField; // Drop dummy field

MapFieldsToKeep:
Mapping
LOAD *, 1 Inline [
Fieldname
Address
Status
Client
]
;

// Drop any fields that are not in the MapFieldsToKeep table.
For idx = NoOfFields('Clients') to 1 step -1
let fname = FieldName($(idx), 'Clients');
if not ApplyMap('MapFieldsToKeep', '$(fname)', 0) THEN
Drop Field [$(fname)];
EndIf
Next idx

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

View solution in original post

juraj_misina · ‎2018-11-28

Hi,

define a subroutine or a variable with field list for those few possible "layouts". Then load only the first line of excel file to decide which layout to use. Something like this:

//let's call this "pseudo code", just to give you an idea
SET vLayout1= DealName as DealRef, USDAmound as DealAmount;
SET vLayout2 = DealID as DealRef, Amount as DealAmount;
For Each vFile in FileList('lib://sourceDir/*.xlsx')
FirstLine:
First 1 LOAD A from [vFile] (ooxml, no labels, table is 'Sheet1');
LET vFirstFieldName = Peek('A', 0, 'FirstLine');
If ('$(vFirstFieldName)'='DealName') then
Data:
LOAD $(vLayout1) from  [vFile] (ooxml, embedded labels, table is 'Sheet1');
ElseIf('$(vFirstFieldName)'='DealID') then
Data:
LOAD $(vLayout2) from  [vFile] (ooxml, embedded labels, table is 'Sheet1');
EndIf
Drop Table FirstLine;
LET vFirstFieldName=;
Next vFile
LET vFile=;

Hope this helps.

Juraj

rwunderlich · ‎2018-11-28

Use a set of ALIAS statements before the LOAD.

ALIAS DealName as DealRef;
ALIAS DealID as DealRef;
ALIAS USDAmount as DealAmount;

You only need to ALIAS the fieldnames that require changing. Then load your files with

LOAD * FROM myfile*.csv ...;

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

danielcapp · ‎2018-11-29

Rob, this seems to be in the right direction. However, there another constrain that I thought wouldn't be relevant: some files have additional fields that shouldn't be considered.
Is there any way to 'lock' the fields to be loaded, using the ALIAS and wildcard LOAD?

Thanks again!

rwunderlich · ‎2018-11-29

I think the most practical way would be to load all fields and then drop the fields you don't want. Something like:

ALIAS Adresse as Address;
ALIAS Dirección as Address;
ALIAS Estado as Status;

Clients:
// Load dummy field for Concatenate.
LOAD 0 as DummyField AutoGenerate 0;
Concatenate (Clients)
LOAD *
FROM addr*.csv (ansi, txt, delimiter is ',', embedded labels, msq)
;

DROP Field DummyField; // Drop dummy field

MapFieldsToKeep:
Mapping
LOAD *, 1 Inline [
Fieldname
Address
Status
Client
]
;

// Drop any fields that are not in the MapFieldsToKeep table.
For idx = NoOfFields('Clients') to 1 step -1
let fname = FieldName($(idx), 'Clients');
if not ApplyMap('MapFieldsToKeep', '$(fname)', 0) THEN
Drop Field [$(fname)];
EndIf
Next idx

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

danielcapp · ‎2018-11-29

Rob, the problem is that when doing the wildcard load, as files structures are different, different tables are created (Clients-1, Clients-2, ...), right? How to manage this?

danielcapp · ‎2018-11-29

Oh, got your solution now! Very clever...
However, the DROP part is not working, it tries to drop empty field names and generate several errors. Any clue?

rwunderlich · ‎2018-11-29

Take a look at your document log to see what the steps in the loop look like.

-Rob

danielcapp · ‎2018-11-29

Execution looks weird. Below is my code:

ALIAS DealID as FXAthenaDealName;
ALIAS InstrumentType as FXAthenaInstrumentType;
ALIAS TradeDate as FXAthenaTradeDate;
ALIAS ExpirationDate as FXAthenaExpirationDate;
ALIAS SettlementDate as FXAthenaSettlementDate;

FXData:
LOAD 0 as DummyField AutoGenerate 0; // Load dummy field for Concatenate. 
Concatenate (FXData)
LOAD *
FROM
[filename*.xlsx]
(ooxml, embedded labels);


DROP Field DummyField; // Drop dummy field

MapFieldsToKeep:
Mapping
LOAD *, 1 Inline [
FXAthenaDealName
FXAthenaInstrumentType
FXAthenaTradeDate
FXAthenaExpirationDate
FXAthenaSettlementDate
]
;

// Drop any fields that are not in the MapFieldsToKeep table. 
For idx = 1 to NoOfFields('FXData')
	let fname = FieldName($(idx), 'FXData');
	if Len('$(fname)')>0 THEN
		if not ApplyMap('MapFieldsToKeep', '$(fname)', 0) THEN 
			Drop Field [$(fname)] From FXData;
		EndIf
	ENDIF
Next idx

See that I included the 'Len' check as there I suspected some files contained columns with trash and were being loaded with empty field names. However, what I see in log file (attached here) is that some fields are not being dropped because during the loop their field names are being treated as an empty string (check the final lines of log file). However, when the execution ends, the fields are available in their original names, as follows:

Look for these fields (AprilID, CorpMV) in the log file. They are being loaded, but not dropped.

Am I missing something?

rwunderlich · ‎2018-11-29

My bad for doing limited testing. Change your for loop to process fields from end of the list backwards:

For idx = NoOfFields('Clients') to 1 step -1

I've updated my sample above. Reason being that when you drop fields you are changing the remaining field numbers.

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

Load table with same content but different column names

conditional

load