Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
danielcapp
Contributor III
Contributor III

Load table with same content but different column names

I have a folder with hundreds of Excel files (with same root name. Eg.: FileName_YYYYMMDD) that contain similar information, but whose columns may be named differently. For example, one file contains column name 'DealName' and the other 'DealID'.

Important to mention that columns may also be in different sequence. However, there are few possible different 'layouts', so I would like to know if there is a way to load all of them at the same time, creating conditional fields.

In a logical way, I would like to do something like:

 

LOAD

If (Exists(DealName), DealName, DealID) as DealRef

If (Exists(USDAmount), USDAmount, Amount) as DealAmount

...

FROM [TableName]

Is there any way to doing so?

 

Thank you all in advance!

Labels (2)
1 Solution

Accepted Solutions
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I think the most practical way would be to load all fields and then drop the fields you don't want. Something like:

ALIAS Adresse as Address;
ALIAS Dirección as Address;
ALIAS Estado as Status;

Clients:
// Load dummy field for Concatenate.

LOAD 0 as DummyField AutoGenerate 0;  
Concatenate (Clients)
LOAD *
FROM addr*.csv (ansi, txt, delimiter is ',', embedded labels, msq)
;

DROP Field DummyField; // Drop dummy field

MapFieldsToKeep:
Mapping
LOAD *, 1 Inline [
Fieldname
Address
Status
Client
]
;

// Drop any fields that are not in the MapFieldsToKeep table.
For idx = NoOfFields('Clients') to 1 step -1
  let fname = FieldName($(idx), 'Clients');
  if not ApplyMap('MapFieldsToKeep', '$(fname)', 0) THEN
    Drop Field [$(fname)];
  EndIf
Next idx

 

 

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

View solution in original post

11 Replies
juraj_misina
Luminary Alumni
Luminary Alumni

Hi,

define a subroutine or a variable with field list for those few possible "layouts".  Then load only the first line of excel file to decide which layout to use. Something like this:

//let's call this "pseudo code", just to give you an idea
SET vLayout1= DealName as DealRef, USDAmound as DealAmount;
SET vLayout2 = DealID as DealRef, Amount as DealAmount;
For Each vFile in FileList('lib://sourceDir/*.xlsx')
FirstLine:
First 1 LOAD A from [vFile] (ooxml, no labels, table is 'Sheet1');
LET vFirstFieldName = Peek('A', 0, 'FirstLine');
If ('$(vFirstFieldName)'='DealName') then
Data:
LOAD $(vLayout1) from  [vFile] (ooxml, embedded labels, table is 'Sheet1');
ElseIf('$(vFirstFieldName)'='DealID') then
Data:
LOAD $(vLayout2) from  [vFile] (ooxml, embedded labels, table is 'Sheet1');
EndIf
Drop Table FirstLine;
LET vFirstFieldName=; Next vFile
LET vFile=;

Hope this helps.

Juraj

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Use a set of ALIAS statements before the LOAD. 

ALIAS DealName as DealRef;
ALIAS DealID as DealRef;
ALIAS USDAmount as DealAmount;

 

You only need to ALIAS the fieldnames that require changing.  Then load your files with 

LOAD * FROM myfile*.csv ...;

 

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

danielcapp
Contributor III
Contributor III
Author

Rob, this seems to be in the right direction. However, there another constrain that I thought wouldn't be relevant: some files have additional fields that shouldn't be considered.
Is there any way to 'lock' the fields to be loaded, using the ALIAS and wildcard LOAD?

Thanks again!
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I think the most practical way would be to load all fields and then drop the fields you don't want. Something like:

ALIAS Adresse as Address;
ALIAS Dirección as Address;
ALIAS Estado as Status;

Clients:
// Load dummy field for Concatenate.

LOAD 0 as DummyField AutoGenerate 0;  
Concatenate (Clients)
LOAD *
FROM addr*.csv (ansi, txt, delimiter is ',', embedded labels, msq)
;

DROP Field DummyField; // Drop dummy field

MapFieldsToKeep:
Mapping
LOAD *, 1 Inline [
Fieldname
Address
Status
Client
]
;

// Drop any fields that are not in the MapFieldsToKeep table.
For idx = NoOfFields('Clients') to 1 step -1
  let fname = FieldName($(idx), 'Clients');
  if not ApplyMap('MapFieldsToKeep', '$(fname)', 0) THEN
    Drop Field [$(fname)];
  EndIf
Next idx

 

 

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

danielcapp
Contributor III
Contributor III
Author

Rob, the problem is that when doing the wildcard load, as files structures are different, different tables are created (Clients-1, Clients-2, ...), right? How to manage this?
danielcapp
Contributor III
Contributor III
Author

Oh, got your solution now! Very clever...
However, the DROP part is not working, it tries to drop empty field names and generate several errors. Any clue?
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Take a look at your document log to see what the steps in the loop look like.

 

-Rob

danielcapp
Contributor III
Contributor III
Author

Execution looks weird. Below is my code:

ALIAS DealID as FXAthenaDealName;
ALIAS InstrumentType as FXAthenaInstrumentType;
ALIAS TradeDate as FXAthenaTradeDate;
ALIAS ExpirationDate as FXAthenaExpirationDate;
ALIAS SettlementDate as FXAthenaSettlementDate;

FXData:
LOAD 0 as DummyField AutoGenerate 0; // Load dummy field for Concatenate. 
Concatenate (FXData)
LOAD *
FROM
[filename*.xlsx]
(ooxml, embedded labels);


DROP Field DummyField; // Drop dummy field

MapFieldsToKeep:
Mapping
LOAD *, 1 Inline [
FXAthenaDealName
FXAthenaInstrumentType
FXAthenaTradeDate
FXAthenaExpirationDate
FXAthenaSettlementDate
]
;

// Drop any fields that are not in the MapFieldsToKeep table. 
For idx = 1 to NoOfFields('FXData')
	let fname = FieldName($(idx), 'FXData');
	if Len('$(fname)')>0 THEN
		if not ApplyMap('MapFieldsToKeep', '$(fname)', 0) THEN 
			Drop Field [$(fname)] From FXData;
		EndIf
	ENDIF
Next idx

See that I included the 'Len' check as there I suspected some files contained columns with trash and were being loaded with empty field names. However, what I see in log file (attached here) is that some fields are not being dropped because during the loop their field names are being treated as an empty string (check the final lines of log file). However, when the execution ends, the fields are available in their original names, as follows:

 

Table.PNGLook for these fields (AprilID, CorpMV) in the log file. They are being loaded, but not dropped.

Am I missing something?

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

My bad for doing limited testing.  Change your for loop to process fields from end of the list backwards:

 

For idx = NoOfFields('Clients') to 1 step -1

 

I've updated my sample above.  Reason being that when you drop fields you are changing the remaining field numbers.

 

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com