Re: Loading from source where duplicate field name... - Qlik Community

gerhardl · ‎2017-05-12

Hello,

I need to load data from a CSV file - this file has 2 fields with the same name: "Business Owner".

I only need to load one of them, but it's a specific one - the first one in the file, however, when I have the field name in my load script it for some reason automatically picks up the second column (which contains no data).

Is there any way to tell QV in the load script that it must load the first instance of the duplicate field names?

Note that I do NOT want to edit the CSV file to change or delete the duplicate - I need to resolve this in the load script.

ramasaisaksoft · ‎2017-05-12

Hi,

while you are taking CSV file you can cancel the second field .Remove second field from here just like click X symbol here(which field is not having data). so that it will give only first field.

gerhardl · ‎2017-05-12

That doesn't work - all that does is create the script for you, and it will still create the script with the "Business Owner" as the field name - it's no different that typing out the load script. It still loads the wrong field - even if it did work, I am reloading the same script multiple times daily (CSV gets updated) so cannot go through the load wizard each time.

marcus_sommer · ‎2017-05-12

AFAIK if this happened within an excel-file the fields with identically field-names will get an automatic counter by loading the data per file-wizard and with them you could identify and/or remove them. But this won't work with a text-file. I think the only way will be to load the file without labels and then to rename them per AS statement or per mapping: How to Rename Fields

- Marcus

morganstake · ‎2019-12-26

It's multi-step process. (I have attached the QVF)

He is my Default Load script (fails):

Tickets:
LOAD
Ticket,
Sprint,
Sprint,
Sprint,
"Story Points",
Assignee,
Assignee,
"Component/s",
"Component/s"
FROM [lib://AttachedFiles/Duplicate Columns Example.csv]
(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

I opted for Concatenating the Columns with duplicate field names and then using SubField() to get the proper output in the final table

1) get the column names

ColumnNames:
first 1
LOAD
@1,
@2,
@3,
@4,
@5,
@6,
@7,
@8,
@9
FROM [lib://AttachedFiles/Duplicate Columns Example.csv]
(txt, codepage is 28591, no labels, delimiter is ',', msq);

2) Narrow down the table to Unique fields and concatenate the ColumnNums into unique Column names
let x=NoOfFields('ColumnNames');
trace '-----[' $(x) ']-----';

for i=1 to $(x);
let j=$(i)+2;

ColumnNames2:
load
$(i) as ColumnNum,
'['&@$(i)&']' as ColumnName
resident ColumnNames;
next;

Col3:
NoConcatenate
load distinct *,
if (Peek('ColumnName') = ColumnName,Peek('ColumnLoad') & '&chr(10) & @' & ColumnNum, '@' & ColumnNum) as ColumnLoad
Resident ColumnNames2;

drop table ColumnNames2;

// Remove the Columns that only concatenate some of the values
filter:
inner keep load
ColumnName,
max(ColumnNum) as ColumnNum
resident Col3
group by ColumnName;
drop table filter;

3)Get each script line individually
Final1:
NoConcatenate
Load
'Unique Name' as ColumnName,
ColumnLoad &' as ' & ColumnName&',' as LoadScriptLine
Resident Col3;

Drop table Col3;

4) Consolidate the Columns into one line of script
Final:
NoConcatenate
Load
Concat(LoadScriptLine,chr(10)) as lines
Resident Final1

Group By ColumnName;

Drop table [Final1];

5) Coalesce the Table load script here, then execute it
// Note: If you use something other comma for delimiting you won't see the highlighting error. This still loads
let FullScript=
'tickets:'& chr(10) &
'NoConcatenate '& chr(10) &
'load '& left(FieldValue('lines','1'),len(FieldValue('lines','1'))-1) & chr(10) &
'FROM [lib://AttachedFiles/Duplicate Columns Example.csv]'& chr(10) &
'(txt, utf8, no labels, delimiter is '','', msq,header is 1 lines);';

6) Break out the Duplicate fields

trace $(FullScript);
$(FullScript);

Drop table [Final];
;

7) Coalesce the table w/Concat values
FinalTickets:
NoConcatenate
LOAD
Ticket,
subfield(Sprint,chr(10)) as Sprint,
subfield("Story Points",chr(10)) as "Story Points",
subfield(Assignee,chr(10)) as Assignee,
subfield([Component/s],chr(10)) as [Component/s]
Resident tickets;

😎 cleanup
Drop tables [tickets],ColumnNames;

dandaanilreddy · ‎2022-07-03

Thanks a lot @morganstake for the script.This worked perfectly. 😊

Loading from source where duplicate field names exist