Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I am looking for code for a seemingly simple task. Load multiple csv files and skip a field load if it does not exist. Example- I am loading fields [OptionA] [OptionB] [OptionC] and [OptionD]. In some files, [OptionD] is called [OptionE]. In others it may be [OptionF] [OptionG]. I do not want a load statement for each file since it will take too long. I want to load \\my location\*.csv to load all files. But I need to skip if a field does not exist. I cannot load all fields since I am dealing with 50+ columns and 75+ files. I simply need to load 4 fields but the 4th field can have up to 7 names (they all mean the same thing). I am stuck and I don’t this this is possible in QlikView.
DataAll:
LOAD
Filepath() as Key,
record_status,
call_result,
call_time,
chain_n,
Location,
disposition,
Campaign
FROM
[*.csv]
(txt, codepage is 1252, embedded labels, delimiter is ',', msq);
SUB LoadField(field)
SET ErrorMode=0;
AddField:
LOAD
Filepath() as Key,
[$(field)] as stunum
FROM
[*.csv]
(txt, codepage is 1252, embedded labels, delimiter is ',', msq);
SET ErrorMode=1;
ENDSUB
FOR EACH field IN 'stunum','Stunum','StuNum','ID','Id','id','Print_Id'
CALL LoadField('$(field)')
NEXT
I know, I know, this won't work either
-Rob
Hi.
You could split the load process into two steps: check the fields and store column#, load if ok.
To check the fields you can load only the first string, transform it with crosstable tag and perform checks with applymap (for renaming) or something else. You should also store the column # of each field during transformation.
Then you could use that information to parameterize the load with $-sign expansion and skip the first row using column numbers.
Seems not so hard 😃
Hi
I think this is what you are after:
T_Cols:
LOAD '[' & Concat(@1, '],[') & ']' As ColNames
FROM [Data.xlsx]
(ooxml, no labels, table is Sheet1, filters(Transpose()));
Let zCols = Peek('ColNames');
DROP Table T_Cols;
Data:
LOAD
$(zCols)
FROM [Data.xlsx]
(ooxml, embedded labels, table is Sheet1);
The first transposed load gets the field names, and the second then loads those columns.
HTH
Jonathan
Neither option worked. The columns are not in the same set location. I need a simple skip if the filed name does not exist.
Here's an idea.
AllData:
LOAD filename() as Key, A, B, C
from ...;
SET ErrorMode=0; // Ignore Errors
Option:
LOAD filename() as Key, D as Option
from ...;
Option:
LOAD filename() as Key, E as Option // Continue with F, etc as needed
from ...;
LEFT JOIN (AllData) LOAD * RESIDENT Option;
DROP TABLE Option;
-Rob
Hello,
This option will not work. 75+ files with 7 possible field names. I am looking for a simple load all files in a folder and skip if file does not contain the field name.
I don't think what i am looking for is possible.
I think perhaps I didn't make it clear that you use the wildcard in the from ...
\\my location\*.csv
Yes, you have to code one LOAD statement for each of the possible field names.
You said this "option will not work" and you said the same thing about Whiteline & Jonathan's answers as well, although they looked perfectly viable to me. I don't think you are going to find the "simple load" (like a magic option) but there are a number of viable solutions.
-Rob
It would stop us all guessing if you posted a couple of your source data spreadsheets. Leave the headers intact, but remove most of the data rows, leaving about 10-20 in each.
Great Idea! Here is the setup, Test files are in a folder. I would like to import only these fields
record_status |
chain_n |
Location |
*stunum* |
disposition |
Campaign |
but stunum can be stunum, Stunum,StuNum,ID,Id,id, and Print_Id. Only one of this form will exist in a file. Also there are not 7 files, about 30 files in 20+ folders each. I would like to load all files in a folder with only the fields mentioned.