Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content
Announcements
See why Qlik was named a Leader in the 2025 Gartner® Magic Quadrant™ for Augmented Data Quality Solutions: GET THE REPORT
cancel
Showing results for 
Search instead for 
Did you mean: 
ianwilson
Contributor III
Contributor III

csv file load with header containing data

Hi Guys

I have a set of CSV files to load, which have the format:

Ref1, Ref2

data1,data2,data3,data4,data5

data1,data2,data3,data4,data5

data1,data2,data3,data4,data5

data1,data2,data3,data4,data5

Ref1 and Ref2 relate to the source of the data and have to be included in the table. I need to be able to load this data into a table which looks like:

Ref1,Ref2,data1,data2,data3,data4,data5

Ref1,Ref2,data1,data2,data3,data4,data5

Ref1,Ref2,data1,data2,data3,data4,data5

Ref1,Ref2,data1,data2,data3,data4,data5

The are 100s of files with a date based file name: ABCYYMMDD.csv, so I need to keep a wildcard file load - ABC*.csv. Also the Ref1, Ref2 data is not available anywhere else, such as the file name.

I can see how this might be possible with a nested loop, to read the first line into a couple of variables, then read the rest of the table, but I have not created anything like that in a Qlik load script before. This must be a common problem, but whilst I have found lots of good information about loads, I can't find an example of this. Any help much appreciated.

Thanks

1 Solution

Accepted Solutions
ianwilson
Contributor III
Contributor III
Author

Hi Jonathan

Thanks to your input I've got this working. Sanitised form of the working code below:

Appreciate the help as always.

//**************************************************************************************************//
// Sub routine to step through every file and subdirectory in a path.
// Creates a table containing list of files by name, size and datetime.
// Use the debug to see how it steps through the code
//**************************************************************************************************//
//Subroutine for variable: Root
sub DoDir (Root)
// picks up list of file extensions, in this case only .csv, but could create a comma seperated list - 'csv','txt','dat' etc
for each Ext in 'csv'
// looks at each file, based on the path and file extension construct
  for each File in filelist (Root & '\*.' & Ext)
// standard load for the file attributes
   LOAD '$(File)' as Name,
FileSize( '$(File)' ) as Size,
FileTime( '$(File)' ) as FileTime
autogenerate 1;
next File
next Ext
// routine for every subdirectory
for each Dir in dirlist (Root & '\*' )
call DoDir (Dir)
next Dir
end sub
// starting point for routine where path is specified
call DoDir ('\\SERVER\DIRECTORY\DIRECTORY')


//**************************************************************************************************//
// use the table created above to step through each file
// create table 'CSV' with the required data.
//**************************************************************************************************//
// pickup the file name from the field 'Name' in the table above and populates variable
FOR each varFileName in FieldValueList('Name')

// create new table using the 'First 1' function to only load the first record
Temp1:
FIRST 1
LOAD
@1 as Ref1,
@2 as Ref2,
@3 as Ref3,
@4 as Ref4,
@5 as Ref5
FROM
[$(varFileName)]
(
txt, codepage is 1252, no labels, delimiter is ';', msq, header is 0 lines); // note header is 0 to allow pick up of first line

// Populate variables from Temp1. These will be recreated on the loop for each file. Only need 3 of the 5 reference fields.
Let varRef1 = Peek('Ref1',0,'Temp1');
Let varRef2 = Peek('Ref2',0,'Temp1');
Let varRef4 = Peek('Ref4',0,'Temp1');

drop table Temp1;

// load the main data from the file, using the variables to populate the first few fields for each record.
CSV:
LOAD
'$(varRef1)'
as Ref1,
'$(varRef2)'
as Ref2,
'$(varRef4)'
as Ref4,
@1,
@2,
@3,
@4,
@5,
@6,
@7,
@8,
@9,
@10
FROM
[$(varFileName)] // is the file name used in this step through the loop
(txt, codepage is 1252, no labels, delimiter is ';', msq, header is 1 lines); // note header is 1 to skip first line

Next varFileName

View solution in original post

7 Replies
jpenuliar
Partner - Specialist III
Partner - Specialist III

Hi Ian,

you can start with this code:

temp1:
FIRST 1
LOAD
    @1,
    @2
FROM [lib://JP/Downloads\TestCSV.txt]
(txt, codepage is 28591, no labels, delimiter is ',', msq);


Let vRef1 = Peek('@1',0,'Temp1');
Let vRef2 = Peek('@2',0,'Temp1');

LOAD
'$(vRef1)' as Ref1,
'$(vRef2)' as Ref2,
    @1 as Field1,
    @2 as Field2,
    @3 as Field3,
    @4 as Field4,
    @5 as Field5
FROM [lib://JP/Downloads\TestCSV.txt]
(txt, codepage is 28591, no labels, delimiter is ',', msq, header is 2 lines);

This gives you a table like below:

Anonymous
Not applicable

Hi Ian,

Is the Reference always the in the same place / format as:

Ref1, Ref2

data1,data2,data3,data4,data5

data1,data2,data3,data4,data5

data1,data2,data3,data4,data5

data1,data2,data3,data4,data5

So you have 100's of CSV files that you want to add the specific references of each CSV (1 & 2) to every row of data in the associated CSV?

jpenuliar
Partner - Specialist III
Partner - Specialist III

In order to use the script to load data from multiple files, you will have to build a list of filenames and filepath.

an example is found here :

https://help.qlik.com/en-US/sense/September2017/Subsystems/Hub/Content/Scripting/ScriptControlStatem...

hope this helps.

ianwilson
Contributor III
Contributor III
Author

Hi Jonathan

That looks like it will do the trick. I had seen the "FIRST 1" way of picking out the first record but was struggling to understand how to combine that with the load of multiple files.

I'll reply when I have had chance to pull it all together...

Thanks

ianwilson
Contributor III
Contributor III
Author

Hi Aron

Correct - the header is always the first line and always the same format. The Ref fields may not always be the same length, but are always separated by a semicolon. I think Jonathan's solution above will make this possible.

Thanks

jpenuliar
Partner - Specialist III
Partner - Specialist III

in the help link i forwarded to you:

Example 2 can help you build the list of filenames to load data from.

Example 3 can loop through each file in the table you created using example 2

and a bit of peek() and variable assignment replacing the source filename in the load script.

I've been reading through your original post again, you mention Ref1 and Ref2 does not occur in all your files.

in that case, you will have to check if the variables if the values Ref1 and Ref2 exist during the load.

You will probably end up with more than 1 type of load

ianwilson
Contributor III
Contributor III
Author

Hi Jonathan

Thanks to your input I've got this working. Sanitised form of the working code below:

Appreciate the help as always.

//**************************************************************************************************//
// Sub routine to step through every file and subdirectory in a path.
// Creates a table containing list of files by name, size and datetime.
// Use the debug to see how it steps through the code
//**************************************************************************************************//
//Subroutine for variable: Root
sub DoDir (Root)
// picks up list of file extensions, in this case only .csv, but could create a comma seperated list - 'csv','txt','dat' etc
for each Ext in 'csv'
// looks at each file, based on the path and file extension construct
  for each File in filelist (Root & '\*.' & Ext)
// standard load for the file attributes
   LOAD '$(File)' as Name,
FileSize( '$(File)' ) as Size,
FileTime( '$(File)' ) as FileTime
autogenerate 1;
next File
next Ext
// routine for every subdirectory
for each Dir in dirlist (Root & '\*' )
call DoDir (Dir)
next Dir
end sub
// starting point for routine where path is specified
call DoDir ('\\SERVER\DIRECTORY\DIRECTORY')


//**************************************************************************************************//
// use the table created above to step through each file
// create table 'CSV' with the required data.
//**************************************************************************************************//
// pickup the file name from the field 'Name' in the table above and populates variable
FOR each varFileName in FieldValueList('Name')

// create new table using the 'First 1' function to only load the first record
Temp1:
FIRST 1
LOAD
@1 as Ref1,
@2 as Ref2,
@3 as Ref3,
@4 as Ref4,
@5 as Ref5
FROM
[$(varFileName)]
(
txt, codepage is 1252, no labels, delimiter is ';', msq, header is 0 lines); // note header is 0 to allow pick up of first line

// Populate variables from Temp1. These will be recreated on the loop for each file. Only need 3 of the 5 reference fields.
Let varRef1 = Peek('Ref1',0,'Temp1');
Let varRef2 = Peek('Ref2',0,'Temp1');
Let varRef4 = Peek('Ref4',0,'Temp1');

drop table Temp1;

// load the main data from the file, using the variables to populate the first few fields for each record.
CSV:
LOAD
'$(varRef1)'
as Ref1,
'$(varRef2)'
as Ref2,
'$(varRef4)'
as Ref4,
@1,
@2,
@3,
@4,
@5,
@6,
@7,
@8,
@9,
@10
FROM
[$(varFileName)] // is the file name used in this step through the loop
(txt, codepage is 1252, no labels, delimiter is ';', msq, header is 1 lines); // note header is 1 to skip first line

Next varFileName