Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Pre-Selection of single rows from multiple csv files

Hello,

One of our monitoring tools creates csv files in a folder hourly.
The structure of the file name is always the same "location" (e.g. Berlin, Munich) and "time" (hhmmss).
The files itself contain a header and some rows with data.

From all this files I need only the last from every location and from every hour. (below marked <==)
Berlin_091013.csv         
Berlin_091114.csv         
Berlin_091117.csv         
Berlin_091219.csv  <==        
Munich_091005.csv         
Munich_091506.csv         
Munich_091607.csv         
Munich_091708.csv  <==
Munich_101205.csv         
Munich_101206.csv         
Munich_101307.csv         
Munich_101308.csv  <==

And from every of this files I need only the last record.

To exclude all the other files/records would be very helpful because there are about 50000 files in the folder and I need only about 400 rows.

But all my attempts failed.

Thanks in advance.

5 Replies
hic
Former Employee
Former Employee

for each vFile in FileList('*.csv')
Data1:
Load *,
Subfield(FileBaseName(),'_',1) as City,
Time#(Subfield(FileBaseName(),'_',2),'hhmmss') as Time
FROM [$(vFile)] (txt, codepage is 1252, embedded labels, delimiter is ',', msq);
next vFile

Left Join
Load
City,
Max(Time) as MaxTime
resident Data1
Group By City;

Data2:
Noconcatenate
Load * resident Data1
where Time=MaxTime;

Drop Table Data1;

hic
Former Employee
Former Employee

I just saw a simpler solution...

for each vFile in FileList('*.csv')
Data1:
Load *,
Subfield(FileBaseName(),'_',1) as City,
Time#(Subfield(FileBaseName(),'_',2),'hhmmss') as Time
FROM [$(vFile)] (txt, codepage is 1252, embedded labels, delimiter is ',', msq);
next vFile

Inner Join
Load
City,
Max(Time) as Time
resident Data1
Group By City;

Not applicable
Author

Thanks Henric, your solution works very well.

Now to my second question.
How can I work with the results?

Let's assume I got following result:
Munich 20120103 5
Munich 20120104 7
Munich 20120105 6
Berlin 20120103 15
Berlin 20120104 17
Berlin 20120105 16

It would be helpful to create charts with expressions related to the resuls not to all data:
Munich max: 7
Munich avg: 6
Munich min: 5
Berlin max: 17
Berlin avg: 16
Berlin min: 15

20120103 max: 15
20120103 avg: 10
20120103 min:  5
20120104 max: 17
20120104 avg: 12
20120104 min:  7

Thanks.

Not applicable
Author

Thanks Henric, your solution works very well.

Now to my second question.
How can I work with the results?

Let's assume I got following result:
Munich 20120103 5
Munich 20120104 7
Munich 20120105 6
Berlin 20120103 15
Berlin 20120104 17
Berlin 20120105 16

It would be helpful to create charts with expressions related to the resuls not to all data:
Munich max: 7
Munich avg: 6
Munich min: 5
Berlin max: 17
Berlin avg: 16
Berlin min: 15

20120103 max: 15
20120103 avg: 10
20120103 min:  5
20120104 max: 17
20120104 avg: 12
20120104 min:  7

Thanks.

hic
Former Employee
Former Employee

This should be straightforward: Create a chart (right-click on sheet - new sheet object) where you use one dimension, e.g. City, and three expressions, Max(Data), Avg(Data) and Min(Data). I would start with a bar chart or a pivot table and change it later when I know that I have the numbers I look for.

You could have several dimensions in one chart, but since it is easier to read a chart with just one dimension, I would probably make several charts: One with City as dimension and a second with Date/Time as dimension.