Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Pre-Selection of single rows from multiple csv files

Hello,

One of our monitoring tools creates csv files in a folder hourly.
The structure of the file name is always the same "location" (e.g. Berlin, Munich) and "time" (hhmmss).
The files itself contain a header and some rows with data.

From all this files I need only the last from every location and from every hour. (below marked <==)
Berlin_091013.csv         
Berlin_091114.csv         
Berlin_091117.csv         
Berlin_091219.csv  <==        
Munich_091005.csv         
Munich_091506.csv         
Munich_091607.csv         
Munich_091708.csv  <==
Munich_101205.csv         
Munich_101206.csv         
Munich_101307.csv         
Munich_101308.csv  <==

And from every of this files I need only the last record.

To exclude all the other files/records would be very helpful because there are about 50000 files in the folder and I need only about 400 rows.

But all my attempts failed.

Thanks in advance.

5 Replies
hic
Former Employee
Former Employee

for each vFile in FileList('*.csv')
Data1:
Load *,
Subfield(FileBaseName(),'_',1) as City,
Time#(Subfield(FileBaseName(),'_',2),'hhmmss') as Time
FROM [$(vFile)] (txt, codepage is 1252, embedded labels, delimiter is ',', msq);
next vFile

Left Join
Load
City,
Max(Time) as MaxTime
resident Data1
Group By City;

Data2:
Noconcatenate
Load * resident Data1
where Time=MaxTime;

Drop Table Data1;

hic
Former Employee
Former Employee

I just saw a simpler solution...

for each vFile in FileList('*.csv')
Data1:
Load *,
Subfield(FileBaseName(),'_',1) as City,
Time#(Subfield(FileBaseName(),'_',2),'hhmmss') as Time
FROM [$(vFile)] (txt, codepage is 1252, embedded labels, delimiter is ',', msq);
next vFile

Inner Join
Load
City,
Max(Time) as Time
resident Data1
Group By City;

Not applicable
Author

Thanks Henric, your solution works very well.

Now to my second question.
How can I work with the results?

Let's assume I got following result:
Munich 20120103 5
Munich 20120104 7
Munich 20120105 6
Berlin 20120103 15
Berlin 20120104 17
Berlin 20120105 16

It would be helpful to create charts with expressions related to the resuls not to all data:
Munich max: 7
Munich avg: 6
Munich min: 5
Berlin max: 17
Berlin avg: 16
Berlin min: 15

20120103 max: 15
20120103 avg: 10
20120103 min:  5
20120104 max: 17
20120104 avg: 12
20120104 min:  7

Thanks.

Not applicable
Author

Thanks Henric, your solution works very well.

Now to my second question.
How can I work with the results?

Let's assume I got following result:
Munich 20120103 5
Munich 20120104 7
Munich 20120105 6
Berlin 20120103 15
Berlin 20120104 17
Berlin 20120105 16

It would be helpful to create charts with expressions related to the resuls not to all data:
Munich max: 7
Munich avg: 6
Munich min: 5
Berlin max: 17
Berlin avg: 16
Berlin min: 15

20120103 max: 15
20120103 avg: 10
20120103 min:  5
20120104 max: 17
20120104 avg: 12
20120104 min:  7

Thanks.

hic
Former Employee
Former Employee

This should be straightforward: Create a chart (right-click on sheet - new sheet object) where you use one dimension, e.g. City, and three expressions, Max(Data), Avg(Data) and Min(Data). I would start with a bar chart or a pivot table and change it later when I know that I have the numbers I look for.

You could have several dimensions in one chart, but since it is easier to read a chart with just one dimension, I would probably make several charts: One with City as dimension and a second with Date/Time as dimension.