Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have create an app using qlik sense. The source data is excel files and every file represent one month. So I get every month an excel file to load it. All the files have the same structure, so I use the following approach:
Table:
Load
A,
B,
C,
D,
FROM [lib://AttachedFiles/Jan2019.xlsx]
(ooxml, embedded labels, table is Sheet1);
Concatenate Load
A,
B,
C,
D,
FROM [lib://AttachedFiles/Feb2019.xlsx]
(ooxml, embedded labels, table is Sheet1);
.....
How can I dedicate the month and the year while there is no field can be used as date.
I think it can be solve by generating a number field in every file and use it as month:
Table:
Load
Month,A,B,C,D
1,name,surname,country,city
1,name,surname,country,city
1,name,surname,country,city
FROM [lib://AttachedFiles/Jan2019.xlsx]
(ooxml, embedded labels, table is Sheet1);
Concatenate Load
Month,A,B,C,D
2,name,surname,country,city
2,name,surname,country,city
2,name,surname,country,city
FROM [lib://AttachedFiles/Feb2019.xlsx]
(ooxml, embedded labels, table is Sheet1);
.....
but i don't know how to achieve this in the script while there is too much rows. And how it could be generated automatically every month.
Thanks in advance
The main principle won't change - you need to read the information from the filename. If the filename-logic differs you will need to implement multiple if-loops to identify the pattern (for example checking if the filename contained any numbers, chars, special chars with keepchar(), isnum() and similar functions) and applying then the appropriate converting.
But if there information are missing like the year you need to add them on another way and also the combination of multiple files is quite difficult to handle - I think I would load them twice / several times and divide their measares through the iteration-number.
How expensive those efforts are depends of the variety of the possible combinations. The above mentioned approaches are rather a worse-case scenario and I recommend to solve the difficulties in beforehand applying strict rules to the data-structure and the fielnames of your data - everything else will always cause additionally work.
- Marcus
You could read the filename, for example:
Table:
Load *, month(Date) as Month, year(Date) as Year; // or maybe just using a master-calendar
Load A, B, C, D, date(date#(filebasename(), 'MMMYYYY')) as Date
FROM [lib://AttachedFiles/*.xlsx]
(ooxml, embedded labels, table is Sheet1);
- Marcus
You can create the name of the months from FileName()
This gives you the full name of the table. So something like this:
Right(Subfield(FileName(),'.',1)4) as Year
Left(Subfield(FileName(),'.',1)3) as Month
As long as all your files have format: MMMYYYY.xls
Thank you Marcus and David for quick reply.
I still have the problem while I don't have always the format for the file names.
sometimes it can be like that:
FROM [lib://AttachedFiles/mar+apr.xlsx]
(ooxml, embedded labels, table is Sheet1);
or
FROM [lib://AttachedFiles/DE_201902.xlsx]
(ooxml, embedded labels, table is Sheet1);
Hi,
so you could try something like this
If(Index(filebasename(),'_')>0, Date(Date#(SubField(filebasename(),'_',2),'YYYYMM'),'MM'),
If(Index(filebasename(),'+')>0, Date(Date#(SubField(filebasename(),'+'),'MMM'),'MM'), date(date#(filebasename(), 'MMMYYYY'),'MM')))
The main principle won't change - you need to read the information from the filename. If the filename-logic differs you will need to implement multiple if-loops to identify the pattern (for example checking if the filename contained any numbers, chars, special chars with keepchar(), isnum() and similar functions) and applying then the appropriate converting.
But if there information are missing like the year you need to add them on another way and also the combination of multiple files is quite difficult to handle - I think I would load them twice / several times and divide their measares through the iteration-number.
How expensive those efforts are depends of the variety of the possible combinations. The above mentioned approaches are rather a worse-case scenario and I recommend to solve the difficulties in beforehand applying strict rules to the data-structure and the fielnames of your data - everything else will always cause additionally work.
- Marcus
You what need to create dates using IF conditions.
Btw what do you do when date is mar+apr?
When I have the file name like mar+apr it contain 2 sheets so I load them from the same file
...
FROM [lib://AttachedFiles/mar+apr.xlsx]
(ooxml, embedded labels, table is DE_201903);
...
FROM [lib://AttachedFiles/mar+apr.xlsx]
(ooxml, embedded labels, table is DE_201904);
Can i use the table name in this case?
If there are multiple sheets you will need an additionally loop to load from there - means the classical wildcard-loading won't work. Reading the sheetnames is in general also possible but it requires to load the Excel per ODBC.
Everything is possible but it won't be simple and needs some efforts to consider all the variety ... Therefore again my recommendation of solving these difficulties in beforehand. Probably there are some more challenges with your data ... the best technically capabilities will be quite useless if there is no valid concept about the data and their quality ...
- Marcus
As Marcus is saying, you can create conditions to load just about anything, even use sheet names. However, the more inconsistencies in file&sheet names you have, the more conditions in load script. Is this really the way you want to go? You may be already at the point, when creating the script is more work than actually renaming the files.
The best practice I use with my clients is to agree on naming conventions and use that. Anything with wrong name doesn't get loaded