Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
HAAM
Contributor II
Contributor II

Generate month and year fields for a data set that don't contain date fields

Hi,

I have create an app using qlik sense. The source data is excel files and every file represent one month. So I get every month an excel file to load it. All the files have the same structure, so I use the following approach:

Table:
    Load
        A,
        B,
        C,
        D,
FROM [lib://AttachedFiles/Jan2019.xlsx]
(ooxml, embedded labels, table is Sheet1);

Concatenate Load
        A,
        B,
        C,
        D,
FROM [lib://AttachedFiles/Feb2019.xlsx]
(ooxml, embedded labels, table is Sheet1);

.....

How can I dedicate the month and the year while there is no field can be used as date.

I think it can be solve by generating a number field in every file and use it as month:

Table:
    Load

Month,A,B,C,D
1,name,surname,country,city
1,name,surname,country,city
1,name,surname,country,city


FROM [lib://AttachedFiles/Jan2019.xlsx]
(ooxml, embedded labels, table is Sheet1);

Concatenate Load


Month,A,B,C,D
2,name,surname,country,city
2,name,surname,country,city
2,name,surname,country,city


FROM [lib://AttachedFiles/Feb2019.xlsx]
(ooxml, embedded labels, table is Sheet1);

.....

but i don't know how to achieve this in the script while there is too much rows. And how it could be generated automatically every month.

Thanks in advance   

 

Labels (2)
1 Solution

Accepted Solutions
marcus_sommer

The main principle won't change - you need to read the information from the filename. If the filename-logic differs you will need to implement multiple if-loops to identify the pattern (for example checking if the filename contained any numbers, chars, special chars with keepchar(), isnum() and similar functions) and applying then the appropriate converting.

But if there information are missing like the year you need to add them on another way and also the combination of multiple files is quite difficult to handle - I think I would load them twice / several times and divide their measares through the iteration-number.

How expensive those efforts are depends of the variety of the possible combinations. The above mentioned approaches are rather a worse-case scenario and I recommend to solve the difficulties in beforehand applying strict rules to the data-structure and the fielnames of your data - everything else will always cause additionally work.

- Marcus

View solution in original post

11 Replies
marcus_sommer

You could read the filename, for example:

Table:
Load *, month(Date) as Month, year(Date) as Year; // or maybe just using a master-calendar
Load A, B, C, D, date(date#(filebasename(), 'MMMYYYY')) as Date
FROM [lib://AttachedFiles/*.xlsx]
(ooxml, embedded labels, table is Sheet1);

- Marcus

DavidM
Partner - Creator II
Partner - Creator II

You can create the name of the months from FileName()

This gives you the full name of the table. So something like this:

Right(Subfield(FileName(),'.',1)4) as Year

Left(Subfield(FileName(),'.',1)3) as Month

As long as all your files have format: MMMYYYY.xls

HAAM
Contributor II
Contributor II
Author

Thank you Marcus and David for quick reply.

I still have the problem while I don't have always the format for the file names.

sometimes it can be like that:

FROM [lib://AttachedFiles/mar+apr.xlsx]
(ooxml, embedded labels, table is Sheet1); 

or

FROM [lib://AttachedFiles/DE_201902.xlsx]
(ooxml, embedded labels, table is Sheet1);

StarinieriG
Partner - Specialist
Partner - Specialist

Hi,

so you could try something like this

If(Index(filebasename(),'_')>0, Date(Date#(SubField(filebasename(),'_',2),'YYYYMM'),'MM'),

If(Index(filebasename(),'+')>0, Date(Date#(SubField(filebasename(),'+'),'MMM'),'MM'),  date(date#(filebasename(), 'MMMYYYY'),'MM')))

marcus_sommer

The main principle won't change - you need to read the information from the filename. If the filename-logic differs you will need to implement multiple if-loops to identify the pattern (for example checking if the filename contained any numbers, chars, special chars with keepchar(), isnum() and similar functions) and applying then the appropriate converting.

But if there information are missing like the year you need to add them on another way and also the combination of multiple files is quite difficult to handle - I think I would load them twice / several times and divide their measares through the iteration-number.

How expensive those efforts are depends of the variety of the possible combinations. The above mentioned approaches are rather a worse-case scenario and I recommend to solve the difficulties in beforehand applying strict rules to the data-structure and the fielnames of your data - everything else will always cause additionally work.

- Marcus

DavidM
Partner - Creator II
Partner - Creator II

You what need to create dates using IF conditions.

Btw what do you do when date is mar+apr?

HAAM
Contributor II
Contributor II
Author

When I have the file name like mar+apr it contain 2 sheets  so I load them from the same file

...

FROM [lib://AttachedFiles/mar+apr.xlsx]
(ooxml, embedded labels, table is DE_201903);

 

...
FROM [lib://AttachedFiles/mar+apr.xlsx]
(ooxml, embedded labels, table is DE_201904); 

Can i use  the table name in this case?

 

marcus_sommer

If there are multiple sheets you will need an additionally loop to load from there - means the classical wildcard-loading won't work. Reading the sheetnames is in general also possible but it requires to load the Excel per ODBC.

Everything is possible but it won't be simple and needs some efforts to consider all the variety ... Therefore again my recommendation of solving these difficulties in beforehand. Probably there are some more challenges with your data ... the best technically capabilities will be quite useless if there is no valid concept about the data and their quality ...

- Marcus

DavidM
Partner - Creator II
Partner - Creator II

As Marcus is saying, you can create conditions to load just about anything, even use sheet names. However, the more inconsistencies in file&sheet names you have, the more conditions in load script. Is this really the way you want to go? You may be already at the point, when creating the script is more work than actually renaming the files.

The best practice I use with my clients is to agree on naming conventions and use that. Anything with wrong name doesn't get loaded