Solved: How to count historical data when it got replace w... - Qlik Community

Applicable88 · ‎2020-05-24

Hello,

I have a tough question here and since I'm new I don't know how to solve it best. I have two tables. These tables have the (Shelf.No or others) and Coordinate as the connected Key.

Table1

Error-Number	Date	Shelf No. or others
343384762	20.05.20	1111
285645938	21.05.20	3424
375783945	22.05.20	3333
333344322	23.05.20	4444
897689983	24.05.20	7674
234325621	25.05.20	6666

Table2

Coordinate	Materialnumber	Sector
2222	C1111111	Area11
3333	C4234522	Area39
7777	D5930433	Area56
6535	E4749573	Area11
4531	C4234522	Area33
1111	C2345654	Area56

Table1 consist error reports from logistic workers who realizes in the shelves of their corresponding Sectors, errors like "empty shelves", or someone accidentally filled the "wrong material". Then the logistic workers uses the SAP-UI to file a report. For additional information they just need to fill in the 4-digit coordinates in which shelve the error occurred. SAP interface will then automatically create an unique Error No. with a date/timestamp. Qlik gets daily updates from SAP of these reports. Note that this table also consist of reports without a shelve no., since also other departments uses this user interface to report other not shelve and logistic related problems.

Table2 is a self constructed Excel Database.Since the 4-digit error report doesn't tell me where the actual problem, Sector or even Material occurred, so this Table is compulsory. For example it tells me that the shelve No. 2222 consist the Material C1111111 and the shelf is in department sector Area11:

Coordinate	Materialnumber	Sector
2222	C1111111	Area11

So I also could count and visualize which material or which sector made the most problems.

The shelve No. is unique but other shelves can of course also contain the same material. And the sector is of course also not unique, since a sector contains multiple shelves. Normally Table2 is "fixed" there is no automatic update, since coordinates won't change.

Now its not a big problem if Table2 never changes. But since production lines and their materials will change from time to time I have a big problem to return the correct values for my bar diagram and so. So on a monthly basis we sometimes decide to exchange Materials of a few shelves. That means for instance Shelve No.2222 might changed from C1111111to Material "C2222222" all of a sudden.

The problem that occurs now: The old error reports in Tabe1 that counted for the old material C1111111 will also counts for C2222222, instead of making a cut and reset the error of the new exchanged material of the shelf to zero, since I'm having a new material there now and want to start to make a new count with that material.

But we also don't want to lose the historic data of the past when the shelf still contained the old material. Ideally the bar diagram should show both bars . Shelf No. 2222 with new AND old Material.No accumulation of both count in one bar, it has to be counted separately.

Every month or two I do a export of the excel data from the shelf system. I can compare the data with my old excel export to check if there is any changes in materials in the shelves. I will never know exactly when the material was changed. It really depends on how often I check these, but there is NO timestamp when it got replaced with another material. I need to make a monthly update myself. So there is always a little subset how the errors should be account for: for the new or old material. I can use end of month, respectively the day I manually update the excel sheet as the moment where I start to count for the new material.
The only thing I can say, is that I can change the Excel sheet of the Shelf Coordinates to help me count right. But I don't know how to do it.

That was a long one, but I hope some expert can guide me through the right direction.

Thanks in advance and stay safe!

BjoernWollny · ‎2020-05-24

Hi,

I faced a similar problem with assignment of employees to departments. What worked well for me:

Basic Requirement (only first time)

a. Load your Excel file and store >Today() as StartDate< in one Field and '31.12.9999' as EndDate

b. create key of Coordinate, Materialnumber, Sector

c. Save this table as 'historical' QVD

Steps during every Reload:

1. Load your 'historical' QVD from step b) only where EndDate = '31.12.9999' (so these entries are still active / you can also use flags instead of this date)

2. concatenate only data from your excel file, where key not exists (so only changes, or new entries) - also create field StartDate

3. sort by your main field and StartDate => with this changed data sets are directly following each other

4. load table again and use key = peek(key) to detect if there was a change in your combination of Coordinate, Materialnumber, Sector - in case yes, update the EndDate to Today()-1, else write EndDate 31.12.9999

5. if you like, add now all already closed periods to this table - so load all entries from 'historical' QVD where end date <> '31.12.9999'

You can now link this to your SAP table by using intervalmatch with Date and StartDate, EndDate.

attached you find script:

set vMaxDate = date('12/31/2999');

//Load current data set (from today)
daily:
LOAD 
    A
    ,B
    ,C
    ,today() as StartDate
    ,AutoNumberHash128(A, B,C) as Key
FROM [];


//Load QVD with historical data where period is not closed
historical:
NoConcatenate LOAD
    A,
    B,
    C,
    StartDate
     ,AutoNumberHash128(A, B,C) as Key2
FROM [historical.qvd]
(qvd) where EndDate = $(vMaxDate);

//Add new and changed data to historical data (by comparing if there is any change in a column - compare keys)
concatenate Load
            *
resident daily where not exists(Key2, Key);
drop Table daily;

//Sort resulting table by Global Employee Code and by Start Date
// => with this sorting the actual data set is always the first and the old data set is following => in next step end date can be set by start date of previous(peek)
SortTable:
NoConcatenate Load
            *
resident historical order by A, StartDate desc;
drop table historical;


FinalTable:
NoConcatenate Load
            *,
	    if(peek(GlobalEmpCode) = GlobalEmpCode, date(peek(StartDate)-1), $(vMaxDate))as EndDate
resident SortTable;
drop table SortTable;

//Add historical data with closed period
Concatenate LOAD
    A
    ,B
    ,C
    ,StartDate
     ,AutoNumberHash128(A, B,C) as Key2
FROM [historical.qvd]
(qvd) where EndDate <> $(vMaxDate);

//Store Final Table
store FinalTable into  [historical.QVD] (qvd);

View solution in original post

ArnadoSandoval · ‎2020-05-24

Hi @Applicable88

You described what is known as Slowly Changing Dimension, there are plenty of documentation regarding them on the internet, I am sharing some of them!

It will be good if you take this opportunity to learn more about Slowly Changing Dimensions, so, your handling will be a well documented implementation.

Hope this helps,

Arnaldo Sandoval
A journey of a thousand miles begins with a single step.