Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
spikenaylor1
Creator
Creator

qlikview load data only newest date

Hi

trying to load data set

a record for a sample for a batch has been entered on 16 March 2020 with data

it was found that the data was wrong

another record was entered for same batch with the corrected data but on the 17 March 2020

both records now show in the source table.

 

How do I load the data but only ensuring the newest record for that entry is loaded.

I.e. I only want the record for the 17 March 2002 to load, not the 16 March 2020

 

obviously this applies for looking at all records where there may have been corrected info entered in the source tabel

field I am looking at is ResultEnteredOn

 

Many thanks for Looking

 

1 Solution

Accepted Solutions
vamsee
Specialist
Specialist

12 Replies
vamsee
Specialist
Specialist

You have first find the unique key column for the batch.

Then, try

A:
Load
*,
Unique_key,//This key should not include date value
Date,
Unique_key&Date as Key_To_Join
From Source;

Right Join(A)
Load 
Unique_key&Date as Key_To_Join
;
Load
Unique_Key,
Max(Date) as Date
Resident A
Group By Unique_Key
;
Drop Field Key_To_Join from A;
spikenaylor1
Creator
Creator
Author

Cant get this to work, it still shows two entries, cant see what is wrong.

 

A:
Load
LotXLotName,
ResultEnteredOn,
ResultNumericEntry,
ResultSampleNumber,
ResultSampleNumber&ResultEnteredOn as Key_to_Join // create a unique key consisting of the Sample Number and the date entered.

From
$(vQC_ODHData_Ext) (qvd);

Right Join(A)
Load
ResultSampleNumber&ResultEnteredOn as Key_to_Join;

Load
ResultSampleNumber,
Max(ResultEnteredOn) as ResultEnteredOn // Only get the newest Date Entered.
Resident A
Group By ResultSampleNumber;

Drop Field Key_to_Join from A; // Remove the Unique Key as we do not need it anymore.

 

 

NewDateOnly.PNG

vamsee
Specialist
Specialist

I don't see an issue in the screenshot shared, as I see two ResultSampleNumber and you are grouping by ResultSampleNumber.

Are you looking for one record per LOT?

 

NitinK7
Specialist
Specialist

Hi,

Try like below to get only max date(new records)

Tabale_Name:

Load Distainct

        'your key fied name',

       Max(Date filed) as Date Field

     from file path

   group by 'your key fied name';

inner join

Load  *

from your file path;

spikenaylor1
Creator
Creator
Author

Hi there all, thanks for the help, I have tried the examples given to no avail.

Here is a fuller set of data.

NewDateOnly.PNG

So I have loaded from the source a mixed bag of test results for different tests.

This sample set of data is just for one Lot.

In this example Test 5 has had data entered twice. the newer ResultEnteredOn date is the row I want showing . I want to discard the 1st Test 5 row.

All other rows would remain.

 

For Test 1, all the results are calculated at the same time in the source. that is why there is 7 entries.

if the user noticed they had entered something wrong, there would be a newer set of 7 rows for Test 1 with a newer ResultEnteredOn date. I would as with test 5 want to discard the Older test 1 rows keeping only the newer data.

 

any ideas, Would I need to break the loading down to loading for tests and remove older rows there, then combine all filtered data together, or can it be done for this one table.

Many thanks for any help or advice given

 

 

 

 

 

vamsee
Specialist
Specialist

Based on the inputs, my first advice would be to use incremental load on the data source while extracting data from each LOT, rather than taking this approach.

Refer: https://www.analyticsvidhya.com/blog/2014/09/qlikview-incremental-load/

If I got this right, firstly, you want the check to happen based on each Lot, TestReportedName & ResultSampleNumber.

So LOT&TestReportedName &ResultSampleNumber would be your Unique Key.

Try

 

A:
Load
LotXLotName,
Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss')) as ResultEnteredOn,
ResultNumericEntry,
ResultSampleNumber,
LotXLotName&TestReportedName&ResultSampleNumber as Unique_Key,
LotXLotName&TestReportedName&ResultSampleNumber&Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss')) as Key_to_Join

From
$(vQC_ODHData_Ext) (qvd);

Right Join(A)
Load
Unique_Key&Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss'))  as Key_to_Join;

Load
Unique_Key,
Max(ResultEnteredOn) as ResultEnteredOn // Only get the newest Date Entered.
Resident A
Group By Unique_Key;

Drop Field Key_to_Join from A; // Remove the Unique Key as we do not need it anymore.

 

 

vamsee
Specialist
Specialist

The Timestamp() ensures the date format stays intact and join happens perfectly

spikenaylor1
Creator
Creator
Author

till cant get anything to work.

I have provided some sample data

aa:
LOAD * INLINE [
    LotXLotName,ResultEnteredOn,ResultName,ResultNumericEntry,SampleStage,TestReportedName,ResultSampleNumber,LotXDueDate
    AB1234,01/06/2020 18:57:30,Test5,425.24,A,Test5,2921298,25/05/2020 00:00:00
    AB1234,08/06/2020 07:59:14,Test5,361.75,A,Test5,2927646,25/05/2020 00:00:00   		
    AB1234,01/06/2020 19:57:30,Test1a,98,A,Test1,2904981,25/05/2020 00:00:00
    AB1234,01/06/2020 19:57:30,Test1b,24,A,Test1,2904981,25/05/2020 00:00:00 
    AB1234,08/06/2020 08:59:14,Test1a,97,A,Test1,2932587,25/05/2020 00:00:00
    AB1234,08/06/2020 08:59:14,Test1b,35,A,Test1,2932587,25/05/2020 00:00:00       
];

/* This is what I need from the Data, but in reality I have lots of LotXLotName, and lots of different TestReportedNames.

    AB1234,08/06/2020 07:59:14,Test5,361.75,A,Test5,2927646,25/05/2020 00:00:00
    AB1234,08/06/2020 08:59:14,Test1a,97,A,Test1,2932587,25/05/2020 00:00:00
    AB1234,08/06/2020 08:59:14,Test1b,35,A,Test1,2932587,25/05/2020 00:00:00 
*/

A:
Load
LotXLotName,
ResultName,
ResultNumericEntry,
SampleStage,
TestReportedName,
ResultSampleNumber,
LotXDueDate,
Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss')) as ResultEnteredOn,
LotXLotName&TestReportedName&ResultSampleNumber as Unique_Key,
LotXLotName&TestReportedName&ResultSampleNumber&Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss')) as Key_to_Join

resident aa;


Right Join(A)
Load
Unique_Key&Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss'))  as Key_to_Join;

Load
Unique_Key,
Max(ResultEnteredOn) as ResultEnteredOn // Only get the newest Date Entered.
Resident A
Group By Unique_Key;

Drop Field Key_to_Join from A; // Remove the Unique Key as we do not need it anymore.

just cant see why it is not working

 

 

vamsee
Specialist
Specialist

My bad made a minor correction to the join script.

Right Join(A)
Load
Unique_Key&Num(ResultEnteredOn) as Key_to_Join
;

Your final script

aa:
LOAD * INLINE [
    LotXLotName, ResultEnteredOn, ResultName, ResultNumericEntry, SampleStage, TestReportedName, ResultSampleNumber, LotXDueDate
    AB1234, 01/06/2020 18:57:30, Test5, 425.24, A, Test5, 2921298, 25/05/2020 00:00:00
    AB1234, 08/06/2020 07:59:14, Test5, 361.75, A, Test5, 2927646, 25/05/2020 00:00:00
    AB1234, 01/06/2020 19:57:30, Test1a, 98, A, Test1, 2904981, 25/05/2020 00:00:00
    AB1234, 01/06/2020 19:57:30, Test1b, 24, A, Test1, 2904981, 25/05/2020 00:00:00
    AB1234, 08/06/2020 08:59:14, Test1a, 97, A, Test1, 2932587, 25/05/2020 00:00:00
    AB1234, 08/06/2020 08:59:14, Test1b, 35, A, Test1, 2932587, 25/05/2020 00:00:00
];

/* This is what I need from the Data, but in reality I have lots of LotXLotName, and lots of different TestReportedNames.

    AB1234,08/06/2020 07:59:14,Test5,361.75,A,Test5,2927646,25/05/2020 00:00:00
    AB1234,08/06/2020 08:59:14,Test1a,97,A,Test1,2932587,25/05/2020 00:00:00
    AB1234,08/06/2020 08:59:14,Test1b,35,A,Test1,2932587,25/05/2020 00:00:00 
*/

A:
Load
			LotXLotName,
			ResultName,
			ResultNumericEntry,
			SampleStage,
			TestReportedName,
			ResultSampleNumber,
			LotXDueDate,
			(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss')) as ResultEnteredOn,
			LotXLotName&TestReportedName&ResultSampleNumber as Unique_Key,
			LotXLotName&TestReportedName&ResultSampleNumber&Num(TimeStamp#(ResultEnteredOn, 'dd/MM/yyyy hh:mm:ss')) as Key_to_Join

Resident aa;
Drop Table aa;

Right Join(A)
Load
			Unique_Key&Num(ResultEnteredOn)  as Key_to_Join
;

Load
			Unique_Key,
			Max(ResultEnteredOn) as ResultEnteredOn // Only get the newest Date Entered.
Resident A
Group By Unique_Key;

Drop Field Key_to_Join from A; // Remove the Unique Key as we do not need it anymore.