Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
haymarketpaul
Creator III
Creator III

LOAD the latest record with duplicate LastUpdatedDates

QV12 SR3

Wondering the best way to LOAD the latest record when my dataset has this sort of thing...

IDEmailLast Update Date
1me@somewhere.com
10/06/2016
2me@somewhere.com22/03/2016
3me@somewhere.com10/06/2016
4me@somewhere.com03/01/2016

I need the latest record based on [Last Update Date] and then if they are equal (ID 1 & 3 above for example) then the highest ID value (ID is an autonumber) - which would leave me with just one record per email address.

Is it a case of preceding FirstSortedValues (if that's possible?) or is there a better way ?

Any pointers appreciated

1 Solution

Accepted Solutions
sunny_talwar

May be this:

Table:

LOAD Email,

  FirstSortedValue(ID, -([Last Update Date]*1000+ID)) as ID,

  FirstSortedValue([Last Update Date], -([Last Update Date]*1000+ID)) as [Last Update Date]

Group By Email;

LOAD * INLINE [

    ID, Email, Last Update Date

    1, me@somewhere.com, 10/06/2016

    2, me@somewhere.com, 22/03/2016

    3, me@somewhere.com, 10/06/2016

    4, me@somewhere.com, 03/01/2016

];

View solution in original post

11 Replies
narendiran
Partner - Creator
Partner - Creator

Use Incremental load, See the below link

http://www.quickintelligence.co.uk/qlikview-incremental-load/

sunny_talwar

May be this:

Table:

LOAD Email,

  FirstSortedValue(ID, -([Last Update Date]*1000+ID)) as ID,

  FirstSortedValue([Last Update Date], -([Last Update Date]*1000+ID)) as [Last Update Date]

Group By Email;

LOAD * INLINE [

    ID, Email, Last Update Date

    1, me@somewhere.com, 10/06/2016

    2, me@somewhere.com, 22/03/2016

    3, me@somewhere.com, 10/06/2016

    4, me@somewhere.com, 03/01/2016

];

Ponkaviyarasu
Partner - Contributor III
Partner - Contributor III

You can try this method

find max(Last update Date) in existing record then concatenate with new record where not exists(date) and not exists(Email)

Not applicable

Hi Paul,

You can write something like this:

//----------------------------------------------------------------------------

Test:

Load ID, Email, Date(Date) as LastUpdatedDate; 

LOAD * INLINE [

    ID, Email, Date

    1, 123@abc.com, 42571

    2, 123@abc.com, 42557

    3, 123@abc.com, 42571

    4, 123@abc.com, 42565

    5, 123@abc.com, 42514

];

Test2:

Load

FirstSortedValue(ID,LastUpdatedDate) as ID

Resident

Test

Group By LastUpdatedDate;

left join

Load * resident Test;

Drop Table Test;

//----------------------------------------------------------------------------

You can sort the records from the sort menu of Table Box properties based on requirement.

Hope this helps.

Regards,

Sakir

santiago_respane
Specialist
Specialist

Hi,

first of all i'd say Sunny's suggestion is the most appropriate for this case (works perfect, less code, better performance, etc.).

Just to add another possible option here's a solution using inner join operations (more basic solution):

Table:

LOAD * INLINE [

ID,Email,Last Update Date

1,me@somewhere.com,10/06/2016

2,me@somewhere.com,22/03/2016

3,me@somewhere.com,10/06/2016

4,me@somewhere.com,03/01/2016

5,you@somewhere.com,03/01/2016

6,you@somewhere.com,03/08/2016

];

INNER JOIN(Table)

LOAD Email,MAX([Last Update Date]) as [Last Update Date] RESIDENT Table GROUP BY Email;

INNER JOIN(Table)

LOAD Email,MAX(ID) as ID RESIDENT Table GROUP BY Email;

Kind regards,

haymarketpaul
Creator III
Creator III
Author

Perfect - Thank you

haymarketpaul
Creator III
Creator III
Author

Have encountered an issue - in some cases both the Last Update Date and ID are identical (although some other data is different).

This seems to blank (NULL) the Last Update Date and ID fields for that email address.

Is there some way of ensuring that if they are identical that the field remains populated?

sunny_talwar

Try this:

Table:

LOAD Email,

  FirstSortedValue(DISTINCT ID, -([Last Update Date]*1000+ID)) as ID,

  FirstSortedValue(DISTINCT [Last Update Date], -([Last Update Date]*1000+ID)) as [Last Update Date]

Group By Email;

LOAD * INLINE [

    ID, Email, Last Update Date

    1, me@somewhere.com, 10/06/2016

    2, me@somewhere.com, 22/03/2016

    3, me@somewhere.com, 10/06/2016

    4, me@somewhere.com, 03/01/2016

];

haymarketpaul
Creator III
Creator III
Author

i got it......

DISTINCT