Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Load script Loop: Multiple lines to find most recent transaction

Afternoon,

Thanks in advance of any help you share.

I have the below data and am looking to load only the most recent transaction by PolicyNumber so that I can drop/remove all previous data.

Example:

Table 1 (to be dropped)

PolicyNumberPremium

Transaction

Type

Transaction Data
PN000123

£500

NB01/01/2015
PN000125£800NB

14/01/2015

PN000126£450NB

01/02/2015

PN000123£92ADJ03/02/2015
PN000123£0ADJ04/02/2015
PN000125£72ADJ15/01/2015
PN000128£475NB01/03/2015

I'm looking to reduce the above data load to this: (The most recent transaction by PolicyNumber)

Table 2

PolicyNumberPremium

Transaction

Type

Transaction Data
PN000126£450NB

01/02/2015

PN000123£0ADJ04/02/2015
PN000125£72ADJ15/01/2015
PN000128£475NB01/03/2015

Thanks again

Gary

1 Solution

Accepted Solutions
jonathandienst
Partner - Champion III
Partner - Champion III

Use an inner join to exclude all but the most recent line for each policy:

Policies:

LOAD

  PolicyNumber,

  Premium,

  [Transaction Type],

  [Transaction Data]

FROM ...;

Inner Join (Policies)

LOAD PolicyNumber,

  Max([Transaction Data]) As [Transaction Data]

Resident Policies;

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein

View solution in original post

6 Replies
jonathandienst
Partner - Champion III
Partner - Champion III

Use an inner join to exclude all but the most recent line for each policy:

Policies:

LOAD

  PolicyNumber,

  Premium,

  [Transaction Type],

  [Transaction Data]

FROM ...;

Inner Join (Policies)

LOAD PolicyNumber,

  Max([Transaction Data]) As [Transaction Data]

Resident Policies;

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
swuehl
MVP
MVP

If your data source allows ORDER BY:

Policies:

LOAD

  PolicyNumber,

  Premium,

  [Transaction Type],

  [Transaction Data]

FROM ...

WHERE NOT EXISTS(PolicyNumber)

ORDER BY PolicyNumber, [Transaction Data] DESC;


If you want to go for Jonathan's approach, I think you need to add a GROUP BY clause:


Policies:

LOAD

  PolicyNumber,

  Premium,

  [Transaction Type],

  [Transaction Data]

FROM ...;

Inner Join (Policies)

LOAD PolicyNumber,

  Max([Transaction Data]) As [Transaction Data]

Resident Policies

GROUP BY PolicyNumber;

Not applicable
Author

Hi Swuehl , could you please explain us 

WHERE NOT EXISTS(PolicyNumber)

ORDER BY PolicyNumber, [Transaction Data] DESC

what it will do ,

as per the Help I know that exists will check  the previously loaded records. but here all the records are loading at a time , could you please explain me the flow of  the statements you have provided.

Thanks

John

swuehl
MVP
MVP

John,

the HELP says: "Determines whether a specific field value exists in a specified field of the data loaded so far."

Loaded so far in includes the data from the current LOAD statement, up to the previous record.

Above WHERE clause will only let pass the first occurence of PolicyNumber in the LOAD (if this is the first appearance in the script run). To fulfill the initial requirements, you need an additional ORDER BY ... DESC, to only retrieve the latest data.

Not applicable
Author

Afternoon all,

Inner Join (Policies)

LOAD PolicyNumber,

  Max([Transaction Data]) As [Transaction Data]

Resident Policies

GROUP BY PolicyNumber;

The above statement works perfectly however I have now changed the Transaction date to a date and time field due to multiple transactions on the same day.

The max function does not appear to work correctly with a date and time field, could someone please advice?

Thanks

Gary

swuehl
MVP
MVP

What do you mean with 'not appear to work correctly with a date and time field'?

It should work as designed, but you probably don't want to get the max Date and max Time calculcated independently, but the max Time on max Date, right? I think you would need to combine both values again to a DateTime value, maybe split the values up only later in the script.

edit:

or try something like

Inner Join (Policies)

LOAD PolicyNumber,

  Max([Transaction Date]) As [Transaction Date],

   frac(Max([Transaction Date]+[Transaction Time])) as [Transaction Time]

Resident Policies

GROUP BY PolicyNumber;