Qlik Community

Qlik Sense Data Connectivity

Discussion board where members can learn more about Qlik Sense Data Connectivity.

Not applicable

How to only import the latest occurrence of an row

Hi, I have a app that traverses many (>100)  excel files, and it's starting to get a app-size problem for me

The obvious solution is that I would only really need the latest version of an item (row) (unique ID column exists, date-time column exists)

any ideas on how i can accomplish that during the import?

basically i want it to add new rows - and replace existing row if imported row is newer only. AND important note across excel files being imported

Tags (2)
5 Replies

Re: How to only import the latest occurrence of an row

First load all the data, then right join that table with a new table with only the maximum date-time values per ID

Data:

LOAD

     ID,

     MyDateTimeField,

     ...other fields...

FROM

     *.xlsx (ooxml, embedded labels, table is Sheet1)

     ;

RIGHT JOIN (Data)

LOAD

     ID, Max(MyDateTimeField) as MyDateTimeField

RESIDENT

     Data

GROUP BY

     ID

     ;

Not applicable

Re: How to only import the latest occurrence of an row

Interesting !

so will that strategy work with the 50mb app limit I have today (cloud plus account),

I mean if i have to load all the data - Iam still temporary exceeding the limit before I can reduce size.

Not applicable

Re: How to only import the latest occurrence of an row

hmm and doing inside the load statement will not work across the >100 sourcefiles...?

Not applicable

Re: How to only import the latest occurrence of an row

Thx  - giving it a try !

Re: How to only import the latest occurrence of an row

If you're worried about the 50MB app limit you could first load only the ID and datetime from all the excel files, calculate the max datetime per ID and then load the rest of the data:

Temp1:

LOAD

     ID,

     MyDateTimeField

FROM

     *.xlsx (ooxml, embedded labels, table is Sheet1)

     ;

Temp2:

LOAD

     ID & '|' & Max(MyDateTimeField) as KEY

RESIDENT

     Temp1

GROUP BY

     ID

     ;

DROP TABLE Temp1;

Result:

LOAD

     *

FROM     

     *.xlsx (ooxml, embedded labels, table is Sheet1)

WHERE

     Exist(KEY, ID & '|' & MyDateTimeField

     ;

DROP TABLE Temp2;