Qlik Community

New to QlikView

Discussion board where members can get started with QlikView.

Re: Removing duplicates, getting latest value via multiple criteria

Record No right now is a text field. You will need to convert into a number field, May be like this:

KeepChar([Record No], '0123456789') * 100/100

May be this code:

Table:

LOAD [Product No],

  FirstSortedValue([Date Created], -([Date Created]+Check)) as [Date Created],

  FirstSortedValue([Record No], -([Date Created]+Check)) as [Record No],

  FirstSortedValue([Date Altered], -([Date Created]+Check)) as [Date Altered],

  FirstSortedValue(Reason, -([Date Created]+Check)) as Reason

Group By [Product No];

LOAD *,

  KeepChar([Record No], '0123456789') * 100/100 as Check;

LOAD * INLINE [

    Product No, Date Created, Record No, Date Altered, Reason

    1, 4/8/2015, RN0001, 4/6/2014, Damaged //Was given as 4/8/2014

    1, 4/8/2015, RN0095, 4/6/2014, AutoRepaired //Was given as 4/8/2014

    1, 4/8/2015, RN0035, 4/11/2014, Damaged

    3, 6/8/2015, RN0008, 6/7/2015, Damaged

    3, 6/8/2015, RN0004, 6/7/2015, AutoRepaired

    3, 6/8/2015, RN0006, 9/9/2015, Damaged

];

Change the data above (in red) as it seems to me that they will be on the same dates. Let me know if that isn't true

Not applicable

Re: Removing duplicates, getting latest value via multiple criteria

Hi SunnyT,

Thank you for the prompt reply.

As for whether they will be on the same dates, the thing is, this example is just like 6 records out of 1 million records that i have, and there are some records that have the same date created, but diff record no, some have same date created and diff date altered but of course diff record no. So i am trying to cover the possibilities here as i can't go through all 1 million records 1 by 1.

Some records no are like CNB1029-102, some are RN10203, so im looking for one formula that works for all, I thought of converting all to numbers and find the largest one, but i always fail to do so using Num().

Using keepchar might face some issue as mentioned records no format are not the same for each product, But however, for each product they will be in the same format.

For example if product 1 its record No is RNxxxx, it will always be like that.

For product say n, if its CNB1029-202, it will always be in that format, just the number changing.

Just an idea, remember the previous solution you posted? the one using RecNo()?
What if i sort my table first using loadscript, by datecreated, recordno, then date altered. That way, will it ever go wrong?

Thank you so much for your assistance, been trying to solve this problem for quite a while now.

Re: Removing duplicates, getting latest value via multiple criteria

If the data can be sorted, I would encourage you to do that and then use RecNo() function.

MVP
MVP

Re: Removing duplicates, getting latest value via multiple criteria

Maybe with a where peek; you can change the order by as you want (ascending/descending on 2, 3, n fields)

Or I'm missing something?

Table:

LOAD * INLINE [

    Product No, Date Created, Record No, Date Altered, Reason

  1, 4/8/2014, RN0001, 4/6/2014, Damaged

    1, 4/8/2014, RN0025, 4/6/2014, AutoRepaired

    1, 4/8/2015, RN9090, 4/11/2014, Damaged

    2, 5/9/2015, RN0002, 3/9/2015, Creation

    3, 6/8/2015, RN0003, 6/7/2015, Damaged

    3, 6/8/2015, RN0004, 6/7/2015, AutoRepaired

    3, 10/10/2015, RN0006, 9/9/2015, Damaged

];

NewTable:

NoConcatenate load *

Resident Table

Where [Product No]<>Peek('Product No')

order by [Product No], [Date Created] desc, [Record No] desc, [Date Altered] desc;

DROP Table Table;

Not applicable

Re: Removing duplicates, getting latest value via multiple criteria

Hi SunnyT,

It is a tricky situation considering different variables affecting this.

As i rmb, RecNo() takes data from the sorted loaded table or the original table?
if from the original table then there is no point sorting my data in a loadscript section beforehand?

basically in SAS, it would be like group by product id,

and if last.datecreated and last.recordno and last.datealtered.

How do i do the same here. hmmmmmmmmmmmmmm

Not applicable

Re: Removing duplicates, getting latest value via multiple criteria

Hi maxgro, do you mind explaining this further?

especially the Where [Product No]<>Peek('Product No')


thank you.

MVP
MVP

Re: Removing duplicates, getting latest value via multiple criteria

Peek('Product No') is the last record read (I copied the QlikView help, below).

With

Where [Product No]<>Peek('Product No')

you get only the records with a different product no, one record for product no 1, one for 2, one for 3.

The order by is needed to choose the first record by product. Change the order by and you have a different first record. The product no must always be the first field in the order by.

The resident is needed because in Qlik you can only sort in a resident load.

The noconcatenate is needed because the fields are the same and I don't want an automatic concatenate.

You can play removing the where peek and adding the peek(....) as a new field.

peek(fieldname [ , row [ , tablename ] ] )

Returns the contents of the fieldname in the record specified by row in the internal table tablename. Data are fetched from the associative QlikView database.

Fieldname must be given as a string (e.g. a quoted literal).

Row must be an integer. 0 denotes the first record, 1 the second and so on. Negative numbers indicate order from the end of the table. -1 denotes the last record read.

If no row is stated, -1 is assumed.

Tablename is a table label, see Table Labels, without the ending colon. If no tablename is stated, the current table is assumed. If used outside the load statement or referring to another table, the tablename must be included.

Examples:

peek( 'Sales' )
returns the value of Sales in the previous record read ( equivalent to previous(Sales) ).

peek( 'Sales', 2 )
returns the value of Sales from the third record read from the current internal table.

peek( 'Sales', -2 )
returns the value of Sales from the second last record read into the current internal table.

peek( 'Sales', 0, 'Tab1' )
returns the value of Sales from the first record read into the input table labeled Tab1.

Load A, B, numsum( B, peek( 'Bsum' ) ) as Bsum...;
creates an accumulation of B in Bsum.

Not applicable

Re: Removing duplicates, getting latest value via multiple criteria

Hi maxgro, thanks for your explaination.

If thats the case, its true i will only get 1 record per product no, but how do i know which one to choose if say i have 3 product no 1, then diff date created, diff record no and diff date altered similar to the table of data above, Do note i have around 1 million records.

Sorry im relatively new to qlikview, do you mind providing a short example?
There are problems i highlighted to Sunnyindia in the above post as well.

So the idea here is that, you sort desc, and pick the first record for each product no right?

How do i get it to read the first record? Since i don't know how many records there are per productno, there can be 1, 2 ,...n

Thank you

MVP
MVP

Re: Removing duplicates, getting latest value via multiple criteria

You choose the record with the order by

With this table

Table:

LOAD * INLINE [

    Product No, Date Created, Record No, Date Altered, Reason

    1, 4/8/2014, RN0001, 4/6/2014, Damaged

    1, 4/8/2014, RN0025, 4/6/2014, AutoRepaired

    1, 4/1/2015, RN9090, 4/11/2014, Damaged

    2, 5/9/2015, RN0002, 3/9/2015, Creation

    3, 6/8/2015, RN0003, 6/7/2015, Damaged

    3, 6/8/2015, RN0004, 6/7/2015, AutoRepaired

    3, 10/10/2015, RN0006, 9/9/2015, Damaged

];

if you want the last (bigger) product by DateCreated,RecordNo,DateAltered.

add an order by product, date created desc, record no desc, date altered desc

NewTable:

NoConcatenate load *

Resident Table

Where [Product No]<>Peek('Product No')

order by [Product No], [Date Created] desc, [Record No] desc, [Date Altered] desc;

DROP Table Table;

Qlik will read the record in this order

The records with product no <> previous product no (the where clause) are the records in bold

    1, 4/1/2015, RN9090, 4/11/2014, Damaged

    1, 4/8/2014, RN0025, 4/6/2014, AutoRepaired

    1, 4/8/2014, RN0001, 4/6/2014, Damaged

  2, 5/9/2015, RN0002, 3/9/2015, Creation

    3, 10/10/2015, RN0006, 9/9/2015, Damaged

    3, 6/8/2015, RN0004, 6/7/2015, AutoRepaired

    3, 6/8/2015, RN0003, 6/7/2015, Damaged


1.png

Do you agree?


If you have different data and / or questions, post them.

Highlighted
MVP
MVP

Re: Removing duplicates, getting latest value via multiple criteria

another example, in this one I want the oldest date altered, first rec no, ; you can see I changed the fields order in order by and replace descending with ascending (asc is the default, no need to specify)

NewTable:

NoConcatenate load *

Resident Table

Where [Product No]<>Peek('Product No')

order by [Product No], [Date Altered], [Record No], [Date Created];

1.png