Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
datanibbler
Champion
Champion

Complex sorting scenario - haven't yet had the time to find an easy solution ...

Hi,

I have a complex sorting scenario that i couldn't yet come up with an easy solution for - I think I know a way to do it, but a rather complex one, maybe someone here knows an easier way.

I can try to put together some sample data, but it's rather easy on the data_side:

- I have a number of different item_numbers;

- For every item_number, there is a number of records, each with a unique date-time-stamp;

- What I want as output is a table that has, for every item_nr, only the latest record (with the most up-to-date date-time-stamp)

=> So, to begin with, I would sort in a RESIDENT LOAD by

     - the item_no (asc.)

     - the date-time-stamp (desc)

=> Then I was thinking of using a >> FIRST 1 LOAD << - but that would only fetch the one first record, so I'd need that once per

      item_no

=> To accomplish that, I was thinking of using a >> LOAD DISTINCT [item_no] << to create an auxiliary table and

      to parse this table line by line and in every iteration, make a variable and then >> FIRST 1 LOAD << from the other

      table with a WHERE clause filtering for that one item_no

==> That way I'd get the most up-to-date record for every item_no

<=> The drawback is, that is a rather complex solution. So that's my question - is there an easier way?

Thanks a lot!

Best regards,

DataNibbler

14 Replies
marcus_sommer

Hi DataNibbler,

I could think on two possiblities. The first is to use Peek() or Previous() ? within a sorted load and another is to use an aggregation with firstsortdevale().

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

>> FirstsortedValue << looks good.

With > Previous << I guess I could load RESIDENT the first record of every group but the very first - and I could load that by using a >> FIRST 1 LOAD <<, so that should work, too.

I'll try that and report back here.

Thanks!

marcus_sommer

Hi DataNibbler,

within the script I haven't ever used the firstsortedvalue-aggregation - I use it only within charts - but you should simply try it. But I think that with larger datasets the peek/previous method will be faster even if you really need 3 loadings for it:

1. data-load to get the data into qv (order by is only by resident loads possible)

2. peek/previous logic to flag (or cumulate) your data

3. filter your data with a where-clause appropriate to your flag

and forget First 1 then it will only load one record in total not for each item.

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

it seems to work.

The only thing I'm not really sure about, even if it looks like it worked, is the very first record loaded - that one has no previous record, so I'm not sure the PREVIOUS() function works correctly on that one?

I have done as you said and loaded one field (after sorting by 1) plan_no (asc) and 2) timestamp (desc.)

>> IF(PREVIOUS(ID) <> ID, 1, 0) as First_record_in_group <<

As I said, it looks good - we have checked in the database and it seems the logic has captured the first plan_number with the most up-to-date timestamp.

marcus_sommer

Hi DataNibbler,

AFAIK previous/peek will return NULL within the first record and are therefore very probably different to your check within an if-loop so that you get your specified false-value. Otherwise you could catch the case of the first record with: if(rowno() = 1, this, that).

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

that trick with PREVIOUS() does exactly what it should - unfortunately it turned out I understood the requirement in a way different from my colleague, so what the code does is not what we want as of now. I think I know now what I misunderstood and I will try to explain again:

- We have a number of delivery_plans

- For every delivery_plan, we get a so-called IDOC every so often

- Every IDOC contains a number of records, all of which we need

- Every IDOC has a unique date-time-stamp (all of the records belonging to that IDOC have the same) of when it was transferred to our IT system.

=> For every delivery_plan, I want to extract only the latest IDOC (all of the records which have the most up-to-date

     date-time-stamp)

=> That is the part where the code does what I understood on Friday that it should do, but not what we want: Due to the query using PREVIOUS(), the code grabs only one single record, the very first in a group.

Now, again, I can think of a solution, but I'm not sure if it might be possible in an easier way:

=> We'd have to extract that very first record like we do now and store it in an auxiliary_table, parse it, make two

       variables out of a) the delivery_plan_number and b) the date-time-stamp

     and extract from the original table all the records with an identical date-time-stamp.

What do you think of this?

Thanks a lot!

Best regards,

DataNibbler

marcus_sommer

Hi DataNibbler,

it might be possible to extend the previous-logic to flag not only the last record of a delivery plan else all records to their IDOC. But easier seems to me to split these table like you mentioned it maybe in a main-table with all delivery plans and their timestamp and a flagging on the last and a second table with the IDOC's and a key to the main-table.

- Marcus

Kushal_Chawda

Can you try this

Data:

LOAD IDOC,

            Date

FROM Table;

left join(Data)

LOAD IDOC,

            FirstSortedValue(Date,-Date) as Date,

            1 as LatestDateFlag

Resident Data

group by IDOC;

Final:

noconcatenate

LOAD *

resident Data

where LatestDateFlag=1;

drip table Data;

jonathandienst
Partner - Champion III
Partner - Champion III

How about something like this:

LOAD TransID,

  TransDate,

  ...

Resident ...

Where Not(Exists(transID))

Order By TransID, transDate DESC;

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein