Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a complex sorting scenario that i couldn't yet come up with an easy solution for - I think I know a way to do it, but a rather complex one, maybe someone here knows an easier way.
I can try to put together some sample data, but it's rather easy on the data_side:
- I have a number of different item_numbers;
- For every item_number, there is a number of records, each with a unique date-time-stamp;
- What I want as output is a table that has, for every item_nr, only the latest record (with the most up-to-date date-time-stamp)
=> So, to begin with, I would sort in a RESIDENT LOAD by
- the item_no (asc.)
- the date-time-stamp (desc)
=> Then I was thinking of using a >> FIRST 1 LOAD << - but that would only fetch the one first record, so I'd need that once per
item_no
=> To accomplish that, I was thinking of using a >> LOAD DISTINCT [item_no] << to create an auxiliary table and
to parse this table line by line and in every iteration, make a variable and then >> FIRST 1 LOAD << from the other
table with a WHERE clause filtering for that one item_no
==> That way I'd get the most up-to-date record for every item_no
<=> The drawback is, that is a rather complex solution. So that's my question - is there an easier way?
Thanks a lot!
Best regards,
DataNibbler
Hi all,
that code now works fine.
The issue is, it takes much too long. The part of the code where it does take ages is the loop - no wonder: In every iteration of the loop, two variables (the plan_number and an ID) are saught for in the original table - meaning the code has to search approx. half a million records in every one of approx. 4.000 iterations ...
It is also thinkable that the renaming of every single field takes so long - I introduced that to make sure that the new table - being built bit by bit in the loop - is not instantaneously linked to the old one.
I will attach the code to this post so you can have a look.
I have an idea that it might be possible to just INNER JOIN the aux_table (Temp2) to the original table - or to use KEEP - to throw out all records from the original table which do not match (that commented section).
I just haven't yet had the time to try it out - because of the LOAD_scripts running, I can only test that during approx. 15min every hour ...
Would that work?
Thanks a lot!
Yes, inner join will also work by which you can skip one load to get the result
Data:
LOAD IDOC,
Date
FROM Table;
Inner join(Data)
LOAD IDOC,
Date(max(Date)) as Date,
Resident Data
group by IDOC;
Update : Use max function instead
Why not use a right join (or keep) with a table that calculates the max timestamp per whatever combination of fields you need:
TableX:
LOAD
plan_number,
ID,
datetime,
...other fields
FROM
...source...
;
RIGHT JOIN (TableX)
LOAD
plan_number,
ID,
max(datetime) as datetime
RESIDENT // *
TableX
GROUP BY
plan_number,
ID
;
//* If you're using a qvd as source then reading that may be faster than a resident load. Or perhaps you can push the calculation of the max(datetime) to the database server in a select statement.
Hi Gysbert,
you're right, reading the qvd might be faster - but then I cannot use the >> ORDER BY <<, can I? I will try with GROUP_BY.
@ kushal
I will try with an INNER JOIN and that Firstsortedvalue() function. I have used it before, let's see ...
Thanks a lot!
I don't see why you need an order by.
What I want as output is a table that has, for every item_nr, only the latest record (with the most up-to-date date-time-stamp)
You can do that with the approach I posted above.