Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have an order table with more than 20 columns with 5 years of data, OrderID should not repeat in 9 Months,
If the order ID is repeated with a one or two-month gap we need to load only the latest row..and remove others that are all duplicates..
If the order is repeated in more than a 9-month gap we need to load that.
Example: Here in the below table 11/17/2023 is the latest update row so we need to show that.. the date 10/17/2023 with a 1-month gap is a duplicate so hide.. then the date with a 1/17/2023 has a 9-month gap from the previous date (10/17/2023) so it's not a duplicate we need to show that.. then following date has a 3-month gap so hide it.
OrderID | Date | Desc | Show/Hide |
6789 | 11/17/2023 | gchc | Show |
6789 | 10/17/2023 | nhrth | Hide |
6789 | 1/17/2023 | gchc | Show |
6789 | 11/18/2022 | etyehet | Hide |
6789 | 10/12/2022 | gchc | Hide |
"Show/Hide" is not a field... I added it for your reference to display the output
you don't have to display that field i have just used it like 'Flag' and 'diff' for reference
tab3:
load
OrderID,Date,Desc
resident tab2
where "Show/Hide"='Show';
this will show only those rows where it is referenced as Show
There is no field called Show/Hide How I can use it in where
check this example below
That is why I created a Flag and diff column , to use where clause on them
Created "Show/Hide" column using preceding load
tab2:
load *,
if(Flag=1 or diff>=9,'Show','Hide') as "Show/Hide";
load *,
if(OrderID<>Previous(OrderID) and Date=ApplyMap('tab1',OrderID),1,0) as Flag,
if(day(previous(Date))-day(Date)<0,
(year(previous(Date))*12+month(previous(Date)))-(year(Date)*12+month(Date))-1,
(year(previous(Date))*12+month(previous(Date)))-(year(Date)*12+month(Date))) as diff
resident tab;
drop table tab;
you can load any columns you want to display in output from the resident table
Hi used this code..
tab:
load * Inline
[
OrderID,Date,Desc
6789,11/17/2023,gchc
6789,8/17/2023,nhrth
6789,1/17/2023,gchc
6789,11/18/2022,etyehet
6789,10/12/2022,gchc
6790,10/17/2023,defg
6790,5/17/2023,defg
6790,12/23/2022,dfdf
];
tab1:
mapping load
OrderID,
max(Date) as maxdate
Resident tab
group by OrderID;
tab2:
load *,
if(Flag=1 or diff>=9,'Show','Hide') as "Show/Hide";
load *,
if(OrderID<>Previous(OrderID) and Date=ApplyMap('tab1',OrderID),1,0) as Flag,
if(day(previous(Date))-day(Date)<0,
(year(previous(Date))*12+month(previous(Date)))-(year(Date)*12+month(Date))-1,
(year(previous(Date))*12+month(previous(Date)))-(year(Date)*12+month(Date))) as diff
resident tab;
drop table tab;
tab3:
load
OrderID,Date,Desc,"Show/Hide"
resident tab2
where "Show/Hide"='Show';
drop Table tab2;
Exit Script;
Current Output:
Expected output:
Along with the above three rows,, need
6789,1/17/2023,gchc
Total 4 rows should display
the previous date before 1/17/2023 is 8/17/2023 for OrderID -6789 ,
the gap is 7 months it is less than 9 months,
the date 11/17/2023 does not exist for OrderID-6790
the table you shared-
tab:
load * Inline
[
OrderID,Date,Desc
6789,11/17/2023,gchc
6789,8/17/2023,nhrth
6789,1/17/2023,gchc
6789,11/18/2022,etyehet
6789,10/12/2022,gchc
6790,10/17/2023,defg
6790,5/17/2023,defg
6790,12/23/2022,dfdf
];
That is what my new Scenario., You should not compare with Previous date.. we should compare with latest Valid row i.e
6789,11/17/2023,gchc
6789,8/17/2023,nhrth
6789,1/17/2023,gchc
Compare 1/17/2023 with 11/17/2023 (Latest Valid Id/row we selected) Now difference is 10 months
try this
tab:
load * Inline
[
OrderID,Date,Desc
6789,11/17/2023,gchc
6789,8/17/2023,nhrth
6789,1/17/2023,gchc
6789,11/18/2022,etyehet
6789,10/12/2022,gchc
6790,10/17/2023,defg
6790,5/17/2023,defg
6790,12/23/2022,dfdf
];
tab1:
mapping load
OrderID,
date(max(Date),'MM/DD/YYYY') as maxdate
Resident tab
group by OrderID;
tab2:
load *,
if(Date=maxdate,1,0) as Flag;
load *,
if(day(maxdate)-day(Date)<0,
(year(maxdate)*12+month(maxdate))-(year(Date)*12+month(Date))-1,
(year(maxdate)*12+month(maxdate))-(year(Date)*12+month(Date))) as diff;
load *,
applymap('tab1',OrderID)as maxdate
resident tab;drop table tab;
tab3:
load
OrderID,Date,Desc
resident tab2
where Flag=1 or diff>=9;
drop table tab2;
OUTPUT
It should be Compared with a recent Valid row or previous green Row
try this
tab:
load * Inline
[
OrderID,Date,Desc
6789,11/17/2023,gchc
6789,8/17/2023,nhrth
6789,1/17/2023,gchc
6789,11/18/2022,etyehet
6789,10/12/2022,gchc
6790,10/17/2023,defg
6790,5/17/2023,defg
6790,12/23/2022,dfdf
];
NoConcatenate
tab1:
load OrderID,Date,Desc
resident tab order by OrderID;
drop table tab;
tab2:
load *,
if(Date1<>previous(Date1),'Show','Hide') as "Show/Hide";
load *,
if(OrderID<>previous(OrderID),Date,
if(((year(peek(Date1))*12+month(peek(Date1)))-(year(Date)*12+month(Date)))<9,peek(Date1),Date)) as Date1
resident tab1;
drop table tab1;
tab3:
load OrderID,Date,Desc
resident tab2 where "Show/Hide"='Show';
drop table tab2;