Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
msmichael
Creator
Creator

Compare two subsets of a table

Hi experts,

I have a table with data like below,

ProdIdDateCategoryAmount 
1013/31/2020A100 
1023/31/2020B150 
1033/31/2020A200 
1053/31/2020A120 
1014/30/2020A150 
1024/30/2020A200 
1034/30/2020B110 
1044/30/2020A180 

 

Basically, a prodId can be in category A or B on a date, but the category can change from date to date. For example, in above table, ProdId 102 was categorized to B on 3/31 but moved to category A on 4/30. Similarly, 103 was in category A but moved to category B. Also, 104 is a new product which only exists on 4/30.

Now I need to show a table on the dashboard, 

GroupCategory ChangeCount (No. of products)Total Amount (on 4/30)
1From Category A to Other Categories1 (because ProdId 103 was moved from category A to other category)110
2From other categories to category A1 (because ProdId 102 was moved from other category to category A)200
3New to category A1 (because ProdId 104 is added to Category on 4/30 only and did not exists on 3/31)180
4Removed from category A1 (because ProdId 105 was in category A on 3/31 but it does not exist on 4/30)0

 

Basically, I need to group the data into 4 groups,

Group 1 is for products that were in category A on 3/31 but moved to other categories on 4/30;

Group 2 is for products that were in category A on 4/30 but in other categories on 3/31;

Group 3 is for products that are newly added on 4/30 to category A, and they did not exist on 3/31;

Group 4 is for products that were in category A on 3/31 but do not exist anymore on 4/30.

 

How can I do this using expressions? 

 

Thanks,

Michael

Labels (1)
2 Solutions

Accepted Solutions
Taoufiq_Zarra

Attached my version :

I assume from the statement that there are only two categories A and B

Loading script :

Data:

LOAD rowno() as ID,* INLINE [
    ProdId, Date, Category, Amount
    101, 3/31/2020, A, 100
    102, 3/31/2020, B, 150
    103, 3/31/2020, A, 200
    105, 3/31/2020, A, 120
    101, 4/30/2020, A, 150
    102, 4/30/2020, A, 200
    103, 4/30/2020, B, 110
    104, 4/30/2020, A, 180
];



output:
load  purgechar(Concat_Cat,'0123456789') as Concat_Categorie,*;
load ProdId, concat(ID&Category,',') as Concat_Cat, FirstSortedValue(Amount,(Date#(Date,'M/DD/YYYY'))) as FirstAmount,FirstSortedValue(Amount,-(Date#(Date,'M/DD/YYYY'))) as LastAmount,Date(Max(Date#(Date,'M/DD/YYYY'))) as MaxDate resident Data Group by ProdId order by ID ;

drop table Data;
drop field Concat_Cat;

 

2- Create Dimension:

Group:

=ValueList('1','2','3','4')

3- create Mesure :

Category Change :

=PICK(MATCH(VALUELIST('1','2','3','4'),'1','2','3','4'),
'From Category A to Other Categories',
'From other categories to category A',
'New to category A',
'Removed from category A'

)

 

Count (No. of products) :

PICK(MATCH(VALUELIST('1','2','3','4'),'1','2','3','4'),
count({<Concat_Categorie={'A,B'}>} ProdId),
count({<Concat_Categorie={'B,A'}>} ProdId),
count({<Concat_Categorie={'A'},MaxDate={"$(=Date#(Max(MaxDate),'M/DD/YYYY'))"}>} ProdId),
count({<Concat_Categorie={'A'},MaxDate={"$(=Date#(Min(MaxDate),'M/DD/YYYY'))"}>} ProdId)

)

 

='Total Amount (on '& Date(Max(MaxDate),'M/DD')&')' :

 

PICK(MATCH(VALUELIST('1','2','3','4'),'1','2','3','4'),
Sum({<Concat_Categorie={'A,B'}>} LastAmount),
Sum({<Concat_Categorie={'B,A'}>} LastAmount),
Sum({<Concat_Categorie={'A'},MaxDate={"$(=Date#(Max(MaxDate),'M/DD/YYYY'))"}>} LastAmount),
0
)

 

Results :

Capture.PNG

 

 

Regards,
Taoufiq ZARRA

"Please LIKE posts and "Accept as Solution" if the provided solution is helpful "

(you can mark up to 3 "solutions") 😉

View solution in original post

Saravanan_Desingh

One solution is.

tab1:
LOAD RowNo() As RowID,* INLINE [
    ProdId, Date, Category, Amount
    101, 3/31/2020, A, 100
    102, 3/31/2020, B, 150
    103, 3/31/2020, A, 200
    105, 3/31/2020, A, 120
    101, 4/30/2020, A, 150
    102, 4/30/2020, A, 200
    103, 4/30/2020, B, 110
    104, 4/30/2020, A, 180
];

Left Join(tab1)
LOAD ProdId, Concat(Category,'', Date) As Str, FirstSortedValue(Category, Date) As Val1, FirstSortedValue(Category, -Date) As Val2
Resident tab1
Group By ProdId
;

Left Join(tab1)
LOAD Date(Min(Date)) As MinDt, Date(Max(Date)) As MaxDt
Resident tab1;

tab2:
LOAD *, Pick(Match(-1, Str Like 'A?*' And Val2<>'A', Str Like '?*A' And Val1<>'A', Str='A' And Date=MaxDt, Str='A' And Date=MinDt), 'Group1', 'Group2', 'Group3', 'Group4') As Group
	  , If(Date=MaxDt, Amount, 0) As FinalAmount
Resident tab1;

tab3:
LOAD Group, Concat(DISTINCT ProdId) As Group_Str, Count(DISTINCT ProdId) As Group_Cnt
	, Sum(FinalAmount) As [Total Amount (on 4/30)]
Resident tab2
Group By Group
;

Drop Table tab1, tab2;

commQV50.PNG

View solution in original post

4 Replies
Taoufiq_Zarra

Attached my version :

I assume from the statement that there are only two categories A and B

Loading script :

Data:

LOAD rowno() as ID,* INLINE [
    ProdId, Date, Category, Amount
    101, 3/31/2020, A, 100
    102, 3/31/2020, B, 150
    103, 3/31/2020, A, 200
    105, 3/31/2020, A, 120
    101, 4/30/2020, A, 150
    102, 4/30/2020, A, 200
    103, 4/30/2020, B, 110
    104, 4/30/2020, A, 180
];



output:
load  purgechar(Concat_Cat,'0123456789') as Concat_Categorie,*;
load ProdId, concat(ID&Category,',') as Concat_Cat, FirstSortedValue(Amount,(Date#(Date,'M/DD/YYYY'))) as FirstAmount,FirstSortedValue(Amount,-(Date#(Date,'M/DD/YYYY'))) as LastAmount,Date(Max(Date#(Date,'M/DD/YYYY'))) as MaxDate resident Data Group by ProdId order by ID ;

drop table Data;
drop field Concat_Cat;

 

2- Create Dimension:

Group:

=ValueList('1','2','3','4')

3- create Mesure :

Category Change :

=PICK(MATCH(VALUELIST('1','2','3','4'),'1','2','3','4'),
'From Category A to Other Categories',
'From other categories to category A',
'New to category A',
'Removed from category A'

)

 

Count (No. of products) :

PICK(MATCH(VALUELIST('1','2','3','4'),'1','2','3','4'),
count({<Concat_Categorie={'A,B'}>} ProdId),
count({<Concat_Categorie={'B,A'}>} ProdId),
count({<Concat_Categorie={'A'},MaxDate={"$(=Date#(Max(MaxDate),'M/DD/YYYY'))"}>} ProdId),
count({<Concat_Categorie={'A'},MaxDate={"$(=Date#(Min(MaxDate),'M/DD/YYYY'))"}>} ProdId)

)

 

='Total Amount (on '& Date(Max(MaxDate),'M/DD')&')' :

 

PICK(MATCH(VALUELIST('1','2','3','4'),'1','2','3','4'),
Sum({<Concat_Categorie={'A,B'}>} LastAmount),
Sum({<Concat_Categorie={'B,A'}>} LastAmount),
Sum({<Concat_Categorie={'A'},MaxDate={"$(=Date#(Max(MaxDate),'M/DD/YYYY'))"}>} LastAmount),
0
)

 

Results :

Capture.PNG

 

 

Regards,
Taoufiq ZARRA

"Please LIKE posts and "Accept as Solution" if the provided solution is helpful "

(you can mark up to 3 "solutions") 😉
Saravanan_Desingh

One solution is.

tab1:
LOAD RowNo() As RowID,* INLINE [
    ProdId, Date, Category, Amount
    101, 3/31/2020, A, 100
    102, 3/31/2020, B, 150
    103, 3/31/2020, A, 200
    105, 3/31/2020, A, 120
    101, 4/30/2020, A, 150
    102, 4/30/2020, A, 200
    103, 4/30/2020, B, 110
    104, 4/30/2020, A, 180
];

Left Join(tab1)
LOAD ProdId, Concat(Category,'', Date) As Str, FirstSortedValue(Category, Date) As Val1, FirstSortedValue(Category, -Date) As Val2
Resident tab1
Group By ProdId
;

Left Join(tab1)
LOAD Date(Min(Date)) As MinDt, Date(Max(Date)) As MaxDt
Resident tab1;

tab2:
LOAD *, Pick(Match(-1, Str Like 'A?*' And Val2<>'A', Str Like '?*A' And Val1<>'A', Str='A' And Date=MaxDt, Str='A' And Date=MinDt), 'Group1', 'Group2', 'Group3', 'Group4') As Group
	  , If(Date=MaxDt, Amount, 0) As FinalAmount
Resident tab1;

tab3:
LOAD Group, Concat(DISTINCT ProdId) As Group_Str, Count(DISTINCT ProdId) As Group_Cnt
	, Sum(FinalAmount) As [Total Amount (on 4/30)]
Resident tab2
Group By Group
;

Drop Table tab1, tab2;

commQV50.PNG

msmichael
Creator
Creator
Author

Thank you both very much! 

Saravanan_Desingh

One more version.

tab1:
LOAD * INLINE [
    ProdId, Date, Category, Amount
    101, 3/31/2020, A, 100
    102, 3/31/2020, B, 150
    103, 3/31/2020, A, 200
    105, 3/31/2020, A, 120
    101, 4/30/2020, A, 150
    102, 4/30/2020, A, 200
    103, 4/30/2020, B, 110
    104, 4/30/2020, A, 180
];

tab2:
LOAD ProdId Resident tab1;
Right Join (tab2)
LOAD Date Resident tab1;

Join(tab1)
LOAD * Resident tab2;

Left Join(tab1)
LOAD ProdId, Concat(DISTINCT If(IsNull(Category),'X',Category),'', Date) As Str
Resident tab1
Group By ProdId
;

Left Join(tab1)
LOAD Date(Max(Date)) As MaxDt
Resident tab1;

tab3:
LOAD *, 'Group'&Match(Str, 'AB', 'BA', 'XA', 'AX') As Group
	  , If(Date=MaxDt, Amount, 0) As FinalAmount
Resident tab1;

tab4:
LOAD Group, Concat(DISTINCT ProdId) As Group_Str, Count(DISTINCT ProdId) As Group_Cnt
	, Sum(DISTINCT FinalAmount) As [Total Amount (on 4/30)]
Resident tab3
Where Group <> 'Group0'
Group By Group
;

Drop Table tab1, tab2, tab3;