Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi everyone,
I need some help to solve the next requirement. I have the Fact aggregate table with fields:
: yearmonth / market / customer / product / sumSales
For a defined set of the products I need to categorize and classify each and all of the customers sumsales , as what their choice was.
Need to create for each customer adicional dimension FirstChoice, SecondChoice,Nochoice but at the SCRIPT LEVEL.
MONTH_YEAR | MARKET | custome | PRODUCT | MTH_QTY |
201201 | vegtables | 108385 | onion | 5 |
201201 | vegtables | 108385 | carrot | 2 |
201201 | vegtables | 108393 | tomato | 4 |
201201 | vegtables | 108393 | carrot | 3 |
201201 | vegtables | 108393 | iogurt | 1 |
201201 | vegtables | 108365 | onion | 2 |
201201 | vegtables | 108365 | carrot | 6 |
201201 | vegtables | 108364 | iogurt | 1 |
For onion and carrots I need to obtain the next data
Customer / yearmonth / market/ FirstChoice/ SecondChoice/Nochoice, for the example data
108385 /201201/ vegetables /onion/carrot/null
108393 /201201/ vegetables / tomato/ carrot/ null
108365 / 201201/ vegetables / carrot/ onion/ null
108364 / 201201 / vegetables / iogurt/null/ 1
Quite complex situation. Any suggestion…?.
I´ve load on script rank for the consumer but need to run through all the table customer by customer to calculate the new dimensions. Ideas are welcome.
Thank you very much.
El mensaje fue editado por: SergioABR Thanks Stephan with your ideas... i´ve arrived to something like..
//MAPING LOAD MAP_PROD: MAPPING LOAD * INLINE [ PRODUCT,FLAG CARROT,Y ONIONS,Y ORANGE,N PATATOES,Y TOMATO,N ];
PRODUCT: LOAD subfield(K_CUST_PROD,'-',2) AS PRODUCT FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq); CUSTOMER: MAPPING LOAD subfield(K_CUST_PROD,'-',1) as CUSTOMER FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq);
let TotCustomer = noofrows(CUSTOMER)-1;
FOR i= 0 TO TotCustomer
LET vCUSTOMER = PEEK(CUSTOMER,$(i),'CUSTOMER') ;
let TotPROD = noofrows(PRODUCT)-1;
FOR j= 0 TO TotPROD
LET VPROD = PEEK(PRODUCT,$(i),'PRODUCT') ; CUST_POS:
LOAD PROD&vCUSTOMER&ITERNO()) AS KEY
,FIRSTSORTEDVALUE(PRODUCT,RANK,1) AS TOP1PROD
,FIRSTSORTEDVALUE(RX_MTH_QTY,RANK,1 AS TOP1VALUE
,IF((FIRSTSORTEDVALUE(PRODUCT,RANK,1))= $(VPROD),'1',
IF((FIRSTSORTEDVALUE(PRODUCT,RANK,2))= $(VPROD),'2','NA')) AS PRODPOS , IF (SUM(RX_MTH_QTY) > 0,'DEAL','NODEAL') AS STATUS GROUP BY CUSTOMER,PROD
HAVING WILDMATCH (PRODUCT,APPLYMAP('MAP_PROD',PRODUCT),'Y'>0 ;
NEXT NEXT WHILE ITERNO() >=10 AND ITERNO()<=12;
BUT IS NOT WORKING............ I HAVE POSTED TWO FILES. ONE IS THE QVW FILE AND A *.CSV FILE WITH DATA.
Thank you
Sorry, still no 100% sure how you want to analyze your data i.e. link your VEG to your CUSTPOS table.
Attached is a sample where I link it via CUSTOMER and TIME_ID.
I don't have much time too look into that anymore, hope you get it working from here.
Regards,
Stefan
I haven't understood your logic for Nochoice, is this indicating that neither onion nor carrots have been in the basket?
Maybe you could work out a solution using a group by load and FirstSortedValue:
LOAD
CUSTOMER, MONTH_YEAR, MARKET
FirstSortedValue( PRODUCT,-MTH_QTY,1) as FirstChoice,
FirstSortedValue( PRODUCT,-MTH_QTY,2) as SecondChoice,
???? as NoChoice
FROM ..... group by CUSTOMER, MONTH_YEAR, MARKET;
If you could have identical monthly quantities for a given combination of customer, monthyear and market, FirstSortedValue() will return null by default, so you might need to decide how to handle that (there are some methods to get a result even for identical sort order values in FirstSortedValue, e.g. by using DISTINCT qualifier or adding a small number to create a sort preference, like using
FirstSortedValue(PRODUCT,-(MTH_QTY+AutoNumber(PRODUCT)/10000),1) as FirstChoice,
when your number of PRODUCTs are much smaller than 10000 and your MTH_QTY are integer).
Hope this helps,
Stefan
Hello,
I have been working and arrived to what i am posting. Receiving a Generic tables must contain al lesat 3 fields message ERROR.
//MAPING LOAD
MAP_PROD:
MAPPING LOAD * INLINE [
PRODUCT,FLAG
CARROT,Y
ONIONS,Y
ORANGE,N
PATATOES,Y
TOMATO,N
];
PRODUCT:
LOAD
subfield(K_CUST_PROD,'-',2) AS PRODUCT
FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq);
CUSTOMER:
MAPPING LOAD
subfield(K_CUST_PROD,'-',1) as CUSTOMER
FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq);
let TotCustomer = noofrows(CUSTOMER)-1;
FOR i= 0 TO TotCustomer
LET vCUSTOMER = PEEK(CUSTOMER,$(i),'CUSTOMER') ;
let TotPROD = noofrows(PRODUCT)-1;
FOR j= 0 TO TotPROD
LET VPROD = PEEK(PRODUCT,$(j),'PRODUCT') ;
CUST_POS:
LOAD
($(VPROD)&$(vCUSTOMER)&TIME_ID) AS KEY
,FIRSTSORTEDVALUE(PRODUCT,RANK,1) AS TOP1PROD
,FIRSTSORTEDVALUE(RX_MTH_QTY,RANK,1 AS TOP1VALUE
,IF((FIRSTSORTEDVALUE(PRODUCT,RANK,1))= $(VPROD),'1',
IF((FIRSTSORTEDVALUE(PRODUCT,RANK,2))= $(VPROD),'2','NA')) AS PRODPOS
, IF (SUM(RX_MTH_QTY) > 0,'DEAL','NODEAL') AS STATUS,
resident VEG GROUP BY CUSTOMER,PROD,TIME_ID;
NEXT
NEXT.
Need help to adjust the two for..next and while extraction...
Thank you
The error message was due to your mapping load using only one field. I don't think you want to use a mapping load here?
I corrected some small syntax errors and the script is now running ... very slowly, due to your loops and embedded load. You might want to start in debug mode and limit the rows returned to some smaller number, maybe some hundred.
Hope this helps,
Stefan
Hi Stefan, thank you very much.
Great it is really working but the numbers are atonishing, fot 2835 records, looped throught 2050 customer and 2 products is producing a 11.623.500 record CUST_POS auxiliar table. Is this how it should be?? 34 min. load ??
Am I doing something wrong??
As well having some dificulties with the repeated values on the rank and going to im`plement your advise, it is producing null value for the Top2Prod,top3prod.. IS THE MINUS SIGN required on the RANK FIELD??
FirstSortedValue(PRODUCT,-(MTH_QTY+AutoNumber(PRODUCT)/10000),1) as FirstChoice,
Should I left join the CUST_POS table with VEG to clean up the data, and save just one value for the combination of keys ??, because now each record is producing +20 lines on the CUST_POS table, and makes it dificult to understand the output of the CUST_POS table
Stefan is the while interaction required ? As I´ve loaded each combination of customer&product&time???
Thank you. Sergio
/[code]
//just load the products I am interested in..
PRODUCT_1:
LOAD * INLINE [
PRODUCT
PATATOES
TOMATO
];
PRODUCT:
NOCONCATENATE LOAD
PRODUCT
RESIDENT PRODUCT_1;
CUSTOMER:
LOAD distinct
subfield(K_CUST_PROD,'-',1) as CUSTOMER
FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq);
let TotCustomer = noofrows('CUSTOMER')-1;
FOR i= 0 TO $(TotCustomer)
LET vCUSTOMER = PEEK('CUSTOMER',$(i),'CUSTOMER') ;
let TotPROD = noofrows('PRODUCT_1')-1;
FOR j= 0 TO $(TotPROD)
LET VPROD = PEEK('PRODUCT',$(j),'PRODUCT') ;
CUST_POS:
LOAD
('$(VPROD)'&'$(vCUSTOMER)'&TIME_ID) AS KEY
,FIRSTSORTEDVALUE( PRODUCT,RANK,1) AS TOP1PROD
,FIRSTSORTEDVALUE( RX_MTH_QTY,RANK,1) AS TOP1VALUE
,FIRSTSORTEDVALUE( PRODUCT,RANK,2) AS TOP2PROD
,FIRSTSORTEDVALUE( RX_MTH_QTY,RANK,2) AS TOP2VALUE
,IF((FIRSTSORTEDVALUE( PRODUCT,RANK,1))= '$(VPROD)','1',
IF((FIRSTSORTEDVALUE( PRODUCT,RANK,2))= '$(VPROD)','2','NA')) AS PRODPOS
, IF (SUM(RX_MTH_QTY) > 0,'DEAL','NODEAL') AS STATUS
resident VEG GROUP BY CUSTOMER,PRODUCT,TIME_ID;
NEXT
NEXT
DROP TABLE PRODUCT_1;
/
Honestly,
I don't really know what you want to achieve with your for loops.
Since you are already grouping by PRODUCT, something like firstsortedvalue(PRODUCT,RANK) within this grouped load doesn't really make sense.
If you could describe your requirements a bit closer, maybe someone can come up with an appropriate solution.
Regards,
Stefan
Hi Stephan,
Fact original table contains all the shopping made by customers and products for each time..
The aggregate table postead as vEGETABLE_bASKET.CSV is the comsumers profile resume.
fOR EXAMPLE ..On ID_TIME 10 FOR THE CUSTOMER 183656 , THE CUSTOMER BOUGHT 2 patatoes and 1 onion.
As we want to analyze the behaviour of the consumer for PATATOES and TOMATO
What was his firstchoice (TOP1Prod) and how much (top1value), top2prod,top2value,top3prod,top3value
In what position did he bought the patatoes and tomatos, 1 or 2 (POSPROD)
Did he bought PATATOES or TOAMTO on that month ( status = DEAL)
At the end I will have for combination of keys the profile of the consumer, classify by the choices he made, what was the first and second.. choice, and if he bought patatoes and tomatos that month...
.
As i need this auxiliar table to link to the fqact table to allowed the user to interact with the data.. How do I acomplish it...?
The consumer can buy up to 5 products but the analysis has to be done just for the patatoes and tomato.
Sugestions are welcome !!
Thank you
Sorry, still no 100% sure how you want to analyze your data i.e. link your VEG to your CUSTPOS table.
Attached is a sample where I link it via CUSTOMER and TIME_ID.
I don't have much time too look into that anymore, hope you get it working from here.
Regards,
Stefan
Hi Stefan.
Perfect...
Thank you very much.
I had arrived to a similar solution making it simple and using the AQL. More association and less for..while...
Glad to had received your time and attention. Really helped !!
Sergio. Posting YOUR script for the PE users..
PRODUCT:
LOAD Distinct
subfield(K_CUST_PROD,'-',2) AS PRODUCT
FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq);
CUSTOMER:
LOAD distinct
subfield(K_CUST_PROD,'-',1) as CUSTOMER
FROM [..\vEGETABLES bASKET.csv] (ansi, txt, delimiter is ';', embedded labels, msq);
For each VPROD in 'PATATOES','TOMATO'
CUST_POS:
LOAD
'$(VPROD)'&CUSTOMER&TIME_ID as KEY
//('$(VPROD)'&'$(vCUSTOMER)'&TIME_ID) AS KEY
,FIRSTSORTEDVALUE(PRODUCT,RANK,1) AS TOP1PROD
,FIRSTSORTEDVALUE(RX_MTH_QTY,RANK,1) AS TOP1VALUE
,IF((FIRSTSORTEDVALUE(PRODUCT,RANK,1))= '$(VPROD)','1',
IF((FIRSTSORTEDVALUE(PRODUCT,RANK,2))= '$(VPROD)','2','NA')) AS PRODPOS
, IF (SUM(RX_MTH_QTY) > 0,'DEAL','NODEAL') AS STATUS
resident VEG GROUP BY CUSTOMER,TIME_ID;
NEXT