Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Good Day
I just want to verify whether this if statement will return the correct results I have a set of data for all vehicles and some ref numbers and customer names appear more than once, so I want to distinguish between repurchases (customers who have bought vehicles before) and first time buyers so I wrote this if statement in my script:
If((Contract_Ref_no) and (Customer_name)>1,'Repurchases','First Time Buyers') AS [Repurchases]
the reason I am skeptical is that there are more repurchases than first time buyers so I was wondering whether it might be the code for some reason.
Thanks in advance.
Seems to me that you want to aggregate record values ("Count duplicates"), so you would need a GROUP BY clause in your script:
LEFT JOIN (YourTable)
LOAD [Contract_Ref_No],
[Customer_name],
If( Count([Customer_name]) >1, 'Repurchases', 'First Time Buyers') as [Repurchases]
RESIDENT YourTable
GROUP BY [Contract_Ref_No], [Customer_name];
Maybe it's sufficient to only use [Contract_Ref_No] in above load instead of the two fields. And it may be better to check on something like OrderNo or OrderDate instead of name:
LEFT JOIN (YourTable)
LOAD [Contract_Ref_No],
If( Count(DISTINCT OrderNo) >1, 'Repurchases', 'First Time Buyers') as [Repurchases]
RESIDENT YourTable
GROUP BY [Contract_Ref_No];
To make this work, all fields need to be part of YourTable, not of several distinct tables in your data model.
Hope this helps,
Stefan
If((Contract_Ref_no) and (Customer_name)>1,'Repurchases','First Time Buyers') AS [Repurchases]
I don't understand what you're trying to test. What is (Contract_Ref_no) and (Customer_name)>1 supposed to do? Contract_Ref_no is probably a number, so likely to be larger than 1, but are your Customer_name values numbers too?
Seems to me that you want to aggregate record values ("Count duplicates"), so you would need a GROUP BY clause in your script:
LEFT JOIN (YourTable)
LOAD [Contract_Ref_No],
[Customer_name],
If( Count([Customer_name]) >1, 'Repurchases', 'First Time Buyers') as [Repurchases]
RESIDENT YourTable
GROUP BY [Contract_Ref_No], [Customer_name];
Maybe it's sufficient to only use [Contract_Ref_No] in above load instead of the two fields. And it may be better to check on something like OrderNo or OrderDate instead of name:
LEFT JOIN (YourTable)
LOAD [Contract_Ref_No],
If( Count(DISTINCT OrderNo) >1, 'Repurchases', 'First Time Buyers') as [Repurchases]
RESIDENT YourTable
GROUP BY [Contract_Ref_No];
To make this work, all fields need to be part of YourTable, not of several distinct tables in your data model.
Hope this helps,
Stefan
Contract_Ref_no is a unique Id each customer is given when they open an account and the customer_name is a string dependent on the name of the customer.
All i wanted to do is to check for duplicates if there is a contract_ref_no and a customer name that appears more than once then it will be regarded as a repurchase.
Hi Swuel
I think it worked it looks more or less correct now first time buyer stats are 97% and Repurchases are at 3%, I would rather remain safe and keep the Contr_Ref_no as well as the Customer name as the key identifiers.
Thanks for your help.