Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello, here is my problem
each month I load the same database structure but obviously with different data
among the fields I load, there are a customer number and a customer name
the number stays the same, but the names occasionally change a character or they change completely (dont' ask :-), so after loading 1-2 years worth of monthly data, I end up with a database which has unique customer numbers, but not unique customer names
and this is a problem because for data analysis we work with customer names, not numbers, because they are easier to remember
so if I select a certain customer name and want to see, for example, monthly sales for the past 12 months, I occasionally get no sales for the period before or after the name change, which is obviously wrong since I still have that customer, it's only that it changed names
so what I want to have is some sort of script or automated procedure that ensures that a customer number has an unique name associated to it at all times in the database and that name needs to be the latest name available
the issue is complicated by the fact that the population of customers is not stable from month to month, i.e. new customers may arrive, old customers may disappear but they are still relevant for analyses over longer periods of time
thank you
Liivu
Hi,
try to use the function FirstValue with group by
example
Load
id,
FirstValue(name) as name
Resident table
group by id;
You can try something like this.
Customer:
Load customerno, customername, customerno as _cno
Resident transactions
Order by date desc
Where not exists(_cno, customerno)
;
Drop field _cno;
Drop field customername in transactions;
You can choose whether to keep this as a dimension table in your data model or to join this table back into your original transaction table.
so I thought of the following algorithm for solving this:
1. load all distinct pairs of customers numbers and names, but I need to load starting from the latest month
2. if the number already exists, do not load a second pair of number and name
3. either create a separate table with numbers and names or use applymap for my current main table
problem is I do not know how to write this script
thank you
Hi,
try to use the function FirstValue with group by
example
Load
id,
FirstValue(name) as name
Resident table
group by id;
You can try something like this.
Customer:
Load customerno, customername, customerno as _cno
Resident transactions
Order by date desc
Where not exists(_cno, customerno)
;
Drop field _cno;
Drop field customername in transactions;
You can choose whether to keep this as a dimension table in your data model or to join this table back into your original transaction table.
@ StarinieriG
thank you very much, this works, just needed to insert an order by date clause otherwise it brings the oldest name, not the newest
@Vegar
thank you very much
your script also works, just needed to change the order of "where" and "order" clauses and also "drop field FROM transactions" instead of "IN transactions"
once again thank you to both of you, this forum is awesome and so is QliK 🙂