Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
A lot of the questions I'm facing seems to depend on me being able to do stuff in QlikView that I haven't been able to figure out yet. I come from a primarily SQL background and I guess it's just that I'm not used to the QlikView way yet but the wall I'm running into often deals with the need for more predicates or some form of per row operation. As I gather from set analysis the set is run once for each graph and not per dimension/row which is sometimes what i want, never mind that the set analysis syntax is quite dense... I've checked the firstsortedvalue, lookup, fieldvalue etc functions but after screwing around some with them they are sometimes close to what I want but not all the way.
I'll give you an example:
Data:
CustomerEvents:
LOAD * Inline [Date,CustomerId,Status
2010-12-05,A,Active
2011-01-01,A,Inactive
2011-01-02,A,Active
2011-02-03,B,Active
2011-02-06,B,Inactive
2011-04-01,C,Active
2011-04-05,C,Upgraded
2011-04-06,C,Inactive
2011-05-03,C,Invoiced
2011-06-08,C,Active]
Question:
For every instance of Inactive customer, check if that customer has returned at a later date than the date the customer became inactive. I.e. is there present a row with status 'Active' any time after the current row regardless of other statuses in between.
SQL-solution, (untested :)):
SELECT ce.*
,(SELECT TOP 1
FROM CustomerEvents ce_sub
WHERE ce_sub.CustomerId = ce.CustomerId
AND ce_sub.Date > ce.Date
AND ce_sub.Status = 'Active') AS Returned_Customer
FROM CustomerEvents ce
WHERE ce.Status = 'Inactive'
Qlikview-solution:
?
With the various lookup functions I can see that there is an event for the customerID but I haven't found the way to add the extra predicate that the date also should be larger than the date for the current row.
If I sort the table according to the status/dates/customerids and we have multiples of statuses I can't be sure that the next or previous event is actually what I'm looking for if I'm using peek etc. See CustomerId C.
But probably I'm just missing something..
I'm aware the performance ramifications of doing stuff like this but a lot of times an answer in 30 minutes is better than no answer at all 🙂
Thanks for the help
Fredrik Ragnar
Hi,
Don't think in SQL, think as there would be additional ETL layer 🙂
CustomerEvents:
LOAD * Inline [Date,CustomerId,Status
2010-12-05,A,Active
2011-01-01,A,Inactive
2011-01-02,A,Active
2011-02-03,B,Active
2011-02-06,B,Inactive
2011-04-01,C,Active
2011-04-05,C,Upgraded
2011-04-06,C,Inactive
2011-05-03,C,Invoiced
2011-06-08,C,Active];
LastInactive:
LOAD
CustomerId,
max(Date) as lastInactive
resident CustomerEvents
where Status='Inactive'
group by CustomerId;
//LastActive:
LEFT KEEP JOIN (LastInactive) LOAD
CustomerId,
max(Date) as lastActive
resident CustomerEvents
where Status='Active'
group by CustomerId;
Flags:
LOAD
CustomerId,
if (lastActive > lastInactive, 'coming back', 'gone') as status
resident LastInactive;
//drop table LastInactive;
-Alex
www.snowflakejoins.com
Bah, ETL! 🙂
Seriously though the data, comes to me in text files and I'm in a very locked down enterprise environment.
Thanks for the solution! It solved my problem and I will add it to my bag of tricks. I actually did something similar just after I posted and got the result I wanted. But if I have a more complex problem I'm still very interested in any other thoughts people might have on this kind of stuff.
Regards
Fredrik Ragnar
hi Fredrik,
seriously though the way Alexandru explained is the best way to go in qlikview.
Your initial sql call has already dragged the data out of the dbms and will store it inside a qlikview document so why not take advantage of the ability to transform this dataset as per your requirements.
With qv make your sql as plain as possible to get maximum data as quickly and cleanly as possible. Then use the script to massage it. Alexandru's solution takes your sql and breaks it down into its component parts and then rebuilds the answer set. This is all happening inside qv so is fast and offers possibilities way beyond sql functionality.
The ETL layer between data source and end user needs is one of teh key strenghts of qv as a solution.
anyway, best of luck in your new environment!
I was probably not very clear. I'm all for massaging the data.
Thanks for the answers and I'll be back next time I hit the wall with a more complicated problem 🙂
Toth, This is the uqltimate solution.I liked it. I think that will give u watever u want. One more question from my side .
Can we do this at front end also .
Hello again,
The solution, while elegant and solving the problem for the data provided actually does not for all cases I'm faced with.
I've added a row for CustomerId C.
As C enters inactive status on 2011-04-06 C would be labeled as nonreturning even though C was back for a month between the actual "churndate" and the last known activity which is inactive status.
Since he in fact was back I would like to mark C as returning the first time and nonreturning the second time he goes into inactive status.
Data:
CustomerEvents:
LOAD * Inline [Date,CustomerId,Status
2010-12-05,A,Active
2011-01-01,A,Inactive
2011-01-02,A,Active
2011-02-03,B,Active
2011-02-06,B,Inactive
2011-04-01,C,Active
2011-04-05,C,Upgraded
2011-04-06,C,Inactive
2011-05-03,C,Invoiced
2011-06-08,C,Active
2011-07-08,C,Inactive]