I currently have a model where I am loading in a table detailing where people were based and between what dates. For e.g.
Name Location From To
Bob London 01/01/2015 31/05/2015
Bob New York 01/06/2015 31/12/2015
In another table in my model I am given a particular date and need to find out where that person is based at that time. I've got this working using an interval match function, but the number of records is increasing greatly and it's starting to put quite a drain on the reload time. Is there a better way of doing this?
What is your min and max date in your data? - the lowest 'From' and highest 'To' ? If it is then we need to look elsewhere.
I'm not sure where you are with resident loads and joins but you might be able to link your location mapping table to the main transaction table with an inner join so you only get entries for dates in the transaction table.
(The assumption I am making is that you are getting every date between the lowest from to the highest to, in the example above every day between 01/01/2015 and 31/12/2015)
How many persons are there that need this matching. What is the size of your resulting IntervalMatch extra table(s). What is the span of day that you really need to check against. Are you still using full date as the linking fields in the IntervalMatch? Date could be reduced into a much smaller storage if you are able to limit the scope of the date(s).
I had the same problem in Qlik Sense, the size of my resulting IntervalMatch extra table was around 83'000.000 and it crashed. So, I deleted the fields that I didn't need and Loaded Data with a Distinct clause. Thanks for your suggest.