# Hausdorff Distance

## Hausdorff Distance

I was trying to think of some different ways of highlighting similar sets of data in QlikView & under 'similarity metrics' found Hausdorff Distance in an old text book I have (https://www.amazon.co.uk/Nonlinear-workbook-algorithms-expression-programming/dp/9814335789, although I found a more comprehensible write up on the web here, http://cgm.cs.mcgill.ca/~godfried/teaching/cg-projects/98/normand/main.html).

In scripting the calculation actually drops out quite neatly, so I have attached sample using some data I found on a resource for teachers in New Zealand;

http://new.censusatschool.org.nz/resource/multivariate-data-sets/

I am not sure anything particularly insightful drops out of that particular set of data & wonder if something built in to QlikView would show similar relationships, or that scripting (rather than being calculated on the fly, depending on selections) limits it, but I think an interesting concept.

Comments
I added a slider control and had a bit play. Looks good to me, though I don't have a clue what it's telling me, so I will have to read the document!

Nice work, thanks for doing this.

I believe the HausDorff calculation looks at 2 sets of data and works out how closely they match. In this example, the comparison is between male and female smokers. I think I might use it to compare the behaviour of our customers with that of the general population.

Thanks for the info.

Could you stratify the general population into different groups, look at the similarity with your customer base to those groups to either identify suggestions for potential new customer bases or better definition (so as to potentially reach more of) your core demographic?

Thanks for the feedback (both).

I work for the NHS (UK Health Service) and we have LOTS of data that would benefit for something like this, so I will be looking at an test implementation at some point in the near future. Again, thanks for the work!

Yes. We already divide our customers into categories, so I hope to use these to compare with the general population to understand where we should concentrate our business efforts.

Hoping this will give us some useful business intelligence.

Thanks

