Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Looping in load script (KNN Algorithm)

Looping in load script

I have the following data:

Customerid

Age

Income (k)

Purchased

1

45

46

Book

2

39

100

TV

3

35

38

DVD

4

69

150

Car Cover

5

58

51

CD player

What I like to do is to find the nearest neighbor,

Meaning, which customer id are relatively close to each other.

The formula I’m using is:

SQRT(((( (customerid(X)Age) - customerid(Y)Age))/(MAX(age)-Min(Age)))^2) + (((customerid(X)Income) - (customerid(Y)Income))/(MAX(Income)-MIN(Income)))^2 )

What I like to do is, run this in a loop in the load script,

And get the nearest neighbor for each customer id.

My expected output should be:

Customer, neighbored, score

For example:

For customer 5

SQRT((((58 - 45)/(69-35))^2) + ((51 - 46)/(150-38))^2 ) = 0.38495

Customer =5

Neighbor = 1

Score = 0.38495

Checking customer 5 against other customers will result a higher scores, so eventually, I need the minimum for each customer that was checked.

Thanks for your help,

Tomer

2 Replies
Gysbert_Wassenaar
Partner - Champion III
Partner - Champion III

See attached qvw. I don't think this is suitable for very large number of records. You may want to use a specialized tool voor this kind of analysis. Maybe R with the Rweka package.


talk is cheap, supply exceeds demand
Not applicable
Author

thanks its very helpful.

I will give it a try with Weka as well.

Cheers,

Tomer