# GeoAnalytics

Discussion Board for collaboration regarding Qlik GeoAnalytics.

Announcements
CUSTOMERS ONLY: Now accepting customer applications for the 2023 Luminary Program: SUBMIT NOW
cancel
Showing results for
Did you mean:
Contributor II

## Calculate distances between many points (large data)

Hi,

Basically I have all functions two calculate distances between two data points, but I have a problem because I have large data set and my server get 100% cpu and 100% Ram because of large data set.

There are 2 two tables:

1st is member_addresses table, where are 2 millions unique rows:

MemberId, Lat, lng

10001, 56.95, 24.25

10002, 56.96, 24.24

etc.

2nd table is store location addresses, where are 20'000 unique stores

Storeid, lat, lng

5001, 60.12, 24.86

5002, 60.13, 24.85

I need to find out for each member nearest 5 stores, when I try to join both table together, it creates 2'000'000* 20*000 rows, and it freezes my server. I know, it's not efficient, but can you help to suggest which will be the best way to calculate it?

1 Solution

Accepted Solutions
Specialist

Hi,

You may have to apply some effort, pls refer this qlikview doc.

You can try with grouping of either Long or Lat that will help you to avoid unwanted join.

EDIT:

So the idea is to divide Lon or Lat in group and assign them a key, do it for both the tables (members & stores). Don't join Qlik will make an association itself.

7 Replies
Employee

Hi,

Split up the request, send perhaps 10 000 members at the time.

Thanks,

Patric

Specialist

Hi,

You may have to apply some effort, pls refer this qlikview doc.

You can try with grouping of either Long or Lat that will help you to avoid unwanted join.

EDIT:

So the idea is to divide Lon or Lat in group and assign them a key, do it for both the tables (members & stores). Don't join Qlik will make an association itself.

Contributor II
Author

That's possible, but I am thinking about to create loop.

Taking just one member and running through each store table and then find 5 the nearest stores and drop other 19k rows.

Employee

You can write your own code or just use the loop in the code that the connector produces.

Move the header for the inline table and move server call inside of the "chunk" iteration.

Thanks,

Patric

Contributor II
Author

Thanks.

2 with loops and your solution. Seems your solutions is the fastest. 6x faster than loop.

Specialist

Loops with big data sets are never good. Glad you could resolve your issue.

Contributor II
Author

Yes. Thank again !

Maybe you can help me with one more thing about loading only top 5 the nearest POS points.

I think code can be optimized.

Right now it is slowing down at table: Member_POS_stores_order because of autonumber and ordering.

first 1500

MembershipNo,

status,

type,

place_id,

location_type,

[location/lat],

[location/lng],

If([location/lat]>=0 and [location/lat]<=10,'A',

If([location/lat]>10 and [location/lat]<=20,'B',

If([location/lat]>20 and [location/lat]<=30,'C',

If([location/lat]>30 and [location/lat]<=40,'D',

If([location/lat]>40 and [location/lat]<=50,'E',

If([location/lat]>50 and [location/lat]<=60,'F','Z')))))) as key

FROM

[servername\QVD\All_members_geocoded.qvd]

(qvd)

where 1=1

and Match(status, 'OK');

inner join

id,

name as POS_name,

postal_code,

city,

country,

latitude as POS_lat,

longitude as POS_lng,

brand_name,

If(latitude>=0 and latitude<=10,'A',

If(latitude>10 and latitude<=20,'B',

If(latitude>20 and latitude<=30,'C',

If(latitude>30 and latitude<=40,'D',

If(latitude>40 and latitude<=50,'E',

If(latitude>50 and latitude<=60,'F','Z')))))) as key

FROM

[servername\QVD\Partner_POS_Locator.qvd] (qvd);

MembershipNo as MembershipNo1,

status as status1,

type as type1,

place_id as place_id1,

location_type as location_type1,

[location/lat] as [location/lat1],

[location/lng] as [location/lng1],

id as id1,

POS_name as POS_name1,

postal_code as postal_code1,

city as city1,

country as country1,

POS_lat as POS_lat1,

POS_lng as POS_lng1,

brand_name as brand_name1,

ACOS(SIN([location/lat]*Pi()/180)*SIN(POS_lat*Pi()/180)

+COS([location/lat]*Pi()/180)*COS(POS_lat*Pi()/180)

*COS((POS_lng*Pi()/180)-([location/lng]*Pi()/180)))

*6371000 as [Distance (metres)]

Member_POS_stores_order:

MembershipNo1 as MembershipNo,

[location/lat1] as [location/lat],

[location/lng1] as [location/lng],

id1 as id,

POS_name1 as POS_name,

postal_code1 as postal_code,

city1 as city,

country1 as country,

POS_lat1 as POS_lat,

POS_lng1 as POS_lng,

brand_name1 as brand_name,

[Distance (metres)],

AutoNumber(MembershipNo1&id1&[Distance (metres)],MembershipNo1) as Rank

order by  [Distance (metres)] asc;

Member_POS_Stores_Top5:

MembershipNo,

[location/lat],

[location/lng],

id as POSid,

POS_name,

postal_code,

city,

country,

POS_lat,

POS_lng,

brand_name,

[Distance (metres)],

Rank

resident Member_POS_stores_order

where Rank <=5;

drop table Member_POS_stores_order;

Community Browser