Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm trying to do a lookup a geographical place in a grid and find the gridID.
Below is my Grid Table and my initial table. I now want to create a new table that has the Location ID and correct Grid ID.
[Lookup]:
LOAD * inline [
GridID, South, North, West, East
1, 1, 2, 10, 11
2, 1, 2, 11, 12
3, 1, 2, 9, 10
4, 2, 3, 10, 11
5, 2, 3, 11, 12
6, 2, 3, 9, 10
7, 3, 4, 10, 11
8, 3, 4, 11, 12
9, 3, 4, 9, 10
];
[InitialTable]:
Load * inline [
LocationID, Latitude, Longititude
1, 1.2, 9.2
2, 3.4, 11.1
3, 2.1, 10.5
4, 2.8, 9.9
5, 3.7, 9.7
6, 1.6, 10.6
7, 1.9, 11.7
8, 3.9, 10.3
9, 2.3, 11.1
];
The resulting table should be (using Latitude > South and Latitude < North and Longitude > West and Longitude < East )
LocationID | GridID |
1 | 3 |
2 | 8 |
3 | 4 |
4 | 6 |
5 | 9 |
6 | 1 |
7 | 2 |
8 | 7 |
9 | 5 |
How do I go about that? This is to look up longitude and latitude coordinates in bounding boxes. I have over 1M records with location coordinates and over 4000 bounding boxes (grids). How to best ensure there will be no performance issue?
Thanks so much.
Hi,
Try this.
[Lookup]:
LOAD * inline [
GridID, South, North, West, East
1, 1, 2, 10, 11
2, 1, 2, 11, 12
3, 1, 2, 9, 10
4, 2, 3, 10, 11
5, 2, 3, 11, 12
6, 2, 3, 9, 10
7, 3, 4, 10, 11
8, 3, 4, 11, 12
9, 3, 4, 9, 10
];
join
Load * inline [
LocationID, Latitude, Longititude
1, 1.2, 9.2
2, 3.4, 11.1
3, 2.1, 10.5
4, 2.8, 9.9
5, 3.7, 9.7
6, 1.6, 10.6
7, 1.9, 11.7
8, 3.9, 10.3
9, 2.3, 11.1
];
Final_Temp:
Load LocationID as LID, if((Latitude > South and Latitude < North) and (Longititude > West and Longititude < East),1,0) as Flag,GridID as GID
Resident [Lookup];
Final:
Noconcatenate Load LID,GID
Resident Final_Temp where Flag = 1;
Drop table Final_Temp,Lookup;
Regards,
Kaushik Solanki
Hi,
Try this.
[Lookup]:
LOAD * inline [
GridID, South, North, West, East
1, 1, 2, 10, 11
2, 1, 2, 11, 12
3, 1, 2, 9, 10
4, 2, 3, 10, 11
5, 2, 3, 11, 12
6, 2, 3, 9, 10
7, 3, 4, 10, 11
8, 3, 4, 11, 12
9, 3, 4, 9, 10
];
join
Load * inline [
LocationID, Latitude, Longititude
1, 1.2, 9.2
2, 3.4, 11.1
3, 2.1, 10.5
4, 2.8, 9.9
5, 3.7, 9.7
6, 1.6, 10.6
7, 1.9, 11.7
8, 3.9, 10.3
9, 2.3, 11.1
];
Final_Temp:
Load LocationID as LID, if((Latitude > South and Latitude < North) and (Longititude > West and Longititude < East),1,0) as Flag,GridID as GID
Resident [Lookup];
Final:
Noconcatenate Load LID,GID
Resident Final_Temp where Flag = 1;
Drop table Final_Temp,Lookup;
Regards,
Kaushik Solanki
Thanks Kaushik! I really appreciate the solution. That works great for relative small datasets. However, I have about 3200 x 1,500,000 records, so the final temp lookup table will be over 4.5 billion records and my laptop is freezing. The 1,500,000 is growing by the day. Is there a way to more efficiently handle this?
You can try IntervalMatch, although I am not sure if the performance would improve or not, but worth giving it a shot:
[Lookup]:
LOAD * inline [
GridID, South, North, West, East
1, 1, 2, 10, 11
2, 1, 2, 11, 12
3, 1, 2, 9, 10
4, 2, 3, 10, 11
5, 2, 3, 11, 12
6, 2, 3, 9, 10
7, 3, 4, 10, 11
8, 3, 4, 11, 12
9, 3, 4, 9, 10
];
[InitialTable]:
Load * inline [
LocationID, Latitude, Longititude
1, 1.2, 9.2
2, 3.4, 11.1
3, 2.1, 10.5
4, 2.8, 9.9
5, 3.7, 9.7
6, 1.6, 10.6
7, 1.9, 11.7
8, 3.9, 10.3
9, 2.3, 11.1
];
Left Join (InitialTable)
IntervalMatch(Latitude)
LOAD South,
North
Resident Lookup;
Left Join (InitialTable)
IntervalMatch(Longititude)
LOAD West,
East
Resident Lookup;
Hi Sunny, thanks so much for looking at this too. Both solutions worked but neither of them had the performance I was looking for. So I solved it with a query on the SQL server side.