Skip to main content
Announcements
Talend Data Catalog 8.0 End of Support: December 31, 2024 Get Details

Splitting up GeoAnalytics connector operations

cancel
Showing results for 
Search instead for 
Did you mean: 
Patric_Nordstrom
Employee
Employee

Splitting up GeoAnalytics connector operations

Last Update:

Jun 28, 2022 7:42:52 AM

Updated By:

Sonja_Bauernfeind

Created date:

Dec 13, 2018 10:49:54 AM

Many of the GeoAnalytics can be executed with a split input of the indata table. This article explains which and how to modify the code that the connector produces. Operations that cannot be Splittable are mostly the aggregating and hence not Splittable. When loadable tables are used for input, inline tables are created in loops and can be used for a quick way to split. Of course it's possible to write custom code to do the splitting instead.

Making calls with large indata tables often causes time outs on the server side, splitting is a way around that. 

Splittable ops Non-splittable ops Special ops, splittable

AddressPointLookup
Intersects
IpLookup
NamedAreaLookup
NamedPointLookup
PointToAddressLookup
Routes
TravelAreas
Within
Load*
Closest**

Closest
Cluster
Dissolve
IntersectsMost
Simplify
Binning
SpatialIndex

 

*Load is splittable, but not with the script version below. Instead, load portions of the indata, tables with the same attributes will be auto concatenated.

**Closest is splittable by the first table, like "for all patients which is the closest hospital" then "patients" can be calculated in batches of ex 1000.

Binning and SpatialIndex

Bining and SpatialIndex differs from other operations, they are not placing any call to the server if the indata are internal geometries, ie lat ,ong points. The operations als produce the same type of results so the resulting tables can be concatenated.

Example, a Within operation

Before the edit

The code as the connector produces it:

/* Generated by GeoAnalytics for operation Within ---------------------- */
Let [EnclosedInlineTable] = 'POSTCODE' & Chr(9) & 'Postal.Latitude' & Chr(9) & 'Postal.Longitude';
Let numRows = NoOfRows('PostalData');
Let chunkSize = 1000;
Let chunks = numRows/chunkSize;
For n = 0 to chunks
   Let chunkText = '';
   Let chunk = n*chunkSize;
   For i = 0 To chunkSize-1
      Let row = '';
      Let rowNr = chunk+i;
      Exit for when rowNr >= numRows;
      For Each f In 'POSTCODE', 'Postal.Latitude', 'Postal.Longitude'
         row = row & Chr(9) & Replace(Replace(Replace(Replace(Replace(Replace(Peek('$(f)', $(rowNr), 'PostalData'), Chr(39), '\u0027'), Chr(34), '\u0022'), Chr(91), '\u005b'), Chr(47), '\u002f'), Chr(42), '\u002a'), Chr(59), '\u003b');
      Next
      chunkText = chunkText & Chr(10) & Mid('$(row)', 2);
   Next
   [EnclosedInlineTable] = [EnclosedInlineTable] & chunkText;
Next
chunkText=''


Let [EnclosingInlineTable] = 'ClubCode' & Chr(9) & 'Car5mins_TravelArea';
Let numRows = NoOfRows('TravelAreas5');
Let chunkSize = 1000;
Let chunks = numRows/chunkSize;
For n = 0 to chunks
   Let chunkText = '';
   Let chunk = n*chunkSize;
   For i = 0 To chunkSize-1
      Let row = '';
      Let rowNr = chunk+i;
      Exit for when rowNr >= numRows;
      For Each f In 'ClubCode', 'Car5mins_TravelArea'
         row = row & Chr(9) & Replace(Replace(Replace(Replace(Replace(Replace(Peek('$(f)', $(rowNr), 'TravelAreas5'), Chr(39), '\u0027'), Chr(34), '\u0022'), Chr(91), '\u005b'), Chr(47), '\u002f'), Chr(42), '\u002a'), Chr(59), '\u003b');
      Next
      chunkText = chunkText & Chr(10) & Mid('$(row)', 2);
   Next
[EnclosingInlineTable] = [EnclosingInlineTable] & chunkText;
Next
chunkText=''


[WithinAssociations]:
SQL SELECT [POSTCODE], [ClubCode] FROM Within(enclosed='Enclosed', enclosing='Enclosing')
DATASOURCE Enclosed INLINE tableName='PostalData', tableFields='POSTCODE,Postal.Latitude,Postal.Longitude', geometryType='POINTLATLON', loadDistinct='NO', suffix='', crs='Auto' {$(EnclosedInlineTable)}
DATASOURCE Enclosing INLINE tableName='TravelAreas5', tableFields='ClubCode,Car5mins_TravelArea', geometryType='POLYGON', loadDistinct='NO', suffix='', crs='Auto' {$(EnclosingInlineTable)}
SELECT [POSTCODE], [Enclosed_Geometry] FROM Enclosed
SELECT [ClubCode], [Car5mins_TravelArea] FROM Enclosing;

[EnclosedInlineTable] = '';
[EnclosingInlineTable] = '';

/* End GeoAnalytics operation Within ----------------------------------- */

After edit

The header and the call is moved inside of the loop. chunkSize decides how big each split is.

Note that the first inline table now comes after the first one, this to get the call inside of the iteration.

 /* Generated by GeoAnalytics for operation Within ---------------------- */

Let [EnclosingInlineTable] = 'ClubCode' & Chr(9) & 'Car5mins_TravelArea';
Let numRows = NoOfRows('TravelAreas5');
Let chunkSize = 1000;
Let chunks = numRows/chunkSize;
For n = 0 to chunks
Let chunkText = '';
Let chunk = n*chunkSize;
For i = 0 To chunkSize-1
Let row = '';
Let rowNr = chunk+i;
Exit for when rowNr >= numRows;
For Each f In 'ClubCode', 'Car5mins_TravelArea'
row = row & Chr(9) & Replace(Replace(Replace(Replace(Replace(Replace(Peek('$(f)', $(rowNr), 'TravelAreas5'), Chr(39), '\u0027'), Chr(34), '\u0022'), Chr(91), '\u005b'), Chr(47), '\u002f'), Chr(42), '\u002a'), Chr(59), '\u003b');
Next
chunkText = chunkText & Chr(10) & Mid('$(row)', 2);
Next
[EnclosingInlineTable] = [EnclosingInlineTable] & chunkText;
Next
chunkText=''

Let numRows = NoOfRows('PostalData');

Let chunkSize = 1000;
Let chunks = numRows/chunkSize;
For n = 0 to chunks

Let [EnclosedInlineTable] = 'POSTCODE' & Chr(9) & 'Postal.Latitude' & Chr(9) & 'Postal.Longitude';

Let chunkText = '';
Let chunk = n*chunkSize;
For i = 0 To chunkSize-1
Let row = '';
Let rowNr = chunk+i;
Exit for when rowNr >= numRows;
For Each f In 'POSTCODE', 'Postal.Latitude', 'Postal.Longitude'
row = row & Chr(9) & Replace(Replace(Replace(Replace(Replace(Replace(Peek('$(f)', $(rowNr), 'PostalData'), Chr(39), '\u0027'), Chr(34), '\u0022'), Chr(91), '\u005b'), Chr(47), '\u002f'), Chr(42), '\u002a'), Chr(59), '\u003b');
Next
chunkText = chunkText & Chr(10) & Mid('$(row)', 2);
Next
[EnclosedInlineTable] = [EnclosedInlineTable] & chunkText;

[WithinAssociations]:
SQL SELECT [POSTCODE], [ClubCode] FROM Within(enclosed='Enclosed', enclosing='Enclosing')
DATASOURCE Enclosed INLINE tableName='PostalData', tableFields='POSTCODE,Postal.Latitude,Postal.Longitude', geometryType='POINTLATLON', loadDistinct='NO', suffix='', crs='Auto' {$(EnclosedInlineTable)}
DATASOURCE Enclosing INLINE tableName='TravelAreas5', tableFields='ClubCode,Car5mins_TravelArea', geometryType='POLYGON', loadDistinct='NO', suffix='', crs='Auto' {$(EnclosingInlineTable)}
SELECT [POSTCODE], [Enclosed_Geometry] FROM Enclosed
SELECT [ClubCode], [Car5mins_TravelArea] FROM Enclosing;

Next

[EnclosedInlineTable] = '';
[EnclosingInlineTable] = '';
chunkText=''
/* End GeoAnalytics operation Within ----------------------------------- */

 

 
Tags (1)
Labels (1)
Comments
bchastai
Contributor III
Contributor III

@Patric_Nordstrom I know this post is about GeoAnalytics, but I was wondering how we can use GeoOperations Cluster on large datasets? Currently there is a 50k row limit, but the whole point of Cluster is to reduce large data to make it reasonable on a map. Is there any solution for larger data in GeoOperations Cluster?

Antoine04
Partner - Creator III
Partner - Creator III

Hi @Patric_Nordstrom,

I am trying to split up a Load operation in order to get all the postal codes from France.

You say on the article : Load is splittable, but not with the script version below. Instead, load portions of the indata, tables with the same attributes will be auto concatenated

But the thing is that I can not really see how and where I can specify that I want to load only a portion

Could you help me on that point ?

Thank you

Regards,

vasilev
Creator
Creator

Hi @Antoine04 ,

I am also trying to split the postal codes from France by using the LOAD operation. Did you find out, how does it work?

 

BR,

Rumen

Emir_Dz
Contributor
Contributor

It would be very helpful if you could provide an example for operation Routes

Contributors
Version history
Last update:
‎2022-06-28 07:42 AM
Updated by: