Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026 Agenda Now Available: Explore Sessions
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Spark partition before join

Hi Quick question , in SPARK -Talend .

In spark In order to join data, Spark needs the data that is to be joined (i.e., the data based on each key) to live on the same partition

If we are using Any key based components like  tmap or Join in tsql  is it wise to just use these components without partitioning for small files and rely on spark repartitioning the lookup flow based on mainflow. 

is there a guide line on when we should necessarily partition vs when we can rely on Spark Framework re partitioning .   especually if lookup data is big for broadcast but not too heavy  either like > 1GB.

 

Labels (2)
0 Replies