Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I am new to Talend BigData. I am migrating all my DI jobs to Spark for faster execution.
I came across tSQLRow component which I read uses Spark SQL for execution. It was my observation that any operations like Join or aggregation worked faster using the tSQLRow against the components like tMap and tAggregateRow.
The only difference I could see was that Talend components work on RDDs where as tSQLRow works on Dataframes.
I was wondering if Talend components can also work on Dataframes instead of RDD.
Looking at current design I am almost moving every key based operation into tSQLRow. This is hampering the readability of my jobs.
Any comments regarding this would be appreciated.
Do you mean that Talend is not handling RDDs even in Spark jobs ? I could see functions related to RDDs in the generated code. I could also see code related to Dataframes. However tMap deals with RDDs and tSQLRow deals with Dataframes.
Hi Team,
I am using Talend 7.3.1, Kindly let me know whether Talend using RDD's or Dataframes when I design a normal job with out Tsqlrow and by using Tmap, azure GEN2 , darabricks 5.5 LTS.
Thanks,
Viswa