Best Approach for Project Design

mshafeeq — Fri, 15 Nov 2024 21:39:26 GMT

Hello All,

we are going to launch new phase and extract data from 350-500 data source like (oracle -mssql-impala-Greenplum) and many different data sources,

what is the best approach/design to perform this phase to extract data with sysdate -1 from mentioned sources and insert it into parquet file and then move the file to HDFS

i have a fear of having memory issue on the server that jobs will run on it , what should i consider and have a logging for all of these tables/jobs ?

can anyone help me to sort out this ?

topic Best Approach for Project Design in Talend Studio

Best Approach for Project Design