Best Approach for Project Design

Ask a Question

Hello All,

we are going to launch new phase and extract data from 350-500 data source like (oracle -mssql-impala-Greenplum) and many different data sources,

what is the best approach/design to perform this phase to extract data with sysdate -1 from mentioned sources and insert it into parquet file and then move the file to HDFS

i have a fear of having memory issue on the server that jobs will run on it , what should i consider and have a logging for all of these tables/jobs ?

can anyone help me to sort out this ?

0 Replies

Best Approach for Project Design

Talend Big Data

v8.x