Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
mshafeeq
Contributor II
Contributor II

Best Approach for Project Design

Hello All,

we are going to launch new phase and extract data from 350-500 data source like (oracle -mssql-impala-Greenplum) and many different data sources,

what is the best approach/design to perform this phase to extract data with sysdate -1 from mentioned sources and insert it into parquet file and then move the file to HDFS

i have a fear of having memory issue on the server that jobs will run on it , what should i consider and have a logging for all of these tables/jobs ?

can anyone help me to sort out this ?

Labels (2)
0 Replies