Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Loading data from CSV file to Snowflake

Halo,

I'm totally new to Talend. Just started learning.. Created few simple jobs to load data into snowflake tables just to have some hands on experience. My requirement is to load a huge data to a snowflake table. The table has 80million records and it has to be refreshed (truncate and load) in Snowflake everyday.

I have a couple of options..

1) One is to copy the data from source table (which is an Oracle) to target table (in snowflake) using tdbinput, tdboutputbulk and tdbbulk objects. I tried with this option and the seems to be very slow... How do i have multiple jobs to run in parallel so that it could be faster.

2) The 2nd option is that i have the entire data (80 million records) available in a CSV file. I believe the job will be much faster when using the CSV file rather accessing the Oracle table. Is there a way to have multiple job to read the CSV file in parallel?

Appreciate if someone provide inputs on this..

 

Labels (1)
  • Cloud

1 Reply
Anonymous
Not applicable
Author

Hi,

 

    From the details you have given, my understanding is that the data fetch from Oracle is taking time. Why don't you do data fetch in parallel for different partitions and merge them later in the flow to Snowflake Bulk Output?

 

     If you have 80 million records handy in csv file, you can think about using it as a source file to Snowflake Bulk Output Exec component. But you may still have to increase the memory of Talend job to maximum possible so that it will read more data chunks in one go.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂