Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026! Turn data into bold moves, April 13 -15: Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

large csv file import

I have many large files, each one contains more than 2 Mil rows of .CVS data.

I need to import gradually into MySQL table because I cannot load data in CVS file into memory.

Could you guide me which should use to do that?

I try to use tFileInputDelimited component, but it take out all file data and consume all my lap memory

Labels (2)
1 Solution

Accepted Solutions
TRF
Champion II
Champion II

You may also have a loop over your tFileInputDelimited with the Limit parameter setted to the max value your are able to manage at the same time (for example 250,000) and set the Header parameter dynamically based on the loop indice to ignore previously imported rows.

You may also split your big file into small chunck files and iterate over the list of created files.

If you connect tFileInputFullRow to tFileOutputDelimited set the advanced parameter "Split output in several files" to the number of lines your are able to manage at once, it should be an easier solution.

View solution in original post

15 Replies
manodwhb
Champion II
Champion II

@phancongphuoc ,if you are getting memory error ,you can increase the JVM by using the below link.

 

https://community.talend.com/t5/Installing-and-Upgrading/Configure-to-use-a-JVM/td-p/112893 

 

you can try to use the tMySqlBulkExec component to improve the performance. to know more about tMySqlBulkExec find the below link.

 

https://help.talend.com/reader/jomWd_GKqAmTZviwG_oxHQ/YhYqawgnulVXpdzE1lJ6cg

TRF
Champion II
Champion II

You may also have a loop over your tFileInputDelimited with the Limit parameter setted to the max value your are able to manage at the same time (for example 250,000) and set the Header parameter dynamically based on the loop indice to ignore previously imported rows.

You may also split your big file into small chunck files and iterate over the list of created files.

If you connect tFileInputFullRow to tFileOutputDelimited set the advanced parameter "Split output in several files" to the number of lines your are able to manage at once, it should be an easier solution.

Anonymous
Not applicable
Author

Hello Manohar,

I have just start with Talend around 3 hours, could you please tell me which one should I use between TOS_DI or TOS_BD for my case? 

Anonymous
Not applicable
Author

Dear TFR,

It seem your suggestion an very appropriate to my case.

Could you guide me some steps to do that? I am very new to TOS

Great thanks

Anonymous
Not applicable
Author

Dear TRF

 

Screenshot is what I am trying to do (connect tFileInputFullRow to tFileOutputDelimited)

But how to config the tFileInputFullRow because it require the File Name?

 


Capture1.PNG
TRF
Champion II
Champion II

The job design should be like this:

tFileInputFullRow --> tFileOutputDelimited (just to create small files - set schema to a single field)

|

onSubJobOK

|

tFileList --> tFileInputDelimited (with the real schema) --> tMap --> IHSDatabase

Anonymous
Not applicable
Author

Dear TRF

I got the OutOfMemmories error message

I still wonder tFileInputFullRow should take row by row in the tFileInputDelimited should be more sense. Am I right?

 

0683p000009M7Og.png


Capture1.PNG
manodwhb
Champion II
Champion II

@phancongphuoc ,increase the JVM and see.

 follow the below link to set JVM.

 

https://community.talend.com/t5/Installing-and-Upgrading/Configure-to-use-a-JVM/td-p/112893 

 

Anonymous
Not applicable
Author

Hi Manohar ,

I am using TOS 64 bit and I have no problem with Java as in your link

Still confuse on what you mentioned about