Loading from file to table

Anonymous · ‎2019-02-11

I have file like this file.txt which may contain millions of records. I want to load into target. When I am executing the job it may get error in middle of the record. I resolve the error again I execute the job the data will load from starting to ending ? Or wherever the error is got after the records will load why?

vapukov · ‎2019-02-11

hi,

it depend from your job design - does you handle this or not

by default - all will continue from begin

if you target support insert ignore and contain primary key, job could avoid duplicates

if not, you must manage this manually, as variant - use autocommit, check number of loaded in target, read file from row N+1

Anonymous · ‎2019-02-13

Always when loading from file to Db, use a Schemacompliancecheck component before inserting into DB, use Manual commit every time. When the records are loading from file to db, we need to increase our memory in the job. before a run try to do memory run.

follow this design

tfileinput => tmap => tschemacompliance => tmssqloutput/Any db

vapukov · ‎2019-02-13

@ksingh wrote:

Always when loading from file to Db, use a Schemacompliancecheck component before inserting into DB, use Manual commit every time. When the records are loading from file to db, we need to increase our memory in the job. before a run try to do memory run.

follow this design

tfileinput => tmap => tschemacompliance => tmssqloutput/Any db

not 100% agree

this does not work for huge files, not first, not second

it will well work for a relatively small number of rows ... but what about 10M rows?

if data is dirty - you need to create separate job for clean them

if data is huge - the single transaction will kill your talend and your database server, so you need to commit more often rather than try to commit all at the end

Talend Data Integration

v7.x