Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
INESBK
Creator
Creator

[CDC]Insert only data that has changed since last run (Tos_DI)

I would like to insert in my database only the new data. So I used incremental loading by comparing my source (set of files) and my target (sql server table) with inner join but since the number of rows inserted in the database is huge this solution is not feasible.
So I thought of doing the CDC by date comparison (last date of run and my current date)
Unfortunately I don't know how to do it.
Someone can help me please !

 

Labels (2)
27 Replies
INESBK
Creator
Creator
Author

My structure is like this :

I have a folder and subfolder structure where each subfolder is located file.dbf( the name represents the sensor).

This file contains thousands of lines that represent the values of this sensor at some point.

Here is an example of a sensor name : "c:\folder\subfolder\128.dbf" 

My need forces me to change the structure of this name in "folder.subfolder.MES".

So the form of the input file name (in tfilelist) is different from the form of file name that in database.

For this reason I used tfilelist to browse all the files, pass each filename to tSystem to read it and change the name structure with python script. Then make other transformations with tmap and finally insert in the table.

 

So , If I understood correctly, we must follow alternative 4

 

vapukov
Master II
Master II


@INESBK wrote:

 

So , If I understood correctly, we must follow alternative 4

 


If You ask my opinion - no, You do not need exactly go by this way :-), as well as You do not need Python for change name structure

But You can go any way of course

 

 

INESBK
Creator
Creator
Author

No I have to use python to read the structure of the file.dbf and add a column that contains the name of file with this form.

 

[If You ask my opinion - no, You do not need exactly go by this way :-)]

And if I am not use the alternative 4 then wich soltuion can solve my problem?

vapukov
Master II
Master II

 

If You want full solution - You must provide full information

There above - a lot of working ideas (from really working processes)

 

but I do not need guess - how it will work in Your case, if You not provide:

- what structure - full description! what we have in files, what we have after python script, how data organised (sorted, unsorted and etc)

- what structure of database - columns, indexes and etc

- what exclusions possible 

 

based on already presented information, You are just need:

1) request from SQL server - last time for selected sensor (taken by Python from file name)

2) read all files for this sensor and filter input data using tFilter or tMap by time column where value bigger than taken from database

 

that all

It is like a Google - as much more correct question, as much more relevant search result

You can request - "Green" or "Green Pub London", results will be little different 🙂

INESBK
Creator
Creator
Author

Thank you very much for your advice and your time.

I try this 

0683p000009LudF.png0683p000009LujH.png

I got this error :

0683p000009LujM.png

vapukov
Master II
Master II

It could be because - tMSSQLInput not run before tMap

In Your case - this process - independent ... somewhere in parallel world 

 

You can do like:

0683p000009LujW.png

 

it just example, but be careful about order - first read value, than use it

INESBK
Creator
Creator
Author

Ah finally, thank you another time and sorry if I have not explained well, I am a beginner with talend.

 

Thanks 0683p000009MACn.png 

vapukov
Master II
Master II

Welcome to community! 🙂