Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I am extracting some chats from REST API, incoming data is in JSON. So far I have been able extract data from API and put in DataBase. what I want to do next is to check is chats already inserted before inserting into table. Some more information chat table has millions of records because it stores all the conversation, one row per message. So there are two table here
CHAT_DETAILS - this stores everything (millions of records)
CHAT_MASTER - This Stores the unique chat_ID and some other information. (couple of hundred thousand records). This is the table where I want to check for incoming Chat_IDs before inserting.
If Chat_ID exists in CHAT_MASTER = STOP THE JOB
If Chat_ID doesn't exist in CHAT_MASTER = continue inserting
You can:
1) use tFlowToIterate and read data from database with this id - tDBInput and use tRunIf if component return or not data, like:
SELECT CASE WHEN EXISTS (SELECT * FROM table where Chat_ID = Chat_ID) THEN 1 ELSE 0 END as ID_Exists
2) as I understand it oracle (?) and because component do not support INSERT/IGNORE You can use tOracleRow component and with prepared query run something like
INSERT INTO table (a,b,c) VALUES (?,?,?) WHERE NOT EXISTS (SELECT * FROM table where Chat_ID = Chat_ID)
in both case use variable instead of Chat_ID
3) use triggers on table
you can also:
put a tMap component between extractJSON an oracleOutput.
connect it with extractJSON and oracleOutput components
add a tOracleInput component (CHAT_MASTER) as lookup
tMap configuration looks like this:
Important:
"Lookup Mode"l has to be set to: Load on every row,
"Catch lookup inner join reject" must be set to true)
you need to add a global Map Key - name it KEY_CHAT_ID and connect it with the CHAT_ID from the main row (row 3)
tOracleInput configuration/query looks like that
in the WHERE condition you can see the key you´ve defined before in the tMap.
Add a tOracleCommit component after the oracleOutput. You have to commit after every inserted row
because in your JSON file the same CHAT_ID might be more than one time (maybe - I dont know your data).
Close the Oracle connection in the Post-Job, not in the oracleCommit component, deselect the check box
Hi odisys,
Thanks a lot for taking out time to provide detailed example for solution, this works great. One of the most important step in all this for me is
" If Chat_ID exists in CHAT_MASTER = STOP THE JOB "
What it means is as soon as I hit any of the chat ID that is already existing in the chat_master table entire job should be stopped/killed. In the example you provided it is continuing the job by insert the ones that are not present and excluding the ones are already present. Can you please help to understand how can I get this condition to work " If Chat_ID exists in CHAT_MASTER = STOP THE JOB "
Hi @vapukov,
Thanks for your advice and help. I am quite new to Talend and have never used any of those components and not sure what to do. Option 1 you provided seems to be the most suitable since for me it's important that I should be able to stop/kill job as soon as I hit the condition " If chat_id already exists stop the job". I will try to understand and learn about the components you mentioned.
hi gr44,
just that i get it right.
lets say you have 200 different chats (CHAT_IDs) in your JSON.
Lets also say the first 100 are not yet in the CHAT_MASTER table.
Number 101 is.
Do you want to
If it´s the first variant: what if just record 101 is already in the database but 102 -200 are not (as 1 - 100 are not). If you exit the job at 101 then 102 - 200 will not be written. Probably your use-case makes sense but it´s a bit different to understand for me
best regards
Hi Odisys,
I want to write the first 100 records into the database and exit the job when reaching the 101st.
"If it´s the first variant: what if just record 101 is already in the database but 102 -200 are not (as 1 - 100 are not). If you exit the job at 101 then 102 - 200 will not be written. Probably your use-case makes sense but it´s a bit different to understand for me"
You point is completely valid and I was bit concerned about this, however the JSON output is always sorted by date. Let's say there are 500 records, I have already successful inserted 400 records. The first 100 record from the JSON output are the new ones that are not in my tables and 101 onward are already in my table. So in this scenario it wouldn't be an issue.
However in a scenario where JSON output was randomly sorted then my approach wouldn't work and what you mentioned is correct and would become an issue.
Why I want to exit as soon as I hit any Chat_ID already exist and not continue :
Hope that clarifies somewhat 🙂
@gr44 wrote:
Hi @vapukov,
Thanks for your advice and help. I am quite new to Talend and have never used any of those components and not sure what to do. Option 1 you provided seems to be the most suitable since for me it's important that I should be able to stop/kill job as soon as I hit the condition " If chat_id already exists stop the job". I will try to understand and learn about the components you mentioned.
Hi!
it is similar with example from @odisys
same idea - use CHAT_ID from request for query filtered data from database and than JOIN in tMap
Hi @gr44
OK i got it 🙂
Possible solution - I dont have my private notebook here where i created the screenshot from posting before, so i have to explain it by words.
Add a tDie component to the job.
Connect a second output row from tMap to tDie - name it stop/exit or something else.
Inside the tMap change the join model on the left side to Left outer join.
On the right side - in output update
Click the Filter symbol - it should be the arrow pointing to the right, just hoover the mouse over the buttons
Enter ->
Relational.ISNULL(row4.CHAT_ID)
On the right side - in output stop (or how you named it)
Click the Filter symbol
Enter ->
!Relational.ISNULL(row4.CHAT_ID)
I would also rename the output "update" to "insert" to have a better/clearer understanding what the jobparts is doing.
best regards, there might be typos in my posting but i guess you get the idea