Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026! Turn data into bold moves, April 13 -15: Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
jammann
Contributor
Contributor

Howto speed up Oracle loading job

Hi all
I have an load job (TOS 4.2) from a positional file into an Oracle table that works well, but is a bit slow. I'm looking into ways to make it faster without giving up features.
I have a simple tFileInputPositional -> tMap (merges in an ID column) -> tOracleOutput.
The file has ~1 Mio records, and ~25 fields, of which I only need ~10 to import into the table.
When I load the file, do the input parsing and the mapping, and write it out to a delimited file, the job performs at ~18'000 rows/sec on my laptop, which is good.
When I redirect the output into the Oracle table, the performance drops to ~1'500 rows/sec. This is WITHOUT using batch insert mode. The reason I don't use batch inserts is that within the ~1 Mio records, I have ~20 which are duplicates according to a unique constraint on the Oracle table. I need to respect this unique constraint, so I can not simply use the batch insert (which of course makes loading much faster).
Things I have considered (but not tried out everything yet)
- splitting the output flow into several (say 3-5) streams and move them into separate tOracleOutput (did not really gain me anything)
- sort the records in the job (tSortRow or tExternalSortRow) and filter out the duplicate records somehow in Talend, then use Oracle batch insert mode
- maybe some other way of elminitating the duplicates (without global sorting, just "keep the keys" and compare if I already saw it)
Or maybe some other ideas??
CU, Joe
Labels (2)
9 Replies
Anonymous
Not applicable

It looks like you're barking up the right tree if that's any consolation.
Theres a few different ways to make this work...
with TOS, to parallelize a job, you would need to kind of "do it yourself"
1) create a "generic" job that loads an input file-- make the filename a context var. Export the job as an autonomous script.
2) create a father job that will split your input file and then with a tSystem call your generic job and background it, repeating with different chunks. you can pass context paramaters by using the "--context-param fileName=foo.csv" option.
There are drawbacks to this approach-- error handling will be awkward and all the exporting during development will rapidly get irritating.
Another option would be to load to a temp table where you have no constraints and then use ELT to load data correctly to your true target.
jammann
Contributor
Contributor
Author

Hehe, thanks for your understanding :-)
I'll be moving to TIS as soon as I got by the administrative hurdles to fork some money out. I already had a TIS evaluation earlier so I know the parallelization feature of the TIS tOracleOutput will be available and it should probably give enough speed.
For time being, I was able to get rid of the constraints (I am actually loading into a staging table). I'm now at ~6'000 rows/sec which is fair.
I'll have a look at the ELT components tomorrow - to be honest, I knew that they are able to process things much faster, but always found them rather limiting in features. But I guess I'll have to learn living with them :-)
Thanks!
CU, Joe
Anonymous
Not applicable

Hello,
question :
would a round-robin approach that splits records on x output connection help with the inserts?
something like :
-> out1
input -> round-robin -> out2
-> out3
meaning that first record in inout goes to out1, second to out2, third to out3, forth to out1 etc...
?
If so it's quite easy I think to create a tRoundRobin component, I'll see if I can provide one this morning maybe.
Bye
jammann
Contributor
Contributor
Author

Hi Saburo
thanks for your idea and willingness to code it! I did something similar myself: I know that the incoming records have a fairly even distribution on one of the key elements, so I configured a tMap with several outputs, going into separate tOracleOutput. This did NOT bring me any relevant speedup.
I guess you would really need to be able to create some multithreading to gain real speedups. Otherwise I guess you're ending up waiting 1ms for out1, then for the next row 1ms for out2, and so on. The tRoundRobin would probably not bring too much speedup, if it is not multithreaded.
That's just a rough guess. I think the "parallelize" feature of the TIS tOracleOut is able to do exactly that, multithreading. That's why it really brings something. But I think to keep it under control, one needs to do it in one component. A generic multithreading tRoundRobin seems like a very interesting idea, but I would be afraid of the potential complexity that comes in when it is used in different job scenarios. Could quickly become a nightmare, I guess.
CU, Joe
Anonymous
Not applicable

Look, I just finished implementing it, you may try it, I will upload it in a couple of minutes.
With another old etl tool I used a round robin to parallelize loads to oracle (at that time it was Oracle 😎 and it actually improved a bit.
The point is that you can use a separate session, which actually fires up another thread, so, from the point of view of the receiver rdbms it is multi thread.
This helps if the bottleneck is the upload, it obviously does not help if your bottleneck is the input part 0683p000009MACn.png
I will post the component in a couple of minutes, I tested it with reowgenerator and tlogrows and it seems to work ok.
Bye
Anonymous
Not applicable

Done, you can find the component in Exchange.
It's a simple modification of tReplicate, did not test it much, but should be enough just to verify if it improves a bit the performances.
Let me know.
Bye
jammann
Contributor
Contributor
Author

Hi Saburo
sorry, I was at a customer site today, only got round now to do some testing.
I did first tests on a smaller sample, with ~50'000 rows which go as "INSERT or UPDATE" into a table.
- reading the file, mapping and writing it to a file: takes around 5.5 sec at ~13'500 rows/sec (so reading should not be the bottleneck)
- same, but write directly into a tOracleOutput with INSERT or UPDATE and thus batching off: around 39 sec at ~1'300 rows/sec (here comes the bottleneck 🙂
- same, but write into a tRoundRobin with 1 output: same, around 39 sec at ~1'300 rows/sec (just to be sure 🙂
- same, but write into a tRoundRobin with 2 output: same, around 39 sec at ~1'300 rows/sec ()
- same, but write into a tRoundRobin with 4 output: a tad faster, around 37 sec at ~1'350 rows/sec ()
With the long running job and 1 Mio rows I did not that many tests (just takes too long). But also here, using 4 outputs instead of one is only slightly faster (certainly within the normal deviations of different test runs).
So I'm afraid the tRoundRobin component did not help me, sorry. I even tried to run the whole job with the multi threaded flag (though I must admit that I never quite understood what the flag does :-), with more or less the same results.
CU, Joe
Anonymous
Not applicable

Yes, I checked how the database output components are created.
The statement is not inserted in a thread, so it does not release the control until the query is fully executed anyway.
Not sure how easy it is to change that behavior, the components seem to perform several "executeupdate" because of the update/insert etc.
Normally using JDBC, when I insert records I fire up a thread, so I wrongly assumed it was the same in TOS components.
It would not necessarily need to be multi-threaded, single (non blocking) thread could be enough using 2 or 3 output with separate connections.
... or you can use TIS 0683p000009MACn.png
Sorry it did not help.
jammann
Contributor
Contributor
Author

Thanks again for trying it! I also looked at the option of doing this in multiple threads.
Creating the threads in the tRoundRobin component seems not very attractive to me. Since you don't really know what the job designer will place after the component, this could quickly become a nightmare.
I also thought about extending the tOracleOutput component. I guess this would be the much more natural place, since it could be localized in one component. It also looks to me like a "doable task", not extremely complex. I just don't know if I would "step on Talends feet" with this, because the parallelizing is certainly one important sales argument for TIS over TOS.
Since I will ultimately be using TIS anyway for my job, it is not that important for me. But I think that we could improve the performance of such loads greatly with relatively little effort.
CU, Joe