Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am loading data from Netezza to Vertica Database, i am having 1.3cr records of data it was taking almost 2hr to load data.
I tried using bulk load,i also enabled the cursor option in NetezzaInput component.
And i also tried to load data using Dynamic data type option in Edit schema i was getting the following error
org.netezza.error.NzSQLException: An existing connection was forcibly closed by the remote host
can any one can help me on this error and also any other option to load data faster.
Thanks,
Bharath.
I'm assuming that cr corresponds to crore (10,000,000). Is that correct? If so, that is a rate of around 1800 rows a second. Are your source and target dbs local or remote (that will have an impact)? Have you got a comparison of rates that you expect and have seen from other software?
Hi rhall,
Yes the data was 1.3 crore and the databases are in remote location. The row count is around 1600 to 2000 rows/second.
But that is not exactly the row count, it is decreasing and increasing automatically.
Thanks,
Bharath.
Hmmmm it sounds like you need to first workout whether your network bandwidth is sufficient to deal with the load. 1800 records a second may not be too bad if the dbs are located remotely.....especially if they are in different remote locations. Can you test the bandwidth during a run?
Hi,
Thanks for your quick response.
Bandwidth means Ethernet speed right. If it so the speed was 1.5 MB/S.
If not could you please let me know where i need to check it.
Thanks,
Bharath.
I'm afraid I cannot answer your questions about your network, I was merely suggesting you consider that unless you can prove that you can get faster performance from other tools on the same infrastructure. Problems like this are very difficult to isolate and people generally don't consider the most obvious reasons for why performance may not be what is expected.
If 1.5MB/s is 1.5 mega bytes per second, your network speed is not very fast. If it is mega bits per second, it is even worse. You also have to consider your source and target dbs' upload and download bandwiths AND if anyone else is using your network for anything else at the same time.
As an example of how location can affect performance. I was on a project where I was running Talend on an internal network where the performance was reasonable (I cannot remember the number of rows per second). I went off site and ran the same code in a different country. The performance went down by a factor of around 20. The data had to be retrieved from my source db over the internet to my laptop, processed and then sent back to the target db over the internet. The latency caused was massive. You need to consider this before assuming there is anything wrong with the job
Thanks for your quick response.
The speed is 1.5 MegaByte/Sec. Then i will treat it as a network bandwidth problem.
Do we have any other option which can load data faster(less than 2hr ).
Thanks,
Bharath.
I wasn't saying that it is definitely a network issue, I was saying that you need to diagnose this like a doctor might diagnose a patient. The patient might have a pain their hip that is caused by a defect in their foot. You need to know what the limitations are and what you should realistically be able to expect.
Ya i will defiantly try to diagnose this issue. Thanks for your time for helping me out.
As i mentioned earlier i was getting the following Exception
org.netezza.error.NzSQLException: An existing connection was forcibly closed by the remote host
do you have any idea about this.
Thanks,
Bharath.
I'm afraid I cannot answer that question without trying it out and I do not have a suitable environment