Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We need to stop an error from popping up randomly and causing our users/management concern. Please advise
I have a job that runs on a Windows 2016 server. We are using Talend Open Studio for Big Data 7.1.1.21081026_1147 and Java version 1.8.0_231.
This job connects to a SFTP server to pull down, then process, then push up files. This job runs every 15 minutes and sometimes there are files to process and sometimes there are no files, for which there are tJava tests. This job works 99% of the time, but at least 2-3x per week we get an error that says
com.jcraft.jsch.JSchException: Session.connect: java.net.SocketException: Connection reset
However this ftp connection does work and if it wasn't this job would fail every time and never be successful. Instead it chooses to 'fail' randomly.
10:15:01|xxxxxxx|xxxxxxx|xxxxxxx|6124|LOCAL_PROJECT|Process_Name|xxxxxxx|0.1|Dev|tFTPConnection_1|begin||
09:15:01 Exception in component tFTPConnection_1 (Process_Name)
09:15:01 com.jcraft.jsch.JSchException: Session.connect: java.net.SocketException: Connection reset
09:15:01 at com.jcraft.jsch.Session.connect(Session.java:558)
09:15:01 2021-01-08 10:15:01|xxxxxxx|xxxxxxx|xxxxxxx|6124|LOCAL_PROJECT|Process_Name|xxxxxxx|0.1|Dev|tFTPConnection_1|end|failure|109
09:15:01 2021-01-08 10:15:01|xxxxxxx|xxxxxxx|xxxxxxx|6124|LOCAL_PROJECT|Process_Name|xxxxxxx|0.1|Dev|tPostjob_1|begin||
09:15:01 2021-01-08 10:15:01|xxxxxxx|xxxxxxx|xxxxxxx|6124|LOCAL_PROJECT|Process_Name|xxxxxxx|0.1|Dev|tPostjob_1|end|success|0
09:15:01 at com.jcraft.jsch.Session.connect(Session.java:183)
09:15:01 at local_project.Process_Name_0_1.Process_Name.tFTPConnection_1Process(Process_Name.java:2249)
09:15:01 at local_project.Process_Name_0_1.Process_Name.tJava_1Process(Process_Name.java:12727)
09:15:01 at local_project.Process_Name_0_1.Process_Name.runJobInTOS(Process_Name.java:13818)
09:15:01 at local_project.Process_Name_0_1.Process_Name.main(Process_Name.java:13448)
09:15:01 2021-01-08 10:15:01|xxxxxxx|xxxxxxx|xxxxxxx|6124|LOCAL_PROJECT|Process_Name|xxxxxxx|0.1|Dev||end|failure|188
09:15:01 Result: 1
09:15:01 Could Not Find C:\xxxxx\xxxxx\xxxxx-localhost-dispatch-script.tmp.bat
09:15:02 Failed: NonZeroResultCode: Result code was 1
There could be many reasons for this error.
1) Are you transferring huge data ?
2) Some times job is waiting for too long to process the data then firewall reset the connection.
This does not transfer large sets of data; and the processes that fail are not transferring any data at all, rather just connecting to the ftp to see if there are any files to transfer, if there are none then it closes.
What you can do, try to close the FTP connection using tFTPClose after completion of your job.
If you are not already doing it
I am using the tFTPClose, and still getting the error.
What's even stranger is that it runs every 15 minutes 24/7 and the errors don't seem to have a pattern-
HourDaily-- Run No.--Day
4:45:00 AM--4--Fri
10:15:00 AM--42--Fri
3:15:00 PM--62--Fri
1:30:00 AM--7--Sat
6:30:00 AM--27--Sat
1:15:00 PM--54--Sat
5:59:00 PM--73--Sun
9:30:00 PM--87--Sun
5:30:00 AM--23--Mon
8:59:00 AM--37--Mon
Another clue is that I can run this manually with the .bat file, in the UI, with RunDeck (a scheduler software) over and over manually and never see this error; It is only when the RunDeck scheduler runs the bat files that this happens.
Just for testing could you extend it from 15 minutes to 1 hours.
Also check with your network team if some huge data transmission is done on the same server during the time when your job failed.
Hello,
Please disable Firewall and try again and check to make sure everything works using ftp command.
Best regards
Sabrina