Skip to main content
Announcements
Qlik Connect 2025! Join us in Orlando join us for 3 days of immersive learning: REGISTER TODAY

Qlik Talend Cloud: Talend Remote Engine Heartbeat is failing hourly with 'GOAWAY received' error

0% helpful (0/1)
cancel
Showing results for 
Search instead for 
Did you mean: 
Mency_Yu
Support

Qlik Talend Cloud: Talend Remote Engine Heartbeat is failing hourly with 'GOAWAY received' error

Last Update:

Feb 5, 2025 11:44:27 PM

Updated By:

Xiaodi_Shi

Created date:

Dec 9, 2024 3:01:55 AM

Talend Remote Engine Heart Beat is failing hourly with 'GOAWAY received' error and the karaf.log shows below:

2024-08-01T10:16:58,638 | ERROR | pool-28-thread-1 | JAXRSUtils | 469 - org.apache.cxf.cxf-rt-frontend-jaxrs - 3.6.2 | | Problem with writing the data, class org.talend.ipaas.rt.engine.model.HeartbeatInfo, ContentType: application/json
2024-08-01T10:16:58,638 | WARN | pool-28-thread-1 | PhaseInterceptorChain | 467 - org.apache.cxf.cxf-core - 3.6.2 | | Interceptor for {http://pairing.rt.ipaas.talend.org/}PairingService has thrown exception, unwinding now
org.apache.cxf.interceptor.Fault: Problem with writing the data, class org.talend.ipaas.rt.engine.model.HeartbeatInfo, ContentType: application/json
Caused by: java.io.IOException: /<IP>:<port>: GOAWAY received

This could also be a possible solution when the following error is also showing in the log:

Caused by: java.io.IOException: HTTP/1.1 header parser received no bytes

 

Resolution

  1. Stop Remote Engine.
  2. Edit the <RemoteEngineInstallation>/etc/org.talend.ipaas.rt.pairing.agent.cfg file and replace the line 
    heartbeat.interval=60 (by default) 
    with 
    heartbeat.interval=65
  3. In the file <RemoteEngineInstallation>/etc/system.properties, remove or comment out the line below (if it exists):
    org.apache.cxf.transport.http.forceVersion=1.1​
  4. Restart Remote Engine.
Do not set heart interval to more than 180 seconds(3 mins). If Talend Cloud does not receive a heartbeat from the Remote Engine for more than 3 mins, it will show the Remote Engine's status as unavailable.

 

Cause

The connection to the server pair.xx.cloud.talend.com is kept during 60 seconds and Heartbeat.interval > 60 seconds allows to close this connection and prevent "GOAWAY received" error message.

 

Related Content

For more information about heartbeat concept and heartbeat interval, please refer to documentation:

Monitoring-remote-engine-health

 

Internal Investigation ID(s)

 Internal Jira Case ID: TMC-4122

 

Environment

Talend Cloud  

Labels (1)
Comments
Tsubasanut
Partner - Contributor

Hi there.

Didn't help me. Heartbeat interval is now 90 seconds, I can verify that in karaf.log. But I'm still getting the error in about 1 hour interval.

I do not have the setting

org.apache.cxf.transport.http.forceVersion=1.1​

in system.properties. 

From googling I understood this is a problem, where the server and agent operates different HTTP version.That probably means my Engine is running HTTP 1.1 by default.
I tried calling server in browser with something like TalendEngineServer:port and dev console shows HTTP 1.1 is being used.
I'm not versed in configuring local instances. Any recommendations on what should I change in settings?

YSavitski
Contributor

Hi there,

We have a similar issue while using the Remote Engine. 

We raised the heartbeat.interval up to 120 seconds. And it still didn't solve the problem

Please help us with that

Regards,
Yauheni

Xiaodi_Shi
Support

Hello @YSavitski 

Thank for leaving your feedback here. I need a little bit information to address your issue.

Remote Engine version? What is the impact on the running jobs(task/plan execution for the moment)? Is talend server down?

For Network connectivity issue between the Remote Engine and Talend Cloud. The GOAWAY message suggests that the connection was closed unexpectedly, often due to network instability, firewall restrictions, or TLS/SSL issues, please try the following Suggested Solution, continue monitoring the RE running, if no success, please Download, clean up and re install a new RE and try again.

  • For the Remote Engine "GOAWAY received" error messages, we have a suggested action of increasing the socket timeout to 5 minutes.
    To do so, here are the steps:
    Stop RE
    Add org.talend.remote.client.socketReadTimeout=30000 in system.properties under RE/etc folder.
    Start RE
  • If your network is slow, the connection might be timing out before the heartbeat is completed.
    Please Increase the timeout in the Remote Engine configuration file:
    Open <install_dir>/etc/org.talend.remote.jobserver.server.cfg
    Update the heartbeat timeout settings:
    heartbeat.delay = 300
    Re-start RE Services
  • If the Remote Engine uses a self-signed certificate, ensure the CA certificate is trusted.
    Test the connection using cURL:
    curl -v https://pair.us.cloud.talend.com
    If there's an SSL/TLS error, update your certificates

We could locate your issue based on the Reported RE karaf error log. You can create a topic on qlik forum here for further investigation.

Best regards

Sabrina

 

YSavitski
Contributor

Hi @Xiaodi_Shi 

Remote Engine version: 2.13.8 build 276

What is the impact on the running jobs(task/plan execution for the moment)? 

- we still struggle with issues of periodic massive outages of our jobs with error reason
Failure details:
Failure type: UNDEFINED_ERROR
Failure message: Engine has been restarted


we have checked all available logs on Remote Engine and did not see any other exceptions except the ones mentioned in the current thread.


Solution, continue monitoring the RE running, if no success, please Download, clean up and re install a new RE and try again

This is what we're doing continuously (each day now), because of attempts to figure out and solve  Engine has been restarted issue

I will try to enlarge the mentioned timeouts and check results

Thank you

Regards,
Yauheni

MBourassa1682971203
Contributor

Hi have the same issue with my remote engine.

Periodicly, I see : IOException invoking https://pair.eu.cloud.talend.com/v2/engine/3e0f465c-4f29-4be2-980e-bdf66e6adb64/heartbeat: /10.128.70.72:54015: GOAWAY received

I use remote engine 2.13.9.

I have a job who call an api, this night, I got this error:

Caused by: java.io.IOException: IOException invoking https://api.digitalrecruiters.com/public/v1/job-applications/detailed?limit=100&page=32: /10.128.70.72:65463: GOAWAY received

 

Shicong_Hong
Support

@MBourassa1682971203 this was a known defect, if the workaround mentioned in the article doesn't work, please wait for the next version of Remote Engine that contains a fix.

Regards

Shicong

MBourassa1682971203
Contributor

Hi,

Thank you, I read on that issue on the internet, I was understand the issue is related to the java ?

https://bugs.openjdk.org/browse/JDK-8335181

What happen if I run my remote engine with java 24 ?

zulu24.28.85-ca-fx-jdk24.0.0-win_x64

Thank you

 

 

 

Shicong_Hong
Support

@MBourassa1682971203  Java 17 is the supported Java environment to execute Talend artifacts, if you install other Java version, you may have java.lang.UnsupportedClassVersionError error for some Job execution. 

Supported Java versions for running Talend artifacts

Regards

Shicong