Stitch - Data Loading Fails When Using SSH Tunnel to Amazon Redshift Destination
We are experiencing an issue when loading data into Amazon Redshift as the destination in Stitch, specifically when using an SSH tunnel for connection security.
Issue Details:
The only change we made is switching the destination connection from a publicly accessible Redshift instance to an instance accessed via SSH tunnel.
Without the SSH tunnel:
Test connection: ✅ Successful
Data loading: ✅ Successful
However, this is a security risk as we need the database to not be publicly accessible.
With the SSH tunnel:
Test connection: ✅ Successful
Data loading: ❌Fails with an S3CurlException error
The error message indicates a timeout when attempting to load data into S3 before writing to Redshift.
Error Message:
ERROR: Problem reading manifest file - S3CurlException: Failed to connect to s3.eu-central-1.amazonaws.com port 443 after 50001 ms: Timeout was reached, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0
Detail:
-----------------------------------------------
error: Problem reading manifest file - S3CurlException: Failed to connect to s3.eu-central-1.amazonaws.com port 443 after 50001 ms: Timeout was reached, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0
code: 9001
context: s3://com-stitchdata-prod-loaders-staging-eu-central-1/ip-10-5-172-173-28081-8ae1151e-6107-4c74-abee-56aa3d18f04f/clients/208203/manifest_8776950644449573834.json
query: 35136095
location: s3_utility.cpp:387
process: padbmaster [pid=1073922562]
-----------------------------------------------
Additional Context:
The connection to S3 appears to be failing only when the SSH tunnel is enabled, even though the test connection succeeds.
The issue does not occur when Redshift is publicly accessible.
We need to resolve this urgently, as keeping the database publicly accessible is not an acceptable security posture.
Questions for Support:
Why does data loading to Redshift fail when using an SSH tunnel while the test connection still succeeds?
Are there additional configurations required for Stitch to properly load data when using an SSH tunnel?
Could there be a network routing issue with the SSH tunnel preventing access to AWS S3?
Are there specific firewall rules, allowlists, or additional settings required for SSH tunnel-based connections to work with Stitch?
Request for Resolution:
We need guidance on how to configure Stitch properly to support SSH tunnel connections while ensuring data loading works as expected.
Appreciate any assistance from the Qlik support team!