Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
To configure a proxy for a Talend Data Catalog server on Linux:
JAVA_OPTS="-D\"catalina.data=${CATALINA_DATA}\" -D\"connector.port=${M_CONNECTOR_PORT}\" -D\"shutdown.port=${M_SHUTDOWN_PORT}\" -D\"server.fqdn=${M_SERVER_FQDN}\" -Dlog4j2.disableJmx=true -Djava.awt.headless=true -Dhttp.proxyHost=<IP or HostName> -Dhttp.proxyPort=3128 -Dhttps.proxyHost= <IP or HostName> -Dhttps.proxyPort=3128"
$ps -ef | grep java
Note the effect using a proxy will have if you have SAML/SSO enabled. See Configuring the SAML server for details.
This article describes how to connect Talend Studio to Starburst (Galaxy or Enterprise) using the Trino JDBC driver. It covers driver download, JDBC connection configuration inside Talend Studio, and load script patterns for querying Starburst catalogs. It is intended for use alongside Qlik Sense Enterprise on Windows (client-managed) environments where Talend Studio is used as the ETL layer to stage or transform data before it is consumed by Qlik.
Content
Before configuring the connection, ensure the following requirements are met:
Talend Studio connects to Starburst using the Trino JDBC driver. Unlike the ODBC driver used for Qlik Sense direct connections, no platform installer is required. The driver is a single JAR file.
https://repo1.maven.org/maven2/io/trino/trino-jdbc/479/trino-jdbc-479.jar
C:\Drivers\Starburst\trino-jdbc-479.jar
Check the Starburst documentation for the recommended JDBC driver version that matches your Starburst Galaxy or Enterprise deployment. The version referenced above (479) is the version used in the Data Nova Conference demo environment.
Talend Studio uses a generic JDBC database connection type to connect to Starburst. The steps below apply to both Starburst Galaxy and Starburst Enterprise. The only difference is the JDBC URL and credentials.
Complete the connection form with the following parameters:
|
Parameter |
Value |
|
JDBC URL (Galaxy) |
jdbc:trino://<cluster>.galaxy.starburst.io:443/<catalog> |
|
JDBC URL (Enterprise) |
jdbc:trino://<coordinator-host>:8443/<catalog> |
|
Driver JAR |
trino-jdbc-479.jar (full local path to the JAR downloaded in Download the Trino JDBC Driver) |
|
Driver Class |
io.trino.jdbc.TrinoDriver |
|
User ID |
Service account username (e.g. starburst_service) |
|
Password |
Service account password or personal access token |
|
Schema |
Target schema within the catalog (e.g. transactional_data) |
Example JDBC URL for the Data Nova Conference demo environment (Starburst Galaxy):
jdbc:trino://stardock-qlik.starport.starburst.io:443/transactional_data
If the connection test fails with an SSL error when connecting to Starburst Galaxy, ensure that Java's certificate store trusts the public CA that signed the Galaxy endpoint certificate. For Starburst Enterprise with a self-signed certificate, import the certificate into the JRE keystore using keytool.
The table below summarises the JDBC parameters that differ between Starburst Galaxy and Starburst Enterprise:
|
Parameter |
Starburst Galaxy |
Starburst Enterprise |
|
Host |
<cluster>.galaxy.starburst.io |
<coordinator-hostname-or-ip> |
|
Port |
443 (HTTPS) |
8443 (HTTPS) or 8080 (HTTP) |
|
SSL |
Always enabled |
Enabled for HTTPS deployments |
|
Authentication |
Username / password (LDAP) or OAuth2 |
LDAP, Kerberos, or password |
|
JDBC URL pattern |
jdbc:trino://<cluster>.galaxy.starburst.io:443/<catalog> |
jdbc:trino://<host>:8443/<catalog> |
The Trino JDBC driver is the same JAR for both Galaxy and Enterprise. Only the JDBC URL, port, and authentication settings differ.
Starburst organises data in a three-level hierarchy: catalog > schema > table. The JDBC URL specifies the default catalog. The schema is set in the Talend connection configuration and can be overridden at the component level.
Include the catalog name as the path component of the JDBC URL:
jdbc:trino://stardock-qlik.starport.starburst.io:443/transactional_data
In this example, transactional_data is the catalog. All queries run through this connection will default to that catalog unless a three-part table name is used.
When querying tables across multiple catalogs within the same Talend job, use three-part table names in tDBInput or tDBRow SQL queries:
SELECT
customer_id,
customer_name,
region,
created_date
FROM transactional_data.retail.customers
WHERE created_date >= DATE '2024-01-01'
Always use three-part table names (catalog.schema.table) in SQL queries inside Talend components to avoid ambiguity, even when a default catalog is set in the JDBC URL.
Starburst uses the Trino query engine. The following SQL conventions differ from standard ANSI SQL and must be observed when writing SQL inside Talend components (tDBInput, tDBRow, tDBOutput):
|
Feature |
Trino Syntax |
Avoid |
|
Date literal |
DATE '2024-01-01' |
TO_DATE(), CONVERT() |
|
String concat |
CONCAT(a, b) or a || b |
a + b |
|
Top N rows |
LIMIT n |
TOP n, ROWNUM |
|
Table names |
catalog.schema.table |
schema.table (ambiguous) |
|
Case sensitivity |
Double-quote identifiers if mixed case |
Unquoted mixed-case names |
|
Write operations |
INSERT INTO and CREATE TABLE AS SELECT supported; UPDATE and DELETE require Delta or Iceberg connector |
Direct UPDATE/DELETE on Hive connector |
The following patterns cover common scenarios for reading from and writing to Starburst within a Talend Studio job.
Use Talend context variables to pass date parameters into SQL predicates for incremental data loads:
Define a context variable:
context.lastLoadDate = 2024-01-01
Reference it in the tDBInput query:
SELECT *
FROM transactional_data.retail.transactions
WHERE transaction_date >= DATE '" + context.lastLoadDate + "'
For production incremental loads, persist the last successful load date to a Talend context file or a control table in Starburst so that the value survives job restarts.
Environment
This article describes how to connect Qlik Sense Enterprise on Windows (client-managed) to Starburst Galaxy using the Starburst ODBC driver. It covers driver installation, data connection configuration, and load script patterns for querying Starburst Galaxy catalogs via Trino SQL.
Content
Before configuring the connection, ensure the following requirements are met on the Qlik Sense server or any node that will execute data load tasks:
Starburst provides a dedicated ODBC driver for Galaxy that is based on the Simba/Magnitude Trino ODBC driver. Use only the Galaxy-specific build to ensure compatibility with Galaxy authentication and TLS settings.
Install the 64-bit driver even if the Qlik Sense process appears as 32-bit. Qlik Sense on Windows runs as a 64-bit process.
A System DSN makes the driver available to all Windows services, including the Qlik Sense Engine and Scheduler services.
|
Setting |
Value |
|
DSN Name |
StarburstGalaxy (used in the Qlik connection string) |
|
Host |
<your-cluster>.galaxy.starburst.io |
|
Port |
443 |
|
Catalog |
Leave blank (set per query or in the Qlik load script) |
|
Authentication |
LDAP (username/password) or OAuth2 (token) |
|
SSL |
Enabled (required for Galaxy) |
|
SSL Certificate |
Use the system certificate store |
If the test connection fails with a TLS error, ensure that the Starburst Galaxy root certificate is trusted by the Windows certificate store on the Qlik Sense Enterprise on Windows server.
The connection name Data Nova Conference:Starburst Galaxy is the canonical name used in all Qlik load scripts for this integration. Changing it requires updating all dependent scripts.
The following patterns cover common scenarios for loading data from Starburst Galaxy into Qlik Sense Enterprise on Windows.
Establish the Connection
Always use the following statement at the top of the load script tab that accesses Starburst Galaxy:
LIB CONNECT TO 'Data Nova Conference:Starburst Galaxy';
Basic SQL SELECT
Use three-part table names (catalog.schema.table) in all Trino SQL statements:
Customers:
SQL SELECT
customer_id,
customer_name,
region,
created_date
FROM burstbank.retail.customers
WHERE created_date >= DATE '2024-01-01';
Joining Tables
Perform joins in SQL rather than in the Qlik data model to reduce data transfer volume:
Orders:
SQL SELECT
o.order_id,
o.customer_id,
o.order_date,
o.total_amount,
c.customer_name,
c.region
FROM burstbank.retail.orders o
JOIN burstbank.retail.customers c ON o.customer_id = c.customer_id;
Incremental Load
Use Qlik variables to pass date parameters into the SQL predicate for incremental data loads:
LET vLastLoadDate = Date(Today()-1, 'YYYY-MM-DD');
Transactions:
SQL SELECT *
FROM burstbank.retail.transactions
WHERE transaction_date >= DATE '$(vLastLoadDate)';
Starburst Galaxy uses the Trino query engine. The following SQL conventions differ from standard ANSI SQL and must be observed:
|
Feature |
Trino Syntax |
Avoid |
|
Date literal |
DATE '2024-01-01' |
TO_DATE(), CONVERT() |
|
String concat |
CONCAT(a, b) or a || b |
a + b |
|
Top N rows |
LIMIT n |
TOP n, ROWNUM |
|
Table names |
catalog.schema.table |
schema.table (ambiguous) |
|
Case sensitivity |
Double-quote identifiers if mixed case |
Unquoted mixed-case names |
Environment
This article describes how to connect Qlik Sense Enterprise on Windows (client-managed) to Starburst Enterprise (self-managed, on-premises or private cloud) using the Starburst ODBC driver. It covers driver installation, data connection configuration, and load script patterns for querying Starburst Enterprise catalogs via Trino SQL.
Content
Before configuring the connection, ensure the following requirements are met on the Qlik Sense server or any node that will execute data load tasks:
Starburst provides a dedicated ODBC driver for Starburst Enterprise based on the Simba/Magnitude Trino ODBC driver. Download the Enterprise-specific build from your Starburst Enterprise distribution. Do not use the Galaxy driver.
Install the 64-bit driver. Qlik Sense on Windows runs as a 64-bit process regardless of the label shown in Windows Task Manager.
A System DSN makes the driver available to all Windows services, including the Qlik Sense Engine and Scheduler services.
|
Setting |
Value |
|
DSN Name |
StarburstEnterprise (used in the Qlik connection string) |
|
Host |
<coordinator-hostname-or-ip> |
|
Port |
8443 (HTTPS) or 8080 (HTTP — not recommended for production) |
|
Catalog |
Leave blank (set per query or in the Qlik load script) |
|
Authentication |
LDAP, Kerberos, or Password (depends on Starburst Enterprise configuration) |
|
SSL |
Enabled for HTTPS deployments; disabled for HTTP-only test environments |
|
SSL Certificate |
Path to the PEM or PFX certificate file, or use the system certificate store if the CA is trusted |
If Kerberos authentication is required, ensure the Qlik Sense service account has a valid Kerberos ticket and that the Kerberos configuration file (krb5.ini) is present at C:\Windows\krb5.ini on the Qlik Sense server.
Unlike Starburst Galaxy, the connection name for Starburst Enterprise deployments can be customised to match your environment. Ensure the name is consistent across all load scripts that use this connection.
The following patterns cover common scenarios for loading data from Starburst Enterprise into Qlik Sense. The patterns are identical to those used with Starburst Galaxy; the only difference is the connection name in the LIB CONNECT TO statement.
Establish the Connection
Replace the connection name with the one configured in Create a Data Connection in Qlik Sense:
LIB CONNECT TO 'Starburst Enterprise:Production';
Basic SQL SELECT
Use three-part table names (catalog.schema.table) in all Trino SQL statements:
Customers:
SQL SELECT
customer_id,
customer_name,
region,
created_date
FROM hive.sales.customers
WHERE created_date >= DATE '2024-01-01';
Joining Tables
Perform joins in SQL rather than in the Qlik data model to reduce data transfer volume:
Orders:
SQL SELECT
o.order_id,
o.customer_id,
o.order_date,
o.total_amount,
c.customer_name,
c.region
FROM hive.sales.orders o
JOIN hive.sales.customers c ON o.customer_id = c.customer_id;
Use Qlik variables to pass date parameters into the SQL predicate for incremental data loads:
LET vLastLoadDate = Date(Today()-1, 'YYYY-MM-DD');
Transactions:
SQL SELECT *
FROM hive.sales.transactions
WHERE transaction_date >= DATE '$(vLastLoadDate)';
Starburst Enterprise uses the Trino query engine. The same Trino SQL conventions apply as for Starburst Galaxy:
|
Feature |
Trino Syntax |
Avoid |
|
Date literal |
DATE '2024-01-01' |
TO_DATE(), CONVERT() |
|
String concat |
CONCAT(a, b) or a || b |
a + b |
|
Top N rows |
LIMIT n |
TOP n, ROWNUM |
|
Table names |
catalog.schema.table |
schema.table (ambiguous) |
|
Case sensitivity |
Double-quote identifiers if mixed case |
Unquoted mixed-case names |
The table below summarises the key differences between the two deployment models relevant to a Qlik Sense integration:
|
Area |
Starburst Galaxy |
Starburst Enterprise |
|
Hosting |
Fully managed SaaS (Starburst-hosted) |
Self-managed (on-premises or private cloud) |
|
Default port |
443 (HTTPS) |
8443 (HTTPS) or 8080 (HTTP) |
|
Authentication |
LDAP or OAuth2 |
LDAP, Kerberos, password, or custom |
|
ODBC driver |
Galaxy-specific build |
Enterprise-specific build |
|
Connection name |
Data Nova Conference:Starburst Galaxy (canonical) |
Defined by the administrator per deployment |
|
TLS certificate |
Signed by a public CA — trusted automatically |
May require manual certificate import |
|
Catalog examples |
burstbank.retail.customers |
hive.sales.customers |
Environment
This article describes how to connect Qlik Cloud Analytics to Starburst Enterprise (self-managed, on-premises, or private cloud) using a Qlik Data Gateway - Direct Access and the Starburst ODBC driver. It covers gateway installation and registration, ODBC driver and DSN configuration on the gateway host, TLS certificate handling for self-managed deployments, and data connection setup in Qlik Cloud. It is intended for Qlik Cloud tenants that need to query Starburst Enterprise catalogs directly from Qlik Cloud apps and scripts.
Content
Qlik Cloud Analytics is a fully managed SaaS platform. Because Starburst Enterprise runs within a customer-managed network, a Qlik Data Gateway - Direct Access must be deployed within the same network as, or with routable access to, the Starburst Enterprise coordinator node. The gateway acts as a secure proxy, relaying data load requests from Qlik Cloud to Starburst Enterprise over the ODBC connection configured on the gateway host.
The connection flow will look as follows:
Qlik Cloud tenant → Qlik Data Gateway (Direct Access) → Starburst ODBC DSN → Starburst Enterprise coordinator
The gateway host is a Windows or Linux server that you manage. It must have outbound HTTPS access to both the Qlik Cloud tenant (port 443) and the Starburst Enterprise coordinator (port 8443 for HTTPS, or 8080 for HTTP). The gateway is typically deployed inside the same private network as the Starburst Enterprise cluster.
Before beginning, ensure the following requirements are met:
A Qlik Data Gateway - Direct Access is required to connect Qlik Cloud Analytics to Starburst Enterprise. Unlike Qlik Sense on Windows (client-managed), Qlik Cloud does not execute data load scripts directly. All connections to on-premises or privately hosted data sources are routed through the gateway.
A Qlik Data Gateway - Direct Access must be installed and registered to your Qlik Cloud tenant before a Starburst ODBC data connection can be created or used. Attempting to create the connection without a registered gateway will result in an error.
Because Starburst Enterprise runs inside a customer-managed network, the gateway host is typically deployed within that same network. This ensures the gateway can reach the Starburst Enterprise coordinator without requiring firewall rules that expose the coordinator to the public internet.
The gateway must be installed on a host that:
blockquote For high availability, install the gateway on two or more hosts within the Starburst Enterprise network and register all of them to the same Qlik Cloud tenant. Qlik Cloud will distribute data load requests across available gateway instances.
The Starburst ODBC driver must be installed on the gateway host, not on the Qlik Cloud infrastructure. Qlik Cloud reads data through the gateway, so the driver must be present and configured where the gateway is running.
Use the Starburst Enterprise-specific ODBC driver build, not the Starburst Galaxy driver.
Install the 64-bit driver on the gateway host. The Qlik Data Gateway - Direct Access runs as a 64-bit process and requires the 64-bit ODBC driver. Use ODBC Data Source Administrator (64-bit) for all DSN configuration.
Starburst Enterprise deployments commonly use HTTPS with a certificate signed by a private or internal CA. Unlike Starburst Galaxy, whose certificate is signed by a publicly trusted CA, the Starburst Enterprise certificate chain may not be automatically trusted by Windows. The certificate must be imported into the Windows certificate store on the gateway host before the ODBC DSN can establish a TLS connection.
If the Starburst Enterprise deployment uses a certificate signed by an intermediate CA, import the full chain: the root CA certificate and any intermediate CA certificates. Importing only the leaf (server) certificate is not sufficient for TLS validation.
After importing the certificate, verify that the gateway host trusts the Starburst Enterprise endpoint by opening a browser on the gateway host and navigating to the coordinator URL (such as https://<coordinator-host>:8443). If the page loads without a certificate warning, the certificate chain is trusted correctly.
This step is only required for HTTPS deployments with private or self-signed certificates. It is not applicable for Starburst Enterprise using a certificate signed by a publicly trusted CA (such as Let's Encrypt, DigiCert).
A System DSN must be created on the gateway host. This is the DSN that Qlik Cloud will reference when the data connection is used in a load script or app.
Configure the DSN with the following settings:
|
Setting |
Value |
|
DSN Name |
StarburstEnterprise (must match the name used when creating the Qlik Cloud data connection) |
|
Host |
<coordinator-hostname-or-ip> |
|
Port |
8443 (HTTPS) or 8080 (HTTP — not recommended for production) |
|
Catalog |
Leave blank (set per query or in the Qlik load script) |
|
Authentication |
LDAP, Kerberos, or password (depends on Starburst Enterprise configuration) |
|
SSL |
Enabled for HTTPS deployments; disabled for HTTP-only test environments |
|
SSL Certificate |
Use the system certificate store (after completing Configure TLS Certificate Trust on the Gateway Host) |
Test the DSN connection while logged in as the same Windows service account that will run the Qlik Data Gateway service. DSN access permissions are per-user for User DSNs and per-machine for System DSNs. Always use System DSN for service accounts.
If Kerberos authentication is required, ensure the Qlik Data Gateway service account has a valid Kerberos principal and that the Kerberos configuration file (krb5.ini) is present at C:\Windows\krb5.ini on the gateway host before configuring the DSN.
The Qlik Data Gateway - Direct Access installer is downloaded from the Qlik Cloud Administration Center and registers the gateway to your tenant during installation. The gateway itself is product-agnostic, meaning the installation steps are identical for Starburst Galaxy and Starburst Enterprise connections.
The gateway service runs under a Windows service account (or as a systemd service on Linux). Ensure this account has Read access to the System DSN registry keys and is granted Log on as a service rights on Windows. For Kerberos authentication, the service account must also have a valid Kerberos ticket.
Once the gateway is registered and the ODBC DSN is configured on the gateway host, create the ODBC data connection in Qlik Cloud.
|
Field |
Value |
|
Data gateway |
Select the Direct Access Gateway registered in Install and Register the Qlik Data Gateway - Direct Access |
|
DSN |
StarburstEnterprise (the System DSN configured on the gateway host in Configure an ODBC DSN on the Gateway Host) |
|
Username |
Starburst Enterprise user account username (leave blank for Kerberos SSO) |
|
Password |
Starburst Enterprise user account password (leave blank for Kerberos SSO) |
|
Connection name |
Administrator-defined (e.g. Starburst Enterprise:Production) |
Unlike Starburst Galaxy, the connection name for Starburst Enterprise is administrator-defined and can reflect the deployment environment (e.g. Starburst Enterprise:Production, Starburst Enterprise:Staging). Ensure the name is used consistently in all load scripts that reference this connection.
Load scripts in Qlik Cloud apps use the same patterns as Qlik Sense on Windows. The LIB CONNECT TO connection name must match the data connection created in Create a Data Connection in Qlik Cloud.
Replace the connection name with the one configured in Create a Data Connection in Qlik Cloud:
LIB CONNECT TO 'Starburst Enterprise:Production';
Use three-part table names (catalog.schema.table) in all Trino SQL statements:
Customers:
SQL SELECT
customer_id,
customer_name,
region,
created_date
FROM hive.sales.customers
WHERE created_date >= DATE '2024-01-01';
Perform joins in SQL rather than in the Qlik data model to reduce data transfer volume through the gateway:
Orders:
SQL SELECT
o.order_id,
o.customer_id,
o.order_date,
o.total_amount,
c.customer_name,
c.region
FROM hive.sales.orders o
JOIN hive.sales.customers c ON o.customer_id = c.customer_id;
Use Qlik variables to pass date parameters into the SQL predicate for incremental data loads:
LET vLastLoadDate = Date(Today()-1, 'YYYY-MM-DD');
Transactions:
SQL SELECT *
FROM hive.sales.transactions
WHERE transaction_date >= DATE '$(vLastLoadDate)';
Rename non-key fields that share names across tables by prefixing the table name. Island tables should be commented out:
Orders:
SQL SELECT
order_id,
customer_id, // Key — keep as-is
order_date AS Order_order_date, // Renamed to avoid synthetic key
total_amount AS Order_total_amount
FROM hive.sales.orders;
In Qlik Cloud, scheduled reloads run through the gateway. Ensure the gateway service is running and the gateway host has network access to the Starburst Enterprise coordinator at the time scheduled reloads are triggered.
Starburst Enterprise uses the Trino query engine. The following SQL conventions differ from standard ANSI SQL and must be observed in all Qlik load script SQL statements:
|
Feature |
Trino Syntax |
Avoid |
|
Date literal |
DATE '2024-01-01' |
TO_DATE(), CONVERT() |
|
String concat |
CONCAT(a, b) or a || b |
a + b |
|
Top N rows |
LIMIT n |
TOP n, ROWNUM |
|
Table names |
catalog.schema.table |
schema.table (ambiguous) |
|
Case sensitivity |
Double-quote identifiers if mixed case |
Unquoted mixed-case names |
The table below summarises the key differences in the Qlik Cloud connection setup between Starburst Enterprise and Starburst Galaxy:
|
Area |
Starburst Enterprise |
Starburst Galaxy |
|
Coordinator host |
Customer-managed (on-premises or private cloud) |
Starburst-managed SaaS |
|
Default port |
8443 (HTTPS) or 8080 (HTTP) |
443 (HTTPS) |
|
ODBC driver build |
Enterprise-specific build from Starburst portal |
Galaxy-specific build from galaxy.starburst.io |
|
TLS certificate |
May require manual import into Windows certificate store |
Signed by public CA — trusted automatically |
|
Authentication |
LDAP, Kerberos, or password |
LDAP or OAuth2 |
|
Gateway network placement |
Deploy inside the Starburst Enterprise private network |
Deploy with outbound access to port 443 |
|
Connection name |
Administrator-defined (e.g. Starburst Enterprise:Production) |
Data Nova Conference:Starburst Galaxy (canonical) |
|
Catalog examples |
hive.sales.customers |
burstbank.retail.customers |
|
Load script syntax |
Identical — LIB CONNECT TO + Trino SQL SELECT |
Identical — LIB CONNECT TO + Trino SQL SELECT |
Environment
This article describes how to connect Qlik Cloud Analytics to Starburst Galaxy using a Qlik Data Gateway - Direct Access and the Starburst ODBC driver. It covers gateway installation and registration, ODBC driver and DSN configuration on the gateway host, and data connection setup in Qlik Cloud. It is intended for Qlik Cloud tenants that need to query Starburst Galaxy catalogs directly from Qlik Cloud apps and scripts.
Content
Qlik Cloud Analytics is a fully managed SaaS platform. Because Starburst Galaxy is not directly accessible from the Qlik Cloud infrastructure without network configuration, a Qlik Data Gateway - Direct Access must be deployed within the same network as, or with routable access to, the Starburst Galaxy cluster endpoint. The gateway acts as a secure proxy, relaying data load requests from Qlik Cloud to Starburst Galaxy over the ODBC connection configured on the gateway host.
The connection flow will look as follows:
Qlik Cloud tenant → Qlik Data Gateway (Direct Access) → Starburst ODBC DSN → Starburst Galaxy cluster
The gateway host is a Windows or Linux server that you manage. It must have outbound HTTPS access to both the Qlik Cloud tenant (port 443) and the Starburst Galaxy cluster endpoint (port 443).
Before beginning, ensure the following requirements are met:
A Qlik Data Gateway - Direct Access must be installed and registered to your Qlik Cloud tenant before a Starburst ODBC data connection can be created or used. Attempting to create the connection without a registered gateway will result in an error.
A Qlik Data Gateway - Direct Access is required to connect Qlik Cloud Analytics to Starburst Galaxy. All connections to on-premises or privately hosted data sources are routed through the gateway.
The gateway must be installed on a host that:
For high availability, install the gateway on two or more hosts and register all of them to the same Qlik Cloud tenant. Qlik Cloud will distribute data load requests across available gateway instances.
The Starburst ODBC driver must be installed on the gateway host. Qlik Cloud reads data through the gateway, so the driver must be present and configured where the gateway is running.
Install the 64-bit driver on the gateway host. The Qlik Data Gateway - Direct Access runs as a 64-bit process and requires the 64-bit ODBC driver. Use ODBC Data Source Administrator (64-bit) for all DSN configuration.
A System DSN must be created on the gateway host. This is the DSN that Qlik Cloud will reference when the data connection is used in a load script or app.
|
Setting |
Value |
|
DSN Name |
StarburstGalaxy (must match the name used when creating the Qlik Cloud data connection) |
|
Host |
<your-cluster>.galaxy.starburst.io |
|
Port |
443 |
|
Catalog |
Leave blank (set per query or in the Qlik load script) |
|
Authentication |
LDAP (username/password) or OAuth2 (token) |
|
SSL |
Enabled (required for Galaxy) |
|
SSL Certificate |
Use system certificate store |
Test the DSN connection while logged in as the same Windows service account that will run the Qlik Data Gateway service. DSN access permissions are per-user for User DSNs and per-machine for System DSNs. Always use System DSN for service accounts.
The Qlik Data Gateway - Direct Access installer is downloaded from the Qlik Cloud Administration activity center and registers the gateway to your tenant during installation.
The gateway service runs under a Windows service account (or as a systemd service on Linux). Ensure this account has Read access to the System DSN registry keys and is granted Log on as a service rights on Windows.
Once the gateway is registered and the ODBC DSN is configured on the gateway host, create the ODBC data connection in Qlik Cloud.
|
Field |
Value |
|
Data gateway |
Select the Direct Access Gateway registered in Install and Register the Qlik Data Gateway - Direct Access |
|
DSN |
StarburstGalaxy (the System DSN configured on the gateway host in Configure an ODBC DSN on the Gateway Host) |
|
Username |
Starburst Galaxy service account username |
|
Password |
Starburst Galaxy service account password or personal access token |
|
Connection name |
Data Nova Conference:Starburst Galaxy |
The connection name Data Nova Conference:Starburst Galaxy is the canonical name used in all Qlik load scripts for this integration. Changing it requires updating all dependent scripts. This name is consistent across Qlik Sense Enterprise on Windows and Qlik Cloud (this article).
Load scripts in Qlik Cloud apps use the same patterns as Qlik Sense on Windows. The LIB CONNECT TO connection name must match the data connection created in Create a Data Connection in Qlik Cloud.
Always use the following statement at the top of the load script tab that accesses Starburst Galaxy:
LIB CONNECT TO 'Data Nova Conference:Starburst Galaxy';
Use three-part table names (catalog.schema.table) in all Trino SQL statements:
Customers:
SQL SELECT
customer_id,
customer_name,
region,
created_date
FROM hive.sales.customers
WHERE created_date >= DATE '2024-01-01';
Perform joins in SQL rather than in the Qlik data model to reduce data transfer volume through the gateway:
Orders:
SQL SELECT
o.order_id,
o.customer_id,
o.order_date,
o.total_amount,
c.customer_name,
c.region
FROM hive.sales.orders o
JOIN hive.sales.customers c ON o.customer_id = c.customer_id;
Use Qlik variables to pass date parameters into the SQL predicate for incremental data loads:
LET vLastLoadDate = Date(Today()-1, 'YYYY-MM-DD');
Transactions:
SQL SELECT *
FROM hive.sales.transactions
WHERE transaction_date >= DATE '$(vLastLoadDate)';
Rename non-key fields that share names across tables by prefixing the table name. Island tables should be commented out:
Orders:
SQL SELECT
order_id,
customer_id, // Key — keep as-is
order_date AS Order_order_date, // Renamed to avoid synthetic key
total_amount AS Order_total_amount
FROM hive.sales.orders;
In Qlik Cloud, scheduled reloads run through the gateway. Ensure the gateway service is running and the gateway host has network access to Starburst Galaxy at the time scheduled reloads are triggered.
Starburst Galaxy uses the Trino query engine. The following SQL conventions differ from standard ANSI SQL and must be observed in all Qlik load script SQL statements:
|
Feature |
Trino Syntax |
Avoid |
|
Date literal |
DATE '2024-01-01' |
TO_DATE(), CONVERT() |
|
String concat |
CONCAT(a, b) or a || b |
a + b |
|
Top N rows |
LIMIT n |
TOP n, ROWNUM |
|
Table names |
catalog.schema.table |
schema.table (ambiguous) |
|
Case sensitivity |
Double-quote identifiers if mixed case |
Unquoted mixed-case names |
The table below summarises the key differences in the Starburst connection setup between Qlik Cloud Analytics and Qlik Sense Enterprise on Windows (client-managed):
|
Area |
Qlik Cloud Analytics |
Qlik Sense on Windows |
|
Deployment |
SaaS (Qlik-managed) |
Client-managed |
|
ODBC driver location |
Gateway host (not Qlik Cloud servers) |
Qlik Sense server |
|
Gateway required |
Yes — Direct Access Gateway mandatory |
No — connects directly via QMC |
|
DSN configuration |
On the gateway host, must be System DSN |
On the Qlik Sense server, must be System DSN |
|
Connection creation |
Qlik Cloud hub or Management Console |
Qlik Management Console (QMC) |
|
Connection name |
Data Nova Conference:Starburst Galaxy (canonical) |
Data Nova Conference:Starburst Galaxy (canonical) |
|
Load script syntax |
Identical — LIB CONNECT TO + SQL SELECT |
Identical — LIB CONNECT TO + SQL SELECT |
|
Scheduled reloads |
Via Qlik Cloud scheduler, executed through gateway |
Via QMC task scheduler, executed on Qlik Sense server |
Environment
NPrinting has a library of APIs that can be used to customize many native NPrinting functions outside the NPrinting Web Console.
An example of two of the more common capabilities available via NPrinting APIs are as follows
These and many other public NPrinting APIs can be found here: Qlik NPrinting API
In the Qlik Sense data load editor of your Qlik Sense app, two REST connections are required (These two REST Connectors must also be configured in the QlikView Desktop application>load where the API's are used. See Nprinting Rest API Connection through QlikView desktop)
Requirements of REST user account:
Creating REST "GET" connections
Note: Replace QlikServer3.domain.local with the name and port of your NPrinting Server
NOTE: replace domain\administrator with the domain and user name of your NPrinting service user account
Creating REST "POST" connections
Note: Replace QlikServer3.domain.local with the name and port of your NPrinting Server
NOTE: replace domain\administrator with the domain and user name of your NPrinting service user account
Ensure to enter the 'Name' Origin and 'Value' of the Qlik Sense (or QlikView) server address in your POST REST connection only.
Replace https://qlikserver1.domain.local with your Qlik sense (or QlikView) server address.
Ensure that the 'Origin' Qlik Sense or QlikView server is added as a 'Trusted Origin' on the NPrinting Server computer
NOTE: The information in this article is provided as-is and to be used at own discretion. NPrinting API usage requires developer expertise and usage therein is significant customization outside the turnkey NPrinting Web Console functionality. Depending on tool(s) used, customization(s), and/or other factors ongoing, support on the solution below may not be provided by Qlik Support.
During Change Data Capture (CDC) replication, the target Oracle database experienced a transient CPU spike.
Database performance monitoring identified that the high CPU consumption was driven by the following automated bulk query generated by Qlik Replicate:
UPDATE /*+ PARALLEL(tempview) */ ( SELECT /*+ PARALLEL("CDC"."CATB_ENTRIES_DECOUPLING") PARALLEL("QLIK_TARGET"."attrep_changes69C66930_0000001") */ ...
The underlying cause of the resource spike was the execution of this query using aggressive Oracle PARALLEL hints, which exhausted available target DB CPU/memory resources.
To prevent future resource contention and CPU/memory alerts in the target database, parallelism hints must be disabled or tuned within Qlik Replicate. This involves modifying two internal parameters across the Full Load and CDC stages: bulkUseParallel and directPathParallelLoad.
bulkUseParallel is for CDC / Change Processing and is enabled by default (true). It instructs Qlik Replicate to inject the Oracle PARALLEL hint into bulk DML statements for better target performance.
Setting it to false stops queries such as UPDATE /*+ PARALLEL(tempview) */ ... from executing in parallel, preventing CPU spikes.
This may cause a slight performance degradation during high-volume CDC processing.
directPathParallelLoad is for Full Load and enabled by default (true). It enables Direct Path loading using parallel processing only during the initial Full Load phase. It has no impact on the daily CDC.
Setting it to false proactively protects the DB during table reloads.
Will likely increase the time required to complete Full Load operations.
After recreating the Amazon EMR cluster and updating the Storage Zone configuration in Qlik Compose with the new EMR master IP, the connection validation failed with the following error:
Storage Zone connection failed
COMPOSE-E-ENGCONFAI
Java connection failed
SYS-E-GNRLERR, Required driver class not found:
com.simba.hive.jdbc41.HS2Driver
This issue caused the Storage Zone connection to fail, impacting data pipeline operations.
sudo service compose-agent stop
sudo service compose-agent start
sudo service compose-agent status
The Hive JDBC driver (HiveJDBC41.jar) required for connecting to EMR HiveServer2 was either:
Qlik Compose does not dynamically load new JDBC drivers. The driver becomes available only after restarting the Compose Agent service.
See Prerequisites | Qlik Compose.
The purpose of this post is to help you install the database drivers necessary to allow your Qlik Data Gateway to communicate with your company's servers once you have completed the Qlik Data Gateway installation itself.
If you are anything like me, perhaps you panicked a bit at the thought of installing the Qlik Data Gateway in a Linux environment. I have a lot of experience with .EXE installations in Windows environments. You know "Next – Next – Next – Finish." But an .RPM file? I had never even see that extension type before. If you were a Linux connoisseur beforehand, you probably guessed that my image for this post is an homage to the Fedora flavor of Linux. Otherwise you just thought it was an advertisement for the new "Raiders of the Lost Data" movie.
In any event, by now you have created your first Data Gateway, applied the registration key, completed the setup instructions and thankfully the command to check your Data Gateway service shows that it is running.
When you go back to the Data Gateway section of the Management Console and do a refresh your eyes fill you with happiness because your brand spanking new Data Movement Gateway shows "Connected.”
A lesser person would go celebrate right now. But you've decided to try and connect to a source before doing your happy dance. So, you create a new Data Integration project to the destination of your choice. While you will ultimately have many different data sources, let's imagine that you decide to start with a "SQL Server (Log Based)" connection, as your first source test.
You input the server connection details, but your SQL Server doesn't use a standard port for security. Finally, you find information online that you should input your server IP followed by a "comma and the port #". As an example, if your servers IP is 39.30.3.1 and your security port is 12345 you would input "39.30.3.1,12345'. Next you input the user and password credentials. Your last step is to choose the database. Easy peezy, lemon squeezey. Right?
You press the "Load databases" button but suddenly a dialog comes up telling you that the Data Gateway can't connect because it can't find a SQL Server driver.
Your heart starts beating quickly but naturally as a pro, you remain calm on the outside. Eventually you realize that whether on Windows or Linux, applications have always required drivers to communicate with servers. This is nothing new, we just got excited when we saw that connected message and thought we were done. Upon going back to the setup guide
you realize that there is in fact a link labeled "Setting up Data Gateway – Data Movement source connections."
So, you go ahead and click the link and it takes you to:
Wow, so many sources, and so many additional links to click to ensure the required drivers are in place for the sources your company will need. All the documentation is there, but I know firsthand that it can get a bit overwhelming, especially if Linux isn't your native language, which is the reason for this post.
Obviously every one of you reading this works in an environment that may require different data source connections than the others. Thus, there is no way for me to predict and help with your exact configuration. However, odds are strong that most of you likely require at least: SQL Server, Databricks, Snowflake, Postgres or MySQL, various combinations of them, or perhaps all of them.
As tedious, or imposing as it may be, I highly recommend you walk through the documentation for each data source you will need. But thanks to my buddy John Neal, I have attached a Linux shell script that can be executed to configure all 5 of those data sources for you. Given the many flairs and versions and configurations of Linux I can't ensure that it will work for everyone, but at least it is a start for those that may want to press an easy button, and those that like me may be somewhat or brand new to Linux.
If you choose to take advantage of it, understand that it is only being offered a shelp, and is not meant to replace the documentation. To utilize it you will need to do the following (Please note in my examples I have changed to the root user. If you are logged in as a normal user account, you may need to use SUDO "super user do"):
If all went well with the installation your output should look like similar to the following image that was part of my file:
It's almost time to do our happy dance, but let's hold off until we test. In my starting example I asked you to assume we wanted to test against a "SQL Server (Log Based) connection." When we left off it was because we got an error message we had no driver while trying to load the list of databases. I will try that again.
Oh no, the heart rate is going up again.
We have successfully installed the Qlik Data Gateway. We have successfully installed the required drivers. Yet, we are getting this new error message. Let's focus on our breathing and try and digest the situation. What could cause our attempt to connect to our data source to timeout? I got it.
It's likely network security. We know what we want to talk to. We know the location. We know the credentials. But our networks aren't always wide open to do the talking. Resolving your connectivity/firewall issues may or not be with your abilities and if you are like me, you may need to seek the help of your IT/Networking team.
When I reached out to my friendly IT guru, here within Qlik, he was able to help me get everything in place so that my Linux server could speak with my database servers, including all of the needed ports.
Once they were completed I was able to test and sure enough my data connection succeeded.
Whether or not you do a happy dance, as I did, I hope that this post has helped you get to that sweet smell of success. After all, someone has to be known as the amazing person who got your Qlik Data Gateway going so that others in the Data Engineering team could create all of those lights out Qlik Cloud Data Integration projects that would be feeding data in near real time to all of those wonderul analytics use cases. Hopefully with the help of the documentation and this post, that person is you my friends.
Challenge
One of the things I've long admired about the Qlik Community is their willingness to help each other through this Community site. If you are a Linux guru and are so inclined I would love to see you share other versions of the shell script that I have started. Maybe your organization is using another flair/version of Linux and you needed to make a few tweaks to my file. Maybe your organization needed Oracle added and you can tweak my file. Whatever the reason, I sure hope you will give back to the community by sharing all of those tweaks here. Who knows, your help might help them be able to do their happy dance. And we all know the world is a better place when more people do their happy dance.
Related Content
Qlik Data Gateway - Data Movement prerequisites and Limitations - https://help.qlik.com/en-US/cloud-services/Subsystems/Hub/Content/Sense_Hub/Gateways/dm-gateway-prerequisites.htm
Setting up the Data Movement gateway - https://help.qlik.com/en-US/cloud-services/Subsystems/Hub/Content/Sense_Hub/Gateways/dm-gateway-setting-up.htm
PS - I created both of the images here using a generative AI solution called MidJourney. I hope they've added to the fun of this post.
Qlik Talend Studio job fails connecting to the Data Stewardship Application in Qlik Talend Cloud with the error below:
tDataStewardshipTaskInput_1 Unable to connect to Talend Data Stewardship.
java.io.IOException: Unable to connect to Talend Data Stewardship.
Caused by: java.net.UnknownHostException: No such host is known (tds.eu.cloud.talend.com)
This error indicates that the Qlik Talend Studio job was unable to resolve the hostname to an IP address at that instant, in this case, Qlik Talend Cloud Data Stewardship.
The Data Load Editor in Qlik Sense Enterprise on Windows 2025 experiences noticeable performance issues.
Qlik Sense May 2025 SR 6 and higher releases.
A workaround is available. It is viable as long as the Qlik SAP Connector is not in use.
No service restart is required.
SUPPORT-6006
Beginning from Qlik Replicate version 2024.05, a new checkbox was added to Log Stream Staging tasks: Retrieve all source columns on UPDATE
The option is available in Task Settings (A) > Change Processing > Change Processing Tuning (B)
It is enabled (C) by default.
When Retrieve all source columns on UPDATE is enabled, it will cause any table added to the task to issue an ALTER on the table to enable supplemental logging on all columns if the source database is Oracle.
For high-transaction tables, enable Supplemental Logging on all columns during off-peak hours manually before adding them to the Qlik Replicate task.
In previous Qlik Replicate versions, supplemental logging was not required on all columns and was enabled only on Primary Key Columns. But with that new checkbox, it is required to be added to all columns.
When any new table is added to the Log Stream Staging task, Qlik Replicate issues an ALTER TABLE command to enable Supplemental Logging on all columns. This command can fail on high-transaction or busy tables in the source Oracle DB.
The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.
When using Key Pair authentication and creating a new Snowflake connection, you might encounter the following error:
Illegal Argument
The provided private key file (.p8) is not in the correct format, or the key file password is invalid.
To get the actual error, install SnowSQL utility in the Linux machine where the Qlik Data Movement gateway is installed and try to connect to the same account:
snowsql -a <account> -u <username> --private-key-path <path to file/rsa_key.p8>
This will provide the exact error on why the connectivity is failing and assist in identifying which root cause applies.
Starting from Qlik Replicate versions 2024.5 and 2024.11, Microsoft SQL Server 2012 and 2014 are no longer supported. Supported SQL Server versions include 2016, 2017, 2019, and 2022. For up-to-date information, see Support Source Endpoints for your respective version.
Attempting to connect to unsupported versions, both on-premise and cloud, can result in various errors.
Examples of reported Errors:
The system view sys.column_encryption_keys is only available starting from SQL Server 2016. Attempting to query this view on earlier versions results in errors.
Reference: sys.column_encryption_keys (Microsoft Docs)
Upgrade your SQL Server instances to a supported version (2016 or later) to ensure compatibility with Qlik Replicate 2024.5 and above.
00375940, 00376089
Qlik Sense Connectors are missing from the Data source except few REST connectors.
Repair Qlik Sense with the Qlik Sense Setup file (identical version).
Encryption keys:
Encryption keys will be stored in either "C:\Users\{sense service user}\AppData\Roaming\Qlik\QwcKeys\" Or "C:\Users\{sense service user}\AppData\Roaming\Qlik\Keys\"
An error occurred / Failed to load connection error message in Qlik Sense - Server Has No Internet
After upgrading Qlik Talend Studio to patch R2025-08 or later, jobs using the tS3Connection or tS3List components fail with the error:
Exception in component tS3List_1
java.lang.IllegalStateException: Connection pool shut down
An additional change with the new SDK version is how the AWS "region" is handled. In Qlik Talend Studio R2025-07 and earlier, the Region and Endpoint field is shown as a dropdown below.
Using the DEFAULT value works for most cases:
With the new SDK 2.x, the region field is less flexible. The region must explicitly be defined to work correctly. To resolve connectivity issues after upgrading, it is best to explicitly define the AWS region used under the new Region text field:
With the release of R2025-08, Qlik Talend Studio migrated AWS component dependencies from Amazon Web Services SDK version 1.x to SDK version 2.x. This move was prompted by SDK 1.x having reached end of life as of December 31, 2025.
The Amazon DynamoDB, Amazon SQS, and Amazon S3 components were all updated. For the full release notes, see R2025-08 Talend Studio 8.0 - New Features.
When using IBM DB2 for iSeries as a source in Qlik Replicate, the task may report a warning if journal receiver numbers are not continuous.
A typical warning message looks like:
[SOURCE_CAPTURE ]W: Journal entry sequence '2026' was read from journal receiver 'APSUPDB.QSQJRN0118'. The previous entry was read from receiver 'APSUPDB.QSQJRN0116'. Check if a receiver has been detached. (db2i_endpoint_capture.c:1836)
Qlik Replicate reports this condition as a warning only. There is no impact on task execution or data integrity:
This warning can be safely ignored unless accompanied by other errors or abnormal task behavior.
On the IBM DB2 for iSeries side, 'Check if a receiver has been detached' can occur if, for example, the process is holding or locking the journal. This temporarily prevents the system from creating or attaching the next journal receiver. In such cases, a receiver number may be allocated but never successfully created, resulting in a gap in the receiver numbering.
This behavior is normal on IBM i and does not indicate a defect. The system assigns journal receiver numbers, but sequential continuity is not guaranteed. IBM i only guarantees that receiver numbers increase monotonically, not that every number will exist.
00420963, 00423959