Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
This forum is where any logged-in member can create a knowledge article.
This article shows you how to configure two Talend Data Catalog bridges to harvest metadata from Snowflake and trace data lineage. It covers the Snowflake Database (via JDBC) and Snowflake Database SQL DML (DI/ETL) Script (SnowSQL) Talend Data Catalog bridges.
Configure the import setup parameters for harvesting metadata from a Snowflake database in Talend Data Catalog:
Define the following parameters on the Import Setup tab: Host, Warehouse, Databases, and the User and Password for authentication.
Select the schemas to be harvested in the Schema field. This example uses the TDC_TEST, TESTSCHEMA1, and TESTSCHEMA2 schemas.
Go to the Overview tab to view an overview of the objects harvested from the Snowflake database.
This example SnowSQL script contains a create view Data Manipulation Language (DML) statement:
create view IF NOT EXISTS testschema2.v_employees ( employee_id, employee_name ) as select rvalue:employee_id::number, rvalue:employee_name::string from testschema1.employees;
You will harvest metadata from this example SnowSQL script in the steps below.
To configure the parameters for the Snowflake Database SQL DML (DI/ETL) Script (SnowSQL) bridge in Talend Data Catalog:
Define the Directory where the SnowSQL file is located on the Import Setup tab.
Add any necessary Include and Exclude filters to select the files to be harvested.
Go to the Overview tab to view the imported SnowSQL script.
Notice that after harvesting metadata (database objects and SnowSQL) using both bridges and building the configuration, the Architecture Diagram displays a bi-directional arrow, because the same Snowflake database model is the source and destination connection to the database.
Go to the Data Flow tab to view the data flow diagram for the SnowSQL script. Referring to the SnowSQL script, Talend Data Catalog creates a view on one schema (TESTSCHEMA2) with a select statement on another schema (TESTSCHEMA1).
You can also view the data flow diagram for the database object (schema level), which displays the three schemas.
Finally, you can display the data lineage for a column (EMPLOYEE_NAME).