Solved: sql server to snowflake error - Qlik Community

Anonymous · ‎2018-08-02

Hello,

I am attempting to copy several tables from sql server to snowflake. I am running talend locally. I have done this successfully for several other tables, I believe the table is just too large. Can you tell me if there is a setting I should change, if this would be corrected by resizing the snowflake dw, or if there is a technique I can use to break the file up easily? I can see that the files are all transferred successfully, so I am not sure if it is really a memory error of if it just times out.

The error is as follows.

Aug 02, 2018 1:50:37 PM net.snowflake.client.core.SFStatement executeFileTransfer
INFO: Completed transferring data
Exception in thread "ProcessQueueThread" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at org.codehaus.jackson.util.TextBuffer.contentsAsString(TextBuffer.java:343)
at org.codehaus.jackson.impl.ReaderBasedParser.getText(ReaderBasedParser.java:202)
at org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:203)
at org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:224)
at org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:200)
at org.codehaus.jackson.map.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:58)
at org.codehaus.jackson.map.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15)
at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2704)
at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1344)
at org.apache.avro.Schema$Parser.parse(Schema.java:1032)
at org.apache.avro.Schema$Parser.parse(Schema.java:1020)
at org.talend.daikon.properties.property.SchemaProperty.getValue(SchemaProperty.java:61)
at org.talend.daikon.properties.property.SchemaProperty.getValue(SchemaProperty.java:34)
at org.talend.components.snowflake.runtime.SnowflakeResultListener.addError(SnowflakeResultListener.java:72)
at net.snowflake.client.loader.ProcessQueue.run(ProcessQueue.java:173)
at java.lang.Thread.run(Unknown Source)
Aug 02, 2018 3:53:41 PM net.snowflake.client.core.SFStatement execute
INFO: execute: commit
Aug 02, 2018 3:53:42 PM net.snowflake.client.core.SFStatement execute
INFO: execute: rollback
Aug 02, 2018 3:53:42 PM net.snowflake.client.loader.StreamLoader finish
WARNING: Failed to rollback
Aug 02, 2018 3:53:42 PM net.snowflake.client.loader.StreamLoader finish
WARNING: Execute After SQL failed to run: null
net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowException(SnowflakeUtil.java:99)
at net.snowflake.client.core.SessionUtil.renewSession(SessionUtil.java:1188)
at net.snowflake.client.core.SFSession.renewSession(SFSession.java:513)
at net.snowflake.client.core.SFStatement.executeHelper(SFStatement.java:381)
at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:197)
at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:149)
at net.snowflake.client.core.SFStatement.execute(SFStatement.java:531)
at net.snowflake.client.jdbc.SnowflakeStatementV1.executeInternal(SnowflakeStatementV1.java:204)
at net.snowflake.client.jdbc.SnowflakeStatementV1.execute(SnowflakeStatementV1.java:239)
at net.snowflake.client.loader.StreamLoader.finish(StreamLoader.java:536)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:222)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:46)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.tDBInput_15Process(sql_to_snowflake.java:3604)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.runJobInTOS(sql_to_snowflake.java:4467)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.main(sql_to_snowflake.java:4121)

Exception in component tDBOutput_21 (sql_to_snowflake)
java.io.IOException: net.snowflake.client.loader.Loader$ConnectionError: net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:224)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:46)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.tDBInput_15Process(sql_to_snowflake.java:3604)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.runJobInTOS(sql_to_snowflake.java:4467)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.main(sql_to_snowflake.java:4121)
Caused by: net.snowflake.client.loader.Loader$ConnectionError: net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at net.snowflake.client.loader.StreamLoader.finish(StreamLoader.java:547)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:222)
... 4 more
Caused by: net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowException(SnowflakeUtil.java:99)
at net.snowflake.client.core.SessionUtil.renewSession(SessionUtil.java:1188)
at net.snowflake.client.core.SFSession.renewSession(SFSession.java:513)
at net.snowflake.client.core.SFStatement.executeHelper(SFStatement.java:381)
at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:197)
at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:149)
at net.snowflake.client.core.SFStatement.execute(SFStatement.java:531)
at net.snowflake.client.jdbc.SnowflakeStatementV1.executeInternal(SnowflakeStatementV1.java:204)
at net.snowflake.client.jdbc.SnowflakeStatementV1.execute(SnowflakeStatementV1.java:239)
at net.snowflake.client.loader.StreamLoader.finish(StreamLoader.java:536)
... 5 more

Anonymous · ‎2018-08-03

View solution in original post

Anonymous · ‎2018-08-03

Hello,

Could you please post your job design screenshots on forum which will be helpful for us to understand your work flow?

Best regards

Sabrina

Anonymous · ‎2018-08-03

Anonymous · ‎2018-08-05

Hi,

Could you please add a tmap in between two components with "store on disk" activated by giving "Temp data directory path". The current issue is because your system memory got exhausted due to large number of incoming records.

This approach will make sure that the memory will not be exhausted due to incoming record volume.

Could you please mark the topic as resolved if the approach has helped to fix your issue?

Warm Regards,

Nikhil Thampi

Anonymous · ‎2018-08-06

I tested out snowflake for 6 months. Loved it. Redshift is ok until it's time to increase storage/nodes. Took 3 days for a new cluster to come up. During that time, we had to basically pause ingestion. That problem doesn't exist for Snowflake. Never had to worry about capacity. Was able to have separate clusters for ingestion and analytics. We had 100 TB of data we wanted to test out on Snowflake. We spun up their largest cluster size and were able to load it in about 20 hours then shut it down. We spun up smaller clusters for analytics after. Shutting down clusters after 30 mins of idle time was awesome. Auto wake on query is nice too. Json support in Snowflake is so much better than Redshift. Granted, it's been 2 years since I last used redshift so things may have gotten better over there. Query times were impressive also. Beat our Teradata queries easily too. Their support was great too. Redshift support was kinda spotty. We had to do lots of workload management with Redshift to prevent queries from getting blocked. Just spin up more clusters in snowflake if you run into concurrency issues. Each business unit or product can have their own cluster. All working on the same data. Cloning 100 TB of data to Dev literally took seconds. And they are completely separate. Changes in dev don't affect prod at all. One of the best features. Sadly we didn't have a product/unit willing to pay for it. They're much cheaper than redshift also.

Anonymous · ‎2018-08-07

sorry this took so long. perfect solution, worked great!

Anonymous · ‎2018-08-07

Hi Max,

Thank you for the information. I have been pleasantly surprised with it thus far. Coming from a hadoop platform, the ease of management is fantastic. As a former DBA, I like that it is an MPP with many of the built-in RDBMS features. I seems like getting people up to speed on it will be far shorter and easier than hadoop.

Anonymous · ‎2018-08-17

High level steps:

Install the Snowflake ODBC driver.
Configure the system DSN for Snowflake.
Configure the linked server provider.
Configure the linked server.
Test the created linked server.
Sub steps for high level steps:

Install the Snowflake ODBC driver:
Obtain the ODBC 32-bit or 64-bit driver for Windows from Support.
Launch the installation wizard by executing the installation .exe file. Follow the instruction on the dialog boxes.
Configure a System DSN for Snowflake:
Launch ODBC Data Source Administrator located at "Control Panel\System and Security\Administrative Tools"
Click on the system DSN tab and configure the Snowflake data source. A sample data source configuration is shown below:

Anonymous · ‎2018-09-28

@nthampi wrote:

Hi,

Could you please add a tmap in between two components with "store on disk" activated by giving "Temp data directory path". The current issue is because your system memory got exhausted due to large number of incoming records.

This approach will make sure that the memory will not be exhausted due to incoming record volume.

redbox tv

Could you please mark the topic as resolved if the approach has helped to fix your issue?

Warm Regards,

Nikhil Thampi

Hello Nikhil Thampi,

Thanks for making or updating this thread. It's helpful.

Regards, James

Anonymous · ‎2018-09-28

I tested out snowflake for 6 months. Loved it. Redshift is ok until it's time to increase storage/nodes. Took 3 days for a new cluster to come up. During that time, we had to basically pause ingestion. That problem doesn't exist for Snowflake. Never had to worry about capacity. Was able to have separate clusters for ingestion and analytics. We had 100 TB of data we wanted to test out on Snowflake. We spun up their largest cluster size and were able to load it in about 20 hours then shut it down. We spun up smaller clusters for analytics after. Shutting down clusters after 30 mins of idle time was awesome. Auto wake on query is nice too. Json support in Snowflake is so much better than Redshift. Granted, it's been 2 years since I last used redshift so things may have gotten better over there. Query times were impressive also. Beat our Teradata queries easily too. Their support was great too. Redshift support was kinda spotty. We had to do lots of workload management with Redshift to prevent queries from getting blocked. Just spin up more clusters in snowflake if you run into concurrency issues. Each business unit or product can have their own cluster. All working on the same data. Cloning 100 TB of data to Dev literally took seconds. And they are completely separate. Changes in dev don't affect prod at all. One of the best features. Sadly we didn't have a product/unit willing to pay for it. They're much cheaper than redshift also.

sql server to snowflake error

AVRO

Java

JDBC

JSON

Snowflake

Talend Data Integration

v7.x