Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

sql server to snowflake error

Hello,

 

I am attempting to copy several tables from sql server to snowflake.  I am running talend locally.  I have done this successfully for several other tables, I believe the table is just too large.  Can you tell me if there is a setting I should change, if this would be corrected by resizing the snowflake dw, or if there is a technique I can use to break the file up easily?  I can see that the files are all transferred successfully, so I am not sure if it is really a memory error of if it just times out.

 

The error is as follows.

 

Aug 02, 2018 1:50:37 PM net.snowflake.client.core.SFStatement executeFileTransfer
INFO: Completed transferring data
Exception in thread "ProcessQueueThread" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at org.codehaus.jackson.util.TextBuffer.contentsAsString(TextBuffer.java:343)
at org.codehaus.jackson.impl.ReaderBasedParser.getText(ReaderBasedParser.java:202)
at org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:203)
at org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:224)
at org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:200)
at org.codehaus.jackson.map.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:58)
at org.codehaus.jackson.map.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15)
at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2704)
at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1344)
at org.apache.avro.Schema$Parser.parse(Schema.java:1032)
at org.apache.avro.Schema$Parser.parse(Schema.java:1020)
at org.talend.daikon.properties.property.SchemaProperty.getValue(SchemaProperty.java:61)
at org.talend.daikon.properties.property.SchemaProperty.getValue(SchemaProperty.java:34)
at org.talend.components.snowflake.runtime.SnowflakeResultListener.addError(SnowflakeResultListener.java:72)
at net.snowflake.client.loader.ProcessQueue.run(ProcessQueue.java:173)
at java.lang.Thread.run(Unknown Source)
Aug 02, 2018 3:53:41 PM net.snowflake.client.core.SFStatement execute
INFO: execute: commit
Aug 02, 2018 3:53:42 PM net.snowflake.client.core.SFStatement execute
INFO: execute: rollback
Aug 02, 2018 3:53:42 PM net.snowflake.client.loader.StreamLoader finish
WARNING: Failed to rollback
Aug 02, 2018 3:53:42 PM net.snowflake.client.loader.StreamLoader finish
WARNING: Execute After SQL failed to run: null
net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowException(SnowflakeUtil.java:99)
at net.snowflake.client.core.SessionUtil.renewSession(SessionUtil.java:1188)
at net.snowflake.client.core.SFSession.renewSession(SFSession.java:513)
at net.snowflake.client.core.SFStatement.executeHelper(SFStatement.java:381)
at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:197)
at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:149)
at net.snowflake.client.core.SFStatement.execute(SFStatement.java:531)
at net.snowflake.client.jdbc.SnowflakeStatementV1.executeInternal(SnowflakeStatementV1.java:204)
at net.snowflake.client.jdbc.SnowflakeStatementV1.execute(SnowflakeStatementV1.java:239)
at net.snowflake.client.loader.StreamLoader.finish(StreamLoader.java:536)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:222)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:46)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.tDBInput_15Process(sql_to_snowflake.java:3604)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.runJobInTOS(sql_to_snowflake.java:4467)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.main(sql_to_snowflake.java:4121)

Exception in component tDBOutput_21 (sql_to_snowflake)
java.io.IOException: net.snowflake.client.loader.Loader$ConnectionError: net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:224)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:46)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.tDBInput_15Process(sql_to_snowflake.java:3604)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.runJobInTOS(sql_to_snowflake.java:4467)
at local_project.sql_to_snowflake_0_1.sql_to_snowflake.main(sql_to_snowflake.java:4121)
Caused by: net.snowflake.client.loader.Loader$ConnectionError: net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at net.snowflake.client.loader.StreamLoader.finish(StreamLoader.java:547)
at org.talend.components.snowflake.runtime.SnowflakeWriter.close(SnowflakeWriter.java:222)
... 4 more
Caused by: net.snowflake.client.jdbc.SnowflakeSQLException: Authentication token has expired. The user must authenticate again.
at net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowException(SnowflakeUtil.java:99)
at net.snowflake.client.core.SessionUtil.renewSession(SessionUtil.java:1188)
at net.snowflake.client.core.SFSession.renewSession(SFSession.java:513)
at net.snowflake.client.core.SFStatement.executeHelper(SFStatement.java:381)
at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:197)
at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:149)
at net.snowflake.client.core.SFStatement.execute(SFStatement.java:531)
at net.snowflake.client.jdbc.SnowflakeStatementV1.executeInternal(SnowflakeStatementV1.java:204)
at net.snowflake.client.jdbc.SnowflakeStatementV1.execute(SnowflakeStatementV1.java:239)
at net.snowflake.client.loader.StreamLoader.finish(StreamLoader.java:536)
... 5 more

Labels (7)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

17 Replies
Anonymous
Not applicable
Author

Hello,

Could you please post your job design screenshots on forum which will be helpful for us to understand your work flow?

Best regards

Sabrina

Anonymous
Not applicable
Author

0683p000009LyqJ.png

Anonymous
Not applicable
Author

Hi,

 

     Could you please add a tmap in between two components with "store on disk" activated by giving "Temp data directory path". The current issue is because your system memory got exhausted due to large number of incoming records.

 

     This approach will make sure that the memory will not be exhausted due to incoming record volume.

0683p000009Lz4Y.png

 

 

0683p000009Lz4d.png

 

Could you please mark the topic as resolved if the approach has helped to fix your issue?

 

Warm Regards,

 

Nikhil Thampi

Anonymous
Not applicable
Author

I tested out snowflake for 6 months. Loved it. Redshift is ok until it's time to increase storage/nodes. Took 3 days for a new cluster to come up. During that time, we had to basically pause ingestion. That problem doesn't exist for Snowflake. Never had to worry about capacity. Was able to have separate clusters for ingestion and analytics. We had 100 TB of data we wanted to test out on Snowflake. We spun up their largest cluster size and were able to load it in about 20 hours then shut it down. We spun up smaller clusters for analytics after. Shutting down clusters after 30 mins of idle time was awesome. Auto wake on query is nice too. Json support in Snowflake is so much better than Redshift. Granted, it's been 2 years since I last used redshift so things may have gotten better over there. Query times were impressive also. Beat our Teradata queries easily too. Their support was great too. Redshift support was kinda spotty. We had to do lots of workload management with Redshift to prevent queries from getting blocked. Just spin up more clusters in snowflake if you run into concurrency issues. Each business unit or product can have their own cluster. All working on the same data. Cloning 100 TB of data to Dev literally took seconds. And they are completely separate. Changes in dev don't affect prod at all. One of the best features. Sadly we didn't have a product/unit willing to pay for it. They're much cheaper than redshift also.

 

Anonymous
Not applicable
Author

sorry this took so long. perfect solution, worked great!
Anonymous
Not applicable
Author

Hi Max,

 

Thank you for the information.  I have been pleasantly surprised with it thus far.  Coming from a hadoop platform, the ease of management is fantastic.  As a former DBA, I like that it is an MPP with many of the built-in RDBMS features.  I seems like getting people up to speed on it will be far shorter and easier than hadoop.  

Anonymous
Not applicable
Author

High level steps:

Install the Snowflake ODBC driver.
Configure the system DSN for Snowflake.
Configure the linked server provider.
Configure the linked server.
Test the created linked server.
Sub steps for high level steps:

Install the Snowflake ODBC driver:
Obtain the ODBC 32-bit or 64-bit driver for Windows from Support.
Launch the installation wizard by executing the installation .exe file. Follow the instruction on the dialog boxes.
Configure a System DSN for Snowflake:
Launch ODBC Data Source Administrator located at "Control Panel\System and Security\Administrative Tools"
Click on the system DSN tab and configure the Snowflake data source. A sample data source configuration is shown below:
Anonymous
Not applicable
Author


@nthampi wrote:

Hi,

 

     Could you please add a tmap in between two components with "store on disk" activated by giving "Temp data directory path". The current issue is because your system memory got exhausted due to large number of incoming records.

 

     This approach will make sure that the memory will not be exhausted due to incoming record volume.

0683p000009Lz4Y.png

                                                                          redbox tv

0683p000009Lz4d.png

 

Could you please mark the topic as resolved if the approach has helped to fix your issue?

 

Warm Regards,

 

Nikhil Thampi


 

Hello Nikhil Thampi,


Thanks for making or updating this thread. It's helpful. 0683p000009MACn.png

Regards, James

Anonymous
Not applicable
Author

I tested out snowflake for 6 months. Loved it. Redshift is ok until it's time to increase storage/nodes. Took 3 days for a new cluster to come up. During that time, we had to basically pause ingestion. That problem doesn't exist for Snowflake. Never had to worry about capacity. Was able to have separate clusters for ingestion and analytics. We had 100 TB of data we wanted to test out on Snowflake. We spun up their largest cluster size and were able to load it in about 20 hours then shut it down. We spun up smaller clusters for analytics after. Shutting down clusters after 30 mins of idle time was awesome. Auto wake on query is nice too. Json support in Snowflake is so much better than Redshift. Granted, it's been 2 years since I last used redshift so things may have gotten better over there. Query times were impressive also. Beat our Teradata queries easily too. Their support was great too. Redshift support was kinda spotty. We had to do lots of workload management with Redshift to prevent queries from getting blocked. Just spin up more clusters in snowflake if you run into concurrency issues. Each business unit or product can have their own cluster. All working on the same data. Cloning 100 TB of data to Dev literally took seconds. And they are completely separate. Changes in dev don't affect prod at all. One of the best features. Sadly we didn't have a product/unit willing to pay for it. They're much cheaper than redshift also.