Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
ruili
Contributor II
Contributor II

Handling LOB columns w/o PK

I'm trying to replicate a table from MS SQL Server to SF, and setting the task to be Full Load only. This table has an LOB column and it has no PK.

Per the post attached below: "A full load-only task can get around a limitation with LOB columns that are in a table with no primary key.", I followed the post to set the task so that "Full Load Processing is ON", but the LOB column is still skipped when loading.
https://community.qlik.com/t5/Official-Support-Articles/An-Introduction-to-Qlik-Replicate-Tasks-Full...

Could you please kindly advise if this is not a valid walkaround, and the only solution is to add PK to the source table? 

Labels (1)
1 Solution

Accepted Solutions
john_wang
Support
Support

Hello @ruili ,

I used the same table in SQL Server source DB, replicate it to Snowflake , the behavior cannot be reproduced in my labs still.

In the Full Load ONLY task, the small LOBs will be replicated 'inline'. 

create table testlobnopk (id integer,name char(20),notes varchar(max));

During the Full Load, the CLOB is replicated by inline mode. The target table creation was done by Replicate. From task log files we see:

2024-03-26T10:27:14:628072 [SOURCE_UNLOAD ]V: Execute: 'SELECT [id],[name],[notes] FROM [dbo].[testlobnopk]' (ar_odbc_stmt.c:2784)

2024-03-26T10:27:15:659174 [TARGET_LOAD ]T: Create table statement: CREATE TABLE "dbo"."testlobnopk" ( "id" INTEGER, "name" VARCHAR(20), "notes" VARCHAR(8192) ) (provider_syntax_manager.c:1790)

The CLOB is replicated to Snowflake successfully.

The table in Snowflake:

john_wang_0-1711421349448.png

john_wang_1-1711421366825.png

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

12 Replies
Heinvandenheuvel
Specialist III
Specialist III

@ruili  wrote  "but the LOB column is still skipped when loading."

Where there any messages referring to LOB in the reptask log? Notably that message in the link you nicely provided? "Column 'MyLob' was removed from table definition dbo.No_PK_LOB': the column data type is LOB and the table has no primary key or unique index"

LOB are supposed to work for Full-load on non-PK tables, but they still have to be enabled. Was that done?

TASK SETTINGS --> Meta Data --> Target Metadata  -->  [X] Replicate LOB columns

LOB's are not supported for no-PK CDC tasks because the way Replicate decided to implement lobs is by NOT fetching them from the Tx log, NOT storing them in memory, or on disk whilst waiting for commit, by NOT storing them for say an insert and then optimize them away by an update in the same TX, by NOT loading them from the Transaction Log only to find out the row is filtered out.

Instead the lobs are fetched (with rules as per task metadata settings) during the final target apply stage using a singleton lookup (select) by PK for (all the) lobs for each row about to be send to target.  I'm not saying this implementation choice was the only one or the best one, but indicating how it works today and thus why a PK is needed.

Hein.

john_wang
Support
Support

Hello @ruili ,

Besides @Heinvandenheuvel comment, in a Full Load ONLY task, the small LOBs will be replicated 'inline'. I've conducted a quick test in my labs, the below table:

create table testlobnopk (id integer,name char(20),notes varchar(max));

During the Full Load, the CLOB is replicated by inline mode rather than dropping it, or via lookup call.

See the unload SQL:

2024-03-19T10:59:03:451159 [SOURCE_UNLOAD ]V: Execute: 'SELECT [id],[name],[notes] FROM [dbo].[testlobnopk]'

The CLOB is replicated to Oracle successfully.

BTW, "SF" means Snowflake in your scenario (rather than "Salesforce"), is that correct?

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
ruili
Contributor II
Contributor II
Author

Hi Hein, 

Thanks for looking into this, yes, I set up the task exactly as you posted:

Metadata > Target Metadata > Replicate LOB Columns checked > Limit LOB size to (KB) 16

Full Load > Full Load Settings > Full Load Processing is ON > Target Table Preparation "DROP and CREATE table"

I fully understand that PK is needed for CDC, but was just wondering if full load should work for as a short-term temp walkaround.
I also added some info when replying to John, please kindly review that response as well.

Thanks,

Rui

ruili
Contributor II
Contributor II
Author

Hi John, 

Yes, SF means Snowflake, and the LOB column from our source MS SQL Server table is set to varchar(max) so it is a pretty large one.
And Snowflake doesn't support LOB, it is recommended to use BINARY for BLOB and VARCHAR for CLOB instead.

Could you please try to replicate the same practice with SF as the target? 

I also tried another walk around but didn't work:
Global Rules > New Rule > Transformation > Convert data type > Column name is like + Data type is: CLOB > Transformation Action - Set target data type to STRING; Subtype Regular; Length 8000

However the process didn't reach this transformation before erroring out.
BTW the error message was:
"Column 'col_name' was removed from table definition 'dbo.tbl_name': the column data type is LOB and the table has no primary key or unique index"

Thanks,

Rui

 

SushilKumar
Support
Support

Hello @ruili 

Is there any challenge to create the PK on the Source table . if you don't want to create at Source endpoint then use Replicate table selection to define the key .

Check the length of Lob in the Source database table and adjusts in the replicate.

thanks & regards,

Sushil Kumar 

john_wang
Support
Support

Hello @ruili ,

I used the same table in SQL Server source DB, replicate it to Snowflake , the behavior cannot be reproduced in my labs still.

In the Full Load ONLY task, the small LOBs will be replicated 'inline'. 

create table testlobnopk (id integer,name char(20),notes varchar(max));

During the Full Load, the CLOB is replicated by inline mode. The target table creation was done by Replicate. From task log files we see:

2024-03-26T10:27:14:628072 [SOURCE_UNLOAD ]V: Execute: 'SELECT [id],[name],[notes] FROM [dbo].[testlobnopk]' (ar_odbc_stmt.c:2784)

2024-03-26T10:27:15:659174 [TARGET_LOAD ]T: Create table statement: CREATE TABLE "dbo"."testlobnopk" ( "id" INTEGER, "name" VARCHAR(20), "notes" VARCHAR(8192) ) (provider_syntax_manager.c:1790)

The CLOB is replicated to Snowflake successfully.

The table in Snowflake:

john_wang_0-1711421349448.png

john_wang_1-1711421366825.png

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Heinvandenheuvel
Specialist III
Specialist III

@SushilKumar  " if you don't want to create at Source endpoint then use Replicate table selection to define the key ." 

I don't see how that might help. Did you try? Please elaborate, or correct.

In the User Guide under "Transform actions " item  "Set a column as a primary key/unique key or disable a column's primary key/unique key " it reads "Select the desired row in the Output table and then click the cell in the Key column. A key icon will be displayed"

If you try this on the Input side, it is only reporting what it found you cannot modify the Kkey column there

Hein.

ruili
Contributor II
Contributor II
Author

@Heinvandenheuvel Thanks for pointing it out.
@SushilKumar Plus adding pk to the table would be the best and most straightforward solution if it is feasible. However there are roadblocks/impediments resulting the no pk situation, and that why we are trying to deal with it.

ruili
Contributor II
Contributor II
Author

Hi John, 

Thanks for the advise.

I tried the same using full load mode with the change processing set to OFF, then it didn't trigger the error message, and the table was successfully loaded into Snowflake.

So looks like it is the change processing set to ON which triggers the LOB column loading issue?