<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Kafka Partition strategy by primary key in Qlik Replicate</title>
    <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894038#M1873</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/138350"&gt;@Yves_s&lt;/a&gt;&amp;nbsp; ,&lt;/P&gt;&lt;P&gt;Excuse me I still did not learn it exactly.&amp;nbsp;Let's change a way to confirm the question.&lt;/P&gt;&lt;P&gt;Assuming you are using the PK (Primary Key column in source table, in your sample it's "&lt;STRONG&gt;COL1&lt;/STRONG&gt;") as the partition number in Kafka. For Insert/Delete operations the messages go to the correct partition, the UPDATE messages go to the correct partition too if the PK value was not changed. However sometimes the PK was changed (in scenario 3 below):&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;1. in Full Load, the row goes to partition "1" (because of "&lt;STRONG&gt;COL1&lt;/STRONG&gt;" value is "1")&lt;/LI&gt;&lt;LI&gt;2. In CDC Update where PK is NOT changed, the row goes to partition "1" (because of "&lt;STRONG&gt;COL1&lt;/STRONG&gt;" value is "1")&lt;/LI&gt;&lt;LI&gt;3. In CDC Update is PK changed, the row goes to partition "2". (because of "&lt;STRONG&gt;COL1&lt;/STRONG&gt;" value is "2")&lt;BR /&gt;(assume you updated the PK like: "update tbl set COL1= 2 where COL1= 1")&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;By default the update operation message in scenario (3) will go to the partition "2", however you want it goes to partition "1" still. Let me know if is this what you are looking for?&lt;/P&gt;&lt;P&gt;thank you,&lt;/P&gt;&lt;P&gt;John.&lt;/P&gt;</description>
    <pubDate>Thu, 17 Feb 2022 02:25:33 GMT</pubDate>
    <dc:creator>john_wang</dc:creator>
    <dc:date>2022-02-17T02:25:33Z</dc:date>
    <item>
      <title>Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1892987#M1852</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hello, &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I would like to have information when partitioning for kafka.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;when choosing the Partition strategy:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;- By message key&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;and&amp;nbsp;&lt;SPAN&gt;Message key:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;-&amp;nbsp;&lt;SPAN&gt;Primary key columns&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;In the case of an update,&amp;nbsp;the primary key values ​​used are :&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1) after modification (in the data object)&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;2)before modification (in the beforedata object)&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;and is it possible to choose between the 2 possibilities&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thank you in advance for your answer&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;yves savean&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Feb 2022 10:19:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1892987#M1852</guid>
      <dc:creator>Yves_s</dc:creator>
      <dc:date>2022-02-15T10:19:10Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1893751#M1859</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/138350"&gt;@Yves_s&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I'm not very sure if I understood the question exactly.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Are you trying to use "&lt;SPAN&gt;Primary key columns" value to decide which&amp;nbsp;&lt;/SPAN&gt;partition a message go ? You are asking if the partition number can be decide by the Primary Key BeforeImage value, or the Primary Key AfterImage value?&lt;SPAN&gt;&amp;nbsp;if this is your question then we have options. If not, then please let me know what's the exact question, a sample is helpful I think.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Regards,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;John.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 14:02:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1893751#M1859</guid>
      <dc:creator>john_wang</dc:creator>
      <dc:date>2022-02-16T14:02:03Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1893775#M1860</link>
      <description>&lt;P&gt;When we write an avro message in the topic, we get a message like:&lt;/P&gt;
&lt;LI-CODE lang="java"&gt;{
	"type": "record",
	"name": "DataRecord",
	"namespace": "com.attunity.queue.msg.TABLE",
	"fields": [{
		"name": "data",
		"type": {
			"type": "record",
			"name": "Data",
			"fields": [{
				"name": "COL1",
				"type": ["null", "string"],
				"default": null
			}, {
				"name": "COL2",
				"type": ["null", {
					"type": "bytes",
					"precision": 5,
					"scale": 0,
					"logicalType": "decimal"
				}],
				"default": null
			}]
		}
	}, {
		"name": "beforeData",
		"type": ["null", "Data"],
		"default": null
	}, {
		"name": "headers",
		"type": {
			"type": "record",
			"name": "Headers",
			"namespace": "com.attunity.queue.msg",
			"fields": [{
				"name": "operation",
				"type": {
					"type": "enum",
					"name": "operation",
					"symbols": ["INSERT", "UPDATE", "DELETE", "REFRESH"]
				}
			}, {
				"name": "changeSequence",
				"type": "string"
			}, {
				"name": "timestamp",
				"type": "string"
			}, {
				"name": "streamPosition",
				"type": "string"
			}, {
				"name": "transactionId",
				"type": "string"
			}, {
				"name": "changeMask",
				"type": ["null", "bytes"],
				"default": null
			}, {
				"name": "columnMask",
				"ty": ["null", "bytes"],
				"default": null
			}, {
				"name": "transactionEventCounter",
				"type": ["null", "long"],
				"default": null
			}, {
				"name": "transactionLastEvent",
				"type": ["null", "boolean"],
				"default": null
			}]
		}
	}]
}&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;the "beforeData" section&amp;nbsp;&amp;nbsp;&lt;SPAN&gt;is complete&amp;nbsp;in the case of an update.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;my question is: &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;when we partition by primary key , in this exemple the primary key is COL1.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;In insertion and deletion the value used&amp;nbsp;&lt;SPAN&gt;is the value found in the "data" section&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;But in update&amp;nbsp;the value used is&amp;nbsp; found in the "beforeData" section ? Or is always the value found in the&amp;nbsp;"data" section?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I think in case it is not a real primary key&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1) insert was launch&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;COL1 = 1 in data.COL1 -&amp;gt; partition 1 of topic&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;2) update was launch&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;COL1=2(data.COL1) -&amp;gt;(other partition)&amp;nbsp; but in beforeData.COL1= 1&amp;nbsp;-&amp;gt; partition 1 of topic&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;i hope i was clear&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 14:29:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1893775#M1860</guid>
      <dc:creator>Yves_s</dc:creator>
      <dc:date>2022-02-16T14:29:35Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894038#M1873</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/138350"&gt;@Yves_s&lt;/a&gt;&amp;nbsp; ,&lt;/P&gt;&lt;P&gt;Excuse me I still did not learn it exactly.&amp;nbsp;Let's change a way to confirm the question.&lt;/P&gt;&lt;P&gt;Assuming you are using the PK (Primary Key column in source table, in your sample it's "&lt;STRONG&gt;COL1&lt;/STRONG&gt;") as the partition number in Kafka. For Insert/Delete operations the messages go to the correct partition, the UPDATE messages go to the correct partition too if the PK value was not changed. However sometimes the PK was changed (in scenario 3 below):&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;1. in Full Load, the row goes to partition "1" (because of "&lt;STRONG&gt;COL1&lt;/STRONG&gt;" value is "1")&lt;/LI&gt;&lt;LI&gt;2. In CDC Update where PK is NOT changed, the row goes to partition "1" (because of "&lt;STRONG&gt;COL1&lt;/STRONG&gt;" value is "1")&lt;/LI&gt;&lt;LI&gt;3. In CDC Update is PK changed, the row goes to partition "2". (because of "&lt;STRONG&gt;COL1&lt;/STRONG&gt;" value is "2")&lt;BR /&gt;(assume you updated the PK like: "update tbl set COL1= 2 where COL1= 1")&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;By default the update operation message in scenario (3) will go to the partition "2", however you want it goes to partition "1" still. Let me know if is this what you are looking for?&lt;/P&gt;&lt;P&gt;thank you,&lt;/P&gt;&lt;P&gt;John.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2022 02:25:33 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894038#M1873</guid>
      <dc:creator>john_wang</dc:creator>
      <dc:date>2022-02-17T02:25:33Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894107#M1875</link>
      <description>&lt;P&gt;It's exactly that.&amp;nbsp;how i can keep partition "1" if the "fake primary key" changes&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2022 07:27:30 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894107#M1875</guid>
      <dc:creator>Yves_s</dc:creator>
      <dc:date>2022-02-17T07:27:30Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894331#M1877</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/138350"&gt;@Yves_s&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Well, you can use below steps to control the messages go to a desired partition:&lt;/P&gt;&lt;P&gt;1. In the table setting, add a new column, its name is "$partition", type is default string(50), as&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="john_wang_0-1645106144780.png" style="width: 400px;"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/72538iC5DA6CE03C32676E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="john_wang_0-1645106144780.png" alt="john_wang_0-1645106144780.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;2. Press the "fx" and input the expression as "ifnull($BI__part,$part)", where "part" is the column name which you want to use it to control the partition number.&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="john_wang_1-1645106286797.png" style="width: 400px;"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/72539i4DC955BF459AC1C9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="john_wang_1-1645106286797.png" alt="john_wang_1-1645106286797.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; In your example, if you want to use "COL1" as the partition divider, then the expression should be:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ifnull($BI__COL1,$COL1)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; Take note this is a sample only, it can be used for several partitions only (for example the topic have 5 partitions, and the COL1 values are 0~4 only). If you have a lot of partitions then you need modify the expression to fit your env.&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; BTW, the "ifnull" is used for Full Load stage. in FL the $BI__COL1 is null, in this case we use the column value $COL1.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;John.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2022 14:05:38 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894331#M1877</guid>
      <dc:creator>john_wang</dc:creator>
      <dc:date>2022-02-17T14:05:38Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka Partition strategy by primary key</title>
      <link>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894643#M1878</link>
      <description>&lt;P&gt;it's perfect thanks&lt;/P&gt;</description>
      <pubDate>Fri, 18 Feb 2022 07:01:51 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Qlik-Replicate/Kafka-Partition-strategy-by-primary-key/m-p/1894643#M1878</guid>
      <dc:creator>Yves_s</dc:creator>
      <dc:date>2022-02-18T07:01:51Z</dc:date>
    </item>
  </channel>
</rss>

