<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tOracleSCDELT and duplicate rows in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347411#M114669</link>
    <description>&lt;P&gt;Hi Sabrina,&lt;/P&gt; 
&lt;P&gt;I managed to find the root cause - the issue is with how we were passing data over to the component and the expected behaviour of it, rather than the component itself.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;As we were passing in multiple change rows to the component where it expected a single row to be present, it was inserting the whole block of change rows instead of processing them one by one. We've now modified the source system to only provide the single most recent row when there's an update.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;For anyone stumbling across the same topic - the SCD components pull from your source table by selecting the following:&lt;/P&gt; 
&lt;PRE&gt;SELECT * FROM sourceTable 
WHERE sourceTable.ID = targetTable.ID
AND ((sourceTable.col1 &amp;lt;&amp;gt; targetTable.col1) OR (sourceTable.col2 &amp;lt;&amp;gt; targetTable.col2) OR ... )&lt;/PRE&gt; 
&lt;P&gt;This means if there's more than one change row for a given ID in your source table, the whole set will be selected and inserted.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Version is Talend 6.2 Big Data.&lt;/P&gt;</description>
    <pubDate>Wed, 06 Dec 2017 08:38:36 GMT</pubDate>
    <dc:creator>A_Froggatt</dc:creator>
    <dc:date>2017-12-06T08:38:36Z</dc:date>
    <item>
      <title>tOracleSCDELT and duplicate rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347409#M114667</link>
      <description>&lt;P&gt;Can anyone shed some light on how the Oracle SCD components perform their updates; is it as a set of rows or row by row?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;We're encountering an issue where if there are multiple rows for the same ID in a source table - where the non-key fields differ - the whole set of rows is inserted as 'current', rather than just the final row. For this data set it's more than likely that it can be updated more than once a day and we'd like to retain that history, however we're currently getting a massive rate of duplication.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;For example, if we have a dimension table with the following already in place:&amp;nbsp;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Dimension Start" style="width: 767px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lroo.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/142339iBE8784311218731E/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lroo.png" alt="0683p000009Lroo.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Dimension Start&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;Then run a tOracleSCDELT component on this data in staging, with UNIQUE_ID as the Source Key,&amp;nbsp;and the rest Type 2 columns:&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Staging Start" style="width: 370px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lrq5.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/153483i0637FAEDEE80C5C6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lrq5.png" alt="0683p000009Lrq5.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Staging Start&lt;/SPAN&gt;&lt;/SPAN&gt;We &lt;STRONG&gt;expected&lt;/STRONG&gt; the following result:&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Dimension Expected" style="width: 768px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrgL.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/141894i0D3ADBE98F1FC53E/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrgL.png" alt="0683p000009LrgL.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Dimension Expected&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;However the &lt;STRONG&gt;actual&lt;/STRONG&gt; result is:&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Dimension Actual" style="width: 771px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lrcs.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/132112i93E8E3CD9A682036/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lrcs.png" alt="0683p000009Lrcs.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Dimension Actual&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 09:00:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347409#M114667</guid>
      <dc:creator>A_Froggatt</dc:creator>
      <dc:date>2024-11-16T09:00:22Z</dc:date>
    </item>
    <item>
      <title>Re: tOracleSCDELT and duplicate rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347410#M114668</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;Could you please indicate on which build version you got this issue?&lt;/P&gt;
&lt;P&gt;Best regards&lt;/P&gt;
&lt;P&gt;Sabrina&lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 08:15:33 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347410#M114668</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-12-06T08:15:33Z</dc:date>
    </item>
    <item>
      <title>Re: tOracleSCDELT and duplicate rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347411#M114669</link>
      <description>&lt;P&gt;Hi Sabrina,&lt;/P&gt; 
&lt;P&gt;I managed to find the root cause - the issue is with how we were passing data over to the component and the expected behaviour of it, rather than the component itself.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;As we were passing in multiple change rows to the component where it expected a single row to be present, it was inserting the whole block of change rows instead of processing them one by one. We've now modified the source system to only provide the single most recent row when there's an update.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;For anyone stumbling across the same topic - the SCD components pull from your source table by selecting the following:&lt;/P&gt; 
&lt;PRE&gt;SELECT * FROM sourceTable 
WHERE sourceTable.ID = targetTable.ID
AND ((sourceTable.col1 &amp;lt;&amp;gt; targetTable.col1) OR (sourceTable.col2 &amp;lt;&amp;gt; targetTable.col2) OR ... )&lt;/PRE&gt; 
&lt;P&gt;This means if there's more than one change row for a given ID in your source table, the whole set will be selected and inserted.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Version is Talend 6.2 Big Data.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 08:38:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tOracleSCDELT-and-duplicate-rows/m-p/2347411#M114669</guid>
      <dc:creator>A_Froggatt</dc:creator>
      <dc:date>2017-12-06T08:38:36Z</dc:date>
    </item>
  </channel>
</rss>

