<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to remove duplicates (rows that have the same id, keep the most recent) (&amp;quot;Distinct&amp;quot; does not work)? in App Development</title>
    <link>https://community.qlik.com/t5/App-Development/How-to-remove-duplicates-rows-that-have-the-same-id-keep-the/m-p/2498741#M102920</link>
    <description>&lt;P&gt;Hi!&lt;BR /&gt;To achieve the goal of keeping only the most recent record for each tickets.id based on the tickets.updated_at field, you can use Qlik's aggregation capabilities.&lt;/P&gt;
&lt;P&gt;Group by tickets.id and determine the maximum tickets.updated_at value for each group.&lt;BR /&gt;Use this maximum value to filter and keep only the most recent record for each tickets.id.&lt;/P&gt;
&lt;P&gt;Below is an example script to accomplish this:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;// Step 1: Load the data with an additional column for the maximum updated_at per tickets.id&lt;BR /&gt;TicketsWithMaxDate:&lt;BR /&gt;LOAD&lt;BR /&gt;[tickets.id],&lt;BR /&gt;[tickets.updated_at],&lt;BR /&gt;Max([tickets.updated_at]) AS MaxUpdatedAt&lt;BR /&gt;RESIDENT tickets&lt;BR /&gt;GROUP BY [tickets.id];&lt;/P&gt;
&lt;P&gt;// Step 2: Join this information back with the original table to filter only the most recent rows&lt;BR /&gt;MostRecentTickets:&lt;BR /&gt;LOAD&lt;BR /&gt;*&lt;BR /&gt;RESIDENT tickets&lt;BR /&gt;WHERE Exists([tickets.id]) AND Exists([tickets.updated_at], MaxUpdatedAt);&lt;/P&gt;
&lt;P&gt;// Step 3: Drop intermediate tables to clean up memory&lt;BR /&gt;DROP TABLE TicketsWithMaxDate;&lt;/P&gt;
&lt;P&gt;// Step 4: Store the deduplicated data into a .qvd file&lt;BR /&gt;STORE MostRecentTickets INTO [lib://QVDFOLDER/Test/tickets.qvd] (qvd);&lt;/P&gt;
&lt;P&gt;// Step 5: Drop temporary table&lt;BR /&gt;DROP TABLE MostRecentTickets;&lt;/P&gt;</description>
    <pubDate>Thu, 19 Dec 2024 08:23:11 GMT</pubDate>
    <dc:creator>diegozecchini</dc:creator>
    <dc:date>2024-12-19T08:23:11Z</dc:date>
    <item>
      <title>How to remove duplicates (rows that have the same id, keep the most recent) ("Distinct" does not work)?</title>
      <link>https://community.qlik.com/t5/App-Development/How-to-remove-duplicates-rows-that-have-the-same-id-keep-the/m-p/2498685#M102916</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;We have the following table that we want to save as a .qvd file:&lt;/P&gt;
&lt;PRE&gt;[tickets]:&lt;BR /&gt;LOAD &lt;BR /&gt;[id] AS [tickets.id],&lt;BR /&gt;[created_at] AS [tickets.created_at],&lt;BR /&gt;[updated_at] AS [tickets.updated_at],&lt;BR /&gt;[fr_escalated] AS [tickets.fr_escalated],&lt;BR /&gt;[spam] AS [tickets.spam],&lt;BR /&gt;[email_config_id] AS [tickets.email_config_id],&lt;BR /&gt;[group_id] AS [tickets.group_id],&lt;BR /&gt;[priority] AS [tickets.priority],&lt;BR /&gt;[requester_id] AS [tickets.requester_id],&lt;BR /&gt;[responder_id] AS [tickets.responder_id],&lt;BR /&gt;[source] AS [tickets.source],&lt;BR /&gt;[company_id] AS [tickets.company_id],&lt;BR /&gt;[status] AS [tickets.status],&lt;BR /&gt;[subject] AS [tickets.subject],&lt;BR /&gt;[association_type] AS [tickets.association_type],&lt;BR /&gt;[support_email] AS [tickets.support_email],&lt;BR /&gt;[product_id] AS [tickets.product_id],&lt;BR /&gt;[type] AS [tickets.type],&lt;BR /&gt;[due_by] AS [tickets.due_by],&lt;BR /&gt;[fr_due_by] AS [tickets.fr_due_by],&lt;BR /&gt;[is_escalated] AS [tickets.is_escalated],&lt;BR /&gt;[associated_tickets_count]AS [tickets.associated_tickets_count],&lt;BR /&gt;[internal_agent_id] AS [tickets.internal_agent_id],&lt;BR /&gt;[internal_group_id] AS [tickets.internal_group_id],&lt;BR /&gt;[nr_due_by] AS [tickets.nr_due_by],&lt;BR /&gt;[nr_escalated] AS [tickets.nr_escalated],&lt;BR /&gt;[to_emails] AS [tickets.to_emails],&lt;BR /&gt;[form_id] AS [tickets.form_id]&lt;BR /&gt;RESIDENT RestConnectorMasterTable&lt;BR /&gt;WHERE NOT IsNull([__KEY_root]);&lt;BR /&gt;&lt;BR /&gt;Store [tickets] into [lib://QVDFOLDER/Test/tickets.qvd] (qvd);&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But before saving we want to remove duplicates: rows that have the same &lt;STRONG&gt;tickets.id&lt;/STRONG&gt; (&lt;STRONG&gt;tickets.id&lt;/STRONG&gt; should be unique). &lt;SPAN class="HwtZe"&gt;&lt;SPAN class="jCAhz ChMk0b"&gt;&lt;SPAN class="ryNqvb"&gt;Also, the record with&amp;nbsp;most recent &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="jCAhz ChMk0b"&gt;&lt;SPAN class="ryNqvb"&gt;&lt;STRONG&gt;tickets.updated_at&lt;/STRONG&gt; should be kept. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="HwtZe"&gt;&lt;SPAN class="jCAhz ChMk0b"&gt;&lt;SPAN class="ryNqvb"&gt;We tried using "&lt;EM&gt;Distinct&lt;/EM&gt;" - but that doesn't remove duplicates at all (because, as far as we understand, it applies to entire rows, not just a specific field like &lt;EM&gt;tickets.id&lt;/EM&gt;).&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Do you have any suggestions on how we can resolve this?&lt;/P&gt;</description>
      <pubDate>Wed, 18 Dec 2024 23:00:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/How-to-remove-duplicates-rows-that-have-the-same-id-keep-the/m-p/2498685#M102916</guid>
      <dc:creator>RoyBatty</dc:creator>
      <dc:date>2024-12-18T23:00:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove duplicates (rows that have the same id, keep the most recent) ("Distinct" does not work)?</title>
      <link>https://community.qlik.com/t5/App-Development/How-to-remove-duplicates-rows-that-have-the-same-id-keep-the/m-p/2498741#M102920</link>
      <description>&lt;P&gt;Hi!&lt;BR /&gt;To achieve the goal of keeping only the most recent record for each tickets.id based on the tickets.updated_at field, you can use Qlik's aggregation capabilities.&lt;/P&gt;
&lt;P&gt;Group by tickets.id and determine the maximum tickets.updated_at value for each group.&lt;BR /&gt;Use this maximum value to filter and keep only the most recent record for each tickets.id.&lt;/P&gt;
&lt;P&gt;Below is an example script to accomplish this:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;// Step 1: Load the data with an additional column for the maximum updated_at per tickets.id&lt;BR /&gt;TicketsWithMaxDate:&lt;BR /&gt;LOAD&lt;BR /&gt;[tickets.id],&lt;BR /&gt;[tickets.updated_at],&lt;BR /&gt;Max([tickets.updated_at]) AS MaxUpdatedAt&lt;BR /&gt;RESIDENT tickets&lt;BR /&gt;GROUP BY [tickets.id];&lt;/P&gt;
&lt;P&gt;// Step 2: Join this information back with the original table to filter only the most recent rows&lt;BR /&gt;MostRecentTickets:&lt;BR /&gt;LOAD&lt;BR /&gt;*&lt;BR /&gt;RESIDENT tickets&lt;BR /&gt;WHERE Exists([tickets.id]) AND Exists([tickets.updated_at], MaxUpdatedAt);&lt;/P&gt;
&lt;P&gt;// Step 3: Drop intermediate tables to clean up memory&lt;BR /&gt;DROP TABLE TicketsWithMaxDate;&lt;/P&gt;
&lt;P&gt;// Step 4: Store the deduplicated data into a .qvd file&lt;BR /&gt;STORE MostRecentTickets INTO [lib://QVDFOLDER/Test/tickets.qvd] (qvd);&lt;/P&gt;
&lt;P&gt;// Step 5: Drop temporary table&lt;BR /&gt;DROP TABLE MostRecentTickets;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Dec 2024 08:23:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/How-to-remove-duplicates-rows-that-have-the-same-id-keep-the/m-p/2498741#M102920</guid>
      <dc:creator>diegozecchini</dc:creator>
      <dc:date>2024-12-19T08:23:11Z</dc:date>
    </item>
  </channel>
</rss>

