<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Clean a field with itself in QlikView</title>
    <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072271#M1223677</link>
    <description>&lt;P&gt;Hi Pierre,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;no they don't. MYFIELD is the result of combined cleaning actions on 3 other fields. So the table actually look like this:&amp;nbsp;&lt;/P&gt;
&lt;TABLE border="1" width="100%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="25%"&gt;Field1&lt;/TD&gt;
&lt;TD width="25%"&gt;Field2&lt;/TD&gt;
&lt;TD width="25%"&gt;Field3&lt;/TD&gt;
&lt;TD width="25%"&gt;MyField&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;I'm trying to clean further MYFIELD to achieve the result in the above example.&lt;/P&gt;
&lt;P&gt;if the current result in MyField is "TOM" for instance instead of "TOMATE" it means that "TOMATE" is never shown in the 3 other fields, but "TOM", while I, being a human, know that "TOM" stands for "TOMATE". Note that the "TOM" can be in the middle of any of the 3 fields.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks !&lt;/P&gt;</description>
    <pubDate>Wed, 17 May 2023 08:01:28 GMT</pubDate>
    <dc:creator>Delphines</dc:creator>
    <dc:date>2023-05-17T08:01:28Z</dc:date>
    <item>
      <title>Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2071477#M1223632</link>
      <description>&lt;P&gt;Hi Everyone,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a field composed of text, and I have kind of duplicates in it that I would like to eliminate. I say "kind of" because actually there is the proper text I would like to keep and abbreviations of that text I'm trying to get rid of. here an example:&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="225"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;&lt;STRONG&gt;MYFIELD&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="120px"&gt;
&lt;P&gt;&lt;STRONG&gt;EXPECTED RESULT&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;TOM&lt;/P&gt;
&lt;/TD&gt;
&lt;TD rowspan="3" width="120px"&gt;
&lt;P&gt;TOMATE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;TOMA&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;TOMATE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;CAROT&lt;/P&gt;
&lt;/TD&gt;
&lt;TD rowspan="3" width="120px"&gt;
&lt;P&gt;CAROTTE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;CAROTTE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;CARTE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;CHOUX&lt;/P&gt;
&lt;/TD&gt;
&lt;TD rowspan="2" width="120px"&gt;
&lt;P&gt;CHOUX ROUGE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="104px"&gt;
&lt;P&gt;CHOUX ROUGE&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;as you can see, I keep each time the longest description, and sometimes the abbreviation is done by taking off some letters inside the proper name, not only at the end of it.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was thinking of something like "if row1 included into row2 then row2" but I didn't succeed to find the right formula for it, and on the other hand it won't work for "CARTE" becoming "CAROTTE" in the above example.&lt;/P&gt;
&lt;P&gt;Do you have any idea of how I could achieve this?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks a lot for your help !&lt;/P&gt;</description>
      <pubDate>Mon, 15 May 2023 16:21:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2071477#M1223632</guid>
      <dc:creator>Delphines</dc:creator>
      <dc:date>2023-05-15T16:21:22Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2071585#M1223638</link>
      <description>&lt;P&gt;Hi, Try below.&lt;/P&gt;
&lt;P&gt;MapProperText: &lt;BR /&gt;Mapping LOAD * Inline [&lt;BR /&gt;ChangeFrom,ChangeTo&lt;BR /&gt;TOM,TOMATE&lt;BR /&gt;TOMATE,TOMATE&lt;BR /&gt;TOMA,TOMATE&lt;BR /&gt;TOMATE,TOMATE&lt;BR /&gt;CAROT,CAROTTE&lt;BR /&gt;CAROTTE,CAROTTE&lt;BR /&gt;CAROTTE,CAROTTE&lt;BR /&gt;CARTE,CAROTTE&lt;BR /&gt;CHOUX,CHOUX ROUGE&lt;BR /&gt;CHOUX ROUGE,CHOUX ROUGE&lt;BR /&gt;];&lt;/P&gt;
&lt;P&gt;Data:&lt;BR /&gt;LOAD * ,&lt;BR /&gt;ApplyMap('MapProperText',MYFIELD) as "Proper Name"&lt;BR /&gt;Inline [&lt;BR /&gt;MYFIELD&lt;BR /&gt;TOM&lt;BR /&gt;TOMATE&lt;BR /&gt;TOMA&lt;BR /&gt;TOMATE&lt;BR /&gt;CAROT&lt;BR /&gt;CAROTTE&lt;BR /&gt;CAROTTE&lt;BR /&gt;CARTE&lt;BR /&gt;CHOUX&lt;BR /&gt;CHOUX ROUGE&lt;BR /&gt;];&lt;/P&gt;</description>
      <pubDate>Mon, 15 May 2023 22:02:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2071585#M1223638</guid>
      <dc:creator>BrunPierre</dc:creator>
      <dc:date>2023-05-15T22:02:50Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2071712#M1223642</link>
      <description>Hi Pierre,&lt;BR /&gt;&lt;BR /&gt;Thanks for your suggestion but I don't think it'll work as it seems rather manual i.e. I'm giving the answer with the mapping table while I need something looking at the field and picking the right term each time. My left column is already the result of some cleaning actions, and actually contains about 1M rows, and the source of it is dynamic, so I can't definitely have a "manual" mapping table.&lt;BR /&gt;&lt;BR /&gt;Kind regards,&lt;BR /&gt;&lt;BR /&gt;Delphine&lt;BR /&gt;</description>
      <pubDate>Tue, 16 May 2023 06:53:18 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2071712#M1223642</guid>
      <dc:creator>Delphines</dc:creator>
      <dc:date>2023-05-16T06:53:18Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072263#M1223676</link>
      <description>&lt;P&gt;Do the values have codes or IDs to distinguish between variants?&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 08:03:07 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072263#M1223676</guid>
      <dc:creator>BrunPierre</dc:creator>
      <dc:date>2023-05-17T08:03:07Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072271#M1223677</link>
      <description>&lt;P&gt;Hi Pierre,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;no they don't. MYFIELD is the result of combined cleaning actions on 3 other fields. So the table actually look like this:&amp;nbsp;&lt;/P&gt;
&lt;TABLE border="1" width="100%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="25%"&gt;Field1&lt;/TD&gt;
&lt;TD width="25%"&gt;Field2&lt;/TD&gt;
&lt;TD width="25%"&gt;Field3&lt;/TD&gt;
&lt;TD width="25%"&gt;MyField&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;I'm trying to clean further MYFIELD to achieve the result in the above example.&lt;/P&gt;
&lt;P&gt;if the current result in MyField is "TOM" for instance instead of "TOMATE" it means that "TOMATE" is never shown in the 3 other fields, but "TOM", while I, being a human, know that "TOM" stands for "TOMATE". Note that the "TOM" can be in the middle of any of the 3 fields.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks !&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 08:01:28 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072271#M1223677</guid>
      <dc:creator>Delphines</dc:creator>
      <dc:date>2023-05-17T08:01:28Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072375#M1223681</link>
      <description>&lt;P&gt;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/209744"&gt;@Delphines&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So, your field named "Myfield" has values TOM, TOMA, TOMATE. From here you need to find the maximum length string&amp;nbsp;Is my understanding correct?&lt;/P&gt;
&lt;P&gt;Or your Field1 contains "TOM", Field2 contains "TOMA" and Field3 contains "TOMATE" and you do a concat and have all three together in "Myfield" column?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 10:46:42 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072375#M1223681</guid>
      <dc:creator>_Iswarya_</dc:creator>
      <dc:date>2023-05-17T10:46:42Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072488#M1223684</link>
      <description>&lt;P&gt;I think you could use a mapping approach but you will need some degree of manual work because you need to define the final mapping-results. This means for each possible search-value must exists a valid return-value.&lt;/P&gt;
&lt;P&gt;It's not mandatory necessary to create such mapping manually else the search-values might be derived from the final values, for example:&lt;/P&gt;
&lt;P&gt;temp: &lt;BR /&gt;load *, mid(F, 1, L - iterno() + 1) as R while iterno() &amp;lt;= L;&lt;BR /&gt;load *, len(F) as L;&lt;BR /&gt;load 'TOMATO' as F, autogenerate 1;&lt;/P&gt;
&lt;P&gt;which applies per internal while-loop multiple value-parts in extra records. Such logic might be applied several times to run it backwards and forwards and/or to pick values in between and might be also combined with additionally conditions in regard to the min/max lengths (returning just a single char or twice won't be sensible).&lt;/P&gt;
&lt;P&gt;With a similar logic you may also consider any typos by reversing chars against each other and/or replacing them with similar chars. Of course this won't be trivial but you may start it with just a few iterations.&lt;/P&gt;
&lt;P&gt;Very important to create such listing is to add recno(), rowno() and iterno() as additionally fields because they are needed to sort the mapping-table. A mapping returned always the first matching and therefore you will look at first for: TOMATO before you look for TOM. Further for the above you couldn't use applymap() else mapsubstring() which won't directly return you final value else it will place the final value within calling-string and needs to be afterwards extracted, for example with textbetween() on additionally wrapped chars around your final values.&lt;/P&gt;
&lt;P&gt;Mapsubstring() has further the benefit that it could replace multiple values within a string which might be also helpful to rank and sort the matches.&lt;/P&gt;
&lt;P&gt;Beside this it might be useful not to apply everything within a single mapping else you may use multiple ones - again to increase the quality of the matching and the ranking. Another point which may minimize the needed efforts might be not to do this against a concatenated string of 3 fields else against the single fields.&lt;/P&gt;
&lt;P&gt;If it looked like a quite heavy work - it is. But I think there aren't many alternatives and if they won't be really easier at least not to derive a valid final value. Nevertheless you may look on features like:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://help.qlik.com/en-US/cloud-services/Subsystems/Hub/Content/Sense_Hub/Scripting/StringFunctions/LevenshteinDist.htm" target="_blank"&gt;LevenshteinDist - script and chart function | Qlik Cloud Help&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 14:52:23 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072488#M1223684</guid>
      <dc:creator>marcus_sommer</dc:creator>
      <dc:date>2023-05-17T14:52:23Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072508#M1223686</link>
      <description>&lt;P&gt;hi Iswarya,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;MyField contains TOM, TOMA, TOMATE, ... and yes I'm looking for the longest word, but between TOM, TOMA and TOMATE. I'm also looking fo rthe longest word between CARO, CAROT, CAROTTE who are in the same field.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Field1, Field2 and Field3 contain a lot more text each, I do clean the 3 of them to get the TOM/TOMA/TOMATE.&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 15:17:52 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072508#M1223686</guid>
      <dc:creator>Delphines</dc:creator>
      <dc:date>2023-05-17T15:17:52Z</dc:date>
    </item>
    <item>
      <title>Re: Clean a field with itself</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072510#M1223687</link>
      <description>&lt;P&gt;hi Marcus,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thanks, sounds interesting, I'll dig further in that direction. I know it's a heavy work&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 15:20:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-a-field-with-itself/m-p/2072510#M1223687</guid>
      <dc:creator>Delphines</dc:creator>
      <dc:date>2023-05-17T15:20:29Z</dc:date>
    </item>
  </channel>
</rss>

