<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Clean and Filter unstructured text box in QlikView</title>
    <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210646#M880239</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks Steve, &lt;/P&gt;&lt;P&gt;I appreciate the time you're taking to respond.&amp;nbsp; But, I'm using an&lt;STRONG&gt; ODBC connection&lt;/STRONG&gt; to a data provider called &lt;STRONG&gt;Quickbase&lt;/STRONG&gt;.&amp;nbsp; Think of it like an Access DB but SAAS.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;No flatfiles, no Excel, just an old timey ODBC connection with fewer options and not much of a wizard.&amp;nbsp; I used Table Files to create the prototype and didn't have these problems but now, switching to a direct ODBC connection, I find the data coming from the back end is different than what comes out the front end in the form of .csv files.&amp;nbsp; This difference is what's causing me grief, such as the CRLF hidden characters described in my original post.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any other suggestions would be appreciated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers~!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;mfc&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 11 Oct 2016 15:35:30 GMT</pubDate>
    <dc:creator />
    <dc:date>2016-10-11T15:35:30Z</dc:date>
    <item>
      <title>Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210642#M880235</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello My Qlik Friends,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a data source to work with that doesn't exactly conform to Codd's Rules.&amp;nbsp; To make things worse, it's been modified over time.&amp;nbsp; Never the less, I'm working with what I have.&amp;nbsp; I have a field that contains a "WorkGroup".&amp;nbsp; It's a free text field that may contain notes embedded in the same field.&amp;nbsp; It also may contain multiple WorkGroups separated by CRLF codes.&amp;nbsp; My goal is to normalize as much as possible, and extract the first conforming WorkGroup from the first line, without any other extraneous characters. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In the Script, I have used variations of the following code (this example is the last variation prior to publishing this request):&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="sql" __jive_macro_name="code" class="jive_macro_code jive_text_macro _jivemacro_uid_14761367583125756" jivemacro_uid="_14761367583125756"&gt;
&lt;P&gt;If(index([Affected Workgroup], ' ')&amp;gt;0, KeepChar(Upper(Left([Affected Workgroup],index([Affected Workgroup], ' '))), '_ABCDEFGHIJKLMNOPQRSTUVWXYZ'),&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; If(index([Affected Workgroup], Chr(13))&amp;gt;0,&amp;nbsp; KeepChar(Upper(Left([Affected Workgroup],index([Affected Workgroup], Chr(13)-1))), '_ABCDEFGHIJKLMNOPQRSTUVWXYZ'),&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; If(index([Affected Workgroup], Chr(10))&amp;gt;0,&amp;nbsp; KeepChar(Upper(Left([Affected Workgroup],index([Affected Workgroup], Chr(10)-2))), '_ABCDEFGHIJKLMNOPQRSTUVWXYZ'),&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; If(index([Affected Workgroup], '(')&amp;gt;0, KeepChar(Upper(Left([Affected Workgroup],index([Affected Workgroup], '(')-1)), '_ABCDEFGHIJKLMNOPQRSTUVWXYZ'),&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Upper(KeepChar([Affected Workgroup], '_ABCDEFGHIJKLMNOPQRSTUVWXYZ')))))) As AffectedWG_Clean,&lt;/P&gt;

&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The input data looks like this:&lt;/P&gt;&lt;TABLE border="1" class="jiveBorder" height="253.4" style="border: 1px solid #000000; width: 691.4px; height: 255.4px;" width="689.4"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;Header 1&lt;/STRONG&gt;&lt;/TH&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;AAG_PAPCARE_CUSTSVC_PHN (6 in customer follow-up, 1 in coaching, 1 in a meeting, 1 ACW extended)&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;&lt;P&gt;AAG_PAPCARE_CUSTSVC_PHN&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;4 agents in project&lt;/P&gt;&lt;P&gt;4 agents cust follow-up status&lt;/P&gt;&lt;P&gt;3 agents coaching status&lt;/P&gt;&lt;P&gt;6 agents ACW extended status&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;&lt;P&gt;AAG_PAPCARE_CUSTSVC_PHN&lt;/P&gt;&lt;P&gt;AAG_PAPCARE_GOLD_PHN&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;AAG_PAPCARE_DIAMOND_PHN&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;AAG_PAPCARE_CUSTSVC_PHN (2 agents in project)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When I drop the multi-line examples into Notepad++, I get the following:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;IMG alt="1.jpg" class="jive-image image-1" src="https://community.qlik.com/legacyfs/online/140207_1.jpg" style="height: auto;" /&gt;&lt;/P&gt;&lt;P&gt;The problem I'm solving for today, and I'm hoping one of you can point me in the right direction is this:&amp;nbsp; when I run the code (in the above example) against the data shown here, for those lines where there are double sets of CRLF (Multi-line with spaces in between), I get this:&amp;nbsp; &lt;STRONG&gt;AAG_PAPCARE_DIAMOND_PHNAAG_PAPCARE_QBOA_DIAMOND_PHN&amp;nbsp; &lt;/STRONG&gt;a concatenation of the first 2 lines rather than simply taking the characters to the Left() of the CRLF.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To be clear, in the above example, all I need is the value of the very first line, nothing else.&amp;nbsp; I hope I've explained this well enough.&amp;nbsp; If not, I'm happy to supply any further information necessary.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for reading this and special Kudo's for supplying any clues.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers~!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Matthew Cummings&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 25 Nov 2020 16:16:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210642#M880235</guid>
      <dc:creator />
      <dc:date>2020-11-25T16:16:04Z</dc:date>
    </item>
    <item>
      <title>Re: Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210643#M880236</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I would try changing the settings on the file import wizard, so that CR/LF are treated as new lines, picking different character sets or delimiters may cause this to work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can then treat each line independently and parse it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good luck!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 10 Oct 2016 22:31:56 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210643#M880236</guid>
      <dc:creator>stevedark</dc:creator>
      <dc:date>2016-10-10T22:31:56Z</dc:date>
    </item>
    <item>
      <title>Re: Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210644#M880237</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Steve,&lt;/P&gt;&lt;P&gt;I connect to the source using an ODBC connection.&amp;nbsp; I'm also new to QlikView.&amp;nbsp; I took a look at the "&lt;STRONG&gt;Connect...&lt;/STRONG&gt;" options and but didn't find any options that sounded like they would bring about the result you suggested. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can you be more specific or is that option only available for Data Files?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your response &lt;IMG src="https://community.qlik.com/legacyfs/online/emoticons/happy.png" /&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 10 Oct 2016 23:04:48 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210644#M880237</guid>
      <dc:creator />
      <dc:date>2016-10-10T23:04:48Z</dc:date>
    </item>
    <item>
      <title>Re: Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210645#M880238</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Matthew,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In order to load in a flat file your best bet is to use the &lt;STRONG&gt;Table Files&lt;/STRONG&gt; button.&amp;nbsp; Once you have selected the source file in there (&lt;EM&gt;you may need to change the drop down from &lt;STRONG&gt;All Data Files &lt;/STRONG&gt;to &lt;STRONG&gt;All Files&lt;/STRONG&gt; if your file extension is not recognised&lt;/EM&gt;) you will be given a number of options about how to read the file in.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As the data are unstructured you may find that reading it in with no embedded header, as a fixed width file with no boundaries, then gives you the best ability to parse the file using the load script.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Apologies if my previous response confused, but I had assumed you were already using the &lt;STRONG&gt;Table Files&lt;/STRONG&gt; button.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers,&lt;/P&gt;&lt;P&gt;Steve&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 11 Oct 2016 05:59:54 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210645#M880238</guid>
      <dc:creator>stevedark</dc:creator>
      <dc:date>2016-10-11T05:59:54Z</dc:date>
    </item>
    <item>
      <title>Re: Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210646#M880239</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks Steve, &lt;/P&gt;&lt;P&gt;I appreciate the time you're taking to respond.&amp;nbsp; But, I'm using an&lt;STRONG&gt; ODBC connection&lt;/STRONG&gt; to a data provider called &lt;STRONG&gt;Quickbase&lt;/STRONG&gt;.&amp;nbsp; Think of it like an Access DB but SAAS.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;No flatfiles, no Excel, just an old timey ODBC connection with fewer options and not much of a wizard.&amp;nbsp; I used Table Files to create the prototype and didn't have these problems but now, switching to a direct ODBC connection, I find the data coming from the back end is different than what comes out the front end in the form of .csv files.&amp;nbsp; This difference is what's causing me grief, such as the CRLF hidden characters described in my original post.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any other suggestions would be appreciated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers~!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;mfc&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 11 Oct 2016 15:35:30 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210646#M880239</guid>
      <dc:creator />
      <dc:date>2016-10-11T15:35:30Z</dc:date>
    </item>
    <item>
      <title>Re: Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210647#M880240</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You could try a replace statement on chr(13) and chr(10) to remove the tricky characters?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Steve&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 11 Oct 2016 16:30:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210647#M880240</guid>
      <dc:creator>stevedark</dc:creator>
      <dc:date>2016-10-11T16:30:02Z</dc:date>
    </item>
    <item>
      <title>Re: Clean and Filter unstructured text box</title>
      <link>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210648#M880241</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Steve,&lt;/P&gt;&lt;P&gt;Ultimately, the Replace() function worked!&amp;nbsp; I could not get either Index() nor Replace() to recognize Chr(13) (CR), although Notepad++ recognized the CR code (see screen shot above), this turned out to be a major source of confusion.&amp;nbsp; Then I dropped the snippet into a Hex Editor which revealed the &lt;A href="http://www.maxi-pedia.com/line+termination+line+feed+versus+carriage+return+0d0a"&gt;hex values OD OA&lt;/A&gt;. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;IMG alt="1.jpg" class="jive-image image-1" src="https://community.qlik.com/legacyfs/online/140389_1.jpg" style="height: 178px; width: 620px;" /&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now that I know there is no CR and only LF, I knew the path I needed to be on.&amp;nbsp; It took further manipulation and toying with replacement values but I found the bar '|' worked where space ' ' didn't.&amp;nbsp; I'm learning a lot about working with hidden characters so this was a good exercise.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your help Sir!&amp;nbsp;&amp;nbsp; &lt;IMG src="https://community.qlik.com/legacyfs/online/emoticons/check.png" /&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Matthew Cummings&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 12 Oct 2016 15:49:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Clean-and-Filter-unstructured-text-box/m-p/1210648#M880241</guid>
      <dc:creator />
      <dc:date>2016-10-12T15:49:44Z</dc:date>
    </item>
  </channel>
</rss>

