<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic tExtractRegex usage and escaping for talend or java in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358546#M123552</link>
    <description>&lt;P&gt;I have a column in my data I am trying to break into 4 columns on a | delimter. I ended up using&amp;nbsp;&lt;STRONG&gt;tExtractRegexFields and got a pattern to work in regex testers finally as groups, but the talend regex won't escape the pipe ( | ) and I end up getting odd results after the tExtractRegex and tConvert (split into strings, then try to cast.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;The regex tester is here&amp;nbsp;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;here is my pattern:&amp;nbsp;^([0-9\.]*) \| ([0-9\.]*) \| ([0-9\.]*) \| (.*) \| (.*)$&lt;/STRONG&gt;&lt;BR /&gt;here is the terrible sample data column:&amp;nbsp;1 | 6.39 | 9.76 | FL500S | FILTER ASY - OIL&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;debug console tLog after the regex has | replaced with [] so I can see if the pipes were removed and new columns made.&lt;BR /&gt;the tLog row pre regex has :: instead of |&lt;BR /&gt;&lt;BR /&gt;Repair_Order SoLine SoPartLine Qty Cost List Part Part_Description&lt;BR /&gt;Repair_Order SoLine SoPartLine Qty_Cost_List_Part_Desc&lt;BR /&gt;6262880::3::1::1 | 1736.33 | 2315.11 | 7L3Z7000ABRM | AUTOMATIC T&lt;BR /&gt;6262880 [] 3 [] 1 [] 1 [] &amp;nbsp;[] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;BR /&gt;6262880 [] 3 [] 1 [] &amp;nbsp;[] 1736.33 | 2315.11 | 7L3Z7000ABRM | AUTOMATIC [] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;BR /&gt;6262880::3::2::1 | 600.00 | 600.00 | 7L3Z7000ABRM-C | 7L3Z 7000 A&lt;BR /&gt;6262880 [] 3 [] 2 [] 1 [] &amp;nbsp;[] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;BR /&gt;6262880 [] 3 [] 2 [] &amp;nbsp;[] 600.00 | 600.00 | 7L3Z7000ABRM-C | 7L3Z 7000 [] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 29 Mar 2017 17:36:40 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-03-29T17:36:40Z</dc:date>
    <item>
      <title>tExtractRegex usage and escaping for talend or java</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358546#M123552</link>
      <description>&lt;P&gt;I have a column in my data I am trying to break into 4 columns on a | delimter. I ended up using&amp;nbsp;&lt;STRONG&gt;tExtractRegexFields and got a pattern to work in regex testers finally as groups, but the talend regex won't escape the pipe ( | ) and I end up getting odd results after the tExtractRegex and tConvert (split into strings, then try to cast.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;The regex tester is here&amp;nbsp;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;here is my pattern:&amp;nbsp;^([0-9\.]*) \| ([0-9\.]*) \| ([0-9\.]*) \| (.*) \| (.*)$&lt;/STRONG&gt;&lt;BR /&gt;here is the terrible sample data column:&amp;nbsp;1 | 6.39 | 9.76 | FL500S | FILTER ASY - OIL&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;debug console tLog after the regex has | replaced with [] so I can see if the pipes were removed and new columns made.&lt;BR /&gt;the tLog row pre regex has :: instead of |&lt;BR /&gt;&lt;BR /&gt;Repair_Order SoLine SoPartLine Qty Cost List Part Part_Description&lt;BR /&gt;Repair_Order SoLine SoPartLine Qty_Cost_List_Part_Desc&lt;BR /&gt;6262880::3::1::1 | 1736.33 | 2315.11 | 7L3Z7000ABRM | AUTOMATIC T&lt;BR /&gt;6262880 [] 3 [] 1 [] 1 [] &amp;nbsp;[] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;BR /&gt;6262880 [] 3 [] 1 [] &amp;nbsp;[] 1736.33 | 2315.11 | 7L3Z7000ABRM | AUTOMATIC [] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;BR /&gt;6262880::3::2::1 | 600.00 | 600.00 | 7L3Z7000ABRM-C | 7L3Z 7000 A&lt;BR /&gt;6262880 [] 3 [] 2 [] 1 [] &amp;nbsp;[] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;BR /&gt;6262880 [] 3 [] 2 [] &amp;nbsp;[] 600.00 | 600.00 | 7L3Z7000ABRM-C | 7L3Z 7000 [] &amp;nbsp;[] &amp;nbsp;[]&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2017 17:36:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358546#M123552</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-03-29T17:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: tExtractRegex usage and escaping for talend or java</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358547#M123553</link>
      <description>regex tester at &lt;A href="http://www.regextester.com/?fam=97285" target="_blank" rel="nofollow noopener noreferrer"&gt;www.regextester.com/?fam=97285&lt;/A&gt;</description>
      <pubDate>Wed, 29 Mar 2017 17:39:27 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358547#M123553</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-03-29T17:39:27Z</dc:date>
    </item>
    <item>
      <title>Re: tExtractRegex usage and escaping for talend or java</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358548#M123554</link>
      <description>so double backslash to escape the escape in java I guess. I was getting 2 or 3 rows per record because the grouping wasn't working right because of the pipe meaning either or in regex if it was not escaped right. which made debugging harder for me to catch the issue.
&lt;BR /&gt;"^([0-9]*) \\| (.*) \\| (.*) \\| (.*) \\| (.*)$" used to parse: "6 | 3.75 | 7.86 | XO5W20BFS | MOTORCRAFT SAE 5W-20"
&lt;BR /&gt;gave me the results I wanted
&lt;BR /&gt;
&lt;BR /&gt;6298055::3::2::6 | 3.75 | 7.86 | XO5W20BFS | MOTORCRAFT SAE 5W-20
&lt;BR /&gt;
&lt;BR /&gt;6298055 [] 3 [] 2 [] 6 [] 3.75 [] 7.86 [] XO5W20BFS [] MOTORCRAFT SAE 5W-20</description>
      <pubDate>Wed, 29 Mar 2017 21:44:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358548#M123554</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2017-03-29T21:44:44Z</dc:date>
    </item>
    <item>
      <title>Re: tExtractRegex usage and escaping for talend or java</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358549#M123555</link>
      <description>Glad to hear you fixed your issue. &amp;nbsp;I think you can also split a field on a delimiter character using the tExtractDelimitedFields component.</description>
      <pubDate>Wed, 29 Mar 2017 22:32:32 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tExtractRegex-usage-and-escaping-for-talend-or-java/m-p/2358549#M123555</guid>
      <dc:creator>cterenzi</dc:creator>
      <dc:date>2017-03-29T22:32:32Z</dc:date>
    </item>
  </channel>
</rss>

