<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Address Standardization, possibly using tExtractRegexFields in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Address-Standardization-possibly-using-tExtractRegexFields/m-p/2266886#M45957</link>
    <description>Hi mw629,&lt;BR /&gt;Have you tried to use &lt;A href="https://help.talend.com/search/all?query=tExtractDelimitedFields&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:tExtractDelimitedFields&lt;/A&gt; which can &lt;FONT color="#333333"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Arial,Helvetica,FreeSans,sans-serif"&gt;generate multiple columns from a given column in a delimited file.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT color="#333333"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Arial,Helvetica,FreeSans,sans-serif"&gt;Best regards&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#333333"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Arial,Helvetica,FreeSans,sans-serif"&gt;Sabrina&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;</description>
    <pubDate>Wed, 19 Nov 2014 06:33:31 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2014-11-19T06:33:31Z</dc:date>
    <item>
      <title>Address Standardization, possibly using tExtractRegexFields</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Address-Standardization-possibly-using-tExtractRegexFields/m-p/2266885#M45956</link>
      <description>Hi, I have an enterprise version of Talend for Data Services. 
&lt;BR /&gt;I am trying to standardize address data for a large data set, but using the Google components doesn't work as 1. it is extremely slow and 2. I run out of available queries as the data set is over 40,000 records 
&lt;BR /&gt;The addresses can come across in 2 ways. 
&lt;BR /&gt;Way 1, with 5 separate columns: 
&lt;BR /&gt;Address Line, City, State, Zip, Country 
&lt;BR /&gt;Example: 100 Main St | New York | New York | 90909 | US 
&lt;BR /&gt;Way 2, 1 column: 
&lt;BR /&gt;Address 
&lt;BR /&gt;Example: 100 Main St, New York, New York, 90909, US 
&lt;BR /&gt;I need to have the data separated like this: 
&lt;BR /&gt;Address Number, Street Name, City, State, Zip Code, Country 
&lt;BR /&gt; 
&lt;BR /&gt;I am having trouble getting the Regex correct as I am new to Java and the Talend process of things. Is there a better way to do this? Or can anyone offer input as to how to set up the Regex. 
&lt;BR /&gt;The job process is currently: 
&lt;BR /&gt;FTPget----tFileInputDelimited----tMap(modifying columns)---tSplitRow(being used to pivot certain items)----tHashOutput 
&lt;BR /&gt;Somewhere within there I need to separate the address fields. Any help with this would be great! Thank you.</description>
      <pubDate>Thu, 13 Nov 2014 15:17:19 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Address-Standardization-possibly-using-tExtractRegexFields/m-p/2266885#M45956</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-11-13T15:17:19Z</dc:date>
    </item>
    <item>
      <title>Re: Address Standardization, possibly using tExtractRegexFields</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Address-Standardization-possibly-using-tExtractRegexFields/m-p/2266886#M45957</link>
      <description>Hi mw629,&lt;BR /&gt;Have you tried to use &lt;A href="https://help.talend.com/search/all?query=tExtractDelimitedFields&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:tExtractDelimitedFields&lt;/A&gt; which can &lt;FONT color="#333333"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Arial,Helvetica,FreeSans,sans-serif"&gt;generate multiple columns from a given column in a delimited file.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT color="#333333"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Arial,Helvetica,FreeSans,sans-serif"&gt;Best regards&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#333333"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Arial,Helvetica,FreeSans,sans-serif"&gt;Sabrina&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;</description>
      <pubDate>Wed, 19 Nov 2014 06:33:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Address-Standardization-possibly-using-tExtractRegexFields/m-p/2266886#M45957</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-11-19T06:33:31Z</dc:date>
    </item>
  </channel>
</rss>

