<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Deliminator for Input Files in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271427#M49031</link>
    <description>Sure.  I am unable to figure out out to do a screen shot to better explain.  The first 4 rows are as expected for the three columns. Product Width and Product Height should be "0"&lt;BR /&gt;The bottom 4 rows are similar however the columns have shifted midway through Product Description.  In the source data, the raw data would looked like " ...Handle, inside..."  Where the comma occurs between Handle and inside, it appears to be marking that as a new column and causing the output to shift over for that row.  &lt;BR /&gt;My delimiter is set to pipes -- "|" so I am not sure why a comma would read as a deliminator as well. &lt;BR /&gt;Output Result&lt;BR /&gt;Product Description	| Product Width	| Product Height&lt;BR /&gt;#######  REAR DOOR  Lock &amp;amp; hardware  Actuator  Toyota  Camry  1997-2001 |	0 | 0&lt;BR /&gt;#######  COWL  Cowl  Front insulator  Toyota  Camry  2010-2011 |	0	|	0&lt;BR /&gt;########FENDER  Structural components &amp;amp; rails  Seal  Toyota  Camry  2010-2011 |	0 |	0	&lt;BR /&gt;#######  FENDER  Structural components &amp;amp; rails  Seal  Toyota  Camry  2010-2011 |	0 |	0	&lt;BR /&gt;--------------------------------------------------------------------------------------------------------------&lt;BR /&gt;########  FRONT DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0&lt;BR /&gt;########  REAR DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0&lt;BR /&gt;########  REAR DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0&lt;BR /&gt;#######  FRONT DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0</description>
    <pubDate>Wed, 30 Oct 2013 01:14:53 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2013-10-30T01:14:53Z</dc:date>
    <item>
      <title>Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271425#M49029</link>
      <description>I am having difficulty getting an input file to read correctly. The file is deliminated "|" And for the most part this works. However there is a description column which occasionally has commas "," that are picked up as well. I have double checked my schema to make sure it is supposed to separate by pipes "|" only, as well as the input expression in my mapping. 
&lt;BR /&gt;Is there anything I could have missed? I have gone into the advanced options and selected CSV option there as well to insert parentheses around each field between the pipes to see if this would mitigate but so far has not. 
&lt;BR /&gt;Suggestions? having the source files sent ahead of time with parenthesis around each field between pipes cannot be done. Unless someone knows a way to use sed or some other Linux approach to clean up the file before entering the Talend job.</description>
      <pubDate>Sun, 27 Oct 2013 20:29:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271425#M49029</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-10-27T20:29:37Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271426#M49030</link>
      <description>Hi 
&lt;BR /&gt;Can you please give us an example of your data? and what's expected result you want?
&lt;BR /&gt;Shong</description>
      <pubDate>Mon, 28 Oct 2013 05:24:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271426#M49030</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-10-28T05:24:31Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271427#M49031</link>
      <description>Sure.  I am unable to figure out out to do a screen shot to better explain.  The first 4 rows are as expected for the three columns. Product Width and Product Height should be "0"&lt;BR /&gt;The bottom 4 rows are similar however the columns have shifted midway through Product Description.  In the source data, the raw data would looked like " ...Handle, inside..."  Where the comma occurs between Handle and inside, it appears to be marking that as a new column and causing the output to shift over for that row.  &lt;BR /&gt;My delimiter is set to pipes -- "|" so I am not sure why a comma would read as a deliminator as well. &lt;BR /&gt;Output Result&lt;BR /&gt;Product Description	| Product Width	| Product Height&lt;BR /&gt;#######  REAR DOOR  Lock &amp;amp; hardware  Actuator  Toyota  Camry  1997-2001 |	0 | 0&lt;BR /&gt;#######  COWL  Cowl  Front insulator  Toyota  Camry  2010-2011 |	0	|	0&lt;BR /&gt;########FENDER  Structural components &amp;amp; rails  Seal  Toyota  Camry  2010-2011 |	0 |	0	&lt;BR /&gt;#######  FENDER  Structural components &amp;amp; rails  Seal  Toyota  Camry  2010-2011 |	0 |	0	&lt;BR /&gt;--------------------------------------------------------------------------------------------------------------&lt;BR /&gt;########  FRONT DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0&lt;BR /&gt;########  REAR DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0&lt;BR /&gt;########  REAR DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0&lt;BR /&gt;#######  FRONT DOOR  Lock &amp;amp; hardware  Handle |	 inside  Toyota  Camry  2005-2006 |	0</description>
      <pubDate>Wed, 30 Oct 2013 01:14:53 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271427#M49031</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-10-30T01:14:53Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271428#M49032</link>
      <description>I am working nearly every day with the component tFileInputDelimited and I have never seen this component mix up the separators. Could you please provide a screenshot of the basic settings of the input component?&lt;BR /&gt;To be honest I cannot belief this. I am pretty sure there is a misconfiguration.</description>
      <pubDate>Wed, 30 Oct 2013 19:32:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271428#M49032</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-10-30T19:32:11Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271429#M49033</link>
      <description>Hey jlolling, 
&lt;BR /&gt;I'm not allowed to post images or screen shots as my account is too new I believe (10 posts minimum first). Hopefully the info below will be enough. 
&lt;BR /&gt;This was designed as a Metadata Schema &amp;gt; File Delimited 
&lt;BR /&gt;------------------------------------------------------------------ 
&lt;BR /&gt;File: "input_file" 
&lt;BR /&gt;Format: Unix 
&lt;BR /&gt;Encoding: US-ASCII 
&lt;BR /&gt;Field Separator: Custom ANSI &amp;gt; "|" 
&lt;BR /&gt;Escape Char Settings: {checked} Delimited 
&lt;BR /&gt;File_Delimited objected in the ETL Job 
&lt;BR /&gt;--------------------------------------------- 
&lt;BR /&gt;Basic Settings: 
&lt;BR /&gt;CSV Row Separator "LF("\n") 
&lt;BR /&gt;Field Separator: "|" 
&lt;BR /&gt;Text enclosure: """ 
&lt;BR /&gt;Escape Char: """ 
&lt;BR /&gt;CSV Options: {checked} 
&lt;BR /&gt;Schema: DELIM: {schema I created and listed above} 
&lt;BR /&gt;Advanced Settings: 
&lt;BR /&gt;Advanced Separator (for numbers) Thousands separator: "," &amp;lt;---- The description field is varchar so I do not expect this setting would be the cause 
&lt;BR /&gt; 
&lt;BR /&gt;My solution for now is to run a linux script that will remove all commas directly and everything runs as expected. I would prefer not to move forward with this method so any input would be much appreciated.</description>
      <pubDate>Fri, 01 Nov 2013 16:22:05 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271429#M49033</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-11-01T16:22:05Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271430#M49034</link>
      <description>Could you please provide an example of you input data? Shong has already ask for. We have seen so far only the wrong result. I will start to reproduce it and give you feedback about my test results.
&lt;BR /&gt;BTW what version of Talend do you use?</description>
      <pubDate>Sat, 02 Nov 2013 11:28:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271430#M49034</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-11-02T11:28:50Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271431#M49035</link>
      <description>The current placement of parenthesis is a bit sporadic and I am unable to force the source file provider to correct this. I am unable to get the data to read correctly so my first step was a Linux script sed command that removed all (") in the file. This would impact Handle,inside type situations where originally the parenthesis would have placed correctly around the characters. 
&lt;BR /&gt;Make Name|Model Name|Year|Section Name|Category Name| Sub Category Name|Part#|item Name|item Price|item Description,,,,, 
&lt;BR /&gt;"Toyota|Camry|""2005""-""2006""|""FRONT DOOR""|""Lock &amp;amp; hardware""|""Handle"," inside""|""XXXXXXXXXXX""|""All"," Charcoal Left""|00.00|"" HANDLE SUB-ASSY- DOO""",,, 
&lt;BR /&gt;"Toyota|Camry|""2005""-""2006""|""REAR DOOR""|""Lock &amp;amp; hardware""|""Handle"," inside""|""XXXXXXXXXXXX""|""All"," Charcoal Left""|00.00|""""",,, 
&lt;BR /&gt;"Toyota|Camry|""2005""-""2006""|""REAR DOOR""|""Lock &amp;amp; hardware""|""Handle"," inside""|""XXXXXXXXXXXX""|""All"," Charcoal Left""|00.00|""""",,, 
&lt;BR /&gt;"Toyota|Camry|""2005""-""2006""|""FRONT DOOR""|""Lock &amp;amp; hardware""|""Handle"," inside""|""XXXXXXXXXXXX""|""All"," Charcoal Right""|00.00|"" HANDLE SUB-ASSY- DOO""",,,</description>
      <pubDate>Mon, 04 Nov 2013 04:55:20 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271431#M49035</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-11-04T04:55:20Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271432#M49036</link>
      <description>OK I have a solution without external tools like sed:
&lt;BR /&gt;tFileInputDelimited ---&amp;gt; tJavaRow ---&amp;gt; tFileExtractDelimited ---&amp;gt; .....
&lt;BR /&gt;in tFileInputDelimited:
&lt;BR /&gt;set as field delimiter anything what never occurs in your content to prevent line splitting, we want the whole line.
&lt;BR /&gt;set as schema one column: line
&lt;BR /&gt;in tJavaRow:
&lt;BR /&gt;
&lt;PRE&gt;output_row.line = input_row.line.replaceAll("\"", "");&lt;/PRE&gt;
&lt;BR /&gt;in tFileExtractDelimited
&lt;BR /&gt;set your target schema (your 10 columns) and take care the input schema with only the line column is unchanged (you will have here different schemas for input and output)
&lt;BR /&gt;set as delimiter ""
&lt;BR /&gt;This way you will receive an output like this (the pipe is from my tLogRow):
&lt;BR /&gt;make model year section category subcategory part item item_price item_desc
&lt;BR /&gt;Toyota|Camry|2005-2006|FRONT DOOR|Lock &amp;amp; hardware|Handle, inside|XXXXXXXXXXX|All, Charcoal Left|00.00|HANDLE SUB-ASSY- DOO,,,
&lt;BR /&gt;Toyota|Camry|2005-2006|REAR DOOR|Lock &amp;amp; hardware|Handle, inside|XXXXXXXXXXXX|All, Charcoal Left|00.00|,,,
&lt;BR /&gt;Toyota|Camry|2005-2006|REAR DOOR|Lock &amp;amp; hardware|Handle, inside|XXXXXXXXXXXX|All, Charcoal Left|00.00|,,,
&lt;BR /&gt;Toyota|Camry|2005-2006|FRONT DOOR|Lock &amp;amp; hardware|Handle, inside|XXXXXXXXXXXX|All, Charcoal Right|00.00|HANDLE SUB-ASSY- DOO,,,</description>
      <pubDate>Mon, 04 Nov 2013 07:18:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271432#M49036</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-11-04T07:18:55Z</dc:date>
    </item>
    <item>
      <title>Re: Deliminator for Input Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271433#M49037</link>
      <description>Thanks jlolling, this should help a lot. I had not really considered a tJavaRow approach but I like the way you send the data through. 
&lt;BR /&gt;I'll keep you posted if I have anymore questions and thank you for the help!</description>
      <pubDate>Tue, 05 Nov 2013 00:25:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Deliminator-for-Input-Files/m-p/2271433#M49037</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-11-05T00:25:02Z</dc:date>
    </item>
  </channel>
</rss>

