<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: [resolved] Collect rejects from tFileInputDelimited in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248941#M33633</link>
    <description>Take a look at this documentation: &lt;A href="https://help.talend.com/search/all?query=tExtractDelimitedFields&amp;amp;content-lang=en" rel="nofollow noopener noreferrer"&gt;https://help.talend.com/search/all?query=tExtractDelimitedFields&amp;amp;content-lang=en&lt;/A&gt;&lt;BR /&gt;This is what it says about field Separator:&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Since this component uses regex to split a filed and the regex syntax uses special characters as operators, make sure to precede the regex operator you use as a field separator by a double backslash. For example, you have to use "\\|" instead of "|".&lt;/FONT&gt;&lt;/FONT&gt;</description>
    <pubDate>Tue, 20 Dec 2016 14:58:41 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2016-12-20T14:58:41Z</dc:date>
    <item>
      <title>[resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248931#M33623</link>
      <description>Hello Team,&lt;BR /&gt;I need to process delimited file and collect all the rejects.&lt;BR /&gt;To do so I use tFileInputDelimited -&amp;gt; Rejects -&amp;gt; tFileOutputDelimited where tFileOutputDelimited is configured to output to a .txt file. Unfortunately in this output .txt file I collect partially parsed rejected lines along with error message. What I really need to collect in this file is the original line(s) from input file that were rejected without any additional information.&amp;nbsp;&lt;BR /&gt;Here is an example. in the input file I have the following line:&lt;BR /&gt;&lt;I&gt;"AAA|BBB"|Literacy 9 Student Book A|NSB2B/NSB2C|Softcover|Softcover|1024118|0176398163|9780176398163|Literacy 9 Student Book A|ITEM-B2B|Online Student Centre, 5 year||Literacy 9 Student Book A Online Student Centre, 5 year|31|PRDONLYSUP|158&lt;/I&gt;&lt;BR /&gt;this line will be rejected since it is not properly formed according to csv schema that I defined. In output file, instead of this full line above, I only get the following:&lt;BR /&gt;&lt;I&gt;"AAA|BBB"|Literacy 9 Student Book A|NSB2B/NSB2C|Softcover|||||||||||||For input string: "Softcover" - Line: 0&lt;/I&gt;&lt;BR /&gt;Is there any way that I can collect original input lines?&lt;BR /&gt;Thank you!&lt;BR /&gt;Svetlana</description>
      <pubDate>Wed, 14 Dec 2016 13:41:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248931#M33623</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-14T13:41:39Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248932#M33624</link>
      <description>Hi,
&lt;BR /&gt;What does your expected result look like? How did you define csv schema?
&lt;BR /&gt;Have you tried to use the component 
&lt;A href="https://help.talend.com/search/all?query=tSchemaComplianceCheck&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:tSchemaComplianceCheck&lt;/A&gt; which is used to validate all input rows against a reference schema or check types, nullability, length of rows against reference values to see if it works?
&lt;BR /&gt;Best regards
&lt;BR /&gt;Sabrina</description>
      <pubDate>Thu, 15 Dec 2016 03:52:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248932#M33624</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-15T03:52:03Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248933#M33625</link>
      <description>Hi Sabrina, 
&lt;BR /&gt;My expected result will contain full line from input file which was rejected.&amp;nbsp; 
&lt;BR /&gt;i.e. this line will throw exception: 
&lt;BR /&gt;"AAA|BBB"|Literacy 9 Student Book A|NSB2B/NSB2C|Softcover|Softcover|1024118|0176398163|9780176398163|Literacy 9 Student Book A|ITEM-B2B|Online Student Centre, 5 year||Literacy 9 Student Book A Online Student Centre, 5 year|31|PRDONLYSUP|158 
&lt;BR /&gt;in this case, &amp;nbsp;my expected result (file with rejects) will contain this line in full: 
&lt;BR /&gt;"AAA|BBB"|Literacy 9 Student Book A|NSB2B/NSB2C|Softcover|Softcover|1024118|0176398163|9780176398163|Literacy 9 Student Book A|ITEM-B2B|Online Student Centre, 5 year||Literacy 9 Student Book A Online Student Centre, 5 year|31|PRDONLYSUP|158 
&lt;BR /&gt;My schema definition is attached as a screenshot. 
&lt;BR /&gt;I did not try to use tSchemaComplianceCheck component, so I am going to take a look at it. 
&lt;BR /&gt;Thank you! 
&lt;BR /&gt;Svetlana 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCHs.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/130160i7A49AF8B331669A4/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCHs.png" alt="0683p000009MCHs.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Thu, 15 Dec 2016 13:54:54 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248933#M33625</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-15T13:54:54Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248934#M33626</link>
      <description>Hi Sabrina, 
&lt;BR /&gt;tSchemaComplianceCheck doesn't seem to help. The line that I want to have in rejects file will fail because the number of columns is different from what is defined in the schema so it will always fail in tFileInputDelimited without even reaching tSchemaComplianceCheck element - see attached screenshot.&amp;nbsp; 
&lt;BR /&gt;Thank you! 
&lt;BR /&gt;Svetlana 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCen.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/130652i8DA908B11638A88D/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCen.png" alt="0683p000009MCen.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Thu, 15 Dec 2016 14:47:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248934#M33626</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-15T14:47:11Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248935#M33627</link>
      <description>Hi,
&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;
 &lt;TABLE border="1"&gt;
  &lt;TBODY&gt;
   &lt;TR&gt;
    &lt;TD&gt;&lt;I&gt;&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;"AAA|BBB"|Literacy 9 Student Book A|NSB2B/NSB2C|Softcover|||||||||||||For input string: "Softcover" - Line: 0&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/TD&gt;
   &lt;/TR&gt;
  &lt;/TBODY&gt;
 &lt;/TABLE&gt;
&lt;/BLOCKQUOTE&gt;
&lt;BR /&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;This field contains string values such as "&lt;/FONT&gt;&lt;/FONT&gt;
&lt;I&gt;&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;Softcover&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;", but you are using "Long" data type to read it.&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;Try to read this column with string data type and&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;validate the input rows against a reference schema by using&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;tSchemaComplianceCheck to see if it works?&lt;/FONT&gt;&lt;/FONT&gt;
&lt;BR /&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;Best regards&lt;/FONT&gt;&lt;/FONT&gt;
&lt;BR /&gt;
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;Sabrina&lt;/FONT&gt;&lt;/FONT&gt;</description>
      <pubDate>Fri, 16 Dec 2016 09:55:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248935#M33627</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-16T09:55:02Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248936#M33628</link>
      <description>Hi Sabrina, 
&lt;BR /&gt;you are absolutely right - this line is deliberately incorrect. If I remove "AAA|BBB" part, it will be processed without any issues. My goal here is to collect all original lines from input file that may throw exception. Our program will process input csv file from our client application and there is absolutely no guarantee that all the lines in the file will be well formed. We will need to process what we can, and collect the rest in a rejects file to send it back to the client, so that they can deal with these rejects, fix them and resubmit. So I need to be able to send them back original line as it came from their input file, meaning I need to have this full incorrect line in my rejects file: 
&lt;BR /&gt;"AAA|BBB"|Literacy 9 Student Book A|NSB2B/NSB2C|Softcover|Softcover|1024118|0176398163|9780176398163|Literacy 9 Student Book A|ITEM-B2B|Online Student Centre, 5 year||Literacy 9 Student Book A Online Student Centre, 5 year|31|PRDONLYSUP|158 
&lt;BR /&gt;Thank you, 
&lt;BR /&gt;Svetlana</description>
      <pubDate>Fri, 16 Dec 2016 14:13:05 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248936#M33628</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-16T14:13:05Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248937#M33629</link>
      <description>&lt;FONT size="2"&gt;&lt;FONT face="Verdana, sans-serif"&gt;Hi,&lt;/FONT&gt;&lt;/FONT&gt;
&lt;BR /&gt;
&lt;FONT size="2"&gt;&lt;FONT face="Verdana, sans-serif"&gt;Could you please try to read this column with string data type (Actually, there is no check for String in talend)and&amp;nbsp;validate the input rows against a reference schema by using&amp;nbsp;tSchemaComplianceCheck to see if it works?&lt;/FONT&gt;&lt;/FONT&gt;
&lt;BR /&gt;
&lt;FONT size="2"&gt;&lt;FONT face="Verdana, sans-serif"&gt;Best regards&lt;/FONT&gt;&lt;/FONT&gt;
&lt;BR /&gt;
&lt;FONT size="2"&gt;&lt;FONT face="Verdana, sans-serif"&gt;Sabrina&lt;/FONT&gt;&lt;/FONT&gt;</description>
      <pubDate>Mon, 19 Dec 2016 08:25:26 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248937#M33629</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-19T08:25:26Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248938#M33630</link>
      <description>Hi Sabrina, 
&lt;BR /&gt;I am not sure what column you are referring to, but I tried the following: 
&lt;BR /&gt;1. if I modify the schema and read all columns according to their types (first two columns as strings) then yes, everything works fine, all lines reach tSchemaComplianceCheck elements and can be validated. 
&lt;BR /&gt;2. when I modify input to not match schema (in this case one extra string column in the beginning of the line which is read as string) - please see attached screenshot. As you can see the rejects happened in tFileInputDelimited - they did not even reach tSchemaComplianceCheck for schema validation. Rejects happen because one of the fields of this line was expected to be long but turned out to be String. 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCes.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/145449iDE71B210AC08FC54/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCes.png" alt="0683p000009MCes.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt;3. I also tried to read entire line as one big "input" of type String and pass it to tSchemaComplianceCheck hoping it will figure out how to parse it but it did not and it gave me "input cannot be resolved or not a field" error: 
&lt;BR /&gt; 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCaw.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/149035iE7FB24D3F415A921/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCaw.png" alt="0683p000009MCaw.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt;The only workaround that I see so far is to read each line of input file as one long string input and pass it to tExtractDelimitedFields for parsing. Then when it fails to parse a line I can use tJavaRow to collect value of the input row to tExtractDelimitedFields. But I am facing strange problem here. For whatever reason it does not parse correctly my input line. It splits every letter as a separate field (see snapshot below). If you could help me to figure out this one, I can use this workaround to collect info that I need. Configuration of tExtractDelimitedFields is attached. 
&lt;BR /&gt; 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCex.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/131801iD585F7CFE3C6E879/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCex.png" alt="0683p000009MCex.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt; 
&lt;BR /&gt;Thank you! 
&lt;BR /&gt;Svetlana 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCcn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/151842i482FB12D121DC5D6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCcn.png" alt="0683p000009MCcn.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Mon, 19 Dec 2016 15:14:42 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248938#M33630</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-19T15:14:42Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248939#M33631</link>
      <description>I figured that. As a field separator instead of "|" I have to use "\\|". It works now.&amp;nbsp;&lt;BR /&gt;Thank you.&lt;BR /&gt;Svetlana</description>
      <pubDate>Tue, 20 Dec 2016 12:56:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248939#M33631</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-20T12:56:22Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248940#M33632</link>
      <description>What is the difference between "|" and "\\|"?</description>
      <pubDate>Tue, 20 Dec 2016 14:53:19 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248940#M33632</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-20T14:53:19Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248941#M33633</link>
      <description>Take a look at this documentation: &lt;A href="https://help.talend.com/search/all?query=tExtractDelimitedFields&amp;amp;content-lang=en" rel="nofollow noopener noreferrer"&gt;https://help.talend.com/search/all?query=tExtractDelimitedFields&amp;amp;content-lang=en&lt;/A&gt;&lt;BR /&gt;This is what it says about field Separator:&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Since this component uses regex to split a filed and the regex syntax uses special characters as operators, make sure to precede the regex operator you use as a field separator by a double backslash. For example, you have to use "\\|" instead of "|".&lt;/FONT&gt;&lt;/FONT&gt;</description>
      <pubDate>Tue, 20 Dec 2016 14:58:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248941#M33633</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-20T14:58:41Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248942#M33634</link>
      <description>Hi,&lt;BR /&gt;Did you try to join input data file with the reject file on a separate subjob after the 1st one is finished?&lt;BR /&gt;A little transformation plus playing with separators should works.&lt;BR /&gt;Regards,&lt;BR /&gt;TRF</description>
      <pubDate>Tue, 20 Dec 2016 15:31:23 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248942#M33634</guid>
      <dc:creator>TRF</dc:creator>
      <dc:date>2016-12-20T15:31:23Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Collect rejects from tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248943#M33635</link>
      <description>Hi TRF,&lt;BR /&gt;my problem was to get rejects file. The rejected lines returned by tFileInputDelimited were not complete original lines. It would return partially processed lines, so if parsing failed on the very first column of the line, I would not have anything that I could compare with the input data. My backup plan was however to compare input data with successfully processed data to find out what was rejected. Solution with&amp;nbsp;tExtractDelimitedFields works fine for my puposes.&lt;BR /&gt;Cheers,&lt;BR /&gt;Svetlana</description>
      <pubDate>Tue, 20 Dec 2016 15:50:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Collect-rejects-from-tFileInputDelimited/m-p/2248943#M33635</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-20T15:50:45Z</dc:date>
    </item>
  </channel>
</rss>

