<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tsqoopimport : how to merge generated files in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331037#M100061</link>
    <description>&lt;P&gt;Ok, i've found : the thdfscopy as an option which can merge the files.&lt;/P&gt;</description>
    <pubDate>Wed, 07 Feb 2018 14:05:09 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2018-02-07T14:05:09Z</dc:date>
    <item>
      <title>tsqoopimport : how to merge generated files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331034#M100058</link>
      <description>&lt;P&gt;hello,&lt;/P&gt;&lt;P&gt;I use the TSqoopimport component for importing&amp;nbsp; oracle tables to HDFS.&lt;/P&gt;&lt;P&gt;This component generate 4 files (part-m-xxxx) if i configure it with 4 mappers.&lt;/P&gt;&lt;P&gt;and after, how can i merge thos 4 files into one file ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I use TOS for Big Data 6.1.1&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 08:47:17 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331034#M100058</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T08:47:17Z</dc:date>
    </item>
    <item>
      <title>Re: tsqoopimport : how to merge generated files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331035#M100059</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;The source data comes from 2 different sources, but has the same schema?&lt;/P&gt; 
&lt;P&gt;You can use a tFileList to iterate on a tFileInput* row linked to a tFileOutputDelimited in append mode.&lt;/P&gt; 
&lt;P&gt;Let us know if it works.&lt;/P&gt; 
&lt;P&gt;Best regards&lt;/P&gt; 
&lt;P&gt;Sabrina&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 07:02:46 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331035#M100059</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-01-31T07:02:46Z</dc:date>
    </item>
    <item>
      <title>Re: tsqoopimport : how to merge generated files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331036#M100060</link>
      <description>&lt;P&gt;If the data isn't huge, you can try configuring Sqoop to just use one mapper: that way, it will generate one file.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;If you don't want to go that route, Sabrina is mostly correct, except that you'll need tHDFSFileList to iterate over the files. Instead of merging them, this will iterate over them, so you can do whatever ETL work you need to do.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;David&lt;/P&gt;</description>
      <pubDate>Sat, 03 Feb 2018 22:31:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331036#M100060</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-02-03T22:31:55Z</dc:date>
    </item>
    <item>
      <title>Re: tsqoopimport : how to merge generated files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331037#M100061</link>
      <description>&lt;P&gt;Ok, i've found : the thdfscopy as an option which can merge the files.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 14:05:09 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tsqoopimport-how-to-merge-generated-files/m-p/2331037#M100061</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-02-07T14:05:09Z</dc:date>
    </item>
  </channel>
</rss>

