<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Kafka input file store into HDFS. in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276953#M52872</link>
    <description>&lt;P&gt;if it slow with local file or logRow - you need seriously investigate your network architecture&lt;/P&gt;&lt;P&gt;Kafka extremely fast and no visible bottlenecks in this job&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;do you test your kafka connection with any other tools? like command line client&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 21 Mar 2019 09:01:30 GMT</pubDate>
    <dc:creator>vapukov</dc:creator>
    <dc:date>2019-03-21T09:01:30Z</dc:date>
    <item>
      <title>Kafka input file store into HDFS.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276950#M52869</link>
      <description>&lt;P&gt;Trying to store kafka input file into HDFS.. it is long running. File created into HDFS but content not copied into file. Job is long running. Can please help me what is the things I am doing wrong.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 04:04:33 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276950#M52869</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-03-21T04:04:33Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka input file store into HDFS.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276951#M52870</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;first of all - I suggest you delete connection between Kafaka Connection and KafkaInput&lt;/P&gt; 
&lt;P&gt;this is 2 independent parts:&lt;/P&gt; 
&lt;UL&gt; 
 &lt;LI&gt;everything whaat run before job&lt;/LI&gt; 
 &lt;LI&gt;and real job&lt;/LI&gt; 
&lt;/UL&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;2nd - there are many paramters affected for total performance, but insert each record non stop into HDFS direct might be not. the best idea&lt;/P&gt; 
&lt;P&gt;try to test in main part of job&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;create infinite loop:&lt;/P&gt; 
&lt;UL&gt; 
 &lt;LI&gt;fetch some limited number of records as a variant 10000 messages&lt;/LI&gt; 
 &lt;LI&gt;store all to local csv file&lt;/LI&gt; 
 &lt;LI&gt;append file to HDFS&lt;/LI&gt; 
&lt;/UL&gt; 
&lt;P&gt;it could be faster overall&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;P.S.&lt;/P&gt; 
&lt;P&gt;but as I mention above - real performance depends on many factors, network latency is one of them&lt;/P&gt; 
&lt;P&gt;for example:&lt;/P&gt; 
&lt;UL&gt; 
 &lt;LI&gt;zip 100 000 2kb files&lt;/LI&gt; 
 &lt;LI&gt;transfer&amp;nbsp;them to the remote server and unzip all over 1Gb network&lt;/LI&gt; 
&lt;/UL&gt; 
&lt;P&gt;will be 50+ times faster than copy files direct, same with databases - import from csv or batch inserts up to 100+ times faster than insert by single row&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 06:46:19 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276951#M52870</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2019-03-21T06:46:19Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka input file store into HDFS.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276952#M52871</link>
      <description>&lt;P&gt;According you suggestion I tried to to store data of kafka topic into csv files or tlogrows because this topic has only 280 records. But both case taking long time.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 08:44:59 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276952#M52871</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-03-21T08:44:59Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka input file store into HDFS.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276953#M52872</link>
      <description>&lt;P&gt;if it slow with local file or logRow - you need seriously investigate your network architecture&lt;/P&gt;&lt;P&gt;Kafka extremely fast and no visible bottlenecks in this job&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;do you test your kafka connection with any other tools? like command line client&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 09:01:30 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Kafka-input-file-store-into-HDFS/m-p/2276953#M52872</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2019-03-21T09:01:30Z</dc:date>
    </item>
  </channel>
</rss>

