<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Designing Spark batch Job for implementing ETL on Hive tables in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Designing-Spark-batch-Job-for-implementing-ETL-on-Hive-tables/m-p/2338083#M106359</link>
    <description>&lt;P&gt;Hi Rohini,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;I believe Spark Batch job will be better in your case and you can use tHiveInput and tHiveOutput components. For all Sybase DB related operations you can use Standard job. So its all about synchronization of your jobs one after another.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Perform your normal tasks with Standard job and any big data related activities using Bigdata job.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Warm Regards,&lt;BR /&gt;Nikhil Thampi&lt;/P&gt;
&lt;P&gt;Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 28 May 2019 13:54:43 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2019-05-28T13:54:43Z</dc:date>
    <item>
      <title>Designing Spark batch Job for implementing ETL on Hive tables</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Designing-Spark-batch-Job-for-implementing-ETL-on-Hive-tables/m-p/2338082#M106358</link>
      <description>&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am converting a legacy ETL(C++,Shell Script) logic into Talend and Big data environment.&lt;/P&gt;&lt;P&gt;I have imported the data in Hive tables from Sybase and now I want to read the data, perfom transformations and load in target hive tables. Primary challenge is data volume and the complex transformation logics. Which of the below approaches would give better performance and why:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Creating a standard job using ELT Hive components&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Creating spark batch job&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;or if there is any other way then please share.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Rohini&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 05:43:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Designing-Spark-batch-Job-for-implementing-ETL-on-Hive-tables/m-p/2338082#M106358</guid>
      <dc:creator>Rohini_B01</dc:creator>
      <dc:date>2024-11-16T05:43:41Z</dc:date>
    </item>
    <item>
      <title>Re: Designing Spark batch Job for implementing ETL on Hive tables</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Designing-Spark-batch-Job-for-implementing-ETL-on-Hive-tables/m-p/2338083#M106359</link>
      <description>&lt;P&gt;Hi Rohini,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;I believe Spark Batch job will be better in your case and you can use tHiveInput and tHiveOutput components. For all Sybase DB related operations you can use Standard job. So its all about synchronization of your jobs one after another.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Perform your normal tasks with Standard job and any big data related activities using Bigdata job.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Warm Regards,&lt;BR /&gt;Nikhil Thampi&lt;/P&gt;
&lt;P&gt;Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 28 May 2019 13:54:43 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Designing-Spark-batch-Job-for-implementing-ETL-on-Hive-tables/m-p/2338083#M106359</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-05-28T13:54:43Z</dc:date>
    </item>
  </channel>
</rss>

