<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>article Using Hive components in MapR Spark Jobs to read Hive MapR-DB tables in Official Support Articles</title>
    <link>https://community.qlik.com/t5/Official-Support-Articles/Using-Hive-components-in-MapR-Spark-Jobs-to-read-Hive-MapR-DB/ta-p/2151856</link>
    <description>&lt;DIV class="talend-tkb-migrated-content"&gt;&lt;DIV class="lia-message-template-content-zone"&gt; 
 &lt;H1&gt;Overview&lt;/H1&gt; 
 &lt;P&gt;This article explains how to use Talend Hive components in MapR Spark Batch Jobs to read from Hive MapR-DB tables. As MapR provides the ability to query MapR-DB tables through a Hive View, this article also covers how to set up Talend Jobs to read from a Hive View of MapR-DB table.&lt;/P&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Environment&lt;/H1&gt; 
 &lt;UL&gt;&lt;LI&gt;Talend Studio 6.5.1&lt;/LI&gt;&lt;LI&gt;MapR 6.0.1&lt;/LI&gt;&lt;/UL&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Prerequisites&lt;/H1&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H2&gt;Setting up MapR&lt;/H2&gt; 
 &lt;OL&gt;&lt;LI&gt;Set up the MapR Client 6.0.1 to connect with your MapR cluster on the system you are using to run your Job. For more information on setting up a MapR Client, see the MapR 6.1 documentation, &lt;A href="https://mapr.com/docs/home/AdvancedInstallation/SettingUptheClient-install-mapr-client.html" target="_blank"&gt;Installing the MapR Client&lt;/A&gt; page.&lt;/LI&gt;&lt;LI&gt; &lt;P&gt;After setting up the MapR Client, generate a MapR ticket that your Job can utilize to communicate with the cluster:&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 641px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGxBAAU.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/122040i94F2E2A9D2495BE6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGxBAAU.jpg" alt="0693p000008uGxBAAU.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt; 
 &lt;H2&gt;Setting up Studio&lt;/H2&gt; 
 &lt;OL&gt;&lt;LI&gt; &lt;P&gt;Ensure that your Studio has access to all of the Cluster nodes, and that they can reach back to your Studio per the &lt;A href="https://spark.apache.org/docs/2.2.0/security.html" target="_blank"&gt;Spark Security&lt;/A&gt; documentation, since Talend utilizes the YARN-Client paradigm that has the Spark driver spun up at the same location as the Job it is run from.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Configure the Hadoop Cluster connection in metadata in Studio.&lt;/P&gt; 
   &lt;OL&gt;&lt;LI&gt; &lt;P&gt;Right-click Hadoop Cluster, then click &lt;STRONG&gt;Create Hadoop Cluster&lt;/STRONG&gt;.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Select the distribution and version of your Hadoop cluster, then select &lt;STRONG&gt;Import configuration from local files&lt;/STRONG&gt;. Click &lt;STRONG&gt;Next&lt;/STRONG&gt;.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 545px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGvCAAU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/124365iD369DD6F9083EB6C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGvCAAU.png" alt="0693p000008uGvCAAU.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Ensure your system has a local copy of the &lt;STRONG&gt;hive-site.xml&lt;/STRONG&gt;, &lt;STRONG&gt;mapred-site.xml&lt;/STRONG&gt; and &lt;STRONG&gt;yarn-site.xml&lt;/STRONG&gt; files to import in to the Hadoop metadata wizard.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Import the cluster configuration files.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 796px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH00AAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/121433iD99D2012BF0BA535/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH00AAE.png" alt="0693p000008uH00AAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Notice that after the configuration files are imported, not all of the information on the next screen is populated, and it gives you a warning that the Resource Manager needs to be specified. This is because there are no specific hostnames included in the configuration files for the Resource Manager and CLDB nodes. You need them though later in this article, as they contain properties that will help with utilizing the Resource Manager HA.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;To fully utilize the CLDB and Resource Manager HA, complete the wizard as shown below:&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 699px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH3mAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/121819i5A8B6BAD374E0050/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH3mAAE.png" alt="0693p000008uH3mAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Once the cluster information is populated, click &lt;STRONG&gt;Check Services&lt;/STRONG&gt; to ensure that Studio can connect successfully to the cluster.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 602px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGnLAAU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/121463i1FAF40B1816E709F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGnLAAU.png" alt="0693p000008uGnLAAU.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt; &lt;/LI&gt;&lt;/OL&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Building the Job&lt;/H1&gt; 
 &lt;OL&gt;&lt;LI&gt; &lt;P&gt;Right-click &lt;STRONG&gt;Job Designs&lt;/STRONG&gt;, click &lt;STRONG&gt;Create Big Data Batch Job&lt;/STRONG&gt;, then give it a name.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;From the Hadoop Cluster connection you created earlier, drag the HDFS connection to the canvas, then select to enter a &lt;STRONG&gt;tHDFSConfiguration&lt;/STRONG&gt; component. Notice that it populates in right away, and in the &lt;STRONG&gt;Run&lt;/STRONG&gt; tab, the &lt;STRONG&gt;Spark Configuration&lt;/STRONG&gt; information is completed for you. This information tells the Job how to communicate with Spark.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH41AAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/123324iB08C5B407FAC64AD/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH41AAE.png" alt="0693p000008uH41AAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Again, using the Hadoop Cluster connection you created earlier, drag the Hive Connection to the canvas, then select to enter a &lt;STRONG&gt;tHiveConfiguration&lt;/STRONG&gt; component.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 912px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGozAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/125139iACF79B2410B06C1A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGozAAE.png" alt="0693p000008uGozAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;For each of the following libraries, use a &lt;STRONG&gt;tLibraryLoad&lt;/STRONG&gt; component referencing each one. The Hive components use these libraries to retrieve the data from the Hive view of your MapR-DB table:&lt;/P&gt; &lt;/LI&gt;&lt;OL&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-common-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-client-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-server-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-spark-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-protocol-1.1.1-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hive-hbase-handler-2.1.1-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;mapr-hbase-6.0.1-mapr.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;maprdb-6.0.1-mapr.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt;&lt;LI&gt; &lt;P&gt;Add a &lt;STRONG&gt;tHiveInput&lt;/STRONG&gt; component and configure it to read from the Hive View of your MapR-DB table.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 549px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGzlAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/124416i1413EC9FD1104E15/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGzlAAE.png" alt="0693p000008uGzlAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Configure this component to output the values of the table to a &lt;STRONG&gt;tLogRow&lt;/STRONG&gt; to ensure you can successfully read the table.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 505px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGenAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/122182i4F68524CAB0B2510/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGenAAE.png" alt="0693p000008uGenAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;The complete Job should look like this:&lt;/P&gt; &lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 822px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGU5AAM.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/123189iA4961370576701EB/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGU5AAM.png" alt="0693p000008uGU5AAM.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt; 
 &lt;H1&gt;Running the Job&lt;/H1&gt; 
 &lt;P&gt;Run the Job to see if you successfully connected to the Hive View, and can read the MapR-DB table data.&lt;/P&gt; 
 &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH4GAAU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/124663i1A6B6C99ACACD502/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH4GAAU.png" alt="0693p000008uH4GAAU.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Additional notes&lt;/H1&gt; 
 &lt;P&gt;The same Job design will work for MapR 5.2.0 and above.&lt;/P&gt; 
 &lt;P&gt;You&amp;nbsp;can utilize MapR 6.0.1 in Talend 6.5.1 through a patch, available from &lt;A href="https://www.talend.com/services/technical-support/" target="_blank"&gt;Talend Support&lt;/A&gt;, that adds it as a supported version.&lt;/P&gt; 
&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Fri, 09 Feb 2024 19:06:24 GMT</pubDate>
    <dc:creator>TalendSolutionExpert</dc:creator>
    <dc:date>2024-02-09T19:06:24Z</dc:date>
    <item>
      <title>Using Hive components in MapR Spark Jobs to read Hive MapR-DB tables</title>
      <link>https://community.qlik.com/t5/Official-Support-Articles/Using-Hive-components-in-MapR-Spark-Jobs-to-read-Hive-MapR-DB/ta-p/2151856</link>
      <description>&lt;DIV class="talend-tkb-migrated-content"&gt;&lt;DIV class="lia-message-template-content-zone"&gt; 
 &lt;H1&gt;Overview&lt;/H1&gt; 
 &lt;P&gt;This article explains how to use Talend Hive components in MapR Spark Batch Jobs to read from Hive MapR-DB tables. As MapR provides the ability to query MapR-DB tables through a Hive View, this article also covers how to set up Talend Jobs to read from a Hive View of MapR-DB table.&lt;/P&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Environment&lt;/H1&gt; 
 &lt;UL&gt;&lt;LI&gt;Talend Studio 6.5.1&lt;/LI&gt;&lt;LI&gt;MapR 6.0.1&lt;/LI&gt;&lt;/UL&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Prerequisites&lt;/H1&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H2&gt;Setting up MapR&lt;/H2&gt; 
 &lt;OL&gt;&lt;LI&gt;Set up the MapR Client 6.0.1 to connect with your MapR cluster on the system you are using to run your Job. For more information on setting up a MapR Client, see the MapR 6.1 documentation, &lt;A href="https://mapr.com/docs/home/AdvancedInstallation/SettingUptheClient-install-mapr-client.html" target="_blank"&gt;Installing the MapR Client&lt;/A&gt; page.&lt;/LI&gt;&lt;LI&gt; &lt;P&gt;After setting up the MapR Client, generate a MapR ticket that your Job can utilize to communicate with the cluster:&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 641px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGxBAAU.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/122040i94F2E2A9D2495BE6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGxBAAU.jpg" alt="0693p000008uGxBAAU.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt; 
 &lt;H2&gt;Setting up Studio&lt;/H2&gt; 
 &lt;OL&gt;&lt;LI&gt; &lt;P&gt;Ensure that your Studio has access to all of the Cluster nodes, and that they can reach back to your Studio per the &lt;A href="https://spark.apache.org/docs/2.2.0/security.html" target="_blank"&gt;Spark Security&lt;/A&gt; documentation, since Talend utilizes the YARN-Client paradigm that has the Spark driver spun up at the same location as the Job it is run from.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Configure the Hadoop Cluster connection in metadata in Studio.&lt;/P&gt; 
   &lt;OL&gt;&lt;LI&gt; &lt;P&gt;Right-click Hadoop Cluster, then click &lt;STRONG&gt;Create Hadoop Cluster&lt;/STRONG&gt;.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Select the distribution and version of your Hadoop cluster, then select &lt;STRONG&gt;Import configuration from local files&lt;/STRONG&gt;. Click &lt;STRONG&gt;Next&lt;/STRONG&gt;.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 545px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGvCAAU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/124365iD369DD6F9083EB6C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGvCAAU.png" alt="0693p000008uGvCAAU.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Ensure your system has a local copy of the &lt;STRONG&gt;hive-site.xml&lt;/STRONG&gt;, &lt;STRONG&gt;mapred-site.xml&lt;/STRONG&gt; and &lt;STRONG&gt;yarn-site.xml&lt;/STRONG&gt; files to import in to the Hadoop metadata wizard.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Import the cluster configuration files.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 796px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH00AAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/121433iD99D2012BF0BA535/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH00AAE.png" alt="0693p000008uH00AAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Notice that after the configuration files are imported, not all of the information on the next screen is populated, and it gives you a warning that the Resource Manager needs to be specified. This is because there are no specific hostnames included in the configuration files for the Resource Manager and CLDB nodes. You need them though later in this article, as they contain properties that will help with utilizing the Resource Manager HA.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;To fully utilize the CLDB and Resource Manager HA, complete the wizard as shown below:&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 699px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH3mAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/121819i5A8B6BAD374E0050/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH3mAAE.png" alt="0693p000008uH3mAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Once the cluster information is populated, click &lt;STRONG&gt;Check Services&lt;/STRONG&gt; to ensure that Studio can connect successfully to the cluster.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 602px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGnLAAU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/121463i1FAF40B1816E709F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGnLAAU.png" alt="0693p000008uGnLAAU.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt; &lt;/LI&gt;&lt;/OL&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Building the Job&lt;/H1&gt; 
 &lt;OL&gt;&lt;LI&gt; &lt;P&gt;Right-click &lt;STRONG&gt;Job Designs&lt;/STRONG&gt;, click &lt;STRONG&gt;Create Big Data Batch Job&lt;/STRONG&gt;, then give it a name.&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;From the Hadoop Cluster connection you created earlier, drag the HDFS connection to the canvas, then select to enter a &lt;STRONG&gt;tHDFSConfiguration&lt;/STRONG&gt; component. Notice that it populates in right away, and in the &lt;STRONG&gt;Run&lt;/STRONG&gt; tab, the &lt;STRONG&gt;Spark Configuration&lt;/STRONG&gt; information is completed for you. This information tells the Job how to communicate with Spark.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH41AAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/123324iB08C5B407FAC64AD/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH41AAE.png" alt="0693p000008uH41AAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Again, using the Hadoop Cluster connection you created earlier, drag the Hive Connection to the canvas, then select to enter a &lt;STRONG&gt;tHiveConfiguration&lt;/STRONG&gt; component.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 912px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGozAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/125139iACF79B2410B06C1A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGozAAE.png" alt="0693p000008uGozAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;For each of the following libraries, use a &lt;STRONG&gt;tLibraryLoad&lt;/STRONG&gt; component referencing each one. The Hive components use these libraries to retrieve the data from the Hive view of your MapR-DB table:&lt;/P&gt; &lt;/LI&gt;&lt;OL&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-common-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-client-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-server-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-spark-1.1.8-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hbase-protocol-1.1.1-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;hive-hbase-handler-2.1.1-mapr-1710.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;mapr-hbase-6.0.1-mapr.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;&lt;STRONG&gt;maprdb-6.0.1-mapr.jar&lt;/STRONG&gt;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt;&lt;LI&gt; &lt;P&gt;Add a &lt;STRONG&gt;tHiveInput&lt;/STRONG&gt; component and configure it to read from the Hive View of your MapR-DB table.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 549px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGzlAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/124416i1413EC9FD1104E15/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGzlAAE.png" alt="0693p000008uGzlAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;Configure this component to output the values of the table to a &lt;STRONG&gt;tLogRow&lt;/STRONG&gt; to ensure you can successfully read the table.&lt;/P&gt; &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 505px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGenAAE.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/122182i4F68524CAB0B2510/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGenAAE.png" alt="0693p000008uGenAAE.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;LI&gt; &lt;P&gt;The complete Job should look like this:&lt;/P&gt; &lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 822px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uGU5AAM.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/123189iA4961370576701EB/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uGU5AAM.png" alt="0693p000008uGU5AAM.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt; &lt;P&gt;&amp;nbsp;&lt;/P&gt; &lt;/LI&gt;&lt;/OL&gt; 
 &lt;H1&gt;Running the Job&lt;/H1&gt; 
 &lt;P&gt;Run the Job to see if you successfully connected to the Hive View, and can read the MapR-DB table data.&lt;/P&gt; 
 &lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p000008uH4GAAU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/124663i1A6B6C99ACACD502/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p000008uH4GAAU.png" alt="0693p000008uH4GAAU.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;H1&gt;Additional notes&lt;/H1&gt; 
 &lt;P&gt;The same Job design will work for MapR 5.2.0 and above.&lt;/P&gt; 
 &lt;P&gt;You&amp;nbsp;can utilize MapR 6.0.1 in Talend 6.5.1 through a patch, available from &lt;A href="https://www.talend.com/services/technical-support/" target="_blank"&gt;Talend Support&lt;/A&gt;, that adds it as a supported version.&lt;/P&gt; 
&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 09 Feb 2024 19:06:24 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Official-Support-Articles/Using-Hive-components-in-MapR-Spark-Jobs-to-read-Hive-MapR-DB/ta-p/2151856</guid>
      <dc:creator>TalendSolutionExpert</dc:creator>
      <dc:date>2024-02-09T19:06:24Z</dc:date>
    </item>
  </channel>
</rss>

