<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Tuning the query in tSQL in talend BigData Platform in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336459#M104902</link>
    <description>&lt;P&gt;&lt;SPAN&gt;I am using Talend bigdata Platform 6.3 Studio. I am using tSql component to execute a hive query in cluster. Please find below query&lt;/SPAN&gt;&lt;/P&gt; 
&lt;PRE&gt; SELECT DISTINCT inVistaNonVistaComb.*,inLeasingProxy.LogicalFeedType as LPLogicalFeedType
FROM inVistaNonVistaComb
LEFT OUTER JOIN inLeasingProxy ON
UPPER(inLeasingProxy.LogicalFeedType) = UPPER(inVistaNonVistaComb.Logical_Feed_Type) AND
UPPER(inLeasingProxy.SalesOrgRegion) = UPPER(inVistaNonVistaComb.Sales_Org_Region) AND
(UPPER(inVistaNonVistaComb.Sales_Org_Sub_Region) = UPPER(inLeasingProxy.SalesOrgSubRegion) OR UPPER(inLeasingProxy.SalesOrgSubRegion)='ALL') AND
(
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Sales_Rep_Code_Mis OR 
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.CBN OR 
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Transaction_Subtype OR
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Ice_Holder5
) AND
(inVistaNonVistaComb.Adjusted_Primary_Transaction_Date BETWEEN inLeasingProxy.EffectiveStartDate AND inLeasingProxy.EffectiveEndDate)&lt;/PRE&gt; 
&lt;P&gt;I am doing a left outer join with a reference table called&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;P&gt;&lt;U&gt;inLeasingProxy&lt;/U&gt;&lt;/P&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;This is a small file with only 2MB size. However the transaction file called&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;P&gt;&lt;U&gt;inVistaNonVistaComb&lt;/U&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;will have a size 1 GB - 2GB&lt;/P&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;The problem here is that, for some transaction files(1GB), This query will execute very fast (10 minutes). But for some trasaction files(1GB) , this query will run for 2 hours.&lt;/P&gt; 
&lt;P&gt;How can i tune this query? Please help&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2024 09:02:21 GMT</pubDate>
    <dc:creator>TomG1</dc:creator>
    <dc:date>2024-11-16T09:02:21Z</dc:date>
    <item>
      <title>Tuning the query in tSQL in talend BigData Platform</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336459#M104902</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I am using Talend bigdata Platform 6.3 Studio. I am using tSql component to execute a hive query in cluster. Please find below query&lt;/SPAN&gt;&lt;/P&gt; 
&lt;PRE&gt; SELECT DISTINCT inVistaNonVistaComb.*,inLeasingProxy.LogicalFeedType as LPLogicalFeedType
FROM inVistaNonVistaComb
LEFT OUTER JOIN inLeasingProxy ON
UPPER(inLeasingProxy.LogicalFeedType) = UPPER(inVistaNonVistaComb.Logical_Feed_Type) AND
UPPER(inLeasingProxy.SalesOrgRegion) = UPPER(inVistaNonVistaComb.Sales_Org_Region) AND
(UPPER(inVistaNonVistaComb.Sales_Org_Sub_Region) = UPPER(inLeasingProxy.SalesOrgSubRegion) OR UPPER(inLeasingProxy.SalesOrgSubRegion)='ALL') AND
(
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Sales_Rep_Code_Mis OR 
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.CBN OR 
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Transaction_Subtype OR
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Ice_Holder5
) AND
(inVistaNonVistaComb.Adjusted_Primary_Transaction_Date BETWEEN inLeasingProxy.EffectiveStartDate AND inLeasingProxy.EffectiveEndDate)&lt;/PRE&gt; 
&lt;P&gt;I am doing a left outer join with a reference table called&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;P&gt;&lt;U&gt;inLeasingProxy&lt;/U&gt;&lt;/P&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;This is a small file with only 2MB size. However the transaction file called&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;P&gt;&lt;U&gt;inVistaNonVistaComb&lt;/U&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;will have a size 1 GB - 2GB&lt;/P&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;The problem here is that, for some transaction files(1GB), This query will execute very fast (10 minutes). But for some trasaction files(1GB) , this query will run for 2 hours.&lt;/P&gt; 
&lt;P&gt;How can i tune this query? Please help&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 09:02:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336459#M104902</guid>
      <dc:creator>TomG1</dc:creator>
      <dc:date>2024-11-16T09:02:21Z</dc:date>
    </item>
    <item>
      <title>Re: Tuning the query in tSQL in talend BigData Platform</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336460#M104903</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;We have redirected your issue to our Bigdata expert and then come back to you as soon as we can.&lt;/P&gt;
&lt;P&gt;Thanks for your time.&lt;/P&gt;
&lt;P&gt;Best regards&lt;/P&gt;
&lt;P&gt;Sabrina&lt;/P&gt;</description>
      <pubDate>Tue, 28 Nov 2017 07:19:30 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336460#M104903</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-11-28T07:19:30Z</dc:date>
    </item>
    <item>
      <title>Re: Tuning the query in tSQL in talend BigData Platform</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336461#M104904</link>
      <description>&lt;P&gt;I don't think talend big data expert will be able to help.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have identified the issue.&lt;/P&gt;
&lt;P&gt;Here the OR condition in the query is causing the problem.&lt;/P&gt;
&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;WorkAround&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;I splitted the query into four queries.The individual OR condition is placed in seperate query. Each query is pasted in seperate tSQLRow.&lt;/P&gt;
&lt;P&gt;The TsqlRows are connected in series.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Ex:&lt;/P&gt;
&lt;PRE&gt;SELECT DISTINCT inVistaNonVistaComb.*,inLeasingProxy.LogicalFeedType as LPLogicalFeedType
FROM inVistaNonVistaComb
LEFT OUTER JOIN inLeasingProxy ON
UPPER(inLeasingProxy.LogicalFeedType) = UPPER(inVistaNonVistaComb.Logical_Feed_Type) AND
UPPER(inLeasingProxy.SalesOrgRegion) = UPPER(inVistaNonVistaComb.Sales_Org_Region) AND
(UPPER(inVistaNonVistaComb.Sales_Org_Sub_Region) = UPPER(inLeasingProxy.SalesOrgSubRegion) OR UPPER(inLeasingProxy.SalesOrgSubRegion)='ALL') AND
(
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Sales_Rep_Code_Mis
) AND
(inVistaNonVistaComb.Adjusted_Primary_Transaction_Date BETWEEN inLeasingProxy.EffectiveStartDate AND inLeasingProxy.EffectiveEndDate)&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 06:40:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Tuning-the-query-in-tSQL-in-talend-BigData-Platform/m-p/2336461#M104904</guid>
      <dc:creator>TomG1</dc:creator>
      <dc:date>2018-02-07T06:40:21Z</dc:date>
    </item>
  </channel>
</rss>

