Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
TomG1
Creator
Creator

Tuning the query in tSQL in talend BigData Platform

I am using Talend bigdata Platform 6.3 Studio. I am using tSql component to execute a hive query in cluster. Please find below query

 SELECT DISTINCT inVistaNonVistaComb.*,inLeasingProxy.LogicalFeedType as LPLogicalFeedType
FROM inVistaNonVistaComb
LEFT OUTER JOIN inLeasingProxy ON
UPPER(inLeasingProxy.LogicalFeedType) = UPPER(inVistaNonVistaComb.Logical_Feed_Type) AND
UPPER(inLeasingProxy.SalesOrgRegion) = UPPER(inVistaNonVistaComb.Sales_Org_Region) AND
(UPPER(inVistaNonVistaComb.Sales_Org_Sub_Region) = UPPER(inLeasingProxy.SalesOrgSubRegion) OR UPPER(inLeasingProxy.SalesOrgSubRegion)='ALL') AND
(
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Sales_Rep_Code_Mis OR 
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.CBN OR 
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Transaction_Subtype OR
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Ice_Holder5
) AND
(inVistaNonVistaComb.Adjusted_Primary_Transaction_Date BETWEEN inLeasingProxy.EffectiveStartDate AND inLeasingProxy.EffectiveEndDate)

I am doing a left outer join with a reference table called

inLeasingProxy

This is a small file with only 2MB size. However the transaction file called

inVistaNonVistaComb will have a size 1 GB - 2GB

The problem here is that, for some transaction files(1GB), This query will execute very fast (10 minutes). But for some trasaction files(1GB) , this query will run for 2 hours.

How can i tune this query? Please help

Labels (3)
1 Solution

Accepted Solutions
TomG1
Creator
Creator
Author

I don't think talend big data expert will be able to help.

 

I have identified the issue.

Here the OR condition in the query is causing the problem.

WorkAround

I splitted the query into four queries.The individual OR condition is placed in seperate query. Each query is pasted in seperate tSQLRow.

The TsqlRows are connected in series.

 

Ex:

SELECT DISTINCT inVistaNonVistaComb.*,inLeasingProxy.LogicalFeedType as LPLogicalFeedType
FROM inVistaNonVistaComb
LEFT OUTER JOIN inLeasingProxy ON
UPPER(inLeasingProxy.LogicalFeedType) = UPPER(inVistaNonVistaComb.Logical_Feed_Type) AND
UPPER(inLeasingProxy.SalesOrgRegion) = UPPER(inVistaNonVistaComb.Sales_Org_Region) AND
(UPPER(inVistaNonVistaComb.Sales_Org_Sub_Region) = UPPER(inLeasingProxy.SalesOrgSubRegion) OR UPPER(inLeasingProxy.SalesOrgSubRegion)='ALL') AND
(
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Sales_Rep_Code_Mis
) AND
(inVistaNonVistaComb.Adjusted_Primary_Transaction_Date BETWEEN inLeasingProxy.EffectiveStartDate AND inLeasingProxy.EffectiveEndDate)

 

View solution in original post

2 Replies
Anonymous
Not applicable

Hello,

We have redirected your issue to our Bigdata expert and then come back to you as soon as we can.

Thanks for your time.

Best regards

Sabrina

TomG1
Creator
Creator
Author

I don't think talend big data expert will be able to help.

 

I have identified the issue.

Here the OR condition in the query is causing the problem.

WorkAround

I splitted the query into four queries.The individual OR condition is placed in seperate query. Each query is pasted in seperate tSQLRow.

The TsqlRows are connected in series.

 

Ex:

SELECT DISTINCT inVistaNonVistaComb.*,inLeasingProxy.LogicalFeedType as LPLogicalFeedType
FROM inVistaNonVistaComb
LEFT OUTER JOIN inLeasingProxy ON
UPPER(inLeasingProxy.LogicalFeedType) = UPPER(inVistaNonVistaComb.Logical_Feed_Type) AND
UPPER(inLeasingProxy.SalesOrgRegion) = UPPER(inVistaNonVistaComb.Sales_Org_Region) AND
(UPPER(inVistaNonVistaComb.Sales_Org_Sub_Region) = UPPER(inLeasingProxy.SalesOrgSubRegion) OR UPPER(inLeasingProxy.SalesOrgSubRegion)='ALL') AND
(
inLeasingProxy.LeasingProxyValue = inVistaNonVistaComb.Sales_Rep_Code_Mis
) AND
(inVistaNonVistaComb.Adjusted_Primary_Transaction_Date BETWEEN inLeasingProxy.EffectiveStartDate AND inLeasingProxy.EffectiveEndDate)