Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
aj96
Contributor III
Contributor III

Poor performance for Datamart Dimension table

I am facing performance issue for a Dimension table having multiple relation to other table. For one of my dimension it took almost 8 min to complete that is having no records that was inserted or updated in the previous DWH run. Sometimes a stat update helps other times  I tried rebuild the index of some of the tables used in that query and it ran in seconds. I also altered the sub query to use that base table only once and that also helped. Could some one help me this? Is there something that can be done in compose that would improve the performance?

I'm using Compose Version 2021.5.0.69 and Sql Server 2017 CU22.

Labels (1)
1 Solution

Accepted Solutions
TimGarrod
Employee
Employee

You won't be able to force a seek off that nested UNION All. Storing the ID results in the temp table and indexing it allows the nested loop and seek on the first table / HUB that it joins to.

View solution in original post

12 Replies
TimGarrod
Employee
Employee

Hi @aj96  - can you run a query plan to see what the plan is doing?    Since its SQL Server it could be stats / index out of date like you are suggesting - or perhaps a more "covering" index for the initial criteria (get the ID's for any changed records). 

Would want to see the plan to determine if any changes are required

aj96
Contributor III
Contributor III
Author

hi @TimGarrod 

I have attached the query plan for this query.

i was also wondering whether instead of having multiple union all and the same E1 table wouldn't it better to have those in the same join. I changed that and it got completed in a 1 sec for 144 records other took about 12 min.

SELECT [ID]
FROM ( SELECT [E1].[ID] [ID]
FROM [ODS_COMPOSE_DW].[ods].[TDWH_UDT_HUB] [E1]
WHERE [E1].[RUNNO_UPDATE] > 100
UNION ALL
SELECT [E1].[ID] [ID]
FROM [ODS_COMPOSE_DW].[ods].[TDWH_UDT_HUB] [E1]
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_MEMBER_HUB] [E2]
ON [E1].[MEMBER] = [E2].[ID]
AND [E2].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_PRACTITIONER_HUB] [E3]
ON [E1].[PRACTITIONER] = [E3].[ID]
AND [E3].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_SUPPLIER_HUB] [E4]
ON [E1].[SUPPLIER] = [E4].[ID]
AND [E4].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_SUPPLIER_LOCATION_HUB] [E5]
ON [E1].[SUPPLIER_LOCATION] = [E5].[ID]
AND [E5].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_AUTHORIZATION_HEADER_HUB] [E6]
ON [E1].[AUTHORIZATION_HEADER] = [E6].[ID]
AND [E6].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_CLAIM_HUB] [E7]
ON [E1].[CLAIM] = [E7].[ID]
AND [E7].[RUNNO_UPDATE] > 100
WHERE NOT [E1].[RUNNO_UPDATE] > 100
) [QQ]
GROUP BY [ID]

TimGarrod
Employee
Employee

Problem is your query isn’t going to detect all the changes.

When you consider all the relationships, what Compose is doing is detecting the ID’s for the GRAIN of the dimension table that need to be adjusted.
So if I have Product with a relationship to Supplier and a relationship to Category, Compose needs to determine

* which product records have change
* which category records have changed (join to products to determine which product ID’s that impacts)
* where supplier records have changed (join to products to determine which product ID’s that impacts)

That’s a simple scenario. Then consider cascading changes (Category has a relationship to CategoryType)

* which categoryType records have changed (joined to category, joined to product to determine which product ID’s are impacted)

So trying to have those all in a single join gets complex with OR conditions having to be accounted for – which would certainly negate indexes and cause scans.

Will review the plan and get back to you 😊
TimGarrod
Employee
Employee

Can you try to run the attached - and tell me how long it takes?   

aj96
Contributor III
Contributor III
Author

Yeah I guessed that would something to look for but just tried to solve the query rather than looking the whole process here 😋

 

aj96
Contributor III
Contributor III
Author

I ran this and this got completed in 25 sec 😀

But I still see a high estimated count in the query plan.  So I was guessing when using the runno to get the records the estimate shouldn't be this high.

aj96_0-1621969673359.png

 

TimGarrod
Employee
Employee

If estimates are off - those are stats based in SQL Server. Remember a single change to a table that is say 4 relationships removed could impact a larger number of the "granular records". The issue is sometimes in SQl server having that union all / group by - followed by the join can cause index scans on the subsequent tables. The tmp table and index is likely ensuring an index seek when compiling the rest of the data together.
aj96
Contributor III
Contributor III
Author

Yes that make sense. So index wise we should be able to do something here right. So we can make sql server use a index seek rather than scan. Also checked index fragmentation, all these were below 5%.

TimGarrod
Employee
Employee

You won't be able to force a seek off that nested UNION All. Storing the ID results in the temp table and indexing it allows the nested loop and seek on the first table / HUB that it joins to.