Solved: Re: Poor performance for Datamart Dimension table - Qlik Community

aj96 · ‎2021-05-24

I am facing performance issue for a Dimension table having multiple relation to other table. For one of my dimension it took almost 8 min to complete that is having no records that was inserted or updated in the previous DWH run. Sometimes a stat update helps other times I tried rebuild the index of some of the tables used in that query and it ran in seconds. I also altered the sub query to use that base table only once and that also helped. Could some one help me this? Is there something that can be done in compose that would improve the performance?

I'm using Compose Version 2021.5.0.69 and Sql Server 2017 CU22.

TimGarrod · ‎2021-05-25

You won't be able to force a seek off that nested UNION All. Storing the ID results in the temp table and indexing it allows the nested loop and seek on the first table / HUB that it joins to.

View solution in original post

TimGarrod · ‎2021-05-24

Hi @aj96 - can you run a query plan to see what the plan is doing? Since its SQL Server it could be stats / index out of date like you are suggesting - or perhaps a more "covering" index for the initial criteria (get the ID's for any changed records).

Would want to see the plan to determine if any changes are required

aj96 · ‎2021-05-25

hi @TimGarrod

I have attached the query plan for this query.

i was also wondering whether instead of having multiple union all and the same E1 table wouldn't it better to have those in the same join. I changed that and it got completed in a 1 sec for 144 records other took about 12 min.

SELECT [ID]
FROM ( SELECT [E1].[ID] [ID]
FROM [ODS_COMPOSE_DW].[ods].[TDWH_UDT_HUB] [E1]
WHERE [E1].[RUNNO_UPDATE] > 100
UNION ALL
SELECT [E1].[ID] [ID]
FROM [ODS_COMPOSE_DW].[ods].[TDWH_UDT_HUB] [E1]
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_MEMBER_HUB] [E2]
ON [E1].[MEMBER] = [E2].[ID]
AND [E2].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_PRACTITIONER_HUB] [E3]
ON [E1].[PRACTITIONER] = [E3].[ID]
AND [E3].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_SUPPLIER_HUB] [E4]
ON [E1].[SUPPLIER] = [E4].[ID]
AND [E4].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_SUPPLIER_LOCATION_HUB] [E5]
ON [E1].[SUPPLIER_LOCATION] = [E5].[ID]
AND [E5].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_AUTHORIZATION_HEADER_HUB] [E6]
ON [E1].[AUTHORIZATION_HEADER] = [E6].[ID]
AND [E6].[RUNNO_UPDATE] > 100
INNER JOIN [ODS_COMPOSE_DW].[ods].[TDWH_CLAIM_HUB] [E7]
ON [E1].[CLAIM] = [E7].[ID]
AND [E7].[RUNNO_UPDATE] > 100
WHERE NOT [E1].[RUNNO_UPDATE] > 100
) [QQ]
GROUP BY [ID]

TimGarrod · ‎2021-05-25

Problem is your query isn’t going to detect all the changes.

When you consider all the relationships, what Compose is doing is detecting the ID’s for the GRAIN of the dimension table that need to be adjusted.
So if I have Product with a relationship to Supplier and a relationship to Category, Compose needs to determine

* which product records have change
* which category records have changed (join to products to determine which product ID’s that impacts)
* where supplier records have changed (join to products to determine which product ID’s that impacts)

That’s a simple scenario. Then consider cascading changes (Category has a relationship to CategoryType)

* which categoryType records have changed (joined to category, joined to product to determine which product ID’s are impacted)

So trying to have those all in a single join gets complex with OR conditions having to be accounted for – which would certainly negate indexes and cause scans.

Will review the plan and get back to you 😊

TimGarrod · ‎2021-05-25

Can you try to run the attached - and tell me how long it takes?

aj96 · ‎2021-05-25

Yeah I guessed that would something to look for but just tried to solve the query rather than looking the whole process here 😋

aj96 · ‎2021-05-25

I ran this and this got completed in 25 sec 😀

But I still see a high estimated count in the query plan. So I was guessing when using the runno to get the records the estimate shouldn't be this high.

TimGarrod · ‎2021-05-25

If estimates are off - those are stats based in SQL Server. Remember a single change to a table that is say 4 relationships removed could impact a larger number of the "granular records". The issue is sometimes in SQl server having that union all / group by - followed by the join can cause index scans on the subsequent tables. The tmp table and index is likely ensuring an index seek when compiling the rest of the data together.

aj96 · ‎2021-05-25

Yes that make sense. So index wise we should be able to do something here right. So we can make sql server use a index seek rather than scan. Also checked index fragmentation, all these were below 5%.

TimGarrod · ‎2021-05-25

You won't be able to force a seek off that nested UNION All. Storing the ID results in the temp table and indexing it allows the nested loop and seek on the first table / HUB that it joins to.

Poor performance for Datamart Dimension table

Performance