Here is thedeal. This is the statement which I need to tune: TempDates: LOAD 'D' as AggrLevel, *, if(dmDateHour = MonthStart(dmDateHour), ES) as MonthES, \\ calc ES for Monthlyaggregation if(dmDateHour = WeekStart(dmDateHour), ES) as WeekES, \\ calc ES for Weeklyaggregation MonthStart(dmDateHour) as MonthStartDay, WeekStart(dmDateHour) as WeekStartDay ; LOAD DPId, TZId, StationId, DayDateHour as dmDateHour, AggrCountry, sum(TLH) as TLH, sum(SS) as SS, sum(if(DPESDateHour = dmDateHour, ES)) as ES \\ calc ESfor Daily aggregation Group by DPId, TZId, StationId, DayDateHour, AggrCountry ; LOAD DPId, TZId, StationId, dmDateHour, AggrCountry, DayDateHour, sum(TLH) as TLH, sum(SS) as SS, sum(ES) as ES, Min(dmDateHour) as DPESDateHour \\first DateHour for a day part Group by DPId, TZId, StationId, dmDateHour, AggrCountry, DayDateHour ; LOAD DPId, TZId, StationId, dmDateHour, AggrCountry, DayDateHour, TLH, SS, ES Resident Temp Where DayDateHour >= $(vDateStart) \\dailyloop condition and DayDateHour < $(vDateEnd); Temp table: The table has aggregated information about sessions on different stations for predefined day parts. Field Description DPId 30 possible day parts (e.g. “Whole Week/Whole Day”, “Monday-Friday/10-13”, “Tuesday-Sunday / 9 - 20” and etc.) TZId 5 possible time zones (e.g. GMT, EST and etc.) StationId About 16 500 Stations’ Ids dmDateHour A timestamp rounded to each hour sharp (e.g. 1/1/2011 10:00:00) or in QV syntax = floor(timestamp * 24)/24 AggrCountry 2 possible countries – Main and Others DayDateHour A beginning of a day calculated derived from dmDateHour (= floor(dmDateHour)) TLH Total hours for a sessions during dmDateHour (> 0) SS # of Started Sessions during dmDateHour ES # of Entered Sessions from previous hour into current dmDateHour The table has 1 month data. As you can see it may have 30 * 5 * 16500 * (30 * 24) * 2 = 3564 000 000 records. Usually it’s not that huge because not all stations have an activity during a month. It’s about 30-40 millions of records. Main problem is related with ES. I need to derive number of Active Sessions (AS)which is Sum(SS) + Sum(ES of first dmDateHour in a day part). And I need to aggregate it to Day. That’s why I have these multi level aggregations. I’ve made this condition in a begging Where DayDateHour>= $(vDateStart) andDayDateHour < $(vDateEnd); to reduce a number of records for the aggregation. I have a loop where I derive vDateStart and vDateEnd for each day of a month. If I remove this condition it will kill whole memory on a server (98GB). So, daily aggregation takes 45 mins per each day (about 22 hours for whole month). If I have just one day data in the Temp table it takes about 3 mins to do sameaggregation for a day. But most interesting thing is all the rest days of a monthin this daily loop will calculates during 90 seconds to retrieve NO DATA fromthe Temp table. It means that a day aggregation takes about 90 second and other90 seconds for data retrieval for the condition. My assumption is – the more data I have in the table the longer this data retrievalis. (It’s too pity that we don’t have indexes in QV). Is there other way to make this retrieval faster? Or maybe there is another way for my statement. Please help. P.S. The Word version is in attachment.
PerformanceTuning.docx 14.8 K