Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Populate missing values inside a fact table

Hi to all,

I have a fact table where i have Product, Customer, Value and YearMonth fields, Dim_Progressivo as the key that concatenate product,customer,value,yearmonth and Key_Progressivo as the concatenate of product,customer,value.

I need to calculate the progressive of Value but some of the YearMonth are missing in the table. For istance I can have the product A in 201101 and 201103.

If I use:

load

rangesum((Value), if(peek('Dim_Progressivo') = Dim_Progressivo, peek('ProgressiveValue'))) as ProgressiveValue,

*

from FactTable

order by Key_Progressivo;

the result in the ProgressiveValue isn't correct.

The problem is that I need to add the missing YearMonth to have the correct progressivevalue calcuated. The fact table is about 10 ml records so i need to have a solution that take care of performance issues.



Thanks to all!

1 Solution

Accepted Solutions
johnw
Champion III
Champion III


danielato.a wrote: I tried your solution but I get exactly the same records


Several problems.

  1. I assumed that AnnoMese was a date field with a format of YYYYMM, not just a number. I recommend changing it in the QVD.
  2. You'd said that Key_Progressivo was "the concatenate of product,customer,value". Instead, it also includes the YearMonth. It appears that Key_Progressivo_SenzaAnnoMese is actually the one that doesn't include the YearMonth. So we'd need to use that instead.
  3. Previous(AnnoMese) wasn't behaving as I expected in a WHILE loop, so I had to create a new field, PreviousAnnoMese, before going into the WHILE loop. I probably just messed something up and the new field isn't necessary, but I'm not seeing it.

See attached. Script below.

[Fact Table]:
LOAD
Key_Progressivo_SenzaAnnoMese
,date(date#(AnnoMese,'YYYYMM'),'YYYYMM') as AnnoMese
,peek('AnnoMese') as PreviousAnnoMese
,Importo
FROM andrea.qvd (qvd)
;

[Fact Table 2]:
LOAD *
,Key_Progressivo_SenzaAnnoMese & date(AnnoMese,'YYYYMM') as Key_Progressivo
;
LOAD Key_Progressivo_SenzaAnnoMese
,addmonths(AnnoMese,1-iterno()) as AnnoMese
,Importo
RESIDENT [Fact Table]
WHILE iterno() = 1
OR ( Key_Progressivo_SenzaAnnoMese = previous(Key_Progressivo_SenzaAnnoMese)
AND addmonths(AnnoMese,1-iterno()) > PreviousAnnoMese)
ORDER BY Key_Progressivo_SenzaAnnoMese, AnnoMese
;
DROP TABLE [Fact Table]
;

View solution in original post

7 Replies
Miguel_Angel_Baeyens

Hello,

Aggregations in script may not be the fastest way when reloading, but it will save you some time (even lots of time) in chart rendering.

Check this post or this post and adapt it to your code, since it seems very similar to what you are asking for.

Note that Peek() returns the last loaded value, if the table is not ordered, it may return a different month and summing the wrong value.

Note as well that the code in the posts above uses several load steps, but that if you are able to do the SQL SELECT ordered you won't need to do that again. Although those example use RESIDENT loads, I'd recommend you to STORE the ordered table into QVD and then do the accumulation on the load of this file, if that's the case.

Hope that helps.

Not applicable
Author

Hi Miguel,

unfortunatly I can't do anything on the SQL SELECT part. I have take a look on the other posts but I still have problems on creating the lines that are missing in my fact table. If you look to this example:

Key_Progressivo AnnoMese sum(Importo)
39275,63
08623-00-BAN-_ns-_ns-00-SCCC301-SCCC301-D3-1030.000.100.0200-QLIK-000000093-20100320100310901,64
08623-00-BAN-_ns-_ns-00-SCCC301-SCCC301-D3-1030.000.100.0200-QLIK-000000093-2010042010043352,71
08623-00-BAN-_ns-_ns-00-SCCC301-SCCC301-D3-1030.000.100.0200-QLIK-000000093-2010052010051258,18
08623-00-BAN-_ns-_ns-00-SCCC301-SCCC301-D3-1030.000.100.0200-QLIK-000000093-2010072010073742,58
08623-00-BAN-_ns-_ns-00-SCCC301-SCCC301-D3-1030.000.100.0200-QLIK-000000093-201009201009179,72
08623-00-BAN-_ns-_ns-00-SCCC301-SCCC301-D3-1030.000.100.0200-QLIK-000000093-2010122010121673,86





you can see that the 201006 is missing. I need to add a line to my fact table with 201008 and Importo = 0 otherwise some additional things will not work as expected.

Thanks!

Andrea

johnw
Champion III
Champion III

So for EVERY value of Key_Progressivo, you want EVERY month in your table? Perhaps something like this:

[All]:
LOAD text(fieldvalue('Key_Progressivo',iterno())) as Key_Progressivo
AUTOGENERATE 1
WHILE len(fieldvalue('Key_Progressivo',iterno()))
;
LEFT JOIN ([All])
LOAD DISTINCT AnnoMese
RESIDENT [Calendar]
;
LEFT JOIN ([All])
LOAD *
,Key_Progressivo & AnnoMese as Dim_Progressivo // or however you define Dim_Progressivo
RESIDENT [All]
;
OUTER JOIN ([Fact Table])
LOAD *
RESIDENT [All]
;
DROP TABLE [All]
;

There might be a simpler way. That also leaves Importo=null for the added months. I don't know if that's a problem, but an extra pass through can set those nulls to 0 if need be.

johnw
Champion III
Champion III

Hmmm, you could probably do something similar with a while loop.

[Fact Table 2]:
LOAD *
,Key_Progressivo & AnnoMese as Dim_Progressivo
;
LOAD Key_Progressivo
,addmonths(AnnoMese,1-iterno()) as AnnoMese
,any other fields you want
RESIDENT FactTable
ORDER BY Key_Progressivo, AnnoMese
WHILE iterno() = 1
OR ( Key_Progressivo = previous(Key_Progressivo)
AND addmonths(AnnoMese,1-iterno()) > previous(AnnoMese))
;
DROP TABLE [Fact Table]
;

It's slightly different than the previous approach. Here, you'll only generate months between the first and last month for the specific Key_Progressivo. With the previous approach, you'd generate all months in the calendar. Not sure which you need.

Not applicable
Author

Hi John,

thank you, exactly what I was looking for, only the missing months from the first and the last one.

I tried your solution but I get exactly the same records, probably

I attach the qvw and the qvd where you can see the results.

Thanks a lot!

Andrea

johnw
Champion III
Champion III


danielato.a wrote: I tried your solution but I get exactly the same records


Several problems.

  1. I assumed that AnnoMese was a date field with a format of YYYYMM, not just a number. I recommend changing it in the QVD.
  2. You'd said that Key_Progressivo was "the concatenate of product,customer,value". Instead, it also includes the YearMonth. It appears that Key_Progressivo_SenzaAnnoMese is actually the one that doesn't include the YearMonth. So we'd need to use that instead.
  3. Previous(AnnoMese) wasn't behaving as I expected in a WHILE loop, so I had to create a new field, PreviousAnnoMese, before going into the WHILE loop. I probably just messed something up and the new field isn't necessary, but I'm not seeing it.

See attached. Script below.

[Fact Table]:
LOAD
Key_Progressivo_SenzaAnnoMese
,date(date#(AnnoMese,'YYYYMM'),'YYYYMM') as AnnoMese
,peek('AnnoMese') as PreviousAnnoMese
,Importo
FROM andrea.qvd (qvd)
;

[Fact Table 2]:
LOAD *
,Key_Progressivo_SenzaAnnoMese & date(AnnoMese,'YYYYMM') as Key_Progressivo
;
LOAD Key_Progressivo_SenzaAnnoMese
,addmonths(AnnoMese,1-iterno()) as AnnoMese
,Importo
RESIDENT [Fact Table]
WHILE iterno() = 1
OR ( Key_Progressivo_SenzaAnnoMese = previous(Key_Progressivo_SenzaAnnoMese)
AND addmonths(AnnoMese,1-iterno()) > PreviousAnnoMese)
ORDER BY Key_Progressivo_SenzaAnnoMese, AnnoMese
;
DROP TABLE [Fact Table]
;

Not applicable
Author

Hi John,

perfect! Tomorrow morning I will do the test on the live environment within my 10 ML records of fact table to see the performance. I will let you know.

Thanks a lot for your help!

Andrea