Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

Announcements
QlikView Fans! We’d love to hear from you.
Share your QlikView feedback with the product team… Click here to participate in our 5-minute survey.
Rules, plus terms and conditions, can be found here.
datanibbler
Esteemed Contributor

High-cardinality keyfield - any way to improve?

Hi,

I am still reworking an old piece of code. In short, it does this:

- There is a target_value for every day, for each of the two shifts, for each of the areas
   -> there are about 30-40 values every day

- That table is joined, via the date, to one Autogenerate_table with minute_intervals from 5:20 am to 10:30 pm

-> Then - here comes the big one - a keyfield is loaded that is made up of the date and any of these minute_intervals - so there are

    a bit over 1.000 keyfields for one day. That blows up the table to millions and millions of records of course.

=> Is there any way of doing this more efficiently (without touching the key first, it is assumedly required on a minute_basis)?

Thanks a lot!

Best regards,

DataNibbler

1 Solution

Accepted Solutions
Not applicable

Re: High-cardinality keyfield - any way to improve?

Afaik, you can use two separate fields.

This works:

TableA:

load id,

       date_pk,

       minute_pk

from tableA.qvd (qvd);

TableB:

left join(TableA)

load date_column as date_pk,

       minute_column as minute_pk,

       target_value

from TableB.qvd (qvd);

eg:

TableA:

[id,date_pk,minute_pk

1,22,05

1,22,06

2,22,06

2,22,07];

TableB:

[date_column, minute_column, target_value

21,05,10

21,06,11

21,07,10.5

22,05,15

22,06,16

22,07,16

22,08,15];

Output would still be

[id,date_pk,minute_pk, target_value

1,22,05,15

1,22,06,16

2,22,06,16

2,22,07,16];

6 Replies

Re: High-cardinality keyfield - any way to improve?

Hi DataNibbler,

I think the best would be to avoid such a key-field. For joining your target-table you didn't need these key only date is enough. Are there joins to other tables? Then it could be a solution to create a bigger fact-table per joining and/or mapping and afterwards delete this key-field and use instead two fields "date" and "time" in your data-model.

- Marcus

datanibbler
Esteemed Contributor

Re: High-cardinality keyfield - any way to improve?

Hi Marcus,

I assume that it has been a requirement that we break down our target_value by minute so that we have all the possibilities - though we probably won't be able to calculate and display this every minute - so, yes, I do need that keyfield.

(those are the target_data which I then have to link to the current_data to show "where we are" production-wise)

Splitting such a thing in two saves a lot, I know, but I cannot use two separate fields as key without combining them, can I?

Not applicable

Re: High-cardinality keyfield - any way to improve?

Hi,

Am not clear on why a combined Date&minute key won't work. Will it be possible for you to paste the code here?

datanibbler
Esteemed Contributor

Re: High-cardinality keyfield - any way to improve?

Hi,

well, this is the critical question - it's probably me who is unclear about this.

AfaIk I cannot use two fields as key as long as they are separate, can I? This assumption is why I concatenate the date and the time (which looks like this)

>>> LOAD

       ..

       num(Date_PK) & '_' & Interval_Min as Keyfield,

       ...

RESIDENT ...

; <<<

Of course, if I could leave those two fields separate and still use their combination(s) as key, that would save me a few million records. Would that work?

Not applicable

Re: High-cardinality keyfield - any way to improve?

Afaik, you can use two separate fields.

This works:

TableA:

load id,

       date_pk,

       minute_pk

from tableA.qvd (qvd);

TableB:

left join(TableA)

load date_column as date_pk,

       minute_column as minute_pk,

       target_value

from TableB.qvd (qvd);

eg:

TableA:

[id,date_pk,minute_pk

1,22,05

1,22,06

2,22,06

2,22,07];

TableB:

[date_column, minute_column, target_value

21,05,10

21,06,11

21,07,10.5

22,05,15

22,06,16

22,07,16

22,08,15];

Output would still be

[id,date_pk,minute_pk, target_value

1,22,05,15

1,22,06,16

2,22,06,16

2,22,07,16];

datanibbler
Esteemed Contributor

Re: High-cardinality keyfield - any way to improve?


Hi,

well, that would be cool. Just by splitting that one high-cardinality field into two, I will save some mio records, and hopefully some processing time.

Thanks a lot!

Community Browser