How can we identify "first time" data in Qlik? - Qlik Community

Anonymous · ‎2018-05-23

We have an interesting challenge, in that we are looking thru a dataset that uploads each week, and we want to identify what is new this week.

For example, let's say that our people are named A, B, C, thru ZZ.

They have attributes, that we'll call 1, 2, 3, ...n.

What we want to know, is when a person first has an attribute. The data would look like this.

Week 1: A1, A4, C1, C29, F2, F100.

Week 2: B1, A4, C1, C29, F2, F100.

Week 3: B1, B4, C2, C29, F2, F100, G5

Our analysis would yield

A1, A4, C1, C29, F2, F100 in Week 1

B1 in Week 2

B4, G5 in Week 3

since those are the first times those combinations were added.

Our list of People and Attributes is not of a fixed length, new ones are added all the time. They are, however, uniquely identified. The only thing we care about, for this challenge, is finding the first time that there is a new Person-Attribute combination.

Ideas on whether this is a good use case within Qlik, or would this be better handled at the database level, before bringing into Qlik?

mikaelsc · ‎2018-05-23

if you work with incremental loads:

1) load previously available data (containing all previously existing combinations User-Attribute)

2) load data from the new week that not existed before using a where not exists(User-Attribute, New-User-Attribute) condition.

ensure to use a certain flag or a field "first loaded in week" containing the week in which the value first appeared

3) load data from the new week that existed before (if needed)

Anonymous · ‎2018-05-23

Mikael, if I understand correctly, we would need to set a "first loaded" flag outside of Qlik. Then Qlik could easily display that data. It wouldn't be a good use case within Qlik to point-out which records are appearing for the 1st time. Is that correct?

mikaelsc · ‎2018-05-23

David, everything in Qlik of Course 🙂

Maybe with some more scripting hints:

1) Load previously available data:

Data:

Load

*

from OldData

Which will give you the table:

Week, User, Attribute, UserAttribute

1,A,1,A1

1,A,4,A4

1,C,1,C1

1,C,29,C29

1,F,2,F2

1,F,100,F100

2) Load only New data from week 2

concatenate(Data)

Load

*

from Week2

where not exists (UserAttribute,UserAttribute); --> the loaded UserAttribute is compared to previously loaded values of UserAttribute