Ideas/Opinions in approaching big data set - Qlik Community

Report Inappropriate Content · ‎2014-11-06

Hi all,

this is more conceptual question and i'm just gathering opinions and ideas what is the best way to proceed in such scenario.

My table in the database look like this:

City, CustomerId, Type ,Date, Hour1, Hour2, Hour3 ... Hour24

A, Customer1, 1, 01/01/2014, 1,2,3,4,5,6 ... 24

A, Customer1, 2, 01/01/2014, 1,2,3,4,5,6 ... 24

A, Customer2, 1, 01/01/2014, 1,2,3,4,5,6 ... 24

A, Customer2, 2, 01/01/2014, 1,2,3,4,5,6 ... 24

B, Customer3, 1, 01/01/2014, 1,2,3,4,5,6 ... 24

B, Customer3, 2, 01/01/2014, 1,2,3,4,5,6 ... 24

...

The data set itself is pretty big. Around 150M rows.

This format is a bit tricky to use in QV since i need to be able to make day/hours analysis. I've grouped and transposed the table in QV and the new format looks like this:

City, Type ,Date, Hour, Data, CustomersCount

A, 1, 01/01/2014, 10,100

A, 1, 01/01/2014, 20,200

A, 1, 01/01/2014, 30,300

A, 1, 01/01/2014, 40,400

A, 1, 01/01/2014, 50,500

A, 1, 01/01/2014, 60,600

...

This way the rows number was increased but the columns are less and the expressions are easy and fast in terms of sum(Data).

But in this scenario there is no way to show count of distinct CustomerIds.

I have few ideas how this might be done (including Direct Discovery, which I leave as my final option).

Can you please share your ideas/experience how such situation can be approached?

Stefan