Solved: Re: One dimension made from two different dimensio... - Qlik Community

Report Inappropriate Content · ‎2013-11-05

Hi everyone,

I already searched the web and this forum and asked a bunch of people, but no one seemed to have an answer to the following:

Our users come from a source, click through our system and go to a sink afterward. Now sink and source are from the same pool of partners, so we would like to have a table with the partners as dimension, where we can show certain clicks sorted by source in one column and those clicks sorted by sink in the next column.

The data is loaded in the following way:

user	source	sink
User 1	Partner 1	Partner 3
User 2	Partner 2	Partner 1
User 3	Partner 1	Partner 2
User 4	Partner 2	Partner 2

and we would like to have the following table of this:

Partner	# as source	# as sink
Partner 1	2	1
Partner 2	2	2
Partner 3	0	1

Partner	# as source	# as sink	sum
Partner 1	2	1	3
Partner 2	2	2	4
Partner 3	0	1	1

Without the sum, we could use two different tables, but like this, it doesn't work unfortunately.

I tried to use source and sink as dimensions, then limit the second dimension by the first and restrict the sums by the first resp. by the second dimension, but this doesn't seem to help.

We have quite large tables (a couple of Gigabytes per table) and we already did an aggregation per day and still need up to 20GB RAM, so we would like to avoid loading double data. Is there a way to get the second table from the first one?

Thank you for your help

Ulrich

Gysbert_Wassenaar · ‎2013-11-05

Load the data so it is in this form instead:

User, Partner, Type

User 1, Partner 1, Source

User 1, Partner 3, Sink

User 2, Partner 2, Source

User 2, Partner 1, Sink

...etc

You can then use Partner as dimension and expressions like:

Source: count({<Type={'Source'}>} User)

Sink: count({<Type={'Sink'}>} User)

Total: count(User)

talk is cheap, supply exceeds demand

View solution in original post

Gysbert_Wassenaar · ‎2013-11-05

Load the data so it is in this form instead:

User, Partner, Type

User 1, Partner 1, Source

User 1, Partner 3, Sink

User 2, Partner 2, Source

User 2, Partner 1, Sink

...etc

You can then use Partner as dimension and expressions like:

Source: count({<Type={'Source'}>} User)

Sink: count({<Type={'Sink'}>} User)

Total: count(User)

talk is cheap, supply exceeds demand

Report Inappropriate Content · ‎2013-11-05

Thank you for your reply. I thought about a reformatting like this as well. But unfortunately a refomatting won't work, since our table is much bigger than this example and the redundancies will make the needs for RAM explode.

If it is the only solution though, we need a better system, I just hoped, there could be a solution avoiding this.

Report Inappropriate Content · ‎2013-11-05

I agree with Gysbert, and I don't think that a crosstable is too demanding for the system!!

If you want you can post your script.

Gio

Gysbert_Wassenaar · ‎2013-11-05

Have you actually tested that? Try it on a limited set of a couply of million records. Another thing you could try is to generate id numbers for users and partners with the autonumber function. Such fields compress pretty good. And you could use a flag field for Source/Sink. Though it has a cardinality of only two already. So QV will use one bit per record for the pointers to the symbol table. See this blog post: Symbol Tables and Bit-Stuffed Pointers

Anyway, I don't think you have an alternative. Trying to 'fix' this in the UI is going to perform far worse.

talk is cheap, supply exceeds demand

Report Inappropriate Content · ‎2013-11-06

Very interesting link, thanks for that

My first quick test with loading those data a second time resulted in a very large file. But probably there is very much to be optimized.

Report Inappropriate Content · ‎2013-11-14

I did it and it worked. Thanks a lot for your help.

One dimension made from two different dimensioins