Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all!
i have a 10 000 000 lines in csv:
Header 1 | Header 2 | Header 3 | Header 4 | Header 5 |
---|---|---|---|---|
Date | Domain | Connections | Bytes in | Bytes out |
I need to display in table top 20 domain by Connections, top 20 Domain by Bytes in and Bytes out.
There is 128 GB RAM on my virtual machine.
If i use Dimension limits - the largest 20 i have the one problem- QV is failed becouse of it try to select top 20 in 10 000 000 domains.
I tryied to use set analisys in my Expression, but he anyway try to select top in 10 000 000.
Could you tell me what shoul i do for resolving this problem?
Kseniya,
It is better to calculate the top domains in the script, it will help with the front end performance.
Regards,
Michael
i thought about it, but i have some filters like service: streaming media, peer-to-peer, http, File Sharing, Remote Access, IP Protocols, Web Browsing i t.e (~ 200 filters) and if i calculate the top domains in the script,i couldn't to filter it.
OK, what about this:
Create three rank fields in the data - RankConnections, RankBytesIn, and RankBytesOut - integer value for each domain. Now, if you don't have selections, your top by RankConnections will be 1 to 20. If you make selections, it could be anything (e.g. 120,500,666,... 5215045) - but still sortable without dynamic rank calculations.
Hope it works...