this is probably a pure newbie question, but still I need to ask it anyway.
I have to do an application for investigating serial numbers, which are by nature distinct values. The amount of serial numbers is in the range of 250 million up to 1 billion and they are normally alphanumeric values. For investigating, a straight table (only dimensions) is more than sufficient for doing the investigation, but the amount of data creates the problem.
If I load the data, the application is in the range of 30 to 80 GB in the memory and the rendering of the table takes verylong. Filtering with the search field just doesn't work, because if finally get your results, the system will not mark them but do very strange things.
Since I learned, that the renderig is done on client computer, I want to try to limit the number of rows loaded into my table object to 50'0000. I thought I could try this with a parameter of maxrows or limiting the dimension of my table, But the property for limiting the dimensions is locked, so I can't use it.
Is this property locked, because I don't have a KPI in my table or am I doing something wrong?
I have seen some discussion, where a setting of default filtering criteria is suggested, but I would prefer if simply limiting the number of loaded datasets was possible.
If somebody has a suggestion how to solve this problem, I would appreciate it very much.
Thank you very much for your help.
Solved! Go to Solution.
I meant using the Calculation Condition (as shown in the picture), not the Dimension Limitation. See attached app for a working example.
You probably can't do a filter search from 250M values. I suggest using some other fields to do the filtering.
Let's suppose you're currently loadinf your data as follow:
from source ;
Alter your script as follow:
from source ;
thank you very much for your reply, but that' not what I'm looking for. If I load only 50'000 datasets and the searched serial numberis not part of these 50'000 datasets, then it will not be found.
I need to have all datasets in the database, but it would be sufficient to display a lower number, if this increases the performance. It looks like you can limit the number of rows in the table for QlikView, but I don't know how to do this for QlikSense.
I checked this option. It's enabled only if you have a KPI included in your table and then I'm not sure how to do it. From what I understand, it limits the rows to those who KPI match the criteria.
But I don't have a criteria at the beginning and in principle, I would be happy to use the search field of the table. But there it takes like forever and the selection is also terribly slow.
I'm not sure, if a filter box would perform better.
Do you have another idea?
In any case, thank you very much for your help.
Try using the calculation condition (in the data handling section) to suppress calculation until you are dealing with a reasonable amount of data. Something like:
count(serialNumber) < 50000
If you need to potentially have all the serial numbers then you might think about a fact table that has all the unique serial numbers and them associate it to other tables that have information about the serial numbers. I doubt there are 250 million to 1 billion unique serial numbers. Divide your data into keys and attributes and that may help.
thank you very much for your suggestion. Actually there is one product, where you have even a much bigger range. But that's the drawback, if you're in the business of producing the product, everybody wants to have in sufficient quantity. If you check your purse (not in the compartment for coins or the section, where you store your credit cards in), you'll understand what I'm talking about.
I try to follow Rob's advice about using a dynamic set analysis to reduce the amount of data, but my first try ended up with a timeout error during calculation.
thank you very much for the hint. I tried to use the limitation for dimensions, but somehow I don't understand how it is supposed to work.
The amount of data obviously is that big, that the starts behaving strange. It took about 5 minutes, before the properties for the Limitation showed after I changed it from no limitation to a limitation type.
If I use the calculation condition, is there a way to filter the data BEFORE I load them into the table?
Doing a reload based upon filter criteria is not an option, this will take too long.
And it looks like calculating a KPI (I used the formula shown in the 2nd snapshot above) runs into a timeout.
So it looks like I might have to stick with a table without KPIs, but I'm not sure yet.
If there is a way to filter the data before calculation without re-loading them I would be glad about any hint how to do this.
I try to define a set Analysis to reduce the number of possibles. Would this be a correct Syntax? I don't see anything in my TextBox, so I wonder if it is the calculation time or my Syntax.
The idea is to return 100 datasets.
Thank's a lot.