Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a very basic question for QV experts. I'm pulling lots of data (millions of rows) and one of my current fields pulls a column of numeric value (3 digits) byte size to be exact - field name 'Bytes'.
Currently I'm seeing about 8 different byte size transactions. I would like to classify everything under 200 bytes a given description (small) and everything 200 bytes and above (large).
Since I am dealing with a lot of data, I am not sure which is the best method from a processing point of view. What would be the best method to do so?
Option 1: Edit Script - can someone please help me how I can write a line that would do this for me?
Option 2: Create a new dimension reference
My only concern with creating a new dimension is that I see maybe 8 different byte sizes so far. If i start to see more, will I have to keep adding to my dimension look-up that I created? Is there a way to capture every number that is added to Bytes? I would prefer not to reference every single number between 1 and 200 as small and then reference 200 + as large
Example of Dimension reference:
Bytes (Column A) Size (Column B)
112 small
115 small
120 small
150 small
155 small
200 large
210 large
220 large
etc.
this can be done in your load statement itself. create another field in your load with something like:
if(Bytes<200,'Small','Large') as Size
now your Size field with have values "Small" or "Large" for each of your "Bytes". if you have more buckets, just modify the condition in the if statement.
You can another field like
load
if( Len( Bytes ) = 3,'Small',
if( Len( Bytes ) = 5,'Large')) as Type
From location;
this can be done in your load statement itself. create another field in your load with something like:
if(Bytes<200,'Small','Large') as Size
now your Size field with have values "Small" or "Large" for each of your "Bytes". if you have more buckets, just modify the condition in the if statement.