Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
joybratas1
Contributor II
Contributor II

Sample Data in Straight Table based on Stratified Sampling

How to get sample data in a Straight table based on sampling algorithm like Stratified Sampling Method. 

Labels (2)
3 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I'm not sure how you could do meaningful sampling in a chart. Can you describe your use case?

In the load script you can use the Sample load prefix to load a subset of data.

https://help.qlik.com/en-US/sense/November2022/Subsystems/Hub/Content/Sense_Hub/Scripting/ScriptPref...

If you want to do Stratified Sampling, you can loop through the values of your strata and sample each slice individually. If you want to do proportional sampling you can use the count of each strata value to adjust the sample weight to reflect the proportion. 

-Rob

joybratas1
Contributor II
Contributor II
Author

Thanks Rob, 

I managed to make some modifications to get the sample size and size of each strata. But not able to extract the stratas values. 

Like 

Strata #1 size - 2

Strata #2   size -1

 

My table 

Col1     Col2               count (Col1) 

Mr X     Business       1

Mr N     Bussiness     1

Mr Y       Service        1

Mr A      Business       1

Mr P       Service        1

 

So my sample  should look like

MrX       

MrN

MrY

i.e 2 from 1st strata and 1 from second strata. 

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I think a generalized Stratified sampling script could look like this:

Data:
LOAD * Inline [
Col1, Col2
Mr X,     Business
Mr N,     Business
Mr Y,      Service
Mr A,      Business
Mr P,      Service
]

;

Strata:
LOAD
*,
Count / NoOfRows('Data') as Pct
;
LOAD
Col2,
Count(Col2) as Count
Resident Data
Group By Col2
;

Let vSampleRate = 1.0; // Overall sample rate
For i = 0 to NoOfRows('Strata')-1
Let vCol2 = Peek('Col2', $(i), 'Strata');
Let vSampleSize = Peek('Pct', $(i), 'Strata') * vSampleRate;
Samples:
Sample $(vSampleSize)
Load *,
$(vSampleSize) as SampleSize  // Extra field to avoid auto concat
Resident Data
Where Col2 = '$(vCol2)'
;
Next i
Drop Table Data; 

-Rob